Hi,
I want implement an RDD wherein the decision of number of partitions is
based on the number of executors that have been set up. Is there some way I
can determine the number of executors within the getPartitions() call?
Hello.
Recently I upgraded storm to 0.9.2.
In Component summary page of Storm UI, Executors is not displayed only
when "emitted" of its spout/bolt is 0.
Please tell me the solution.
Thanks.
Up until last week we had no problems running a Spark standalone cluster. We
now have a problem registering executors with the driver node in any
application. Although we can start-all and see the worker on 8080 no
executors are registered with the blockmanager.
The feedback we have is scant but we're getting stuff like this suggesting
it's a name resolution issue of some kind:
14/04/09
Running on Amazon EMR w/Yarn and Spark 1.1.1, I have trouble getting Yarn
to use the number of executors that I specify in spark-submit:
In a cluster with two core nodes will typically only result in one executor
running at a time. I can play with the memory settings and
num-cores-per-executor, and sometimes I can get 2 executors running at
consistently.
Hi,
I am using Spark 1.0.1 on Yarn 2.5, and doing everything through spark
shell.
I am running a job that essentially reads a bunch of HBase keys, looks up
HBase data, and performs some filtering and aggregation. The job works fine
in smaller datasets, but when i try to execute on the full dataset, the job
never completes. The few symptoms i notice are:
a. The job shows progress for a
I am running spark 1.1.0 on AWS EMR and I am running a batch job that
should seems to be highly parallelizable in yarn-client mode. But spark
stop spawning any more executors after spawning 6 executors even though
YARN cluster has 15 healthy m1.large nodes. I even tried providing
'--num-executors 60' argument during spark-submit but even that doesn't
help. A quick look at spark admin UI suggests
HI,
I have simple spout and bolt ,here my spout is reading 10 files each file contains 200 mb dataand my bolt is to perform receiving tuples write into file .
for checking how much time it will take in storm cluster?
in my cluster i have 3 machines for nimbus ,supervisor-1 and supervisor-2 respectively.
please suggest me how many workers and how many executors for spout,bolt
i tried
Hi,
I'm trying to execute a stream application using local[4], however I just
see one executor in the web UI, shouldn't be more? one executor per worker
thread?
I'm trying to open connections in all the worker nodes to a mysql database
and keep them open until the end of the stream.
Do you guys know any better way to do this? right now I'm just trying to
create static connections in each
[ https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lars Hofhansl updated HBASE-3809:
---------------------------------
Fix Version/s: (was: 0.94.0)
0.96.0
Moving out of 0.94.
> .META. may not come back online if > number of executors servers crash and one of
those > number of executors
Can someone explain what each of these terms means in terms of Spark? I've
a confusion between what's the difference between Slaves, workers,
executors.
My understanding is the slaves and workers are interchangeable ?
Thanks.
Can someone explain what each of these terms means in terms of Spark? I've a confusion between what's the difference between Slaves, workers, executors.�My
I'm using spark 1.0.1 on a quite large cluster, with gobs of memory, etc.
Cluster resources are available to me via Yarn and I am seeing these
errors quite often.
ERROR YarnClientClusterScheduler: Lost executor 63 on : remote Akka
client disassociated
This is in an interactive shell session. I don't know a lot about Yarn
plumbing and am wondering if there's some constraint in play -- executors
Hi, I tried a similar question before and didn't get any answers,so I'll
try again:
I am using updateStateByKey, pretty much exactly as shown in the examples
shipping with Spark:
def createContext(master:String,dropDir:String, checkpointDirectory:String) = {
val updateFunc = (values: Seq[Int], state: Option[Int]) => {
val currentCount = values.sum
val previousCount = state
I'm launching a Spark shell with the following parameters
./spark-shell --master yarn-client --executor-memory 32g --driver-memory 4g
but when I look at the Spark UI it shows only 209.3 GB total memory.
Executors (10)
- *Memory:* 55.9 GB Used (209.3 GB Total)
This is a 10 node YARN cluster where each node has 48G of memory.
Any idea what I'm missing here?
Thanks
-Soumya
I'm trying to compare the performance of Spark running on Mesos vs YARN.
However, I am having problems being able to configure the Spark workload to
run in a similar way on Mesos and YARN.
When running Spark on YARN, you can specify the number of executors per
node. So if I have a node with 4 CPUs, I can specify 6 executors on that
node. When running Spark on Mesos, there doesn't seem to be
Hi,
When I try requesting a large number of executors - e.g. 242, it doesn't
seem to actually reach that number. E.g., under the executors tab, I only
see an executor ID of upto 234.
This despite the fact that there're plenty more memory available as well as
CPU cores, etc in the system. In fact, in the YARN page, it shows that 243
containers are running (242 executors + driver).
Anyone
Hi,
I am currently running a single node storm deployment with 6 workers.
But when I try to deploy multiple topologies in a way that those
utilize all the workers there are still idle workers at the end though
there are topologies which do not get the number of workers which
those asked for. Same applies to no of executors. With three
topologies which each initialize around 150 executors actual
Hi all,
I am running a simple analysis using Spark streaming. I set executor number
and default parallelism both as 300. The program consumes data from Kafka
and do a simple groupBy operation with 300 as the parameter. The batch size
is one minute. In the first two batches, there are around 50 executors.
However, after the first two batches, there are always 2 executors for the
groupBy operation
I've set up a YARN (Hadoop 2.4.1) cluster with Spark 1.0.1 and I've
been seeing some inconsistencies with out of memory errors
(java.lang.OutOfMemoryError: unable to create new native thread) when
increasing the number of executors for a simple job (wordcount).
The general format of my submission is:
spark-submit \
--master yarn-client \
--num-executors=$EXECUTORS \
--executor-cores
We recently upgraded to Storm 0.9.2-incubating, and found that on the UI, Num workers and Num executors switched.
Example:
In older version (0.9.0.1):
[cid:
[email protected]]
In new version (0.9.2-incubating):
[cid:
[email protected]]
Is this a UI bug? Or did something change in Storm core functionality?
Thanks,
Jing
HI,
i did small example on storm in cluster mode , which contains one spout and one bolt.here in my spout am reading list of files(10 files each contains 100 records ) ,while in my bolt am just writing receiving tuples into file.
when i run this application with 2 executors for bolt and 2 executors for spout and 2 workers then it is executed fine. there is no duplicate tuples. i received 1000