I am just starting up with Spark. I am trying to use it to implement distributed processing for a deduplication application. The part i am working on now is supposed to get a RDD list of Pair<String[],String[]> which are the columns of the records. This process should be highly parallelisable but currently i am just working on local. When debuging everything seems to work as expected in the map function, when the collect tries to execute though everything breaks :( and i have no idea why, its not even running on a cluster.
This is the part of the code let me know if you need to see more:
JavaRDD<Pair<String[],String[]>> rddData = javaSparkContext.parallelize(data.stream()
.map(p->Pair.of(p.getLeft().fields(),p.getRight().fields()))
.collect(Collectors.toList()),2);
final int featureVecSize = featureCombinationToFeatureIndex.size();
List<double[]> distributedArchFeatures = rddData.map(recordPair ->{
String[] valuesA = recordPair.getLeft();
String[] valuesB = recordPair.getRight();
double[] scores = new double[featureVecSize];
for(Map.Entry<Triple<Integer, ComparisonFeature, ComparisonModifier>, Integer> t : featureCombinationToFeatureIndex.entrySet()) {
int index = t.getKey().getLeft();
ComparisonFeature feature = t.getKey().getMiddle();
ComparisonModifier modifier = t.getKey().getRight();
double score = modifier.calculateScore(feature, valuesA[index], valuesB[index]);
scores[t.getValue()] = score;
}
return scores;
}).collect();
And this is the stacktrace i get, it seems to be retrying to retreive the data but no dice:
[analyzerbeans-pool1-thread-27] INFO com.hi.identify7.execution.simple.SimpleMatchingExecutionContext - Scoring all matches...
[analyzerbeans-pool1-thread-27] INFO org.apache.spark.SparkContext - Starting job: collect at SimpleMatchingExecutionContext.java:177
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Got job 0 (collect at SimpleMatchingExecutionContext.java:177) with 2 output partitions
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Final stage: ResultStage 0 (collect at SimpleMatchingExecutionContext.java:177)
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Parents of final stage: List()
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Missing parents: List()
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SimpleMatchingExecutionContext.java:165), which has no missing parents
[dag-scheduler-event-loop] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_0 stored as values in memory (estimated size 4.9 KB, free 2.5 GB)
[dag-scheduler-event-loop] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.7 KB, free 2.5 GB)
[dispatcher-event-loop-0] INFO org.apache.spark.storage.BlockManagerInfo - Added broadcast_0_piece0 in memory on 10.209.1.88:52600 (size: 2.7 KB, free: 2.5 GB)
[dag-scheduler-event-loop] INFO org.apache.spark.SparkContext - Created broadcast 0 from broadcast at DAGScheduler.scala:1006
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SimpleMatchingExecutionContext.java:165) (first 15 tasks are for partitions Vector(0, 1))
[dag-scheduler-event-loop] INFO org.apache.spark.scheduler.TaskSchedulerImpl - Adding task set 0.0 with 2 tasks
[dispatcher-event-loop-1] WARN org.apache.spark.scheduler.TaskSetManager - Stage 0 contains a task of very large size (11664 KB). The maximum recommended task size is 100 KB.
[dispatcher-event-loop-1] INFO org.apache.spark.scheduler.TaskSetManager - Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 11944196 bytes)
[Executor task launch worker for task 0] INFO org.apache.spark.executor.Executor - Running task 0.0 in stage 0.0 (TID 0)
[Executor task launch worker for task 0] INFO org.apache.spark.storage.memory.MemoryStore - Block taskresult_0 stored as bytes in memory (estimated size 185.3 MB, free 2.3 GB)
[dispatcher-event-loop-1] INFO org.apache.spark.storage.BlockManagerInfo - Added taskresult_0 in memory on 10.209.1.88:52600 (size: 185.3 MB, free: 2.3 GB)
[Executor task launch worker for task 0] INFO org.apache.spark.executor.Executor - Finished task 0.0 in stage 0.0 (TID 0). 194298473 bytes result sent via BlockManager)
[dispatcher-event-loop-2] INFO org.apache.spark.scheduler.TaskSetManager - Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 11932301 bytes)
[Executor task launch worker for task 1] INFO org.apache.spark.executor.Executor - Running task 1.0 in stage 0.0 (TID 1)
[task-result-getter-0] INFO org.apache.spark.network.client.TransportClientFactory - Successfully created connection to /10.209.1.88:52600 after 59 ms (0 ms spent in bootstraps)
[shuffle-client-4-1] ERROR org.apache.spark.network.client.TransportClient - Failed to send RPC 7173534709356817937 to /10.209.1.88:52600: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
at io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:810)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
at io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:302)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1089)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1136)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1078)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
[shuffle-client-4-1] ERROR org.apache.spark.network.shuffle.OneForOneBlockFetcher - Failed while starting block fetches
java.io.IOException: Failed to send RPC 7173534709356817937 to /10.209.1.88:52600: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
at org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
at io.netty.channel.AbstractChannelHandlerContext.notifyOutboundHandlerException(AbstractChannelHandlerContext.java:837)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:740)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
at io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:302)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1089)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1136)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1078)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
at io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:810)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
... 17 more
[shuffle-client-4-1] INFO org.apache.spark.network.shuffle.RetryingBlockFetcher - Retrying fetch (1/3) for 1 outstanding blocks after 5000 ms
[Block Fetch Retry-6-1] INFO org.apache.spark.network.client.TransportClientFactory - Found inactive connection to /10.209.1.88:52600, creating a new one.
[Block Fetch Retry-6-1] INFO org.apache.spark.network.client.TransportClientFactory - Successfully created connection to /10.209.1.88:52600 after 1 ms (0 ms spent in bootstraps)
[shuffle-client-4-1] ERROR org.apache.spark.network.client.TransportClient - Failed to send RPC 6680513752030028512 to /10.209.1.88:52600: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
at io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:810)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
at io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:302)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1089)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1136)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1078)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
[shuffle-client-4-1] ERROR org.apache.spark.network.shuffle.OneForOneBlockFetcher - Failed while starting block fetches
java.io.IOException: Failed to send RPC 6680513752030028512 to /10.209.1.88:52600: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
at org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)
at io.netty.util.internal.PromiseNotificationUtil.tryFailure(PromiseNotificationUtil.java:64)
at io.netty.channel.AbstractChannelHandlerContext.notifyOutboundHandlerException(AbstractChannelHandlerContext.java:837)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:740)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:816)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
at io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:302)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:730)
at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:38)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1089)
at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1136)
at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1078)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.AbstractMethodError: org.apache.spark.network.protocol.MessageWithHeader.touch(Ljava/lang/Object;)Lio/netty/util/ReferenceCounted;
at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
at io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:810)
at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:723)
at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:738)
This is due to netty dependency conflicts. Check the dependency tree of your project and use the one that spark needs
Related
I tried using spark streaming to process kafka messages,followed this wiki https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html
and my code is below:
SparkConf sparkConf = new SparkConf().setAppName("JavaDirectKafkaWordCount").setMaster("spark://sl:7077");
JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, Durations.seconds(10));
Map<String, Object> kafkaParams = new HashMap<>();
kafkaParams.put("bootstrap.servers", "10.0.1.5:9092");
kafkaParams.put("key.deserializer", StringDeserializer.class);
kafkaParams.put("value.deserializer", StringDeserializer.class);
kafkaParams.put("group.id", "group1");
kafkaParams.put("auto.offset.reset", "earliest");
kafkaParams.put("enable.auto.commit", false);
Collection<String> topics = Collections.singletonList("test");
final JavaInputDStream<ConsumerRecord<String, String>> stream = KafkaUtils.createDirectStream(jssc,
LocationStrategies.PreferConsistent(),
ConsumerStrategies.<String, String>Subscribe(topics, kafkaParams));
stream.print();
after submit, it returns :
17/04/05 22:43:10 INFO SparkContext: Starting job: print at JavaDirectKafkaWordCount.java:47
17/04/05 22:43:10 INFO DAGScheduler: Got job 0 (print at JavaDirectKafkaWordCount.java:47) with 1 output partitions
17/04/05 22:43:10 INFO DAGScheduler: Final stage: ResultStage 0 (print at JavaDirectKafkaWordCount.java:47)
17/04/05 22:43:10 INFO DAGScheduler: Parents of final stage: List()
17/04/05 22:43:10 INFO DAGScheduler: Missing parents: List()
17/04/05 22:43:10 INFO DAGScheduler: Submitting ResultStage 0 (KafkaRDD[0] at createDirectStream at JavaDirectKafkaWordCount.java:44), which has no missing parents
17/04/05 22:43:10 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 2.3 KB, free 366.3 MB)
17/04/05 22:43:10 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1529.0 B, free 366.3 MB)
17/04/05 22:43:10 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.245.226.155:15258 (size: 1529.0 B, free: 366.3 MB)
17/04/05 22:43:10 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:996
17/04/05 22:43:10 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (KafkaRDD[0] at createDirectStream at JavaDirectKafkaWordCount.java:44)
17/04/05 22:43:10 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
17/04/05 22:43:10 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.245.226.155:53448) with ID 0
17/04/05 22:43:10 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 10.245.226.155, executor 0, partition 0, PROCESS_LOCAL, 7295 bytes)
17/04/05 22:43:10 INFO BlockManagerMasterEndpoint: Registering block manager 10.245.226.155:14669 with 366.3 MB RAM, BlockManagerId(0, 10.245.226.155, 14669, None)
17/04/05 22:43:10 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(null) (10.245.226.155:53447) with ID 1
17/04/05 22:43:10 INFO BlockManagerMasterEndpoint: Registering block manager 10.245.226.155:33754 with 366.3 MB RAM, BlockManagerId(1, 10.245.226.155, 33754, None)
17/04/05 22:43:11 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 10.245.226.155, executor 0): java.lang.NullPointerException
at org.apache.spark.util.Utils$.decodeFileNameInURI(Utils.scala:409)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:434)
at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:508)
at org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:500)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:500)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:257)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
can someone help on it? thanks very much.
Can you provide the parameters passed to spark-submit?
You might have passed a jar-file name instead of an absolute path to the jar file. The class org.apache.spark.executor.Executor tries to load "Added Jars" and "Added Files" in updateDependencies method, but the URI path is not as presumed by spark.
I try to create a JavaRDD which contains an other series of RDD inside.
RDDMachine.foreach(machine -> startDetectionNow())
Inside, machine start a query to ES and get an other RDD. I collect all this (1200hits) and covert to Lists. After the Machine start work with this list
Firstly : is it possible to do this or not ? if not, in which way can i try to do something different?
Let me show what I try to do :
SparkConf conf = new SparkConf().setAppName("Algo").setMaster("local");
conf.set("es.index.auto.create", "true");
conf.set("es.nodes", "IP_ES");
conf.set("es.port", "9200");
sparkContext = new JavaSparkContext(conf);
MyAlgoConfig config_algo = new MyAlgoConfig(Detection.byPrevisionMerge);
Machine m1 = new Machine("AL-27", "IP1", config_algo);
Machine m2 = new Machine("AL-20", "IP2", config_algo);
Machine m3 = new Machine("AL-24", "IP3", config_algo);
Machine m4 = new Machine("AL-21", "IP4", config_algo);
ArrayList<Machine> Machines = new ArrayList();
Machines.add(m1);
Machines.add(m2);
Machines.add(m3);
Machines.add(m4);
JavaRDD<Machine> machineRDD = sparkContext.parallelize(Machines);
machineRDD.foreach(machine -> machine.startDetectNow());
I try to start my algorithm in each machine which must learn from data located in Elasticsearch.
public boolean startDetectNow()
// MEGA Requete ELK
JavaRDD dataForLearn = Elastic.loadElasticsearch(
Algo.sparkContext
, "logstash-*/Collector"
, Elastic.req_AvgOfCall(
getIP()
, "hour"
, "2016-04-16T00:00:00"
, "2016-06-10T00:00:00"));
JavaRDD<Hit> RDD_hits = Elastic.mapToHit(dataForLearn);
List<Hit> hits = Elastic.RddToListHits(RDD_hits);
So I try to get all data of a query in every "Machine".
My question is : is it possible to do this with Spark ? Or maybe in an other way ?
When I start it in Spark; it's seams to be something like lock when the code is around the second RDD.
And the error message is :
16/08/17 00:17:13 INFO SparkContext: Starting job: collect at Elastic.java:94
16/08/17 00:17:13 INFO DAGScheduler: Got job 1 (collect at Elastic.java:94) with 1 output partitions
16/08/17 00:17:13 INFO DAGScheduler: Final stage: ResultStage 1 (collect at Elastic.java:94)
16/08/17 00:17:13 INFO DAGScheduler: Parents of final stage: List()
16/08/17 00:17:13 INFO DAGScheduler: Missing parents: List()
16/08/17 00:17:13 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[4] at map at Elastic.java:106), which has no missing parents
16/08/17 00:17:13 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.3 KB, free 7.7 KB)
16/08/17 00:17:13 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.5 KB, free 10.2 KB)
16/08/17 00:17:13 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:46356 (size: 2.5 KB, free: 511.1 MB)
16/08/17 00:17:13 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
16/08/17 00:17:13 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[4] at map at Elastic.java:106)
16/08/17 00:17:13 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
^C16/08/17 00:17:22 INFO SparkContext: Invoking stop() from shutdown hook
16/08/17 00:17:22 INFO SparkUI: Stopped Spark web UI at http://192.168.10.23:4040
16/08/17 00:17:22 INFO DAGScheduler: ResultStage 0 (foreach at GuardConnect.java:60) failed in 10,292 s
16/08/17 00:17:22 INFO DAGScheduler: Job 0 failed: foreach at GuardConnect.java:60, took 10,470974 s
Exception in thread "main" org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:806)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:804)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:804)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:1658)
at org.apache.spark.util.EventLoop.stop(EventLoop.scala:84)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1581)
at org.apache.spark.SparkContext$$anonfun$stop$9.apply$mcV$sp(SparkContext.scala:1740)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1229)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1739)
at org.apache.spark.SparkContext$$anonfun$3.apply$mcV$sp(SparkContext.scala:596)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:267)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:218)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:912)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:910)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.foreach(RDD.scala:910)
at org.apache.spark.api.java.JavaRDDLike$class.foreach(JavaRDDLike.scala:332)
at org.apache.spark.api.java.AbstractJavaRDDLike.foreach(JavaRDDLike.scala:46)
at com.seigneurin.spark.GuardConnect.main(GuardConnect.java:60)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/08/17 00:17:22 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo#4a7e0846)
16/08/17 00:17:22 INFO DAGScheduler: ResultStage 1 (collect at Elastic.java:94) failed in 9,301 s
16/08/17 00:17:22 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo#6c6b4cb8)
16/08/17 00:17:22 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerJobEnd(0,1471385842813,JobFailed(org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down))
16/08/17 00:17:22 INFO DAGScheduler: Job 1 failed: collect at Elastic.java:94, took 9,317650 s
16/08/17 00:17:22 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
org.apache.spark.SparkException: Job 1 cancelled because SparkContext was shut down
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:806)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:804)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:804)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:1658)
at org.apache.spark.util.EventLoop.stop(EventLoop.scala:84)
at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1581)
at org.apache.spark.SparkContext$$anonfun$stop$9.apply$mcV$sp(SparkContext.scala:1740)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1229)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1739)
at org.apache.spark.SparkContext$$anonfun$3.apply$mcV$sp(SparkContext.scala:596)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:267)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:218)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
at org.apache.spark.api.java.JavaRDDLike$class.collect(JavaRDDLike.scala:339)
at org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:46)
at com.seigneurin.spark.Elastic.RddToListHits(Elastic.java:94)
at com.seigneurin.spark.OXO.prepareDataAndLearn(OXO.java:126)
at com.seigneurin.spark.OXO.startDetectNow(OXO.java:148)
at com.seigneurin.spark.GuardConnect.lambda$main$1282d8df$1(GuardConnect.java:60)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreach$1.apply(JavaRDDLike.scala:332)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreach$1.apply(JavaRDDLike.scala:332)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$32.apply(RDD.scala:912)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/08/17 00:17:22 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerJobEnd(1,1471385842814,JobFailed(org.apache.spark.SparkException: Job 1 cancelled because SparkContext was shut down))
16/08/17 00:17:22 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/08/17 00:17:22 INFO MemoryStore: MemoryStore cleared
16/08/17 00:17:22 INFO BlockManager: BlockManager stopped
16/08/17 00:17:22 INFO BlockManagerMaster: BlockManagerMaster stopped
16/08/17 00:17:22 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/08/17 00:17:22 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/08/17 00:17:22 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/08/17 00:17:22 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, partition 0,ANY, 6751 bytes)
16/08/17 00:17:22 ERROR Inbox: Ignoring error
java.util.concurrent.RejectedExecutionException: Task org.apache.spark.executor.Executor$TaskRunner#65fd4104 rejected from java.util.concurrent.ThreadPoolExecutor#4387a1bf[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
at org.apache.spark.executor.Executor.launchTask(Executor.scala:128)
at org.apache.spark.scheduler.local.LocalEndpoint$$anonfun$reviveOffers$1.apply(LocalBackend.scala:86)
at org.apache.spark.scheduler.local.LocalEndpoint$$anonfun$reviveOffers$1.apply(LocalBackend.scala:84)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.scheduler.local.LocalEndpoint.reviveOffers(LocalBackend.scala:84)
at org.apache.spark.scheduler.local.LocalEndpoint$$anonfun$receive$1.applyOrElse(LocalBackend.scala:69)
at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:116)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/08/17 00:17:22 INFO SparkContext: Successfully stopped SparkContext
16/08/17 00:17:22 INFO ShutdownHookManager: Shutdown hook called
16/08/17 00:17:22 INFO ShutdownHookManager: Deleting directory /tmp/spark-8bf65e78-a916-4cc0-b4d1-1b0ec9a07157
16/08/17 00:17:22 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/08/17 00:17:22 INFO ShutdownHookManager: Deleting directory /tmp/spark-8bf65e78-a916-4cc0-b4d1-1b0ec9a07157/httpd-6d3aeb80-808c-4749-8f8b-ac9341f990a7
Thank if you can give me some advice.
You cannot create an RDD inside an RDD, what soever the type of RDD be.
This is the first rule. This is because RDD being an abstraction pointing to your data.
I have a spark jobs that runs on yarn, it works with about 150gb of dataset and does multiple shuffle operations and finally stores data into hbase. It keeps failing at saveAsHadoopDataset Basically multiple Executors fails at this stage after reporting high GC activities. However none of the executor logs, driver logs or node manager logs indicate any OutOfMemory errors or GC Overhead Exceeded errors or memory limits exceeded errors. I don't see any other reason for Executor failures as well in spark ui as well.
val hConf = HBaseConfiguration.create
hConf.setInt("hbase.client.scanner.caching", 10000)
hConf.setBoolean("hbase.cluster.distributed", true)
new PairRDDFunctions(hbaseRdd).saveAsHadoopDataset(jobConfig)
Driver Logs:
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Job aborted due to stage failure: Task 388 in stage 22.0 failed 4 times, most recent failure: Lost task 388.3 in stage 22.0 (TID 32141, maprnode5): ExecutorLostFailure (executor 5 lost)
Driver stacktrace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 388 in stage 22.0 failed 4 times, most recent failure: Lost task 388.3 in stage 22.0 (TID 32141, maprnode5): ExecutorLostFailure (executor 5 lost)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1914)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1124)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1065)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1065)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1065)
Executor Logs:
16/02/24 11:09:47 INFO executor.Executor: Finished task 224.0 in stage 8.0 (TID 15318). 2099 bytes result sent to driver
16/02/24 11:09:47 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 15333
16/02/24 11:09:47 INFO executor.Executor: Running task 239.0 in stage 8.0 (TID 15333)
16/02/24 11:09:47 INFO storage.ShuffleBlockFetcherIterator: Getting 125 non-empty blocks out of 3007 blocks
16/02/24 11:09:47 INFO storage.ShuffleBlockFetcherIterator: Started 14 remote fetches in 10 ms
16/02/24 11:11:47 ERROR server.TransportChannelHandler: Connection to maprnode5 has been quiet for 120000 ms while there are outstanding requests. Assuming connection is dead; please adjust spark.network.timeout if this is wrong.
16/02/24 11:11:47 ERROR client.TransportResponseHandler: Still have 1 requests outstanding when connection from maprnode5 is closed
16/02/24 11:11:47 ERROR shuffle.OneForOneBlockFetcher: Failed while starting block fetches
java.io.IOException: Connection from maprnode5 closed
at org.apache.spark.network.client.TransportResponseHandler.channelUnregistered(TransportResponseHandler.java:104)
at org.apache.spark.network.server.TransportChannelHandler.channelUnregistered(TransportChannelHandler.java:91)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelUnregistered(AbstractChannelHandlerContext.java:158)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelUnregistered(AbstractChannelHandlerContext.java:144)
at io.netty.channel.ChannelInboundHandlerAdapter.channelUnregistered(ChannelInboundHandlerAdapter.java:53)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelUnregistered(AbstractChannelHandlerContext.java:158)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelUnregistered(AbstractChannelHandlerContext.java:144)
at io.netty.channel.ChannelInboundHandlerAdapter.channelUnregistered(ChannelInboundHandlerAdapter.java:53)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelUnregistered(AbstractChannelHandlerContext.java:158)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelUnregistered(AbstractChannelHandlerContext.java:144)
at io.netty.channel.ChannelInboundHandlerAdapter.channelUnregistered(ChannelInboundHandlerAdapter.java:53)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelUnregistered(AbstractChannelHandlerContext.java:158)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelUnregistered(AbstractChannelHandlerContext.java:144)
at io.netty.channel.DefaultChannelPipeline.fireChannelUnregistered(DefaultChannelPipeline.java:739)
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:659)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:744)
16/02/24 11:11:47 INFO shuffle.RetryingBlockFetcher: Retrying fetch (1/3) for 6 outstanding blocks after 5000 ms
16/02/24 11:11:52 INFO client.TransportClientFactory: Found inactive connection to maprnode5, creating a new one.
16/02/24 11:12:16 WARN server.TransportChannelHandler: Exception in connection from maprnode5
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313)
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:744)
16/02/24 11:12:16 ERROR client.TransportResponseHandler: Still have 1 requests outstanding when connection from maprnode5 is closed
16/02/24 11:12:16 ERROR shuffle.OneForOneBlockFetcher: Failed while starting block fetches
So it turns out although spark UI says it failed at saveAsHadoopDataSet it was in fact failing at first step of the stage where saveAsHadoopDataSet was the last step. To elaborate more, spark defines stage boundaries based on sequence of narrow transformation or sequence of combined wide transformation and narrow transformation. In my particular case, sequence was groupByKey(wide dep) -> mapValues(narrow dep) -> map(narrow dep) where last map is actually doing saveAsHadoopDataSet. Executor was reporting hight GC activity and memory usage at in fact shuffle stage groupByKey. I changed my application logic to use reduceByKey instead of groupByKey. Now its super slow but at least not failing.
I'm trying to load a big Graph (60GB) using GraphX in spark1.4.1 in local mode with 16 threads. The driver memory is set to 500GB inside spark-defaults.conf. I work on a machine that has 590341 MB free (shown by free -m command)which is actually 576GB.
Some details about the graph I'm trying to load: It is the Friendster Graph downloaded from snap.stanford.edu. The actual size is 30GB because Friendster is a directed Graph but I doubled the edges to create an undirected graph version. (That's why the size is doubled 60=2*30). I use the GraphLoader.edgeListFile using scala to load the Graph as shown below:
def main(args: Array[String]): Unit = {
val logFile = "/home/panos/spark-1.2.0/README.md"
// Create spark configuration and spark context
val conf = new SparkConf().setAppName("My App").setMaster("local[16]")
val sc = new SparkContext(conf)
val currentDir = System.getProperty("user.dir") // get the current directory
val edgeFile = "file://" + currentDir + "/friendsterUndirected.txt"
// Load the edges as a graph
val graph = GraphLoader.edgeListFile(sc, edgeFile,false,1,StorageLevel.MEMORY_AND_DISK,StorageLevel.MEMORY_AND_DISK)
}
The problem is that the graph cannot be loaded receiving java.lang.negativeArraySize exception as shown below. I tried with smaller graphs (~4GB) with success. Any Idea?
16/01/10 13:04:09 WARN TaskSetManager: Stage 0 contains a task of very large size (440 KB). The maximum recommended task size is 100 KB.
16/01/10 13:32:56 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.NegativeArraySizeException
at java.lang.reflect.Array.newArray(Native Method)
at java.lang.reflect.Array.newInstance(Array.java:75)
at scala.reflect.ClassTag$class.newArray(ClassTag.scala:62)
at scala.reflect.ClassTag$$anon$1.newArray(ClassTag.scala:144)
at org.apache.spark.util.collection.PrimitiveVector.copyArrayWithLength(PrimitiveVector.scala:87)
at org.apache.spark.util.collection.PrimitiveVector.resize(PrimitiveVector.scala:74)
at org.apache.spark.util.collection.PrimitiveVector.$plus$eq(PrimitiveVector.scala:41)
at org.apache.spark.graphx.impl.EdgePartitionBuilder$mcI$sp.add$mcI$sp(EdgePartitionBuilder.scala:34)
at org.apache.spark.graphx.GraphLoader$$anonfun$1$$anonfun$apply$1.apply(GraphLoader.scala:87)
at org.apache.spark.graphx.GraphLoader$$anonfun$1$$anonfun$apply$1.apply(GraphLoader.scala:76)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.graphx.GraphLoader$$anonfun$1.apply(GraphLoader.scala:76)
at org.apache.spark.graphx.GraphLoader$$anonfun$1.apply(GraphLoader.scala:74)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:703)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:703)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:242)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/01/10 13:32:56 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.NegativeArraySizeException
at java.lang.reflect.Array.newArray(Native Method)
at java.lang.reflect.Array.newInstance(Array.java:75)
at scala.reflect.ClassTag$class.newArray(ClassTag.scala:62)
at scala.reflect.ClassTag$$anon$1.newArray(ClassTag.scala:144)
at org.apache.spark.util.collection.PrimitiveVector.copyArrayWithLength(PrimitiveVector.scala:87)
at org.apache.spark.util.collection.PrimitiveVector.resize(PrimitiveVector.scala:74)
at org.apache.spark.util.collection.PrimitiveVector.$plus$eq(PrimitiveVector.scala:41)
at org.apache.spark.graphx.impl.EdgePartitionBuilder$mcI$sp.add$mcI$sp(EdgePartitionBuilder.scala:34)
at org.apache.spark.graphx.GraphLoader$$anonfun$1$$anonfun$apply$1.apply(GraphLoader.scala:87)
at org.apache.spark.graphx.GraphLoader$$anonfun$1$$anonfun$apply$1.apply(GraphLoader.scala:76)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.graphx.GraphLoader$$anonfun$1.apply(GraphLoader.scala:76)
at org.apache.spark.graphx.GraphLoader$$anonfun$1.apply(GraphLoader.scala:74)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:703)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:703)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:242)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/01/10 13:32:56 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.NegativeArraySizeException
at java.lang.reflect.Array.newArray(Native Method)
at java.lang.reflect.Array.newInstance(Array.java:75)
at scala.reflect.ClassTag$class.newArray(ClassTag.scala:62)
at scala.reflect.ClassTag$$anon$1.newArray(ClassTag.scala:144)
at org.apache.spark.util.collection.PrimitiveVector.copyArrayWithLength(PrimitiveVector.scala:87)
at org.apache.spark.util.collection.PrimitiveVector.resize(PrimitiveVector.scala:74)
at org.apache.spark.util.collection.PrimitiveVector.$plus$eq(PrimitiveVector.scala:41)
at org.apache.spark.graphx.impl.EdgePartitionBuilder$mcI$sp.add$mcI$sp(EdgePartitionBuilder.scala:34)
at org.apache.spark.graphx.GraphLoader$$anonfun$1$$anonfun$apply$1.apply(GraphLoader.scala:87)
at org.apache.spark.graphx.GraphLoader$$anonfun$1$$anonfun$apply$1.apply(GraphLoader.scala:76)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.graphx.GraphLoader$$anonfun$1.apply(GraphLoader.scala:76)
at org.apache.spark.graphx.GraphLoader$$anonfun$1.apply(GraphLoader.scala:74)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:703)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:703)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:242)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
Hello,I am a noob in spark.These days I am using spark-submit to handle the SQL operation on the files of HDFS. Here is my question. Please pick it if I'm wrong.In my view,as soon as the spark-submit finished, the RDD cached in memory will be cleared. But i want to reuse the RDD cached in memory.
So I let the program that will be executed by spark-submit don't quit ,and act as a server:listens on a port, receives queries from socket continually,does the SQL queries and returns the results to the clients. But the program can only handle once.
The client will send to server 10 queries in one time.Only the first 10 SQL jobs that the server received from one client can be successful. The next jobs will be aborted all.And the exception as follow:
Exception in thread "Thread-47" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 10.0 failed 4 times, most recent failure: Lost task 1.3 in stage 10.0 (TID 51, 10.10.108.63): java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_29_piece0 of broadcast_29
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1178)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:165)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:88)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Failed to get broadcast_29_piece0 of broadcast_29
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:137)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:120)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:175)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1175)
... 11 more
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1280)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1268)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1267)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1267)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1493)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1455)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1444)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1813)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1826)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1839)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1910)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:905)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.RDD.collect(RDD.scala:904)
at org.apache.spark.api.java.JavaRDDLike$class.collect(JavaRDDLike.scala:338)
at org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:47)
at cn.ac.ict.chinasoft.netty.SQLServerHandler.runSQLOnSpark(SQLServerHandler.java:40)
at cn.ac.ict.chinasoft.netty.SQLServerHandler.access$000(SQLServerHandler.java:17)
at cn.ac.ict.chinasoft.netty.SQLServerHandler$ExecuteSql.run(SQLServerHandler.java:63)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_29_piece0 of broadcast_29
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1178)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:165)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:88)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
... 1 more
Caused by: org.apache.spark.SparkException: Failed to get broadcast_29_piece0 of broadcast_29
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:138)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:137)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:120)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:120)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:175)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1175)
... 11 more
15/09/11 08:21:48 ERROR scheduler.TaskSetManager: Task 1 in stage 15.0 failed 4 times; aborting job
15/09/11 08:21:48 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 15.0, whose tasks have all completed, from pool
15/09/11 08:21:48 INFO scheduler.TaskSetManager: Lost task 0.1 in stage 16.0 (TID 56) on executor 10.10.108.63: java.io.IOException (org.apache.spark.SparkException: Failed to get broadcast_35_piece0 of broadcast_35) [duplicate 6]
15/09/11 08:21:48 INFO scheduler.TaskSetManager: Starting task 0.2 in stage 16.0 (TID 63, 10.10.108.63, ANY, 2213 bytes)
15/09/11 08:21:48 INFO scheduler.TaskSchedulerImpl: Cancelling stage 12
15/09/11 08:21:48 INFO scheduler.DAGScheduler: ResultStage 12 (collect at SQLServerHandler.java:40) failed in 0.210 s
15/09/11 08:21:48 INFO scheduler.DAGScheduler: Job 12 failed: collect at SQLServerHandler.java:40, took 0.246739 s
15/09/11 08:21:48 INFO scheduler.TaskSchedulerImpl: Cancelling stage 13
15/09/11 08:21:48 INFO scheduler.DAGScheduler: ResultStage 13 (collect at SQLServerHandler.java:40) failed in 0.206 s
15/09/11 08:21:48 INFO scheduler.DAGScheduler: Job 13 failed: collect at SQLServerHandler.java:40, took 0.231847 s
Please help me or give some suggestions on my purpose.Thanks!