I was working on the movie recommendations work around using crcmnky's repository. https://github.com/crcsmnky/mongodb-spark-demo
I have compiled mongo-hadoop and mongo-java-driver and stored the jars: mongo-hadoop-core-1.3.2-SNAPSHOT and mongo-java-driver-2.13.3.jar in the $HADOOP_HOME/lib folder.
After doing all this, I built the project and ran it as per the given instructions on the README file.
I get the error:
Exception in thread "main" java.lang.NoClassDefFoundError: com/mongodb/hadoop/BSONFileInputFormat
at com.mongodb.spark.demo.Recommender.main(Recommender.java:59)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.mongodb.hadoop.BSONFileInputFormat
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
What could have possibly gone wrong? I followed all instructions correctly.
I had the exact same problem and the son of Zeus took me forever to solve. Try this:
Locate your mongo-hadoop-core-1.4.1-SNAPSHOT.jar and mongo-java-driver-2.12.3.jar
Add them to the --jars in spark-submit command "before" your --master and the application jar location. This is the crucial step. If you mention the --jars after the two then you will for some insane reason keep getting the BSONFileInputFormat exception. So effectively your spark-submit command would be -
./bin/spark-submit --class "com.mongodb.spark.demo.Recommender" --jars /home/killshot/Downloads/mongo-hadoop/core/build/libs/mongo-hadoop-core-1.4.1-SNAPSHOT.jar,/home/killshot/Downloads/mongo-hadoop/work/mongodb-spark-demo/target/lib/mongo-java-driver-2.12.3.jar --master local[4]
Related
I have a java spark program that uses google secret manager client libraries (https://cloud.google.com/secret-manager/docs/reference/libraries#client-libraries-install-java).
While doing spark submit I got following errors. Not sure what is going on here. Sounds like some dependency issue but could not find a solution yet. Any help is appreciated. "MyJar.jar" is an uber-jar that I created with maven.
spark-submit --class com.myProgram.MyMain --master local[2] --jars
libs/spark-bigquery-latest_2.12.jar target/MyJar.jar Exception in
thread "main" java.lang.NoSuchMethodError:
com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;CLjava/lang/Object;)V
at io.grpc.Metadata$Key.validateName(Metadata.java:754)
at io.grpc.Metadata$Key.(Metadata.java:762)
at io.grpc.Metadata$Key.(Metadata.java:671)
at io.grpc.Metadata$AsciiKey.(Metadata.java:971)
at io.grpc.Metadata$AsciiKey.(Metadata.java:966)
at io.grpc.Metadata$Key.of(Metadata.java:708)
at io.grpc.Metadata$Key.of(Metadata.java:704)
at com.google.api.gax.grpc.GrpcHeaderInterceptor.(GrpcHeaderInterceptor.java:60)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.createSingleChannel(InstantiatingGrpcChannelProvider.java:321)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.access$1900(InstantiatingGrpcChannelProvider.java:82)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider$1.createSingleChannel(InstantiatingGrpcChannelProvider.java:240)
at com.google.api.gax.grpc.ChannelPool.create(ChannelPool.java:72)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.createChannel(InstantiatingGrpcChannelProvider.java:250)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.getTransportChannel(InstantiatingGrpcChannelProvider.java:228)
at com.google.api.gax.rpc.ClientContext.create(ClientContext.java:205)
at com.google.cloud.secretmanager.v1.stub.GrpcSecretManagerServiceStub.create(GrpcSecretManagerServiceStub.java:248)
at com.google.cloud.secretmanager.v1.stub.SecretManagerServiceStubSettings.createStub(SecretManagerServiceStubSettings.java:342)
at com.google.cloud.secretmanager.v1.SecretManagerServiceClient.(SecretManagerServiceClient.java:152)
at com.google.cloud.secretmanager.v1.SecretManagerServiceClient.create(SecretManagerServiceClient.java:133)
at com.google.cloud.secretmanager.v1.SecretManagerServiceClient.create(SecretManagerServiceClient.java:124)
at com.myProgram.shared.AccessSecretVersion.getPrivateKey(AccessSecretVersion.java:12)
at com.myProgram.MyMain.main(MyMain.java:33)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I am trying to run the sample JavaStatefulNetworkWordCount Algorithm provided by Apache Spark examples but am experiencing a problem when I try to run the program using spark submit, I get the following exception:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/streaming/StateSpec
at JavaStatefulNetworkWordCount.main(JavaStatefulNetworkWordCount.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.streaming.StateSpec
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
I have imported the StateSpec classes and the code is the same as the one provided over here: https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaStatefulNetworkWordCount.java
I would appreciate any help in understanding why this problem arises and how I can fix it.
You should probably add the actual submit command as well.
I highly suspect that you are either missing the --class parameter at all when submitting the spark jar or you did not use the correct qualified class name in it.
I tried using the following command:
spark-submit --class JavaStatefulNetworkWordCount --master local[2] target/SparkStreaming.jar localhost 2222
I downloaded the confluent package which includes camus jars and I followed the instructions online enter link description here.
Hadoop is properly setup (meaning I can use hadoop fs -ls commands and other hadoop jar commands). However, when i tried to run
hadoop jar confluent-camus-1.0.jar com.linkedin.camus.etl.kafka.CamusJob
I got "main" classNotFound error
Exception in thread "main" java.lang.ClassNotFoundException: com.linkedin.camus.
etl.kafka.CamusJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:344)
at org.apache.hadoop.util.RunJar.main(RunJar.java:205)
The the path to the "confluent-camus-1.0.jar" is correct (right under the folder). I didn't start the kafka service, just to try to run it.
Anyone got similar problems?
Thanks.
You should try to inspect your jar file:
jar tvf confluent-camus-1.0.jar | grep com.linkedin.camus.etl.kafka.CamusJob
If you do not find this class, try to find it in other jar, which generated by camus.
After you should add target jar with
hadoop jar confluent-camus-1.0.jar com.linkedin.camus.etl.kafka.CamusJob -libjars {JAR_NAME}
I am trying to run a hadoop job on a server. The version is 0.20.2.
I have a big amount of jars, I am running:
hadoop jar GenData.jar -libjars /path/jar1,path/jar2,...
I am getting the error below even if the corresponding classes are inside the jars:
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/avro/mapreduce/AvroKeyInputFormat at
GenerateTrainningData.main(GenerateTrainningData.java:256) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606) at
org.apache.hadoop.util.RunJar.main(RunJar.java:197) Caused by:
java.lang.ClassNotFoundException:
org.apache.avro.mapreduce.AvroKeyInputFormat at
java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
java.security.AccessController.doPrivileged(Native Method) at
java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
java.lang.ClassLoader.loadClass(ClassLoader.java:358)
Looks like you are getting this exception from Hadoop client side, Mapreduce driver code execution happens in Client JVM. In hadoop -libjars is a generic option which is used for adding dependent jars to mapper/reducer. In your case for adding Jars to Client set you may set the following environment variable,before executing the hadoop command.
export HADOOP_CLASSPATH=<PATH_to_jar>/Jar1:<PATH_to_jar>/Jar2;
(colon ":" can be used for specifying more than 1 jars, In your case you may add the Jar that contains the class org.apache.avro.mapreduce.AvroKeyInputFormat).
New edits
Here first of all you need to find the jar containing the class org.apache.avro.mapreduce.AvroKeyInputFormat. You can find the class inside the jar avro-mapred*.jar (Get the compatible version of avro-mapred-version.jar from internet ) include the same in your classpath using the above command.
You are missing avro-mapred dependency.
I'm very new to Hadoop. I followed the basic tutorial about how to create word count program in hadoop. Everything was fine. I than tried to create my own map reduce, and put it in a separate jar file. When I tried to run the program, it gives me that error:
shean#ubuntu-PC:~/hadoop/bin$ hadoop jar ../weather.jar weather.Weather /user/hadoop/weather_log_sample.txt /user/hadoop/output
Warning: $HADOOP_HOME is deprecated.
Exception in thread "main" java.lang.NoClassDefFoundError: org/myorg/WordCount
at weather.Weather.main(Weather.java:45)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.ClassNotFoundException: org.myorg.WordCount
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
... 6 more
But the problem is , it's looking for WordCount class...
If I am not wrong you are missing the jar wordcount.jar.Please add it to build path.
My advice: you put "package" path first removed. This makes it easier not reported NoClassDefFoundError errors. javac compile time: javac-classpath "$ HADOOP_HOME/hadoop-core-1.2.0.jar: $ HADOOP_HOME/lib/commons-cli-1.2.jar"-d. / weather
litianmin#gmail.com