Hadoop external jars - java

I am trying to run a hadoop job on a server. The version is 0.20.2.
I have a big amount of jars, I am running:
hadoop jar GenData.jar -libjars /path/jar1,path/jar2,...
I am getting the error below even if the corresponding classes are inside the jars:
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/avro/mapreduce/AvroKeyInputFormat at
GenerateTrainningData.main(GenerateTrainningData.java:256) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606) at
org.apache.hadoop.util.RunJar.main(RunJar.java:197) Caused by:
java.lang.ClassNotFoundException:
org.apache.avro.mapreduce.AvroKeyInputFormat at
java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
java.security.AccessController.doPrivileged(Native Method) at
java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
java.lang.ClassLoader.loadClass(ClassLoader.java:425) at
java.lang.ClassLoader.loadClass(ClassLoader.java:358)

Looks like you are getting this exception from Hadoop client side, Mapreduce driver code execution happens in Client JVM. In hadoop -libjars is a generic option which is used for adding dependent jars to mapper/reducer. In your case for adding Jars to Client set you may set the following environment variable,before executing the hadoop command.
export HADOOP_CLASSPATH=<PATH_to_jar>/Jar1:<PATH_to_jar>/Jar2;
(colon ":" can be used for specifying more than 1 jars, In your case you may add the Jar that contains the class org.apache.avro.mapreduce.AvroKeyInputFormat).
New edits
Here first of all you need to find the jar containing the class org.apache.avro.mapreduce.AvroKeyInputFormat. You can find the class inside the jar avro-mapred*.jar (Get the compatible version of avro-mapred-version.jar from internet ) include the same in your classpath using the above command.

You are missing avro-mapred dependency.

Related

hadoop always use my installation path to expand classpath on remote node and then load jar failed

I'm a Hadoop&Hbase newbie. I've already run the WordCount example successfully. Now I modify the Mapper and try to use Hbase row as input data, so I need to import some HBase classes.
After I rebuild WordCount.jar and run:
$ hadoop jar ./out/artifacts/WordCount_jar/WordCount.jar WordCount
I got error like:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
at WordCount.main(WordCount.java:83)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
So I copy all hbase library to a folder and set HADOOP_CLASSPATH:
$ export HADOOP_CLASSPATH=/home/kayuuzu/jar/*
$ hadoop fs -put /home/kayuuzu/jar/* /home/kayuuzu/jar/
$ hadoop jar ./out/artifacts/WordCount_jar/WordCount.jar WordCount
Now it found hbase classes but print error like:
Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://mycluster/ldata/bin/hadoop-2.3.0-cdh5.0.1/share/hadoop/mapreduce2/hadoop-mapreduce-client-core-2.3.0-cdh5.0.1.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1313)
at WordCount.main(WordCount.java:101)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
hadoop classpath:
$ hadoop classpath:
/home/kayuzu/jar/*:/ldata/bin/hadoop-2.3.0-cdh5.0.1/etc/hadoop:/ldata/bin/hadoop-2.3.0-cdh5.0.1/share/hadoop/common/lib/*:/ldata/bin/hadoop-2.3.0-cdh5.0.1/share/hadoop/common/*:/ldata/bin/hadoop-2.3.0-cdh5.0.1/share/hadoop/hdfs:/ldata/bin/hadoop-2.3.0-cdh5.0.1/share/hadoop/hdfs/lib/*:/ldata/bin/hadoop-2.3.0-cdh5.0.1/share/hadoop/hdfs/*:/ldata/bin/hadoop-2.3.0-cdh5.0.1/share/hadoop/yarn/lib/*:/ldata/bin/hadoop-2.3.0-cdh5.0.1/share/hadoop/yarn/*:/ldata/bin/hadoop-2.3.0-cdh5.0.1/share/hadoop/mapreduce/lib/*:/ldata/bin/hadoop-2.3.0-cdh5.0.1/share/hadoop/mapreduce/*
It seems strangely using "ldata/bin/hadoop-2.3.0-cdh5.0.1"(my hadoop installation path) to expand classpath and tring to load jar from hdfs filesytem just like from local.
If I move hadoop-mapreduce-client-core-2.3.0-cdh5.0.1.jar to /home/kayuuzu/jar/ and upload it to hdfs://home/kayuuzu/jar/, this error will dismiss and then fail to load other class. It seems hadoop try to load class from hdfs using the same path on my local machine.
I guess it will work if I move all hadoop library file to one directory and upload it to hdfs keeping same path, but it will destroy my local hadoop installation and there is so many jar file.
Have I misunderstood something? How to specify library path of remote mapreduce?

Hbase example, Exception in thread "main" java.lang.NoClassDefFoundError

We are trying to execute basic Hbase example on hortonworks sandbox (2.3).
hadoop jar /usr/hdp/2.3.0.0-2557/hbase/lib/hbase-examples.jar org.apache.hadoop.hbase.mapreduce.IndexBuilder
We are getting below exception after executing this program.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/util/Bytes
at org.apache.hadoop.hbase.mapreduce.IndexBuilder.<clinit>(IndexBuilder.java:67)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:278)
at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.util.Bytes
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 5 more
Based on this error we tried to set the Hadoop classpath in Hbase-env.sh.
/usr/hdp/2.3.0.0-2557/hbase/lib/hbase-client-1.1.1.2.3.0.0-2557.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/hbase-common-1.1.1.2.3.0.0-2557.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/protobuf-java-2.5.0.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/guava-12.0.1.jar:$/usr/hdp/2.3.0.0-2557/hbase/lib/zookeeper.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/hbase-protocol-1.1.1.2.3.0.0-2557.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/commons-configuration-1.6.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/hadoop-common.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/hbase-0.94.27.jar
But still getting the same error.
Instead of manually adding jars into classpath you can directly use below command.
$(hbase classpath) recursively search in hortonworks hadoop folders and finds the required jars from sandbox.
HADOOP_CLASSPATH=$(hbase classpath):/usr/hdp/2.3.0.0-2557/hbase/conf hadoop jar /usr/hdp/2.3.0.0-2557/hbase/lib/hbase-examples.jar org.apache.hadoop.hbase.mapreduce.IndexBuilder
When I face NoClassDefFoundError error with mapreduce, I add jar using one of the jar class in JobBuilder to resolve it.
e.g.
Job job = new Job(conf);
job.setJarByClass(org.apache.hadoop.hbase.util.Bytes.class);
Supply jars using libjars parameter to your job-
e.g.
LIB=hbase-x.x.x.jar
hadoop jar /usr/hdp/2.3.0.0-2557/hbase/lib/hbase-examples.jar org.apache.hadoop.hbase.mapreduce.IndexBuilder -libjars ${LIB}
you can also add jar to HADOOP_CLASSPATH variable before launch job.
Is all the latest code included in the jar? Use a java decompiler such as jd-gui to look inside the jar file to make sure this class you are referencing is actually there. Also check that the necessary import statements are present in the Java class.

java.lang.ClassNotFoundException when trying to run camus

I downloaded the confluent package which includes camus jars and I followed the instructions online enter link description here.
Hadoop is properly setup (meaning I can use hadoop fs -ls commands and other hadoop jar commands). However, when i tried to run
hadoop jar confluent-camus-1.0.jar com.linkedin.camus.etl.kafka.CamusJob
I got "main" classNotFound error
Exception in thread "main" java.lang.ClassNotFoundException: com.linkedin.camus.
etl.kafka.CamusJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:344)
at org.apache.hadoop.util.RunJar.main(RunJar.java:205)
The the path to the "confluent-camus-1.0.jar" is correct (right under the folder). I didn't start the kafka service, just to try to run it.
Anyone got similar problems?
Thanks.
You should try to inspect your jar file:
jar tvf confluent-camus-1.0.jar | grep com.linkedin.camus.etl.kafka.CamusJob
If you do not find this class, try to find it in other jar, which generated by camus.
After you should add target jar with
hadoop jar confluent-camus-1.0.jar com.linkedin.camus.etl.kafka.CamusJob -libjars {JAR_NAME}

Java program executes in eclipse but not in terminal

I am using Eclipse with Eclipse Maven Plugin (m2e).
My java program compiles and run correctly from eclipse interface but I am unable to compile and run it from terminal.
My Eclipse Setting:
I am using two third party APIs, for which in eclipse build path I added
"/home/syed/workspace/FirstMaven/target/resources/fuse-jna-master/build/classes" (as external class folder)
"/home/syed/workspace/FirstMaven/target/resources/apache-jena-2.11.1/lib" (as external jars)
Package:
package org.organization.upesh.FirstMaven;
My Project path:
syed#ubuntu:~/workspace/FirstMaven$
Source Code Directory Path:
syed#ubuntu:~/workspace/FirstMaven/src/main/java/org/organization/upesh/FirstMaven$
Classes Directory:
syed#ubuntu:~/workspace/FirstMaven/target/classes/org/organization/upesh/FirstMaven$
When I try to execute myProgram via below command
syed#ubuntu:~/workspace/FirstMaven/target/classes$ java org.organization.upesh.FirstMaven.myProgram
it gives me these errors:
Exception in thread "main" java.lang.NoClassDefFoundError: net/fusejna/util/FuseFilesystemAdapterFull
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:482)
Caused by: java.lang.ClassNotFoundException: net.fusejna.util.FuseFilesystemAdapterFull
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 13 more
But my test program that do not use the third party API runs correctly via:
syed#ubuntu:~/workspace/FirstMaven/target/classes$ java org.organization.upesh.FirstMaven.test
I think myProgram is not executing because of the two APIs (class folder and jar folder) that I am using.
I have added path of class and jar folders of the APIs to /etc/environment (given below) and rested my computer, but still the same error
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/syed/workspace/FirstMaven/target/resources/apache-jena-2.11.1/lib:/home/syed/workspace/FirstMaven/target/resources/fuse-jna-master/build/classes"
Please guide me how to run my program correctly
JVM does not get libraries from PATH. It uses special environment variable CLASSPATH that can contain a list of directories or jar files separated by colon on Unix or semicolon on Windows.
So, just define CLASSPATH and put references to all your libraries there.
Aleternatively (and IMHO better) use command line switch -classpath (or its alias -cp) when running java:
java -cp mylib1.jar:mylib2.jar com.mycompany.Main

java.lang.ClassNotFoundException: org.restlet.service.TunnelService command line

I am using the Restlet framework.
I am trying to run my project from a jar file created using Eclipse, by doing: Export->Runnable JAR File, and selecting the option Package required libraries into generated jar.
However, when I try to execute the jar file in the command line, by typing:
java -Djava.security.policy=Client.Policy -jar identiscopeRunnable.jar
I get the following:
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.NoClassDefFoundError: org/restlet/service/TunnelService
at rest.IdentiscopeServer.main(IdentiscopeServer.java:24)
... 5 more
Caused by: java.lang.ClassNotFoundException: org.restlet.service.TunnelService
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
... 6 more
I have added all the jar files downloaded from the Restlet Framework to my project, so I presume it is not a problem with them. Does anyone have any clue about this?
Just in case anyone asks, the line 24 of IdentiscopeServer.java is:
IdentiscopeServerApplication identiscopeServerApp = new IdentiscopeServerApplication();
The class IdentiscopeServerApplication basically does this:
#Override
public Restlet createInboundRoot() {
Router router = new Router(getContext());
//attaches the /tweet path to the TweetRest class
router.attach("/collectionPublic", CollectionPublicREST.class);
router.attach("/collectionPrivate", CollectionPrivateREST.class);
router.attach("/analysis", AnalysisREST.class);
return router;
}
Adding the jars to your eclipse project will not add the jars to your command line classpath.
java -cp <add your jars here separated by ';'(win) or ':'(linux) > -Djava.security.policy=Client.Policy -jar identiscopeRunnable.jar
See if this helps.

Categories

Resources