Configuring memory for mappers and reducer during mapreduce job submission - java

I am trying to configure memory for mapper/reducer memory during a map reduce job submission as below:
hadoop jar Word-0.0.1-SNAPSHOT.jar -Dmapreduce.map.memory.mb=5120 com.test.Word.App /tmp/ilango/input /tmp/ilango/output/
Is there any wrong in the command above ? I am getting the following exception. It looks like do we need to put JAR file or need to configure what to use -D option in Hadoop. Thanks in advance.
Exception in thread "main" java.lang.ClassNotFoundException: -Dmapreduce.map.memory.mb=5120
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.util.RunJar.main(RunJar.java:205)

Command to run a MR job is
hadoop jar jarname classname input output
As per your command
hadoop jar jarname -D mapreduce.map.memory.mb=5120 classname input output
hadoop checks Driver class with name "-Dmapreduce.map.memory.mb=5120".
Thats why it is showing java.lang.ClassNotFound Exception.
-D option should be supplied after your Driver class.
Try using below command.
hadoop jar Word-0.0.1-SNAPSHOT.jar com.test.Word.App -D mapreduce.map.memory.mb=5120 /tmp/ilango/input /tmp/ilango/output/
Hope this solve your issue.

It looks like you are missing a space after -D
try -D mapreduce.map.memory.mb=5120
There is a difference between -Dproperty=value and -D property=value. The first one sets JVM system property where as the second one sets the Hadoop configuration property.
Quoting from the book Hadoop the Definitive guide, :
-D property=value Sets the given Hadoop configuration property to the given value. Overrides any default or site properties in the
configuration, and any properties set via the -conf option.

If you're using MVN and added the Main class to the manifest, in this case com.test.Word.App, your command -D mapreduce.map.memory.mb=5120 will be taken as input.
So, just remove the com.test.Word.App line

Related

Why occurs a java.lang.NoClassDefFoundError when I use args4j in a batch built by mvn:assembly

For various experimentations, I take care of a java project in github.
After the Maven build, the program runs with a script bat.
Now I opened a branch because I would use the library args4j to parsing the arguments.
The build works fine, the jars exist in the directory lib, but when I run I have this stacktrace of Exception
Exception in thread "main" java.lang.NoClassDefFoundError:
org/kohsuke/args4j/CmdLineException
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2625)
at java.lang.Class.getMethod0(Class.java:2866)
at java.lang.Class.getMethod(Class.java:1676)
at sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486)
Caused by: java.lang.ClassNotFoundException:
org.kohsuke.args4j.CmdLineException
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 6 more
in bat I configured the classpath so that the args4j jar in in lib: this are the instructions of bat script
SET JAVA_DIR=C:\Program Files\Java\jdk1.7.0_80\bin\
>CUT
"%JAVA_DIR%\java" -jar ".\lib\buildCSS-1.0.jar" -cp ".\lib\" -conf "./conf/environment.properties"
I don't understand the deal of java.lang.NoClassDefFoundError. The jar are present and linked by -cp option
Do you have any idea (and solution), please?
You cannot combine -jar and -cp arguments on the command line. If the java command sees -jar it treats everything after the jarfile name as application arguments, AND it ignores any earlier classpath arguments.
You have two choices:
Use -cp, include the main JAR in the classpath, and put the full class name for the main class on the command line.
Use -jar, and add a "Class-Path" attribute to the main JAR's manifest file listing all of the dependencies.
References:
The java command page - explains -jar versus -cp
The JAR file specification - explains the "Class-Path" attribute
Note: since you are building the JAR file using Maven, there are other options; for example
Use the "Shade" plugin to create an executable "uber-jar" containing all of the dependencies in a single JAR.

Hbase example, Exception in thread "main" java.lang.NoClassDefFoundError

We are trying to execute basic Hbase example on hortonworks sandbox (2.3).
hadoop jar /usr/hdp/2.3.0.0-2557/hbase/lib/hbase-examples.jar org.apache.hadoop.hbase.mapreduce.IndexBuilder
We are getting below exception after executing this program.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/util/Bytes
at org.apache.hadoop.hbase.mapreduce.IndexBuilder.<clinit>(IndexBuilder.java:67)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:278)
at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.util.Bytes
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 5 more
Based on this error we tried to set the Hadoop classpath in Hbase-env.sh.
/usr/hdp/2.3.0.0-2557/hbase/lib/hbase-client-1.1.1.2.3.0.0-2557.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/hbase-common-1.1.1.2.3.0.0-2557.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/protobuf-java-2.5.0.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/guava-12.0.1.jar:$/usr/hdp/2.3.0.0-2557/hbase/lib/zookeeper.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/hbase-protocol-1.1.1.2.3.0.0-2557.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/commons-configuration-1.6.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/hadoop-common.jar:/usr/hdp/2.3.0.0-2557/hbase/lib/hbase-0.94.27.jar
But still getting the same error.
Instead of manually adding jars into classpath you can directly use below command.
$(hbase classpath) recursively search in hortonworks hadoop folders and finds the required jars from sandbox.
HADOOP_CLASSPATH=$(hbase classpath):/usr/hdp/2.3.0.0-2557/hbase/conf hadoop jar /usr/hdp/2.3.0.0-2557/hbase/lib/hbase-examples.jar org.apache.hadoop.hbase.mapreduce.IndexBuilder
When I face NoClassDefFoundError error with mapreduce, I add jar using one of the jar class in JobBuilder to resolve it.
e.g.
Job job = new Job(conf);
job.setJarByClass(org.apache.hadoop.hbase.util.Bytes.class);
Supply jars using libjars parameter to your job-
e.g.
LIB=hbase-x.x.x.jar
hadoop jar /usr/hdp/2.3.0.0-2557/hbase/lib/hbase-examples.jar org.apache.hadoop.hbase.mapreduce.IndexBuilder -libjars ${LIB}
you can also add jar to HADOOP_CLASSPATH variable before launch job.
Is all the latest code included in the jar? Use a java decompiler such as jd-gui to look inside the jar file to make sure this class you are referencing is actually there. Also check that the necessary import statements are present in the Java class.

Errors running flume agent with interceptor

I am trying to run custom flume agent from terminal using linux. I am working on cloudera VM. Command running flume looks like:
flume-ng agent --conf . -f spoolDirLocal2hdfs_memoryChannel.conflume.root.logger=DEBUG,console -n Agent5
Sources with interceptor looks like:
Agent5.sources.spooldir-source.interceptors = i1
Agent5.sources.spooldir-source.interceptors.i1.type = org.flumefiles.flume.HtmlInterceptor$Buider
I've placed my jar file both into /usr/lib/hadoop/lib/ and /usr/lib/flume-ng/lib/. Also I've created plugins.d at /usr/lib/flume-ng/plugins.d/ and placed jar there. But when running flume agent I've got an error:
15/02/18 06:10:46 ERROR channel.ChannelProcessor: Builder class not found. Exception follows.
java.lang.ClassNotFoundException: org.intropro.flume.HtmlInterceptor$Buider
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
.....
Where should I place my jar file to make it find builder?
Place it into FLUME_HOME/lib and then restart flume.
If that doesn't work, make sure your interceptor actually implements the Builder interface. That might be another reason.

Weka from command line Mac OS

I usually use Weka from command line on Linux systems to perform feature selection on attributes as:
java -cp PATH_TO_WEKA_JAR weka.attributeSelection.CfsSubsetEval ... (other parameters)
I'm trying to run the same code on Mac OS but I have this error:
Exception in thread "main" java.lang.NoClassDefFoundError: weka.attributeSelection.CfsSubsetEval
Caused by: java.lang.ClassNotFoundException: weka.attributeSelection.CfsSubsetEval
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
How can I run the same command on Mac OS? Shouldn't be the same as it's a UNIX based OS?
I also tried to include the path within " " but there is no difference. What's is wrong?
Try this command:
jar tf PATH_TO_WEKA_JAR | grep weka.attributeSelection.CfsSubsetEval
In the output you should see a line with weka.attributeSelection.CfsSubsetEval.class.
If you don't see such line, then the jar file doesn't contain that class,
and the command cannot work.
In that case, try to run this:
jar tf PATH_TO_WEKA_JAR | less
to just see what is in the jar file.
One way or another, this is a simple class path issue:
the class weka.attributeSelection.CfsSubsetEval is simply not on your classpath.
You need to find the correct path to the jar,
possible to other dependencies as well,
and construct the correct parameter to use in:
java -cp CORRECT_CLASSPATH weka.attributeSelection.CfsSubsetEval # ... your other params
I guess there is something wrong with the Weka jar file. It tells you it can't find a particular class in the file.
Mac OS has evolved away from Unix quite a bit, which may make it necessary to use a differend JAR file.
This may help you: Weka Site download

How to execute mahout with hadoop installation

i'm trying to figure out how to run mahout jar examples with hadoop. I configured mahout and hadoop, now i enter in the hadoop dir and type something like this:
/Users/hadoop/hadoop-0.20.2/bin/hadoop jar /Users/hadoop/trunk/examples/mahout-examples-0.5-SNAPSHOT-job.jar org.apache.mahout.SpareVectorsFromSequenceFile -w -i ratings -o ratings_vectors
but i'm trying and my goal is to run hadoop job for the Grouplens dataset. I executed put command to upload my ratings.dat to Hadoop, and then? The command give me always something like this:
Exception in thread "main" java.lang.ClassNotFoundException: org.apache.mahout.SpareVectorsFromSequenceFile
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
My questions are:
how can i set the right path in hadoop dir to call mahout?
how can i use the org.apache.mahout.cf.taste.example.grouplens.GroupLensRecommenderEvaluatorRunner to compute my data ratings.dat with hadoop?
Thank you very much, I'm beginning with hadoop and mahout ;)
You have a typo. They are "sparse vectors", not "spare vectors". See SpareVectorsFromSequenceFile which should be SparseVectorsFromSequenceFile.

Categories

Resources