Hadoop stand alone mode , dirName.className, gives classNotFoundException

Hadoop stand alone mode , dirName.className, gives classNotFoundException - java

I am trying to run hadoop in stand alone mode and have set up all the correct configuration files and have successfully run the wordCount example. The problem arises when I try to organize my source code and jar files into a file hierarchy to make things a little more organized.
hadoop --config ~/myconfig jar ~/MYPROGRAMSRC/WordCount.jar MYPROGRAMSRC.WordCount ~/wordCountInput/allData ~/wordCountOutput
I use the above code to invoke hadoop from a script file in my home directory. It fails to recognize the WordCount file one level below in the MYPROGRAMSRC directory.
The ~/MYPROGRAMSRC directory contains the:
WordCount.jar, WordCount.java, WordCount.class, WordCount$Map.class and WordCont$Reduce.class files.
Buy why is hadoop throwing a ClassNotFoundException:
Exception in thread "main" java.lang.ClassNotFoundException: MYPROGRAMSRC.WordCount
I know my program runs because if I transfer the script file into the same directory as the WordCount.class file and run the following command:
hadoop --config ~/myconfig jar WordCount.jar WordCount ~/wordCountInput/allData ~/wordCountOutput
It runs fine.

Try
hadoop --config ~/myconfig jar ~/MYPROGRAMSRC/WordCount.jar ~/MYPROGRAMSRC/WordCount ~/wordCountInput/allData ~/wordCountOutput
MYPROGRAMSRC.WordCount makes no sense if MYPROGRAMSRC is a directory.

Related

Commands for executing JAR file containing mapreduce programming on hadoop in Singleton Mode

I have Hadoop setup in Stand-Alone Mode on Ubuntu.
I have JAR file with MapReduce program: runner.JAR in /home/ubuntu folder
package for JAR file: mypackage3
I have input file: demo.csv in /home/ubuntu folder
I want to execute this jar file with demo.csv as an input.
I have used below commands to copy Hadoop's configuration files into it:
mkdir ~/input
cp /usr/local/hadoop/etc/hadoop/*.xml ~/input
Can you please tell me how to execute this MapReduce program.

Java command line using an imported jar

I am sure this is a stupid question and it must have been asked by every java programmer before. But I cannot find a related question at all.
This talks about subdirectories but I don't have any subdirectories as they are all in the same directory as the java file and the directory I executed the command line from Executable jar file error
This solution gives me the same error as I am writing below: Java command line with external .jar
Others (I don't have links to) talk about Eclipse and other IDE but I am not using an IDE, just a Linux terminal.
I am trying to import a public jar file from http://www.hummeling.com/IF97. The downloaded jar file has been renamed to if97.jar.
I have a java file called steam.java with these commands inside the file:
'
import com.hummeling.if97.IF97;
IF97 H2O = new IF97(IF97.UnitSystem.ENGINEERING);
System.out.println("test H2O table PSpecificEnthalpy(1): "+H2O.specificEnthalpyPT(1,300));
System.out.println("test H2O table PSpecificEnthalpy(5): "+H2O.specificEnthalpyPT(5,300));
'
But I do not know how to run this file in the command line.
I successfully compiled by typing:
'javac -cp if97.jar ~/test/steam.java'
Now I have a file called steam.class
But when I execute it with:
'java steam -cp if97.jar'
or
'java steam -jar if97.jar'
I get error:
Exception in thread "main" java.lang.NoClassDefFoundError: com/hummeling/if97/IF97
at steam.start(steam.java:364)
at steam.main(steam.java:341)
Caused by: java.lang.ClassNotFoundException: com.hummeling.if97.IF97
I am trying to execute this in Linux Ubuntu 16.04 using Terminal. Both the files (steam.java and if97.jar) are in the same Home directory where I execute the javac & java command on.
I believe (or I'm mistaken) that the problem is that java isn't able to find the jar file. But I don't know why.
Please advise, thank you in advance.

You need to specify the class name after the JVM options, because whatever coming after the class name are considered arguments for the class, not the JVM.
Try this:
'java -cp if97.jar steam'

java.io.exception Cannot run program "python"

I'm trying to run wordcount topology on apache storm via command line in ubuntu and it is using multiland property to split words from sentences with a program written in python.
I've set the classpath of the multilang dir in .bashrc file but still at the time of execution it is giving error
java.lang.RuntimeException: Error when launching multilang subprocess
Caused by: java.io.IOException: Cannot run program "python" (in directory "/tmp/eaf0b6b3-67c1-4f89-b3d8-23edada49b04/supervisor/stormdist/word-count-1-1414559082/resources"): error=2, No such file or directory

I found my answer, I was submitting jar to storm but the cluster it contain was Local and hence the classpath was not working while uploading jar to storm, I re modified the code and change the local cluster to storm cluster and then It was uploaded successfully to storm, along this I have also included the classpath of multilang folder in the eclipse ide itself instead of creating it in .bashrc file.

The python installed in the system may have its default path, such as /usr/bin or /usr/local/bin. Python modules may have different paths.
Do not fully override $PATH environment variable in .bashrc.
Or you can set the execution bit of the Python script you would like to run, and call the script as a normal program in storm.

hadoop installed how write use the WordCount example?

I'm really new to Hadoop and not familiar to terminal commands.
I followed step by step to install hadoop on my mac and can run some inner hadoop examples. However, when i tried to run the WordCount example, it generate many errors such as org.apache can't be resolved.
The post online said you should put it in where you write your java code.. I used to use eclipse. However, in Eclipse there're so many errors that the project was enable to be compiled.
And suggestion?
Thanks!

Assuming you have also followed the directions to start up a local cluster, or pseudo-distributed cluster, then here is the easiest way.
Go to the hadoop directory, which should be whatever directory is unzipped when you download the hadoop library from apache. From there you can run these command to run hadoop
for Hadoop version 0.23.*
cd $HOME/path/to/hadoop-0.23.*
./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.23.5.jar wordcount myinput outputdir
for Hadoop version 0.20.*
cd $HOME/path/to/hadoop-0.20.*
./bin/hadoop jar hadoop-0.20.2-examples.jar wordcount myinput outputdir

Running mapreduce java programs on hadoop cluster

I am learning to work on hadoop cluster. I have worked for some time on hadoop streaming where I coded map-reduce scripts in perl/python and ran the job.
However, I didn't find any good explanation for running a java map reduce job.
For example:
I have the following program-
http://www.infosci.cornell.edu/hadoop/wordcount.html
Can somebody tell me how shall I actually compile this program and run the job.

Create a directory to hold the compiled class:
mkdir WordCount_classes
Compile your class:
javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d WordCount_classes WordCount.java
Create a jar file from your compiled class:
jar -cvf $HOME/code/hadoop/WordCount.jar -C WordCount_classes/ .
Create a directory for your input and copy all your input files into it, then run your job as follows:
bin/hadoop jar $HOME/code/WordCount.jar WordCount ${INPUTDIR} ${OUTPUTDIR}
The output of your job will be put in the ${OUTPUTDIR} directory. This directory is created by the Hadoop job, so make sure it doesn't exist before you run the job.
See here for a full example.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Hadoop stand alone mode , dirName.className, gives classNotFoundException - java

Try hadoop --config ~/myconfig jar ~/MYPROGRAMSRC/WordCount.jar ~/MYPROGRAMSRC/WordCount ~/wordCountInput/allData ~/wordCountOutput MYPROGRAMSRC.WordCount makes no sense if MYPROGRAMSRC is a directory.

Related

Commands for executing JAR file containing mapreduce programming on hadoop in Singleton Mode

Java command line using an imported jar

java.io.exception Cannot run program "python"

hadoop installed how write use the WordCount example?

Running mapreduce java programs on hadoop cluster

Categories

Resources