I can create Mapreduce programme, so i can configure hadoop in eclipse. after i can create 1. mapper, 2. reducer, 3.mapreducerDriver
after i can create jar file with help of Makefile in shell command prompt,
after i can use
this command
hadoop jar $ {JarFile} $ {MainFunc} input output
make file
JarFile = "Sample-0.1.jar"
MainFunc = "mypack.Mapreduce"
LocalOutDir = "/ tmp / output"
after i use
jar-cvf $ {Sample-0.1.jar}-C bin /.
jar file created , finally i can write this command.
hadoop jar $ {Sample-0.1.jar} $ {mypack.Mapreduce} input output
finally i get like this error will come in command prompt.
bash: ${mypack.Mapreduce}: bad substitution
how can i solve this problem . pleasae help me
now i find sollution
hadoop jar $ {Sample-0.1.jar} mypack.Mapreduce input output
then hadoop will be run .
Write a script like compile.sh
$ mkdir wordcount_classes
$ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d wordcount_classes WordCount.java
$ jar -cvf /usr/joe/wordcount.jar -C wordcount_classes/ .
For reference: http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html
Related
I am following the documentation found at this link
https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Usage
When i try to compile for WordCount.java and create a jar, I get the following error
bin/hadoop com.sun.tools.javac.Main WordCount.java
Error: Could not find or load main class com.sun.tools.javac.Main
I have verified my $JAVA_HOME and $HADOOP_CLASSPATH in the hadoop-env.sh file and also verified to see if I have the jdk
Here are the contents from hadoop-env.sh
export JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk1.8.0_111.jdk/Contents/Home/"
.......
.........
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
if [ "$HADOOP_CLASSPATH" ]; then
export HADOOP_CLASSPATH="$JAVA_HOME/lib/tools.jar"
else
export HADOOP_CLASSPATH=$f
fi
I am not sure the reason behind error or if I am missing another key configuration?
This doesn't make sense in that loop... Nor does checking the existence of the variable first
if [ "$HADOOP_CLASSPATH" ]; then
export HADOOP_CLASSPATH="$JAVA_HOME/lib/tools.jar"
else
You need to set HADOOP_CLASSPATH="$JAVA_HOME/lib/tools.jar", as the documentation says for that class to be found. And that class is only available in the JDK
But, you could just run javac command to compile code. Not sure why the docs have you calling that class.
How to compile a Hadoop program
$ javac -classpath ${HADOOP_CLASSPATH} -d WordCount/ WordCount.java
To create jar:
$ jar -cvf WordCount.jar -C WordCount/ .
To run:
$ hadoop jar WordCount.jar WordCount input/ output
Suggestion Please use Maven/Gradle to create proper JAR files, and an IDE to write code.
P.S. Not many people actually write plain MapReduce
I am trying to learn MapReduce from the official documentation. To make a jar file for WordCount class, the documentation says to run the following command:
javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d wordcount_classes WordCount.java
But, I found that my Hadoop directory has no core.jar present. I suppose my Hadoop installation is alright as I can execute the Hadoop shell script from the Bin folder.
If you trying with that:
javac -classpath `hadoop classpath` -d wordcount_classes WordCount.java
Isn't the best practice, I think, but work for me.
Check in your hadoop-1.2.1 folder (as in my case), which you unzipped in "Prepare to Start Cluster" of single node setup. There you would find hadoop-1.2.1-core.jar
That is the file being used to compile here.
I am trying to create a Jar file from command line using -C flag, but every time it returns a help screen.
I am giving following command.
user#ubuntu:~/CDH/JAVA_WORKSPACE/JAVA-SETUP$ jar cvfm ./build/jar/Setup.jar MANIFEST -C build/classes/com/demo/Setup.class
If I remove -C command then it archives fine.
But if -C flag is there, then it always returns jar help page.
Am I doing something wrong here?
Your command line option is:
-C build/classes/com/demo/Setup.class
The jar tools wants the directory name to follow the -C and then the file. You need two words to follow "-C" like this:
-C build/classes/com/demo Setup.class
I understand the command would be javac file_name.java but how would I put together a shell script which could compile several java files?
I was also thinking about copying the files, which I presume I just use cp and absolute file path referencing.
Create a .sh file and add the following contents. Make the file as executable and run it.
(Specify the complete path along with the file name)
#! /bin/sh
javac sample.java
Try this script: compile_java_files.sh
#!/bin/sh
typeset -r JAVA_FILES_DIR=$(cd full_path_to_java_files 2>/dev/null ; pwd) # JAVA FILES DIRECTORY
LOG_DIR="/tmp/java_compilation/logs" # Create this dir or use another one
for java_file in `ls $JAVA_FILES_DIR`;
do
javac $java_file
return_status=`echo $?`
if [ $return_status -ne 0 ]
then
echo "Failed to compile $java_file" >> $LOG_DIR/$java_file.ERR
exit 1
fi
done
Then run your script(don't forget to specify the path to the directory that contain java files):
chmod +x compile_java_files.sh
./compile_java_files.sh
I'm trying to compile a simple WordCount.java map-reduce example on a linux (CentOS) installation of Cloudera 4. I keep hitting compiler errors when I reference any of the hadoop classes, but I can't figure out which jars of the hundreds under /usr/lib/hadoop I need to add to my classpath to get things to compile. Any help would be greatly appreciated! What I'd like most is a java file for word count (just in case the one I found is bad for some reason) along with the associated command to compile and run it.
I am trying to do this using just javac rather than Eclipse. My main issue either way is what exactly are the Hadoop libraries from the Cloudera 4 install which I need to include in order to get the classic WordCount example to compile. Basically, I need to put the Java MapReduce API classes (Mapper, Reducer, etc.) in my classpath.
I have a script that builds my hadoop classes. Try:
#!/bin/bash
program=`echo $1 | awk -F "." '{print $1}'`
if [ ! -d "${program}_classes" ]
then mkdir ${program}_classes/;
fi
javac -classpath /usr/lib/hadoop/hadoop-common-2.0.0-cdh4.0.1.jar:/usr/lib/hadoop/client/h\
adoop-mapreduce-client-core-2.0.0-cdh4.0.1.jar -d ${program}_classes/ $1
jar -cvf ${program}.jar -C ${program}_classes/ .;
You were probably missing the key jars:
/usr/lib/hadoop/hadoop-common-2.0.0-cdh4.0.1.jar
and
/usr/lib/hadoop/client/hadoop-mapreduce-client-core-2.0.0-cdh4.0.1.jar
If you are running the Cloudera CDH4 Virtual Machine then the following should get you running:
javac -classpath /usr/lib/hadoop/hadoop-common-2.0.0-cdh4.0.0.jar:/usr/lib/hadoop/client/hadoop-mapreduce-client-core-2.0.0-cdh4.0.0.jar -d wordcount_classes WordCount.java
Or you can export environment:
export JAVA_HOME=/usr/java/default
export PATH=${JAVA_HOME}/bin:${PATH}
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar
and use the commands below:
$ bin/hadoop com.sun.tools.javac.Main WordCount.java
$ jar cf wc.jar WordCount*.class
If you are using Eclipse please do add Hadoop packages. you may get it from java2s or any similar sites. I couldn't say without know anything about what you did till now.