Java: Class not found for PageRank algorithm in Apache Hadoop - java

I am trying to run PageRank algorithm in Apache Hadoop (2.6.5) cluster (1 master 2 slaves). I am using the program in this repository - https://github.com/danielepantaleone/hadoop-pagerank.git. I was able to compile all the sources using this command -
sudo javac -classpath ${HADOOP_CLASSPATH} -d ./build src/it/uniroma1/hadoop/pagerank/PageRank.java src/it/uniroma1/hadoop/pagerank/job1/PageRankJob1Mapper.java src/it/uniroma1/hadoop/pagerank/job1/PageRankJob1Reducer.java src/it/uniroma1/hadoop/pagerank/job2/PageRankJob2Mapper.java src/it/uniroma1/hadoop/pagerank/job2/PageRankJob2Reducer.java src/it/uniroma1/hadoop/pagerank/job3/PageRankJob3Mapper.java
I created the jar file using this command sudo jar -cf build/pagerank.jar build/.
I am trying to run the program just like the wordcount example like this -
sudo bin/hadoop jar hadoop-pagerank/build/pagerank.jar PageRank --
input /usr/local/hdfs/web-Google.txt --output /usr/local/hdfs-out-PR
Sometimes I get an error like this -
Exception in thread "main" java.lang.NoClassDefFoundError: PageRank (wrong name: it/uniroma1/hadoop
/pagerank/PageRank)
and sometimes I get an error like this - Exception in thread "main" java.lang.ClassNotFoundException: PageRank for different types of compilation.
I am not sure what am I doing wrong. Can anyone please help me in proper steps to compile and run the program in Hadoop ? I dont have any pom.xml file and I am able to run the provided wordcount example jar.

You have to use package name before the name of the class,
it means you have to use :
it.uniroma1.hadoop.pagerank.PageRank
rather than PageRank
in your command.
like this :
hadoop jar hadoop-pagerank/build/pagerank.jar it.uniroma1.hadoop.pagerank.PageRank --input /usr/local/hdfs/web-Google.txt --output /usr/local/hdfs-out-PR

Related

Classpath issues - getJNIEnv failed

I have successfully compiled the JNI based Apache libhdfs (C++) on my Hadoop Sandbox / CentOS - no compilation errors or warnings:
g++ test.cpp -o test -I/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.151.x86_64/include/
-I/usr/hdp/2.6.3.0-235/usr/include/ -I/usr/hdp/2.6.3.0-235/hadoop/bin
-I/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-1.b12.el6_9.x86_64/include/
-I/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-1.b12.el6_9.x86_64/jre/lib/amd64/
-L/usr/hdp/2.6.3.0-235/hadoop/lib/ -L/usr/hdp/2.6.3.0-235/hadoop/lib/native
-L/usr/hdp/2.6.3.0-235/hadoop/lib/ -L/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-1.b12.el6_9.x86_64/jre/lib/amd64/
-L/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-1.b12.el6_9.x86_64/jre/lib/amd64/server/
-lhdfs -pthread -ljvm
Once I try to run the code, I get the following errors:
[root#sandbox-hdp ~]# ./test
Environment variable CLASSPATH not set!
getJNIEnv: getGlobalJNIEnv failed
Environment variable CLASSPATH not set!
getJNIEnv: getGlobalJNIEnv failed
If I run hadoop classpath in the terminal, I get the following output:
[root#sandbox-hdp ~]# hadoop classpath
/usr/hdp/2.6.3.0-235/hadoop/conf:/usr/hdp/2.6.3.0-
235/hadoop/lib/:/usr/hdp/2.6.3.0-235/hadoop/.//:/usr/hdp/2.6.3.0-235/hadoop-
hdfs/./:/usr/hdp/2.6.3.0-235/hadoop-hdfs/lib/:/usr/hdp/2.6.3.0-235/hadoop-
hdfs/.//:/usr/hdp/2.6.3.0-235/hadoop-yarn/lib/:/usr/hdp/2.6.3.0-235/hadoop-
yarn/.//:/usr/hdp/2.6.3.0-235/hadoop-mapreduce/lib/:/usr/hdp/2.6.3.0-
235/hadoop-mapreduce/.//::jdbc-mysql.jar:mysql-connector-java-
5.1.17.jar:mysql-connector-java-5.1.37.jar:mysql-connector-
java.jar:/usr/hdp/2.6.3.0-235/tez/:/usr/hdp/2.6.3.0-
235/tez/lib/:/usr/hdp/2.6.3.0-235/tez/conf
On the Apache libhdfs page it says:
The most common problem is the CLASSPATH is not set properly when
calling a program that uses libhdfs. Make sure you set it to all the
Hadoop jars needed to run Hadoop itself as well as the right
configuration directory containing hdfs-site.xml. It is not valid to
use wildcard syntax for specifying multiple jars. It may be useful to
run hadoop classpath --glob or hadoop classpath --jar to generate the
correct classpath for your deployment. See Hadoop Commands Reference
for more information on this command.
I do however not get how to proceed after many trial and error attempts, I would therefore appreciate any help that could help me to solve this problem.
Edit: tried the following: CLASSPATH=hadoop classpath ./test
...which gave me the following error: libjvm.so: cannot open shared object file: No such file or directory
I tried the following: export LD_LIBRARY_PATH=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-1.b12.el6_9.x86_64/jre/lib/amd64/server
...and now the error is:
[root#sandbox-hdp ~]# CLASSPATH=$CLASSPATH:`hadoop classpath` ./test
loadFileSystems error:
(unable to get stack trace for java.lang.NoClassDefFoundError exception: ExceptionUtils::getStackTrace error.)
hdfsBuilderConnect(forceNewInstance=0, nn=default, port=0, kerbTicketCachePath=(NULL), userName=(NULL)) error:
(unable to get stack trace for java.lang.NoClassDefFoundError exception: ExceptionUtils::getStackTrace error.)
hdfsOpenFile(/tmp/testfile.txt): constructNewObjectOfPath error:
(unable to get stack trace for java.lang.NoClassDefFoundError exception: ExceptionUtils::getStackTrace error.)
Maybe the following could works for you:
CLASSPATH=$CLASSPATH:`hadoop classpath` ./test
or only this:
CLASSPATH=`hadoop classpath` ./test
Check out JAVA_HOME environment variable, maybe it could alter the java libraries used too.
And finally, a wrapper like the script below could be useful:
#!/bin/bash
export CLASSPATH="AllTheJARs"
ARG0="$0"
EXEC_PATH="$( dirname "$ARG0" )"
"${EXEC_PATH}/test" $#

Could not find or load main class ...commons-logging-1.1.1.jar while running a spring batch job from linux command line

I am trying to run a simple spring batch job from linux command line using the following command:
$PATH_TO_JAVA -cp $PATH_TO_JAR/dependency-jars/*;$PATH_TO_JAR/SpringBatchExample.jar org.springframework.batch.core.launch.support.CommandLineJobRunner file:resources/spring/batch/jobs/job-read-files.xml readMultiFileJob
But getting the following error:
Error: Could not find or load main class ...dependency-jars.commons-logging-1.1.1.jar
commons-logging-1.1.1.jar file is present in $PATH_TO_JAR/dependency-jars folder. Please advise.

Command: java -jar [...] Fails with Error Message

I am running java 1.8.0_65 on Windows 7.
I create a JAR and run it with the following command:
java -jar printxml.jar
And get this error:
Error: Could not find or load main class printxml.PrintXml
Here is my command to create the JAR:
jar cmfev manifest.txt printxml.jar printxml.PrintXml #filelist.txt
Contents of file "manifest.txt":
Class-Path: C:\Users\Me\SQLSER~1\JDBC\jtds-1.3.1.jar
I checked whether printxml.PrintXml class is in the JAR via this command:
jar tvf printxml.jar printxml/PrintXml.class
The command succeeded, i.e. PrintXml class is in the JAR.
I then checked if the PrintXml class in the JAR has a "main" method via this command:
javap -classpath printxml.jar -public printxml.PrintXml
The command succeeded and its output included...
public static void main(java.lang.String[]);
Searching the Internet, I found only the obvious answers, like:
Your classpath is wrong.
Your class doesn't have a "main" method.
Can someone please tell me how to resolve this problem?
Thanks,
Avi.
As Homer Simpson would say: D'OH
The value of the Class-Path entry in file "manifest.txt" is wrong!
It needs to be a URL!
So I changed it to:
file:/C:/Users/Me/SQLSER~1/JDBC/jtds-1.3.1.jar
Hey presto! No more error message. Now it runs!
Thanks to all who helped. ;-)

How to execute an AQL query with Java API?

I have a collection named docCollection and I want to perform a normal AQL query on for example:
FOR id IN docCollection FILTER id.center == "Germany" RETURN id
I have tried to use the example as stated here:
https://docs.arangodb.com/cookbook/JavaDriverXmlData.html
But it didn't worked for me and it showed me
Exception in thread "main" java.lang.NullPointerException
Normally you have to use driver.executeDocumentQuery(...) for document queries.
To illustrate the differences between driver.executeDocumentQuery(...) and driver.executeAqlQuery(...) I added an example.
Download the ArangoDB java driver on github and compile it with maven:
mvn clean install -DskipTests=true -Dgpg.skip=true -Dmaven.javadoc.skip=true -B
Maven creates the standalone driver JAR file (arangodb-java-driver-X.X.X-SNAPSHOT-standalone.jar) containing all dependencies in the target directory.
Fetch the example code:
wget https://gist.githubusercontent.com/anonymous/bd68b523647548e5fb36d27c29561cfe/raw/f2922d431b9f1e5a3f3239e9024cf342536f55f7/AqlExample.java
Compile the example code:
javac -classpath arangodb-java-driver-X.X.X-SNAPSHOT-standalone.jar AqlExample.java
Start the ArangoDB without authentication on the default port and run the example code:
java -classpath arangodb-java-driver-X.X.X-SNAPSHOT-standalone.jar:. AqlExample

Could not find or load main class

I want to run a java project in terminal. When I compiled, no error occurred, but when I run the program I get the following error:
Could not find or load main class orException in thread "main"
java.lang.NoClassDefFoundError: Appium (wrong name:
com/appiumproj/test/Appium)
Please help me to solve this problem.
iMac:~ Samuel$ javac -cp /Users/Samuel/Downloads/AppiumTest/lib/selenium-server-standalone-2.45.0.jar:/Users/Samuel/Downloads/AppiumTest/lib/gson-2.3.1.jar:/Users/Samuel/Downloads/AppiumTest/lib/java-client-2.2.0.jar: /Users/Samuel/Downloads/AppiumTest/src/com/appiumproj/test/Appium.java
iMac:~ Samuel$ java -cp /Users/Samuel/Downloads/AppiumTest/lib/selenium-server-standalone-2.45.0.jar:/Users/Samuel/Downloads/AppiumTest/lib/gson-2.3.1.jar:/Users/Samuel/Downloads/AppiumTest/lib/java-client-2.2.0.jar: /Users/Samuel/Downloads/AppiumTest/src/com/appiumproj/test/Appium
Error: Could not find or load main class .Users.Samuel.Downloads.AppiumTest.src.com.appiumproj.test.Appium
iMac:~ Samuel$
You need to specify the name of the class - not a filename. It needs to be the fully-qualified class name, and it needs to be on the classpath. So after compiling, you'd want something like this (just spread out on multiple lines for readability; the backslashes are line continuations - you should be able to copy and paste this straight into your shell):
java -cp /Users/Samuel/Downloads/AppiumTest/lib/selenium-server-standalone-2.45.0.jar\
:/Users/Samuel/Downloads/AppiumTest/lib/gson-2.3.1.jar\
:/Users/Samuel/Downloads/AppiumTest/lib/java-client-2.2.0.jar\
:/Users/Samuel/Downloads/AppiumTest/src \
com.appiumproj.test.Appium
Are you sure your compiled version is in /Users/Samuel/Downloads/AppiumTest/src/com/appiumproj/test/ ? I would say that it is probably where the javac was run. Check and find it and specify the path to to compile version

Categories

Resources