I am new to Hadoop/Giraph and Java. As part of a task, I downloaded Cloudera Quickstart VM and Giraph on top of it. I am using this book named "Practical Graph Analytics with Apache Giraph; Authors: Shaposhnik, Roman, Martella, Claudio, Logothetis, Dionysios" from which I tried to run the first example on Page 111 (Twitter Followership Graph).
Defining the Shell Environment for Giraph Execution
$export HADOOP_HOME=/usr/lib/hadoop
$export GIRAPH_HOME=/usr/local/giraph
$export HADOOP_CONF_DIR=$GIRAPH_HOME/conf
$PATH=$HADOOP_HOME/bin:$GIRAPH_HOME/bin:$PATH
Running the Giraph Application
$ giraph target/*.jar GiraphHelloWorld -vip src/main/resources/1
-vif org.apache.giraph.io.formats.IntIntNullTextInputFormat
-w 1 -ca giraph.SplitMasterWorker=false,giraph.logLevel=error
I created both jar file and java program in /home/cloudera/target folder and the graph txt is created in src/main/resources/1.
I am facing the below attached error after running the above commands with the below attached program.
https://i.stack.imgur.com/tAQaT.jpg (Error1)
https://i.stack.imgur.com/GqY2O.jpg (Error2)
https://i.stack.imgur.com/ATacy.jpg (Java Program)
Please let me know if anything else is needed.
The issue with the above error was the process in which the jar file and class were created. It needs to be created in Eclipse with a new Maven Project. I created my own pom file, java program and build the project.
Once it was successful in creating jars and classes, I then tried to run the GiraphHelloWorld example by following a systematic approach as before. Also make sure to provide the HADOOP_CLASSPATH to the folder which contains "classes" folder.
Related
I have Jenkins on Centos server with only one job calls "HOMEPAGE".
I would like to run this job in parallel, this is why I setup 5 executors for master node.
This "HOMEPAGE" job running java program that creates some folders and files that I need to use. So I need to know the fool path to files on server that was creating during this job.
Problem start when I run this job in parallel.
For first build Jenkins would assigne path: /var/lib/jenkins/workspace/HOMEPAGE/ and executor_number=2
For second build: /var/lib/jenkins/workspace/HOMEPAGE#2/ and executor_number=4
For third: /var/lib/jenkins/workspace/HOMEPAGE#3/ and executor_number=1
For fourth: /var/lib/jenkins/workspace/HOMEPAGE#4/ and executor_number=3
After execution I could see this folders on server:
As you can see, the number of the HOMEPAGE folder is not consistent with executor_number variable in Jenkins.
How could I get information from Jenkins about where it saving results in current build? Whether it HOMEPAGE#2 or HOMEPAGE#4 folder? I need this information for my java program.
Here is fragments from Console Output:
First build:
<===[JENKINS REMOTING CAPACITY]===>channel started
Executing Maven: -B -f /var/lib/jenkins/workspace/HOMEPAGE/pom.xml -PHomepage -Djob_name=HOMEPAGE -Dexecutor_number=2
Third build:
<===[JENKINS REMOTING CAPACITY]===>channel started
Executing Maven: -B -f /var/lib/jenkins/workspace/HOMEPAGE#3/pom.xml -PHomepage -Djob_name=HOMEPAGE -Dexecutor_number=1
The environment variable 'WORKSPACE' will always contain the proper path, including any #<n> suffix.
I am using Jenkins, and using a Github repo as Source Code.
In the Build section, I am executing this as a Windows Batch command:
set path=%path%;C:\Program Files\Java\jdk1.8.0_144\bin
cd \Users\harr\JenkinsServer\JenkinsTest\src
javac SimpleTest.java //Error is after this executes
java SimpleTest
I know it has something to do with classpath, but I am unsure how to solve this problem in jenkins.
Let me know if more information would be helpful.
Suppose you deploy the jekins server on linux platform, so you have to install the jdk, tomcat and so on, set the env path as well. Then you don't have to execute set path before every build.
you can create a script and copy the command into it, then when jenkins performs the build task, it can execute the script. Refer to the jenkins tutorial to learn about it.
I want to build a jar-library for the openconnect client. But I couldn't find the configure-file there is only a configure.ac file that seems to include the code for the configure file. The content of the readme-file couldn't help me:
Description:
This directory contains a JNI interface layer for libopenconnect, and a
demo program to show how it can be used.
Build instructions:
From the top level, run:
./configure --with-java
make
cd java
ant
sudo java -Djava.library.path=../.libs -jar dist/example.jar <server_ip>
If ocproxy[1] is installed somewhere in your $PATH, this can be run as a
non-root user and it should be pingable from across the VPN.
Test/demo code is in src/com/example/
OpenConnect wrapper library is in src/org/infradead/libopenconnect/
[1] http://repo.or.cz/w/ocproxy.git
ant worked but ./configure didn't work.
Any suggestions?
I'm trying to run wordcount topology on apache storm via command line in ubuntu and it is using multiland property to split words from sentences with a program written in python.
I've set the classpath of the multilang dir in .bashrc file but still at the time of execution it is giving error
java.lang.RuntimeException: Error when launching multilang subprocess
Caused by: java.io.IOException: Cannot run program "python" (in directory "/tmp/eaf0b6b3-67c1-4f89-b3d8-23edada49b04/supervisor/stormdist/word-count-1-1414559082/resources"): error=2, No such file or directory
I found my answer, I was submitting jar to storm but the cluster it contain was Local and hence the classpath was not working while uploading jar to storm, I re modified the code and change the local cluster to storm cluster and then It was uploaded successfully to storm, along this I have also included the classpath of multilang folder in the eclipse ide itself instead of creating it in .bashrc file.
The python installed in the system may have its default path, such as /usr/bin or /usr/local/bin. Python modules may have different paths.
Do not fully override $PATH environment variable in .bashrc.
Or you can set the execution bit of the Python script you would like to run, and call the script as a normal program in storm.
I'm really new to Hadoop and not familiar to terminal commands.
I followed step by step to install hadoop on my mac and can run some inner hadoop examples. However, when i tried to run the WordCount example, it generate many errors such as org.apache can't be resolved.
The post online said you should put it in where you write your java code.. I used to use eclipse. However, in Eclipse there're so many errors that the project was enable to be compiled.
And suggestion?
Thanks!
Assuming you have also followed the directions to start up a local cluster, or pseudo-distributed cluster, then here is the easiest way.
Go to the hadoop directory, which should be whatever directory is unzipped when you download the hadoop library from apache. From there you can run these command to run hadoop
for Hadoop version 0.23.*
cd $HOME/path/to/hadoop-0.23.*
./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.23.5.jar wordcount myinput outputdir
for Hadoop version 0.20.*
cd $HOME/path/to/hadoop-0.20.*
./bin/hadoop jar hadoop-0.20.2-examples.jar wordcount myinput outputdir