I'm trying to run wordcount topology on apache storm via command line in ubuntu and it is using multiland property to split words from sentences with a program written in python.
I've set the classpath of the multilang dir in .bashrc file but still at the time of execution it is giving error
java.lang.RuntimeException: Error when launching multilang subprocess
Caused by: java.io.IOException: Cannot run program "python" (in directory "/tmp/eaf0b6b3-67c1-4f89-b3d8-23edada49b04/supervisor/stormdist/word-count-1-1414559082/resources"): error=2, No such file or directory
I found my answer, I was submitting jar to storm but the cluster it contain was Local and hence the classpath was not working while uploading jar to storm, I re modified the code and change the local cluster to storm cluster and then It was uploaded successfully to storm, along this I have also included the classpath of multilang folder in the eclipse ide itself instead of creating it in .bashrc file.
The python installed in the system may have its default path, such as /usr/bin or /usr/local/bin. Python modules may have different paths.
Do not fully override $PATH environment variable in .bashrc.
Or you can set the execution bit of the Python script you would like to run, and call the script as a normal program in storm.
Related
I'm trying to install and start Hadoop 2.7.1 on my computer (windows 10) with command lines and I have followed steps from different websites for that. I have configurated systems variables and Hadoop (edit some files in etc folder : Hadoop-env.cmd, core-site.xml, mapred-site.xml, yarn-site.xml, hdfs-site.xml) and download a new bin folder. I'm currently trying to start Hadoop and I have executed the command hdfs namenode -format successfully.
However, when pointing in command prompt to sbin folder and trying to execute start-dfs.cmd I have an error message telling : The system cannot find the file hadoop. Anyone has any idea what I should do or have done wrong ?
Set the Hadoop home and path variable in environment variable path
change the file name from hadoop to hadoop.cmd in bin/hadoop.cmd
and run start-all in cmd and check if it is working or not.
and check java_home path in environment variables.
Old but for others starting out, this is how I fixed this problem. I will assume you have followed:
https://github.com/MuhammadBilalYar/Hadoop-On-Window/wiki/Step-by-step-Hadoop-2.8.0-installation-on-Window-10
And are having troubles.
Open start-all.cmd in 'C:\hadoop-2.8.0\sbin' in a text editor like notepad++
Replace line 24 with set 'HADOOP_BIN_PATH=C:\hadoop-2.8.0\bin'
In this file note the calls to 'hadoop-config .cmd' , 'start-dfs .cmd' , 'start-yarn.cmd'. Open these in a text editor.
Replace the hadoop path as per step 2 . Set HADOOP_BIN_PATH=C:\hadoop-2.8.0\bin
Save files and re-run your start-all cmd
Hope this helps. `
I am following this tutorial to setup hadoop on my windows 10. I successfully followed the tutorial till Page 5 and after putting everything I tried to check the hadoop version on the command line. But it returns the following error:
D:\hadoop-2.6.0\bin>hadoop version
Error: Could not find or load main class Azfar
Instead, if I use the following command it somehow works:
D:\hadoop-2.6.0\bin>hadoop
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
fs run a generic filesystem user client
version print the version
jar <jar> run a jar file
checknative [-a|-h] check native hadoop and compression libraries availability
distcp <srcurl> <desturl> copy file or directories recursively
archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
classpath prints the class path needed to get the
Hadoop jar and the required libraries
credential interact with credential providers
key manage keys via the KeyProvider
daemonlog get/set the log level for each daemon
or
CLASSNAME run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
I think it is doing this because my username "Azfar Faizan" contains a space, but I didn't use any path which includes my username folder during setting up. Can anybody guide me where exactly is the problem or I am doing it totally wrong ?
I am able to get localhost:16010 running. But, somehow the Hbase shell is not launching when I use :
01HW993798:bin tcssig$ cd /Users/tcssig/Downloads/hbase-1.0.3/bin
01HW993798:bin tcssig$ hbase shell
-bash: hbase: command not found
When I directly launch Hbase Unix executable, it generates the below error log.
Error: JAVA_HOME is not set
Although I have set it. After this only, the localhost:16010 is running.
NOTE : I know there is one similar question, but no relevant answers are present there.
Using this I am able to invoke the command, but now it gives the error :
./hbase: line 403: /Users/tcssig/Downloads/hbase-
1.0.3/bin/JAVA_HOME:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Cont``ents/Home/bin/java: No such file or directory
Although I have java file there.
Your hbase invocation should be like this:
cd /Users/tcssig/Downloads/hbase-1.0.3/bin
./hbase shell [Note the ./]
When you just type hbase shell linux searches for hbase executable in all directories included in PATH environment variable. Since above bin directory is not included it errors out.
Alternatively you can also update your path variable, based on linux distribution, the command to do that may vary. It should be something like:
export PATH=/Users/tcssig/Downloads/hbase-1.0.3/bin:$PATH
Put this command in your .bashrc or .bash_profile and then source this file. That way the bin directory is now included in PATH and hbase command is available.
Go into $HBASE_HOME/bin path, and try:
./hbase shell
I am trying to run hadoop in stand alone mode and have set up all the correct configuration files and have successfully run the wordCount example. The problem arises when I try to organize my source code and jar files into a file hierarchy to make things a little more organized.
hadoop --config ~/myconfig jar ~/MYPROGRAMSRC/WordCount.jar MYPROGRAMSRC.WordCount ~/wordCountInput/allData ~/wordCountOutput
I use the above code to invoke hadoop from a script file in my home directory. It fails to recognize the WordCount file one level below in the MYPROGRAMSRC directory.
The ~/MYPROGRAMSRC directory contains the:
WordCount.jar, WordCount.java, WordCount.class, WordCount$Map.class and WordCont$Reduce.class files.
Buy why is hadoop throwing a ClassNotFoundException:
Exception in thread "main" java.lang.ClassNotFoundException: MYPROGRAMSRC.WordCount
I know my program runs because if I transfer the script file into the same directory as the WordCount.class file and run the following command:
hadoop --config ~/myconfig jar WordCount.jar WordCount ~/wordCountInput/allData ~/wordCountOutput
It runs fine.
Try
hadoop --config ~/myconfig jar ~/MYPROGRAMSRC/WordCount.jar ~/MYPROGRAMSRC/WordCount ~/wordCountInput/allData ~/wordCountOutput
MYPROGRAMSRC.WordCount makes no sense if MYPROGRAMSRC is a directory.
I have class files loaded on to Hadoop file system.And also i have loaded input file to hdfs.
When I run class file through hadoop command in terminal i get Class not found error.
E.G.:
I have HDFS contents as
WordCount.class
WordCountMapper.class
WordCOuntReducer.class
SampleInpujt.txt
Can Some one correct me where i am doing wrong.Or is this can be done in real.
Below is the commandline we use for running a Java mapreduce job on our 4-node Hadoop-2.2.0 cluster daily and it works fine. We run it from the namenode but any machine in the cluster should work fine.
hadoop jar ~/..path../mr_orchestrate/target/mr-orchestrate-1.0.jar com.rr.ap.orchestrate.MROrchestrate /user/hduser/in/Sample_15Feb2014.txt /user/hduser/out/out15Feb2014
You may need the "-libjars" option to add other library paths.