I am trying to run wordcount example already provided in hadoop on the following env: (Pseudodistributed mode)
Windows 7
Hadoop 2.7.1
JDK 1.7.x
RAM 4 GB
The jps command returns
C:\deploy\hadoop-2.7.1>jps
2336 ResourceManager
7500 NameNode
4984 Jps
6900 NodeManager
4940 DataNode
The command I use for setting the hadoop heap size
set HADOOP_HEAPSIZE=512
The command I use from the hadoop home installation directory is
bin\yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /input /output
I see the following stack trace
C:\deploy\hadoop-2.7.1>bin\yarn jar share/hadoop/mapreduce/hadoop-mapreduce-exam
ples-2.7.1.jar wordcount /input /output
15/08/14 22:36:26 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0
:8032
15/08/14 22:36:27 INFO input.FileInputFormat: Total input paths to process : 1
15/08/14 22:36:28 INFO mapreduce.JobSubmitter: number of splits:1
15/08/14 22:36:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_14
39571873038_0001
15/08/14 22:36:28 INFO impl.YarnClientImpl: Submitted application application_14
39571873038_0001
15/08/14 22:36:28 INFO mapreduce.Job: The url to track the job: http://XXX-PC
:8088/proxy/application_1439571873038_0001/
15/08/14 22:36:28 INFO mapreduce.Job: Running job: job_1439571873038_0001
15/08/14 22:36:37 INFO mapreduce.Job: Job job_1439571873038_0001 running in uber
mode : false
15/08/14 22:36:37 INFO mapreduce.Job: map 0% reduce 0%
15/08/14 22:36:37 INFO mapreduce.Job: Job job_1439571873038_0001 failed with sta
te FAILED due to: Application application_1439571873038_0001 failed 2 times due
to AM Container for appattempt_1439571873038_0001_000002 exited with exitCode:
1
For more detailed output, check application tracking page:http://XXX-PC:8088/
cluster/app/application_1439571873038_0001Then, click on links to logs of each a
ttempt.
Diagnostics: Exception from container-launch.
Container id: container_1439571873038_0001_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la
unchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:615)
at java.lang.Thread.run(Thread.java:744)
Shell output: 1 file(s) moved.
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
15/08/14 22:36:37 INFO mapreduce.Job: Counters: 0
When I went to the Stderr logs as mentioned in the above stack trace, the actual error came out to be
Error: Could not create the Java Virtual Machine.
When I try to increase HADOOP_HEAPSIZE to 1024, the namenode, datanode and yarn daemons do not start at all and give me the same error of could not create java virtual machine
Has someone got the same problem ? How to solve this issue ?
You can solve this problem by editting the file /etc/hadoop/hadoop-env.sh like the one below:
export HADOOP_HEAPSIZE=3072
Related
I tried this command in command prompt (run as administrator):
hadoop jar C:\Users\tejashri\Desktop\Hadoopproject\WordCount.jar WordcountDemo.WordCount /work /out
but i got this error message:
my application got stopped.
2020-04-04 23:53:27,918 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
2020-04-04 23:53:28,881 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2020-04-04 23:53:28,951 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/tejashri/.staging/job_1586024027199_0006
2020-04-04 23:53:29,162 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-04-04 23:53:29,396 INFO input.FileInputFormat: Total input files to process : 1
2020-04-04 23:53:29,570 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-04-04 23:53:29,762 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-04-04 23:53:29,802 INFO mapreduce.JobSubmitter: number of splits:1
2020-04-04 23:53:30,059 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2020-04-04 23:53:30,156 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1586024027199_0006
2020-04-04 23:53:30,156 INFO mapreduce.JobSubmitter: Executing with tokens: []
2020-04-04 23:53:30,504 INFO conf.Configuration: resource-types.xml not found
2020-04-04 23:53:30,507 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2020-04-04 23:53:30,586 INFO impl.YarnClientImpl: Submitted application application_1586024027199_0006
2020-04-04 23:53:30,638 INFO mapreduce.Job: The url to track the job: http://LAPTOP-2UBC7TG1:8088/proxy/application_1586024027199_0006/
2020-04-04 23:53:30,640 INFO mapreduce.Job: Running job: job_1586024027199_0006
2020-04-04 23:53:35,676 INFO mapreduce.Job: Job job_1586024027199_0006 running in uber mode : false
2020-04-04 23:53:35,679 INFO mapreduce.Job: map 0% reduce 0%
2020-04-04 23:53:35,698 INFO mapreduce.Job: Job job_1586024027199_0006 failed with state FAILED due to: Application application_1586024027199_0006 failed 2 times due to AM Container for appattempt_1586024027199_0006_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2020-04-04 23:53:34.955]Exception from container-launch.
Container id: container_1586024027199_0006_02_000001
Exit code: 1
Shell output: 1 file(s) moved.
"Setting up env variables"
"Setting up job resources"
"Copying debugging information"
C:\hadoop\hdfstmp\nm-local-dir\usercache\tejashri\appcache\application_1586024027199_0006\container_1586024027199_0006_02_000001>rem Creating copy of launch script
C:\hadoop\hdfstmp\nm-local-dir\usercache\tejashri\appcache\application_1586024027199_0006\container_1586024027199_0006_02_000001>copy "launch_container.cmd" "C:/hadoop/logs/userlogs/application_1586024027199_0006/container_1586024027199_0006_02_000001/launch_container.cmd"
1 file(s) copied.
C:\hadoop\hdfstmp\nm-local-dir\usercache\tejashri\appcache\application_1586024027199_0006\container_1586024027199_0006_02_000001>rem Determining directory contents
C:\hadoop\hdfstmp\nm-local-dir\usercache\tejashri\appcache\application_1586024027199_0006\container_1586024027199_0006_02_000001>dir 1>>"C:/hadoop/logs/userlogs/application_1586024027199_0006/container_1586024027199_0006_02_000001/directory.info"
"Launching container"
[2020-04-04 23:53:34.959]Container exited with a non-zero exit code 1. Last 4096 bytes of stderr :
'"C:\Program Files\Java\jdk1.8.0_171"' is not recognized as an internal or external command,
operable program or batch file.
[2020-04-04 23:53:34.960]Container exited with a non-zero exit code 1. Last 4096 bytes of stderr :
'"C:\Program Files\Java\jdk1.8.0_171"' is not recognized as an internal or external command,
operable program or batch file.
For more detailed output, check the application tracking page: http://LAPTOP-2UBC7TG1:8088/cluster/app/application_1586024027199_0006 Then click on links to logs of each attempt.
. Failing the application.
2020-04-04 23:53:35,743 INFO mapreduce.Job: Counters: 0
'"C:\Program Files\Java\jdk1.8.0_171"' is not recognized as an internal or external command,
operable program or batch file.
The JAVA_HOME variable is not properly set in hadoop-env.cmd.
Also, move the JDK installation to a folder without whitespaces (say, C:\Java\jdk1.8.0_171)
Update the JAVA_HOME and PATH environment variables
Add this line in hadoop-env.cmd,
set JAVA_HOME=C:\Java\jdk1.8.0_171
Restart the hadoop daemons and run the Job.
I need help for running wordcount application in Hadoop. I work on Multi Node clusters, I want to run it on my main(master) node.
I make folder for input and copy .txt to that folder, it works great. 1, 2
Now I need to run mapreduce which is done using eclipse.
Run with:
hadoop jar WordCount.jar WordCount /input /output
When I run it I get:
/input
/output
18/12/30 19:50:13 INFO client.RMProxy: Connecting to ResourceManager at
/0.0.0.0:8032
18/12/30 19:50:13 WARN mapreduce.JobResourceUploader: Hadoop command-line
option parsing not performed. Implement the Tool interface and execute your
application with ToolRunner to remedy this.
18/12/30 19:50:14 INFO input.FileInputFormat: Total input files to process
: 1
18/12/30 19:50:14 INFO mapreduce.JobSubmitter: number of splits:1
18/12/30 19:50:14 INFO Configuration.deprecation:
yarn.resourcemanager.system-metrics-publisher.enabled is deprecated.
Instead, use yarn.system-metrics-publisher.enabled
18/12/30 19:50:14 INFO mapreduce.JobSubmitter: Submitting tokens for job:
job_1546180801267_1648
18/12/30 19:50:15 INFO impl.YarnClientImpl: Submitted application
application_1546180801267_1648
18/12/30 19:50:15 INFO mapreduce.Job: The url to track the job: http://ec2-
3-82-16-179.compute-
1.amazonaws.com:8088/proxy/application_1546180801267_1648/
18/12/30 19:50:15 INFO mapreduce.Job: Running job: job_1546180801267_1648
I listen to a lot of tutorial, in every tutorial they do some things and they get same INFO, but after that map and reduce run, in my situation it does not run.
Here is my .profile 3 :
Here is yarn-site.xml 4 :
Mapred-site.xml 5 :
hdfs-site.xml 6 :
core-site.xml 7, 8 :
Please help me with this, I am trying to solve it for two days but without any good result.
Thanks!
I have been trying to import data from MySQL database to Hive using Sqoop utility. I got the table created and I have given the fetch-size as low as 10. Everytime I run the command, I am getting Java Heap Size Error and the job gets killed after 4 attempts. How can I fix this.
My sqoop command is as follows :
sqoop import --connect jdbc:mysql://my_local_ip/mydatabase --fetch-size 10 --username root -P --table table_name --hive-import --compression-codec=snappy --as-parquetfile -m 1
and I am getting :
16/08/29 07:06:24 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1472465929944_0013/
16/08/29 07:06:24 INFO mapreduce.Job: Running job: job_1472465929944_0013
16/08/29 07:06:47 INFO mapreduce.Job: Job job_1472465929944_0013 running in uber mode : false
16/08/29 07:06:47 INFO mapreduce.Job: map 0% reduce 0%
16/08/29 07:07:16 INFO mapreduce.Job: Task Id : attempt_1472465929944_0013_m_000000_0, Status : FAILED
Error: Java heap space
16/08/29 07:07:37 INFO mapreduce.Job: Task Id : attempt_1472465929944_0013_m_000000_1, Status : FAILED
Error: Java heap space
16/08/29 07:07:59 INFO mapreduce.Job: Task Id : attempt_1472465929944_0013_m_000000_2, Status : FAILED
Error: Java heap space
16/08/29 07:08:21 INFO mapreduce.Job: map 100% reduce 0%
16/08/29 07:08:23 INFO mapreduce.Job: Job job_1472465929944_0013 failed with state FAILED due to: Task failed task_1472465929944_0013_m_000000
Try with
sqoop import -Dmapreduce.map.memory.mb=1024 -Dmapreduce.map.java.opts=-Xmx7200m -Dmapreduce.task.io.sort.mb=2400 --connect jdbc:mysql://local.ip/database_name --username root -P --hive-import --table table_name --as-parquetfile --warehouse-dir=/home/cloudera/hadoop --split-by 'id' -m 100
Initially, I have been using 10 mappers to process 10 million records and each chunk has a size of 1 million record. This was causing the error and as I fired 100 mapping jobs, it has processed the data successfully . The only thing I noticed is the time taken to complete the jobs. It has taken almost 1 hr to run all the 100 mapper jobs.
I am new in the hadoop world. I have setup a hadoop cluster in Linux and I try to run a mapreduce job from windows.
I wordcount example work find, but when I try to run another job that I wrote myself, I have this error:
16/07/05 16:04:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/07/05 16:04:42 INFO client.RMProxy: Connecting to ResourceManager at /myIP:8050
16/07/05 16:04:42 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
16/07/05 16:04:42 WARN mapreduce.JobResourceUploader: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
16/07/05 16:04:42 INFO input.FileInputFormat: Total input paths to process : 3
16/07/05 16:04:42 INFO mapreduce.JobSubmitter: number of splits:3
16/07/05 16:04:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1467727275250_0002
16/07/05 16:04:43 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
16/07/05 16:04:43 INFO impl.YarnClientImpl: Submitted application application_1467727275250_0002
16/07/05 16:04:43 INFO mapreduce.Job: The url to track the job: http://hadoopMaster:8088/proxy/application_1467727275250_0002/
16/07/05 16:04:43 INFO mapreduce.Job: Running job: job_1467727275250_0002
16/07/05 16:04:46 INFO mapreduce.Job: Job job_1467727275250_0002 running in uber mode : false
16/07/05 16:04:46 INFO mapreduce.Job: map 0% reduce 0%
16/07/05 16:04:46 INFO mapreduce.Job: Job job_1467727275250_0002 failed with state FAILED due to: Application application_1467727275250_0002 failed 2 times due to AM Container for appattempt_1467727275250_0002_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://hadoopMaster:8088/cluster/app/application_1467727275250_0002Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1467727275250_0002_02_000001
Exit code: 1
Exception message: /bin/bash: Zeile 0: fg: Keine Job-Steuerung in dieser Shell.
Stack trace: ExitCodeException exitCode=1: /bin/bash: line 0: fg: no Job control in this Shell.
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
16/07/05 16:04:46 INFO mapreduce.Job: Counters: 0
Can someone help me please?
mapred-site.xml add property
<property>
<name>mapred.remote.os</name>
<value>Linux</value>
<description>Remote MapReduce framework's OS, can be either Linux or Windows</description>
</property>
<property>
<name>mapreduce.app-submission.cross-platform</name>
<value>true</value>
</property>
I have installed Hadoop 2.7.1 stable version. I followed Tom White's book for installation in Pseudodistributed mode. I did set all environment variables like JAVA_HOME, HADOOP_HOME, PATH etc.. I configured yarn-site.xml, hdfs-site.xml, core-site.xml, mapred-site.xml.
I copied the sample file file.txt using following command.
$hadoop fs -copyFromLocal textFiles/file.txt file.txt
which shows me
Found 2 items
-rw-r--r-- 1 RAMA supergroup 3737 2015-12-27 21:52 file.txt
drwxr-xr-x - RAMA supergroup 0 2015-12-27 22:17 input
When I am executing the wordcount program in hadoop-mapreduce-examples-2.7.1.jar using below command
RAMAs-MacBook-Pro:hadoop-2.7.1 RAMA$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount file.txt output
it is throwing me following exception, for which I am not able to find any feasible solution.
15/12/27 22:41:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/12/27 22:41:53 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
15/12/27 22:41:53 INFO input.FileInputFormat: Total input paths to process : 1
15/12/27 22:41:53 INFO mapreduce.JobSubmitter: number of splits:1
15/12/27 22:41:53 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1451216397139_0020
15/12/27 22:41:54 INFO impl.YarnClientImpl: Submitted application application_1451216397139_0020
15/12/27 22:41:54 INFO mapreduce.Job: The url to track the job: http://192.168.1.6:8088/proxy/application_1451216397139_0020/
15/12/27 22:41:54 INFO mapreduce.Job: Running job: job_1451216397139_0020
15/12/27 22:41:57 INFO mapreduce.Job: Job job_1451216397139_0020 running in uber mode : false
15/12/27 22:41:57 INFO mapreduce.Job: map 0% reduce 0%
15/12/27 22:41:57 INFO mapreduce.Job: Job job_1451216397139_0020 failed with state FAILED due to: Application application_1451216397139_0020 failed 2 times due to AM Container for appattempt_1451216397139_0020_000002 exited with exitCode: 127
For more detailed output, check application tracking page:http://192.168.1.6:8088/cluster/app/application_1451216397139_0020Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1451216397139_0020_02_000001
Exit code: 127
Stack trace: ExitCodeException exitCode=127:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 127
Failing this attempt. Failing the application.
15/12/27 22:41:57 INFO mapreduce.Job: Counters: 0
Any suggestions/comments is of great help...
In hadoop-env.sh, explicitly add the java home path in JAVA_HOME variable.