Hadoop error on executing job - java

I tried to run an example and get the following output:
12/06/30 12:27:39 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
12/06/30 12:27:39 INFO input.FileInputFormat: Total input paths to process : 7
12/06/30 12:27:40 INFO mapred.JobClient: Running job: job_local_0001
12/06/30 12:27:40 INFO input.FileInputFormat: Total input paths to process : 7
12/06/30 12:27:40 INFO mapred.MapTask: io.sort.mb = 100
12/06/30 12:27:41 INFO mapred.MapTask: data buffer = 79691776/99614720
12/06/30 12:27:41 INFO mapred.MapTask: record buffer = 262144/327680
12/06/30 12:27:41 INFO mapred.JobClient: map 0% reduce 0%
12/06/30 12:27:41 INFO mapred.MapTask: Starting flush of map output
12/06/30 12:27:41 WARN mapred.LocalJobRunner: job_local_0001
java.io.IOException: Expecting a line not the end of stream
at org.apache.hadoop.fs.DF.parseExecResult(DF.java:109)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:179)
at org.apache.hadoop.util.Shell.run(Shell.java:134)
at org.apache.hadoop.fs.DF.getAvailable(DF.java:73)
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:329)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:107)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1221)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1129)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:549)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:623)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
12/06/30 12:27:42 INFO mapred.JobClient: Job complete: job_local_0001
12/06/30 12:27:42 INFO mapred.JobClient: Counters: 0
Does anyone know why I get this error? Hadoop version is 0.20.2.

Apparently you need to have the df command available on the machine on which you have eclipse too. In my case I had 2 ubuntu VMs (acting as master and slave) and was running eclipse with the hadoop plugin from windows. After installing cygwin and adding it to the path it doesn't give that error anymore.

Related

Apache Nutch Indexer Plugin to Manticore Search Exception: java.lang.NoClassDefFoundError: com/manticoresearch/client/ApiException

I have created an Apache Nutch Indexer Plugin to push data to Manticore Search using Manticore Search Java API.
The build is successful and all the crawling steps before indexing are succeeding (inject, generate, fetch, parse, updatedb).
When I run the indexing command bin/nutch index /root/nutch_source/crawl/crawldb/ -linkdb /root/nutch_source/crawl/linkdb/ -dir /root/nutch_source/crawl/segments/ -filter -normalize -deleteGone it fails and logs/hadoop.log include the following stack trace.
I am running Nutch into a Docker container.
Nutch version in the image is 1.19
2021-09-07 10:15:46,040 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-09-07 10:16:23,666 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-09-07 10:17:36,020 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-09-07 10:17:36,378 INFO segment.SegmentChecker - Segment dir is complete: file:/root/nutch_source/crawl/segments/20210906001900.
2021-09-07 10:17:36,383 INFO segment.SegmentChecker - Segment dir is complete: file:/root/nutch_source/crawl/segments/20210906001655.
2021-09-07 10:17:36,387 INFO segment.SegmentChecker - Segment dir is complete: file:/root/nutch_source/crawl/segments/20210906002358.
2021-09-07 10:17:36,391 INFO indexer.IndexingJob - Indexer: starting at 2021-09-07 10:17:36
2021-09-07 10:17:36,401 INFO indexer.IndexingJob - Indexer: deleting gone documents: true
2021-09-07 10:17:36,402 INFO indexer.IndexingJob - Indexer: URL filtering: true
2021-09-07 10:17:36,402 INFO indexer.IndexingJob - Indexer: URL normalizing: true
2021-09-07 10:17:36,403 INFO indexer.IndexerMapReduce - IndexerMapReduce: crawldb: /root/nutch_source/crawl/crawldb
2021-09-07 10:17:36,407 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: file:/root/nutch_source/crawl/segments/20210906001900
2021-09-07 10:17:36,408 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: file:/root/nutch_source/crawl/segments/20210906001655
2021-09-07 10:17:36,410 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: file:/root/nutch_source/crawl/segments/20210906002358
2021-09-07 10:17:36,411 INFO indexer.IndexerMapReduce - IndexerMapReduce: linkdb: /root/nutch_source/crawl/linkdb
2021-09-07 10:17:36,528 WARN impl.MetricsConfig - Cannot locate configuration: tried hadoop-metrics2-jobtracker.properties,hadoop-metrics2.properties
2021-09-07 10:17:37,708 INFO mapreduce.Job - The url to track the job: http://localhost:8080/
2021-09-07 10:17:37,711 INFO mapreduce.Job - Running job: job_local250243852_0001
2021-09-07 10:17:38,724 INFO mapreduce.Job - Job job_local250243852_0001 running in uber mode : false
2021-09-07 10:17:38,725 INFO mapreduce.Job - map 0% reduce 0%
2021-09-07 10:17:39,731 INFO mapreduce.Job - map 100% reduce 0%
2021-09-07 10:17:47,677 WARN impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-09-07 10:17:47,992 INFO indexer.IndexWriters - Index writer org.apache.nutch.indexwriter.manticore.ManticoreIndexWriter identified.
2021-09-07 10:17:48,013 WARN mapred.LocalJobRunner - job_local250243852_0001
java.lang.Exception: java.lang.NoClassDefFoundError: com/manticoresearch/client/ApiException
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:559)
Caused by: java.lang.NoClassDefFoundError: com/manticoresearch/client/ApiException
at java.base/java.lang.Class.getDeclaredConstructors0(Native Method)
at java.base/java.lang.Class.privateGetDeclaredConstructors(Class.java:3137)
at java.base/java.lang.Class.getConstructor0(Class.java:3342)
at java.base/java.lang.Class.getConstructor(Class.java:2151)
at org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:170)
at org.apache.nutch.indexer.IndexWriters.<init>(IndexWriters.java:97)
at org.apache.nutch.indexer.IndexWriters.lambda$get$0(IndexWriters.java:60)
at java.base/java.util.Map.computeIfAbsent(Map.java:1003)
at org.apache.nutch.indexer.IndexWriters.get(IndexWriters.java:60)
at org.apache.nutch.indexer.IndexerOutputFormat.getRecordWriter(IndexerOutputFormat.java:41)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:542)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:615)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:347)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.ClassNotFoundException: com.manticoresearch.client.ApiException
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
at org.apache.nutch.plugin.PluginClassLoader.loadClassFromSystem(PluginClassLoader.java:105)
at org.apache.nutch.plugin.PluginClassLoader.loadClassFromParent(PluginClassLoader.java:93)
at org.apache.nutch.plugin.PluginClassLoader.loadClass(PluginClassLoader.java:73)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
... 19 more
2021-09-07 10:17:48,742 INFO mapreduce.Job - Job job_local250243852_0001 failed with state FAILED due to: NA
2021-09-07 10:17:48,773 INFO mapreduce.Job - Counters: 30
File System Counters
FILE: Number of bytes read=157397439
FILE: Number of bytes written=332518016
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=51223
Map output records=51223
Map output bytes=24049558
Map output materialized bytes=24158915
Input split bytes=2010
Combine input records=0
Combine output records=0
Reduce input groups=0
Input split bytes=2010
Combine input records=0
Combine output records=0
Reduce input groups=0
Reduce shuffle bytes=24158915
Reduce input records=0
Reduce output records=0
Spilled Records=51223
Shuffled Maps =14
Failed Shuffles=0
Merged Map outputs=14
GC time elapsed (ms)=125
Total committed heap usage (bytes)=5221908480
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=11426452
File Output Format Counters
Bytes Written=0
2021-09-07 10:17:48,774 ERROR indexer.IndexingJob - Indexing job did not succeed, job status:FAILED, reason: NA
2021-09-07 10:17:48,776 ERROR indexer.IndexingJob - Indexer: java.lang.RuntimeException: Indexing job did not succeed, job status:FAILED, reason: NA
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:152)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:293)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:302)
I could resolve this issue by adding all the dependent libraries of ManticoreSearch to the plugin manifest plugin.xml file inside the plugin folder.
I have found all the dependent JAR libraries listed in the folder runtime/local/plugins/<plugin-name>/ and took the name and included it under <runtime> tag of the plugin.xml.
After rebuilding the solution the indexer worked!

Error when running mapreduce job from windows client

I am new in the hadoop world. I have setup a hadoop cluster in Linux and I try to run a mapreduce job from windows.
I wordcount example work find, but when I try to run another job that I wrote myself, I have this error:
16/07/05 16:04:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/07/05 16:04:42 INFO client.RMProxy: Connecting to ResourceManager at /myIP:8050
16/07/05 16:04:42 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
16/07/05 16:04:42 WARN mapreduce.JobResourceUploader: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
16/07/05 16:04:42 INFO input.FileInputFormat: Total input paths to process : 3
16/07/05 16:04:42 INFO mapreduce.JobSubmitter: number of splits:3
16/07/05 16:04:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1467727275250_0002
16/07/05 16:04:43 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources.
16/07/05 16:04:43 INFO impl.YarnClientImpl: Submitted application application_1467727275250_0002
16/07/05 16:04:43 INFO mapreduce.Job: The url to track the job: http://hadoopMaster:8088/proxy/application_1467727275250_0002/
16/07/05 16:04:43 INFO mapreduce.Job: Running job: job_1467727275250_0002
16/07/05 16:04:46 INFO mapreduce.Job: Job job_1467727275250_0002 running in uber mode : false
16/07/05 16:04:46 INFO mapreduce.Job: map 0% reduce 0%
16/07/05 16:04:46 INFO mapreduce.Job: Job job_1467727275250_0002 failed with state FAILED due to: Application application_1467727275250_0002 failed 2 times due to AM Container for appattempt_1467727275250_0002_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://hadoopMaster:8088/cluster/app/application_1467727275250_0002Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1467727275250_0002_02_000001
Exit code: 1
Exception message: /bin/bash: Zeile 0: fg: Keine Job-Steuerung in dieser Shell.
Stack trace: ExitCodeException exitCode=1: /bin/bash: line 0: fg: no Job control in this Shell.
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
16/07/05 16:04:46 INFO mapreduce.Job: Counters: 0
Can someone help me please?
mapred-site.xml add property
<property>
<name>mapred.remote.os</name>
<value>Linux</value>
<description>Remote MapReduce framework's OS, can be either Linux or Windows</description>
</property>
<property>
<name>mapreduce.app-submission.cross-platform</name>
<value>true</value>
</property>

Hadoop wordcount Pseudodistributed mode error Exit code:127

I have installed Hadoop 2.7.1 stable version. I followed Tom White's book for installation in Pseudodistributed mode. I did set all environment variables like JAVA_HOME, HADOOP_HOME, PATH etc.. I configured yarn-site.xml, hdfs-site.xml, core-site.xml, mapred-site.xml.
I copied the sample file file.txt using following command.
$hadoop fs -copyFromLocal textFiles/file.txt file.txt
which shows me
Found 2 items
-rw-r--r-- 1 RAMA supergroup 3737 2015-12-27 21:52 file.txt
drwxr-xr-x - RAMA supergroup 0 2015-12-27 22:17 input
When I am executing the wordcount program in hadoop-mapreduce-examples-2.7.1.jar using below command
RAMAs-MacBook-Pro:hadoop-2.7.1 RAMA$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount file.txt output
it is throwing me following exception, for which I am not able to find any feasible solution.
15/12/27 22:41:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/12/27 22:41:53 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
15/12/27 22:41:53 INFO input.FileInputFormat: Total input paths to process : 1
15/12/27 22:41:53 INFO mapreduce.JobSubmitter: number of splits:1
15/12/27 22:41:53 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1451216397139_0020
15/12/27 22:41:54 INFO impl.YarnClientImpl: Submitted application application_1451216397139_0020
15/12/27 22:41:54 INFO mapreduce.Job: The url to track the job: http://192.168.1.6:8088/proxy/application_1451216397139_0020/
15/12/27 22:41:54 INFO mapreduce.Job: Running job: job_1451216397139_0020
15/12/27 22:41:57 INFO mapreduce.Job: Job job_1451216397139_0020 running in uber mode : false
15/12/27 22:41:57 INFO mapreduce.Job: map 0% reduce 0%
15/12/27 22:41:57 INFO mapreduce.Job: Job job_1451216397139_0020 failed with state FAILED due to: Application application_1451216397139_0020 failed 2 times due to AM Container for appattempt_1451216397139_0020_000002 exited with exitCode: 127
For more detailed output, check application tracking page:http://192.168.1.6:8088/cluster/app/application_1451216397139_0020Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1451216397139_0020_02_000001
Exit code: 127
Stack trace: ExitCodeException exitCode=127:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 127
Failing this attempt. Failing the application.
15/12/27 22:41:57 INFO mapreduce.Job: Counters: 0
Any suggestions/comments is of great help...
In hadoop-env.sh, explicitly add the java home path in JAVA_HOME variable.

Error: Could not create the Java Virtual Machine. - Apache Hadoop

I am trying to run wordcount example already provided in hadoop on the following env: (Pseudodistributed mode)
Windows 7
Hadoop 2.7.1
JDK 1.7.x
RAM 4 GB
The jps command returns
C:\deploy\hadoop-2.7.1>jps
2336 ResourceManager
7500 NameNode
4984 Jps
6900 NodeManager
4940 DataNode
The command I use for setting the hadoop heap size
set HADOOP_HEAPSIZE=512
The command I use from the hadoop home installation directory is
bin\yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /input /output
I see the following stack trace
C:\deploy\hadoop-2.7.1>bin\yarn jar share/hadoop/mapreduce/hadoop-mapreduce-exam
ples-2.7.1.jar wordcount /input /output
15/08/14 22:36:26 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0
:8032
15/08/14 22:36:27 INFO input.FileInputFormat: Total input paths to process : 1
15/08/14 22:36:28 INFO mapreduce.JobSubmitter: number of splits:1
15/08/14 22:36:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_14
39571873038_0001
15/08/14 22:36:28 INFO impl.YarnClientImpl: Submitted application application_14
39571873038_0001
15/08/14 22:36:28 INFO mapreduce.Job: The url to track the job: http://XXX-PC
:8088/proxy/application_1439571873038_0001/
15/08/14 22:36:28 INFO mapreduce.Job: Running job: job_1439571873038_0001
15/08/14 22:36:37 INFO mapreduce.Job: Job job_1439571873038_0001 running in uber
mode : false
15/08/14 22:36:37 INFO mapreduce.Job: map 0% reduce 0%
15/08/14 22:36:37 INFO mapreduce.Job: Job job_1439571873038_0001 failed with sta
te FAILED due to: Application application_1439571873038_0001 failed 2 times due
to AM Container for appattempt_1439571873038_0001_000002 exited with exitCode:
1
For more detailed output, check application tracking page:http://XXX-PC:8088/
cluster/app/application_1439571873038_0001Then, click on links to logs of each a
ttempt.
Diagnostics: Exception from container-launch.
Container id: container_1439571873038_0001_02_000001
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la
unchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C
ontainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:615)
at java.lang.Thread.run(Thread.java:744)
Shell output: 1 file(s) moved.
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
15/08/14 22:36:37 INFO mapreduce.Job: Counters: 0
When I went to the Stderr logs as mentioned in the above stack trace, the actual error came out to be
Error: Could not create the Java Virtual Machine.
When I try to increase HADOOP_HEAPSIZE to 1024, the namenode, datanode and yarn daemons do not start at all and give me the same error of could not create java virtual machine
Has someone got the same problem ? How to solve this issue ?
You can solve this problem by editting the file /etc/hadoop/hadoop-env.sh like the one below:
export HADOOP_HEAPSIZE=3072

Snappy compression error in Hadoop 2.x

I've setup a Hadoop cluster using the newly 2.x version. And I installed snappy and hadoop snappy according to this guide, to enable snappy compression in map output.
When running the example wordcount, The error occurred:
[dm#node1 ~]$ hadoop jar /opt/hadoop-2.0.5-alpha/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar wordcount /in /out
13/09/06 05:09:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/09/06 05:09:53 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
13/09/06 05:09:53 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
13/09/06 05:10:04 INFO input.FileInputFormat: Total input paths to process : 1
13/09/06 05:10:04 INFO snappy.LoadSnappy: Snappy native library loaded
13/09/06 05:10:04 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
13/09/06 05:10:04 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev d0f5a10f99f1b2af4f6610447052c5a67b8b1cc7]
13/09/06 05:10:04 INFO mapreduce.JobSubmitter: number of splits:1
13/09/06 05:10:04 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar
13/09/06 05:10:04 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
13/09/06 05:10:04 WARN conf.Configuration: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
13/09/06 05:10:04 WARN conf.Configuration: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
13/09/06 05:10:04 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name
13/09/06 05:10:04 WARN conf.Configuration: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
13/09/06 05:10:04 WARN conf.Configuration: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
13/09/06 05:10:04 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
13/09/06 05:10:04 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
13/09/06 05:10:04 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
13/09/06 05:10:04 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
13/09/06 05:10:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1378415309099_0001
13/09/06 05:10:06 INFO client.YarnClientImpl: Submitted application application_1378415309099_0001 to ResourceManager at node1/192.168.56.101:60832
13/09/06 05:10:06 INFO mapreduce.Job: The url to track the job: http://node1:60888/proxy/application_1378415309099_0001/
13/09/06 05:10:06 INFO mapreduce.Job: Running job: job_1378415309099_0001
13/09/06 05:10:32 INFO mapreduce.Job: Job job_1378415309099_0001 running in uber mode : false
13/09/06 05:10:32 INFO mapreduce.Job: map 0% reduce 0%
13/09/06 05:10:53 INFO mapreduce.Job: map 100% reduce 0%
13/09/06 05:11:02 INFO mapreduce.Job: Task Id : attempt_1378415309099_0001_m_000000_0, Status : FAILED
Error: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
Container killed by the ApplicationMaster.
13/09/06 05:11:03 INFO mapreduce.Job: map 0% reduce 0%
13/09/06 05:11:07 INFO mapreduce.Job: Task Id : attempt_1378415309099_0001_m_000000_1, Status : FAILED
Error: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
13/09/06 05:11:13 INFO mapreduce.Job: Task Id : attempt_1378415309099_0001_m_000000_2, Status : FAILED
Error: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
13/09/06 05:11:19 INFO mapreduce.Job: map 100% reduce 0%
13/09/06 05:11:19 INFO mapreduce.Job: Job job_1378415309099_0001 failed with state FAILED due to: Task failed task_1378415309099_0001_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0
13/09/06 05:11:19 INFO mapreduce.Job: Counters: 6
Job Counters
Failed map tasks=4
Launched map tasks=4
Other local map tasks=3
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=42989
Total time spent by all reduces in occupied slots (ms)=0
I searched google about the error message "Error: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z", haven't find the solution to this problem. So I want to know how can I enable snappy compression in Hadoop 2.x? Thanks.
good server:
$ hadoop checknative -a | grep snappy
15/06/18 14:51:05 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
15/06/18 14:51:05 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
snappy: true /opt/cloudera/parcels/CDH-5.4.2-1.cdh5.4.2.p0.2/lib/hadoop/lib/native/libsnappy.so.1
not good server:
$ hadoop checknative -a | grep snappy
15/06/18 14:50:31 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
15/06/18 14:50:31 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
snappy: true /usr/lib64/libsnappy.so.1
Notice in one case "hadoop checknative" command returns system's libsnappy, in another case - one from a Hadoop distrubution. If you have hadoop on some hosts report that they use system's libsnappy, errors like ".hadoop.mapred.YarnChild: Error running child : java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
at org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method)
" are quite possible.
From the output you’ve presented ( WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... etc.. ) it looks like your native library isn't loading. I think it would be worth trying to build and install the native library and then adding the Snappy library inside it at <HADOOP_HOME>/lib/native/ . I think that might resolve the issue. Hope that helps!
It looks like your code is not able to load snappy native library.There could be two possible explanation for this.
Either you are using a version of snappy which is not compatible with the version of hadoop you have installed.
OR You have not added the library in your classpath.
The second parameter on that guide is wrong.
Use mapreduce.map.output.compress.codec instead.

Categories

Resources