Errors for running Mahout example

Errors for running Mahout example - java

I downloaded the examples of latest version for chapter 09 of “Mahout in Action”. I can successfully run several examples, but for three files, NewsKMeansClustering.java, ReutersToSparseVectors.java, and NewsFuzzyKMeansClusteing.java. Running these three programs gives similar error messages:
Aug 3, 2011 2:03:54 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Initializing JVM Metrics with processName=JobTracker, sessionId=
Aug 3, 2011 2:03:54 PM org.apache.hadoop.mapred.JobClient configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
Aug 3, 2011 2:03:54 PM org.apache.hadoop.mapred.JobClient configureCommandLineOptions
WARNING: No job jar file set. User classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/home/user1/workspaceMahout1/recommender/inputDir
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:224)
at org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:55)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at org.apache.mahout.vectorizer.DocumentProcessor.tokenizeDocuments(DocumentProcessor.java:93)
at mia.clustering.ch09.NewsKMeansClustering.main(NewsKMeansClustering.java:54)
For the above messages, I do not quite understand what do those two warnings mean? Moreover, it looks like that “input path” should have been created, how can I create this type of input? Thanks.

You can ignore the warnings. The error is that the input directory you have specified does not exist. Does it exist? What is your command line?

I ran into a similar mismatch. The MiA files at https://github.com/tdunning/MiA have some cases where a .csv file is left in the same dir as the Java source. For example https://github.com/tdunning/MiA/tree/master/src/main/java/mia/recommender/ch02 ... however via Eclipse, loading it using DataModel model = new FileDataModel(new File("intro.csv")); ...doesn't find it.
Adding
System.out.println("CWD: "+System.getProperty("user.dir"));
...will reveal where Eclipse is looking (in my case, a couple levels up the filetree, but this might vary depending on how exactly you've set things up).

Related

How to ignore 'java.io.serialization' logger in java

To adress security vulnerability CVE-2017-3241 (Java RMI Registry.bind() Unvalidated Deserialization) which affects JRE version prior to 1.8.0_121. In addition to using JRE 1.8.0_121 ,we added below lines of code in java.security file.
jdk.serialFilter=*
sun.rmi.registry.registryFilter=*
sun.rmi.transport.dgcFilter=\
java.rmi.server.ObjID;\
java.rmi.server.UID;\
java.rmi.dgc.VMID;\
java.rmi.dgc.Lease;\ maxdepth=2147483647;maxarray=2147483647;maxrefs=2147483647;maxbytes=2147483647
Once we add these lines then we are getting below lines whenever do any RMI call.
Feb 13, 2017 1:00:53 AM sun.misc.ObjectInputFilter$Config lambda$static$0
INFO: Creating serialization filter from *
We want to suppress these info, can somebody suggest any solution for this.

I use -Djdk.serialFilter=maxbytes=10000;!org.* as JVM arguments, don`t see any log info.

Hbase calling HTable hangs

There is sample java code for Hbase connectivity program which is the famous "HbaseTest" class sample, which is available in the internet for long time.
I have compiled the code in my server and compiling was successful. When i run my Java class file, i am able to see that it is getting hanged in the particular line. "HTable table = new HTable(conf, tableName);"
It throws the below alert when running.
Jun 18, 2015 12:16:14 PM org.apache.hadoop.util.NativeCodeLoader WARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Jun 18, 2015 12:16:15 PM org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper INFO: The identifier of this process is pid#servername
I have identified it has stuck in that particular lines by giving Print statements.
Please do let me know what to do for the same. I have checked that the Hbase is running properly.
Kindly share your thoughts and idea's.
#hive #hbase #hadoop
Thanks in Advance Sam

I had a similar problem before it was network issues. Try setting retry and timeout parameters, e.g.
hbase.client.retries.number=2
zookeeper.session.timeout=2000
zookeeper.recovery.retry=0
hbase.rpc.timeout=100
ipc.socket.timeout=100
hbase.client.pause=100
zookeeper.recovery.retry.intervalmill=100
timeout=100
You may need to modify your network settings according to the errors that are thrown.

Cobertura 2.0.3 - unable to instrument

I am using Corbetura-2.0.3 with java 1.7 and I am trying to instrument the classes in our build system via command line. Stuck at instrumenting classes. Please assist.
Here is the command:
./cobertura-instrument.sh --basedir /ariba/9r2_sourcing/roots-S49r2/install/classes/ariba.app.approvable.zip --destination /ariba/9r2_sourcing/Instrument -auxClasspath /ariba/9r2_sourcing/roots-S49r2/install/classes
Wherein,
/ariba/9r2_sourcing/roots-S49r2/install/classes/ariba.app.approvable.zip – zip containing classes which I would like to instrument
/ariba/9r2_sourcing/Instrument – Folder to save instrumented classes
/ariba/9r2_sourcing/roots-S49r2/install/classes – path where all other reference classes are present.
Output:
-bash-4.1$ ./cobertura-instrument.sh --basedir /ariba/9r2_sourcing/roots-S49r2/install/classes/ariba.app.approvable.zip --destination /ariba/9r2_sourcing/Instrument -auxClasspath /ariba/9r2_sourcing/roots-S49r2/install/classes
Cobertura 2.0.3 - GNU GPL License (NO WARRANTY) - See COPYRIGHT file
Apr 29, 2014 4:53:27 AM net.sourceforge.cobertura.coveragedata.CoverageDataFileHandler loadCoverageData
INFO: Cobertura: Loaded information on 0 classes.
Apr 29, 2014 4:53:27 AM net.sourceforge.cobertura.coveragedata.CoverageDataFileHandler saveCoverageData
INFO: Cobertura: Saved information on 0 classes.
Also tried with:
‘archivesdepth’ parameter as well; it gives the same above error.
I have updated the ‘cobertura-instrument.sh’ file with the right versions of jars present in Cobertura installed location.

I see , in command you have not mentioned classes which you need to instrument.
Example: Below command works.
sh cobertura-instrument.sh --basedir `pwd` GenerateReports.class My_lib.class
Note:
Classes need to be mentioned as complete filename (e.g. mycls.class)
-auxClasspath : Add any classes/jars that cobertura is unable to find during instrumentation i.e. classes to exclude from coverage

Slick2D Music not working

I am currently testing Slick2D and thats why i am writing a Pong Game,
Also, in this PongGame you can select Music Files, which are in the ./music Folder
When i run it through Eclipse, all works perfectly.
When i export it, run it with all natives and Librarys in the Folder, it also starts,
but the Music Menu does not work (It just crashes when i select a Music)
The code you can see on:
https://bitbucket.org/JohnnyCrazy/pingpong/src/f2fd635ccfef/src/me/Johnny/Slick2D?at=master
Error: Mon Feb 04 13:47:05 CET 2013 ERROR:OpenAL error: Invalid
Operation (40964) org.lwjgl.openal.OpenALException: OpenAL error:
Invalid Operation (40964)
at org.lwjgl.openal.Util.checkALError(Util.java:64)
at org.lwjgl.openal.AL10.alDeleteBuffers(AL10.java:1097)
at org.newdawn.slick.openal.AudioImpl.release(AudioImpl.java:56)
at org.newdawn.slick.Music.release(Music.java:424)
at me.Johnny.Slick2D.Slick2D.setMusic(Slick2D.java:57)
at me.Johnny.Slick2D.MusikState.update(MusikState.java:108)
at org.newdawn.slick.state.StateBasedGame.update(StateBasedGame.java:278)
at org.newdawn.slick.GameContainer.updateAndRender(GameContainer.java:678)
at org.newdawn.slick.AppGameContainer.gameLoop(AppGameContainer.java:456)
at org.newdawn.slick.AppGameContainer.start(AppGameContainer.java:361)
at me.Johnny.Slick2D.Slick2D.main(Slick2D.java:40) Mon Feb 04 13:47:05 CET 2013 ERROR:Game.update() failure - check the game code.
org.newdawn.slick.SlickException: Game.update() failure - check the
game code.
at org.newdawn.slick.GameContainer.updateAndRender(GameContainer.java:684)
at org.newdawn.slick.AppGameContainer.gameLoop(AppGameContainer.java:456)
at org.newdawn.slick.AppGameContainer.start(AppGameContainer.java:361)
at me.Johnny.Slick2D.Slick2D.main(Slick2D.java:40)
I don't understand why it works in Eclipse and exported it doesn't.

The JavaDocs for the constructor of Music(String) are not entirely clear on how it interprets the String. J2SE methods that take a String typically presume it to represent the path to a File. If that is the case, that is also the reason behind the failure.
Typically those types of resources become an embedded-resource when the app. is built. An embedded resource must be accessed by URL. See the embedded resource info. page for tips on forming the URL.

need to use hadoop native

I am invoking a mapreduce job from my java program.
Today, when I set the mapreduce job's input format to :LzoTextInputFormat
The mapreduce job fails:
Could not load native gpl library
java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1738)
at java.lang.Runtime.loadLibrary0(Runtime.java:823)
at java.lang.System.loadLibrary(System.java:1028)
at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:32)
at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:67)
at com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:58)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
at com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:85)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at company.Validation.run(Validation.java:99)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at company.mapreduceTest.main(mapreduceTest.java:18)
Apr 5, 2012 4:40:29 PM com.hadoop.compression.lzo.LzoCodec <clinit>
SEVERE: Cannot load native-lzo without native-hadoop
java.lang.IllegalArgumentException: Wrong FS: hdfs://D-SJC-00535164:9000/local/usecases /gbase014/outbound/seed_2012-03-12_06-34-39/1_1.lzo.index, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310)
at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:47)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:357)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648)
at com.hadoop.compression.lzo.LzoIndex.readIndex(LzoIndex.java:169)
at com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:69)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
at com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:85)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at company.Validation.run(Validation.java:99)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at company.stopTransfer.mapreduceTest.main(mapreduceTest.java:18)
Apr 5, 2012 4:40:29 PM company.Validation run
SEVERE: LinkExtractor: java.lang.IllegalArgumentException: Wrong FS: hdfs://D-SJC-00535164:9000/local/usecases/gbase014/outbound/seed_2012-03-12_06-34-39/1_1.lzo.index, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310)
at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:47)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:357)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648)
at com.hadoop.compression.lzo.LzoIndex.readIndex(LzoIndex.java:169)
at com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:69)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
at com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:85)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at company.Validation.run(Validation.java:99)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at company.stopTransfer.mapreduceTest.main(mapreduceTest.java:18)
But in lib/native they are some files extends with a,la,so...
I tried to set them in my path environment variable, but it still doesn't work.
Could anyone please give me a suggestion!!!!
Thank you very much!

Your error relates to the actual shared library for Lzo not being present in the hadoop native library folder.
The code for GPLNativeCodeLoader is looking for a shared library called gplcompression. Java is actually looking for a file named libgplcompression.so. If this file doesn't exist in your lib/native/${arch} folder then you'll see this error.
In a terminal, navigate to your hadoop base directory and execute the following to dump the native libraries installed, and post back to your original question
uname -a
find lib/native

If you are using Cloudera Hadoop, you can install lzo easily according to the following instruction:
http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/v1/v1-0-1/Installing-and-Using-Impala/ciiu_lzo.html

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Errors for running Mahout example - java

You can ignore the warnings. The error is that the input directory you have specified does not exist. Does it exist? What is your command line?

Related

How to ignore 'java.io.serialization' logger in java

Hbase calling HTable hangs

Cobertura 2.0.3 - unable to instrument

Slick2D Music not working

need to use hadoop native

Categories

Resources