slurping /proc/cpuinfo with clojure - java

(Clojure newbie)
On my linux machine, slurping /proc/cpuinfo raises an error:
user=> (slurp "/proc/cpuinfo")
java.io.IOException: Invalid argument (NO_SOURCE_FILE:0)
Anybody knows why that is? (is the /proc filesystem some kind of second-class citizen in Java?)
Edit: the following code, adapted from nakkaya.com, works flawlessly:
(with-open [rdr (java.io.BufferedReader.
(java.io.FileReader. "/proc/cpuinfo"))]
(let [seq (line-seq rdr)]
(apply print seq)))
I wonder why this difference ?

I've had a similar problem with files in /proc. The solution is simple though:
(slurp (java.io.FileReader. "/proc/cpuinfo"))

the problem is that java cannot open a DataInputStream on /proc so the slurp function isnt going to help you here, sorry :(
/proc/cpuinfo is a little strange because it has a file size of zero and produces bytes when read. this upsets the smarter java file handling classes.
ls -l /proc/cpuinfo
-r--r--r-- 1 root root 0 2012-01-20 00:10 /proc/cpuinfo
see this thread for more http://www.velocityreviews.com/forums/t131093-java-cannot-access-proc-filesystem-on-linux.html
you are going to have to open it with a FileReader. I'll add an example in a bit

Related

.NoSuchFileException saving error in Java

I'm trying to save my first project and I keep getting this error, I'm following along with a book and I don't really know what went wrong?
jshell> /save ~/Desktop/filename.txt
| File '~/Desktop/filename.txt' for '/save' threw exception: java.nio.file.NoSuchFileException: C:\Users\klbad\Desktop\filename.txt
/save ~/Desktop/filename.txt
~ has no meaning in Windows. It's a Unix shell thing.
I'm not sure whether you'd get away with forward slashes either (I can't test it as not in Windows)

Mallet: java.lang.OutOfMemoryError with 1024GB Memory allocation

I am trying to use Mallet to run topic modeling on a ~1GB text file, with 11403956 rows. From the mallet directory, I cd to bin and upgrade the memory requirement to 1024GB:
set MALLET_MEMORY=1024G
I then try to run the command:
bin/mallet import-file --input combined_bios.txt --output dh_size.mallet --keep-sequence --remove-stopwords
However, this throws a memory error:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at gnu.trove.TObjectIntHashMap.rehash(TObjectIntHashMap.java:170)
at gnu.trove.THash.postInsertHook(THash.java:359)
at gnu.trove.TObjectIntHashMap.put(TObjectIntHashMap.java:155)
at cc.mallet.types.Alphabet.lookupIndex(Alphabet.java:115)
at cc.mallet.types.Alphabet.lookupIndex(Alphabet.java:123)
at cc.mallet.types.FeatureSequence.add(FeatureSequence.java:131)
at cc.mallet.pipe.TokenSequence2FeatureSequence.pipe(TokenSequence2FeatureSequence.java:44)
at cc.mallet.pipe.Pipe$SimplePipeInstanceIterator.next(Pipe.java:294)
at cc.mallet.pipe.Pipe$SimplePipeInstanceIterator.next(Pipe.java:282)
at cc.mallet.types.InstanceList.addThruPipe(InstanceList.java:267)
at cc.mallet.classify.tui.Csv2Vectors.main(Csv2Vectors.java:290)
Is there a workaround for such situations? Any help others can offer would be greatly appreciated!
If you are on Linux or OS X, I think you might be altering the wrong variable. The one you are changing is found in bin/mallet.bat, but you want to change the one in the executable at bin/mallet (i.e. without the .bat file extension):
MEMORY=1g
This is also described under "Issues with Big Data" in this Mallet tutorial:
http://programminghistorian.org/lessons/topic-modeling-and-mallet

Java error while running maxent in biomod2

I am running maxent from R, in the package biomod2 and the following error appeared. I do not come from a technical background and wasn't sure why is this error happening. Is it a memory problem or someone said the java path is not set. But I followed the instructions to set maxent to run in R and also downloaded Java Platform, Standard Edition Development Kit and set a path for it as explained in this pdf: http://modata.ceoe.udel.edu/dev/dhaulsee/class_rcode/r_pkgmanuals/MAXENT4R_directions.pdf
I would be really grateful if you could help me understand this problem and any solution to it.
Thanks a lot
Error in file(file, "rt") : cannot open the connection
In addition: Warning messages:
1: running command 'java' had status 1
2: running command 'java -mx512m -jar E:\bioclim_2.5min\model/maxent.jar environmentallayers
="rainfed/models/1432733200/m_47203134/Back_swd.csv"
samplesfile="rainfed/models/1432733200/m_47203134/Sp_swd.csv"
projectionlayers="rainfed/models/1432733200/m_47203134/Predictions/Pred_swd.csv"
outputdirectory="rainfed/models/1432733200/rainfed_PA1_Full_MAXENT_outputs"
outputformat=logistic redoifexists visible=FALSE linear=TRUE quadratic=TRUE
product=TRUE threshold=TRUE hinge=TRUE lq2lqptthreshold=80 l2lqthreshold=10
hingethreshold=15 beta_threshold=-1 beta_categorical=-1 beta_lqp=-1
beta_hinge=-1 defaultprevalence=0.5 autorun nowarnings notooltips
noaddsamplestobackground' had status 1
3: In file(file, "rt") :
cannot open file 'rainfed/models/1432733200/rainfed_PA1_Full_MAXENT_outputs/rainfed_PA1_
Full_Pred_swd.csv': No such file or directory
I've just manage to solve this problem - it is a problem with the file path specified. For me, I had a space in one of the folder names which was not accepted in the path to the maxent.jar file. From looking at your error, it looks like it might be the two backslashes.
E:\bioclim_2.5min\model/maxent.jar
should probably read
E:/bioclim_2.5min/model/maxent.jar

How to find the containing egg file from Python class?

From Python 2.6 it is possible to run an egg file directly from python command line, by incorporating a main.py file.
Now... is it possible to detect in runtime if the current executing python code is running directly from an egg and obtain the path of it ?
I'm trying to mirror similar Java functionality where it is possible to get the containing jar from a class through Java's ClassLoader.
I didn't find a nicer pythonic way to do this otherwise by hand:
puria#spaghetti:~$ cat __main__.py
import zipfile, os
module_path = os.path.dirname(__file__)
print "Is this an egg? ", zipfile.is_zipfile(module_path)
puria#spaghetti:~$ python __main__.py
Is this an egg? False
puria#spaghetti:~$ zip example.egg __main__.py
adding: __main__.py (deflated 15%)
puria#spaghetti:~$ python example.egg
Is this an egg? True
Hope this is useful and I actually understood your question.

tdbloader on Cygwin: Gettging FileNotFoundException: d:\cygdrive\d\....\node2id.idn

I am completely new to Jena/TDB. All I want to do is to load data from some sample rdf, N3 etc file using tdb scripts or through java api.
I am tried to use tbdloader on Cygwin to load data (tdb-0.9.0, on Windows XP with IBM Java 1.6). Following are the command that I ran:
$ export TDBROOT=/cygdrive/d/Project/Store_DB/jena-tdb-0.9.0-incubating
$ export PATH=$TDBROOT/bin:$PATH
I also changed classpath for java in the tdbloader script as mentioned at tdbloader on Cygwin: java.lang.NoClassDefFoundError :
exec java $JVM_ARGS $SOCKS -cp "PATH_OF_JAR_FILES" "tdb.$TDB_CMD" $TDB_SPEC "$#"
So when I run $ tdbloader --help it shows the help correctly.
But when I run
$ tdbloader --loc /cygdrive/d/Project/Store_DB/data1
OR
$ tdbloader --loc /cygdrive/d/Project/Store_DB/data1 test.rdf
I am getting following exception:
com.hp.hpl.jena.tdb.base.file.FileException: Failed to open: d:\cygdrive\d\Project\Store_DB\data1\node2id.idn (mode=rw)
at com.hp.hpl.jena.tdb.base.file.ChannelManager.open$(ChannelManager.java:83)
at com.hp.hpl.jena.tdb.base.file.ChannelManager.openref$(ChannelManager.java:58)
at com.hp.hpl.jena.tdb.base.file.ChannelManager.acquire(ChannelManager.java:47)
at com.hp.hpl.jena.tdb.base.file.FileBase.<init>(FileBase.java:57)
at com.hp.hpl.jena.tdb.base.file.FileBase.<init>(FileBase.java:46)
at com.hp.hpl.jena.tdb.base.file.FileBase.create(FileBase.java:41)
at com.hp.hpl.jena.tdb.base.file.BlockAccessBase.<init>(BlockAccessBase.java:46)
at com.hp.hpl.jena.tdb.base.block.BlockMgrFactory.createStdFile(BlockMgrFactory.java:98)
at com.hp.hpl.jena.tdb.base.block.BlockMgrFactory.createFile(BlockMgrFactory.java:82)
at com.hp.hpl.jena.tdb.base.block.BlockMgrFactory.create(BlockMgrFactory.java:58)
at com.hp.hpl.jena.tdb.setup.Builder$BlockMgrBuilderStd.buildBlockMgr(Builder.java:196)
at com.hp.hpl.jena.tdb.setup.Builder$RangeIndexBuilderStd.createBPTree(Builder.java:165)
at com.hp.hpl.jena.tdb.setup.Builder$RangeIndexBuilderStd.buildRangeIndex(Builder.java:134)
at com.hp.hpl.jena.tdb.setup.Builder$IndexBuilderStd.buildIndex(Builder.java:112)
at com.hp.hpl.jena.tdb.setup.Builder$NodeTableBuilderStd.buildNodeTable(Builder.java:85)
at com.hp.hpl.jena.tdb.setup.DatasetBuilderStd$NodeTableBuilderRecorder.buildNodeTable(DatasetBuilderStd.java:389)
at com.hp.hpl.jena.tdb.setup.DatasetBuilderStd.makeNodeTable(DatasetBuilderStd.java:300)
at com.hp.hpl.jena.tdb.setup.DatasetBuilderStd._build(DatasetBuilderStd.java:167)
at com.hp.hpl.jena.tdb.setup.DatasetBuilderStd.build(DatasetBuilderStd.java:157)
at com.hp.hpl.jena.tdb.setup.DatasetBuilderStd.build(DatasetBuilderStd.java:70)
at com.hp.hpl.jena.tdb.StoreConnection.make(StoreConnection.java:132)
at com.hp.hpl.jena.tdb.transaction.DatasetGraphTransaction.<init>(DatasetGraphTransaction.java:46)
at com.hp.hpl.jena.tdb.sys.TDBMakerTxn._create(TDBMakerTxn.java:50)
at com.hp.hpl.jena.tdb.sys.TDBMakerTxn.createDatasetGraph(TDBMakerTxn.java:38)
at com.hp.hpl.jena.tdb.TDBFactory._createDatasetGraph(TDBFactory.java:166)
at com.hp.hpl.jena.tdb.TDBFactory.createDatasetGraph(TDBFactory.java:74)
at com.hp.hpl.jena.tdb.TDBFactory.createDataset(TDBFactory.java:53)
at tdb.cmdline.ModTDBDataset.createDataset(ModTDBDataset.java:95)
at arq.cmdline.ModDataset.getDataset(ModDataset.java:34)
at tdb.cmdline.CmdTDB.getDataset(CmdTDB.java:137)
at tdb.cmdline.CmdTDB.getDatasetGraph(CmdTDB.java:126)
at tdb.cmdline.CmdTDB.getDatasetGraphTDB(CmdTDB.java:131)
at tdb.tdbloader.loadQuads(tdbloader.java:163)
at tdb.tdbloader.exec(tdbloader.java:122)
at arq.cmdline.CmdMain.mainMethod(CmdMain.java:97)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:59)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:46)
at tdb.tdbloader.main(tdbloader.java:53)
Caused by: java.io.FileNotFoundException: d:\cygdrive\d\Project\Store_DB\data1\node2id.idn (The system cannot find the path specified.)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:222)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:107)
at com.hp.hpl.jena.tdb.base.file.ChannelManager.open$(ChannelManager.java:80)
... 37 more
I am not sure what node2id.idn file is and why is it expecting it?
The file node2id.idn is one of TDB's internal index files. It's not something that you have to create or manage for yourself. I've just tried tdbloader on cygwin myself, it it worked OK for me. I can think of two basic possibilities:
your disk is full
the TDB index is corrupted
If this is the first file you are loading into an otherwise emtpy TDB, the second possibility is unlikely. If you are loading into a non-empty TDB, try deleting the TDB image and starting again. Note that TDB by itself does not manage concurrent writes: if you have more than one process writing to a single TDB image, you must handle locking at the application level, or use TDB's transactions.
The final possibility, of course, is that your disk is flaky. You might want to try your code on another machine.
If none of these suggestions help, please send a complete minimal test case to the Jena users list.

Categories

Resources