Using Weka gives different results between GUI and API implementation

Using Weka gives different results between GUI and API implementation - java

I am using Weka to do classification of my dataset. First I did this using the GUI giving me some results (Accurracy, ROC, ...). Now that I'm using the API to implement a small framework around WEKA, I run the exact same configuration yet getting different results.
Let me give you an example:
Config of .ARFF file in GUI:
#relation 'QueryResult-weka.filters.unsupervised.attribute.NominalToString-Clast
-weka.filters.unsupervised.attribute.StringToWordVector-R2-W1000-prune-rate-1.0-N0
-stemmerweka.core.stemmers.NullStemmer-M1-O-tokenizerweka.core.tokenizers.WordTokenizer -delimiters \",
-weka.filters.unsupervised.attribute.StringToNominal-R2-last
-weka.filters.unsupervised.attribute.NumericToBinary-unset-class-temporarily
-weka.filters.unsupervised.attribute.NominalToString-Cfirst
-weka.filters.unsupervised.attribute.StringToWordVector-R1-W1000000-prune-rate-1.0-C-T-I-N0-L-S
-stemmerweka.core.stemmers.NullStemmer-M1-stopwords/Users/stopwds.txt-tokenizerweka.core.tokenizers.WordTokenizer
-delimiters \" \\r \\t.,;:\\\'\\\"()?!-/<>[]\\t\\r\\n\"-weka.filters.unsupervised.attribute.Remove-R120-1964
-weka.filters.unsupervised.attribute.Remove-R121-123'
As I said, using the API I use same config:
#relation 'QueryResult-weka.filters.unsupervised.attribute.NominalToString-Clast
-weka.filters.unsupervised.attribute.StringToWordVector-R2-W1000-prune-rate-1.0-N0
-stemmerweka.core.stemmers.NullStemmer-M1-O-tokenizerweka.core.tokenizers.WordTokenizer -delimiters \",
-weka.filters.unsupervised.attribute.StringToNominal-R2-last
-weka.filters.unsupervised.attribute.NumericToBinary-unset-class-temporarily
-weka.filters.unsupervised.attribute.NominalToString-Cfirst
-weka.filters.unsupervised.attribute.StringToWordVector-R1-W1000000-prune-rate-1.0-C-T-I-N0-L-S
-stemmerweka.core.stemmers.NullStemmer-M1-stopwords/Users/stopwds.txt-tokenizerweka.core.tokenizers.WordTokenizer
-delimiters \" \\r \\t.,;:\\\'\\\"()?!-/<>[]\\t\\r\\n\"-weka.filters.unsupervised.attribute.Remove-R120-1964
-weka.filters.unsupervised.attribute.Remove-R121-123'
Now when I run the classifier, again with same configurations, I get different output! However, the funny part is when I load the .ARFF file generated from my Java code after running the config and then train the classifier there, I DO get the exact same output as expected/required.
Can please someone explain what I'm doing wrong and why the output is different? I read other posts such as link where a similar problem occurred.
--
To clarify, here is the config of my classifier in the GUI:
weka.classifiers.lazy.IBk -K 30 -W 0 -I -A "weka.core.neighboursearch.LinearNNSearch -A \"weka.core.EuclideanDistance -R first-last\""
And this is how it is in my Java code:
iBk.setOptions(weka.core.Utils.splitOptions("-K 30 -W 0 -I -A \"weka.core.neighboursearch.LinearNNSearch -A \\\"weka.core.EuclideanDistance -R first-last\\\"\""));

Related

How to output stream from camera(recording in RTSP) via FFmpeg (Kokorin Jaffree)

I'm trying to reach watching live hls stream in browser based on rtsp-camera stream from java-client using Jafree library(https://github.com/kokorin/Jaffree).
But I could not execute the command due to a lack of FFmpeg rights(FFmpeg is installed in /usr/bin/ffmpeg)
Code
And I also tried to execute this command from runtime:
sudo ffmpeg -fflags nobuffer -rtsp_transport tcp -i rtsp://my_url -vsync 0 -copyts -vcodec copy -movflags frag_keyframe+empty_moov -an -hls_flags delete_segments+append_list -f segment -segment_list_flags live -segment_time 1 -segment_list_size 3 -segment_format mpegts -segment_list /temp/stream/index.m3u8 -segment_list_type m3u8 -segment_list_entry_prefix /stream/ /temp/stream/%d.ts
I can execute it in the console and everything is ok, but on the browser I get CORS trying to access it.
(https://i.stack.imgur.com/QUltG.png)
Could you please share a way to achieve stream in the browser?

See the readable names with jemalloc

We are trying to catch a memory leak, and we are using with jemalloc. How do you change the tree to display symbol/class names? Right now, our gif looks like this:
On most tutorials I see, they just say to set the following 2 env vars:
echo $LD_PRELOAD
/usr/lib/x86_64-linux-gnu/libjemalloc.so
echo $MALLOC_CONF
prof:true,lg_prof_interval:29,lg_prof_sample:17
The command I'm running is: jeprof --show_bytes --gif which java jeprof*.heap > mem.gif
We are running a java application from a docker-compose file, using the image: openjdk:13-jdk-alpine

Weka - Can't find a permissible class

I'm integrating Weka into a plug-in I'm writing for another application. I included weka.jar in my class path and for the most part, things seem to be working well. Unfortunately, when I get to the point of changing the options for some classifiers, I run into problems specific to being unable to find certain classes. For example, when I try to change the name of the classifier in the AdaBoost options, I get an error that ends like so:
java.lang.Exception: Can't find a permissible class called: weka.classifiers.bayes.BayesNet
Model options set to: -P 50 -S 1 -I 10 -W weka.classifiers.bayes.BayesNet
at weka.core.ResourceUtils.forName(ResourceUtils.java:84)
at weka.core.Utils.forName(Utils.java:1080)
at weka.classifiers.AbstractClassifier.forName(AbstractClassifier.java:91)
at weka.classifiers.SingleClassifierEnhancer.setOptions(SingleClassifierEnhancer.java:108)
at weka.classifiers.IteratedSingleClassifierEnhancer.setOptions(IteratedSingleClassifierEnhancer.java:115)
at weka.classifiers.RandomizableIteratedSingleClassifierEnhancer.setOptions(RandomizableIteratedSingleClassifierEnhancer.java:93)
at weka.classifiers.meta.AdaBoostM1.setOptions(AdaBoostM1.java:375)
I'm thinking that this might have something to do with me using the JAR in an OSGi bundle, but I'm not sure. Any ideas? Other than this issue, I'm able to train these classifiers just fine using the default options for them.
Thanks.

Solve this problem via set all parameters via setters.
This problem when you use like this
BayesNet processes = new BayesNet();
String options = "-D -Q weka.classifiers.bayes.net.search.local.K2 -- -P 1 -S BAYES -E weka.classifiers.bayes.net.estimate.SimpleEstimator -- -A 0.5";
processes.setOptions(weka.core.Utils.splitOptions(options));
Change set all param like this.(Here not all option, small example)
BayesNet processes = new BayesNet();
SimpleEstimator newBayesNetEstimator = new SimpleEstimator();
newBayesNetEstimator.setAlpha(0.5);
processes.setEstimator(newBayesNetEstimator);

Cannot get External Node Classifier in puppet working - bash and java

I have the following script where I call a java program that writes a YAML output to Strandard output stream, and that is echoed (Simple).
#!/bin/bash
echo `/usr/lib/jvm/jre/bin/java -jar /etc/puppet/enc/enc.jar $1`
I have the above script in file/etc/puppet/enc/javaEnc.sh When I execute this providing node name as argument I get the following output.
---
classes:
class1:
class2:
The problem is, on the agent node, I get the error message
err: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find node 'node-agent-1'; cannot compile
warning: Not using cache on failed catalog
err: Could not retrieve catalog; skipping run
I have found that the script does not execute (or rather my java program is not called, don't know why) - In my java program I write the output to a file in addition to doing a System.out.print.
I have a another script where I read the file (data.yaml) that contains the same data as I mentioned as output and writes it to output stream by following script.
#!/bin/bash
cat "/etc/puppet/enc/data.yaml"
When this script is mentioned against external_nodes, it works fine, the puppet agent configures itself. Can I please get an idea where I am getting it wrong.? The java program actually queries some external resources and classifies the classes and produces the output - it takes around 10 seconds to get this done. Could this be a problem ? I have seen ruby and python solutions - couldn't get them to work either. I would like it done with Java most preferably.
In my puppet.conf file I have the following.
[master]
node_terminus = exec
external_nodes = /etc/puppet/enc/javaEnc.sh

tdbloader on Cygwin: Gettging FileNotFoundException: d:\cygdrive\d\....\node2id.idn

I am completely new to Jena/TDB. All I want to do is to load data from some sample rdf, N3 etc file using tdb scripts or through java api.
I am tried to use tbdloader on Cygwin to load data (tdb-0.9.0, on Windows XP with IBM Java 1.6). Following are the command that I ran:
$ export TDBROOT=/cygdrive/d/Project/Store_DB/jena-tdb-0.9.0-incubating
$ export PATH=$TDBROOT/bin:$PATH
I also changed classpath for java in the tdbloader script as mentioned at tdbloader on Cygwin: java.lang.NoClassDefFoundError :
exec java $JVM_ARGS $SOCKS -cp "PATH_OF_JAR_FILES" "tdb.$TDB_CMD" $TDB_SPEC "$#"
So when I run $ tdbloader --help it shows the help correctly.
But when I run
$ tdbloader --loc /cygdrive/d/Project/Store_DB/data1
OR
$ tdbloader --loc /cygdrive/d/Project/Store_DB/data1 test.rdf
I am getting following exception:
com.hp.hpl.jena.tdb.base.file.FileException: Failed to open: d:\cygdrive\d\Project\Store_DB\data1\node2id.idn (mode=rw)
at com.hp.hpl.jena.tdb.base.file.ChannelManager.open$(ChannelManager.java:83)
at com.hp.hpl.jena.tdb.base.file.ChannelManager.openref$(ChannelManager.java:58)
at com.hp.hpl.jena.tdb.base.file.ChannelManager.acquire(ChannelManager.java:47)
at com.hp.hpl.jena.tdb.base.file.FileBase.<init>(FileBase.java:57)
at com.hp.hpl.jena.tdb.base.file.FileBase.<init>(FileBase.java:46)
at com.hp.hpl.jena.tdb.base.file.FileBase.create(FileBase.java:41)
at com.hp.hpl.jena.tdb.base.file.BlockAccessBase.<init>(BlockAccessBase.java:46)
at com.hp.hpl.jena.tdb.base.block.BlockMgrFactory.createStdFile(BlockMgrFactory.java:98)
at com.hp.hpl.jena.tdb.base.block.BlockMgrFactory.createFile(BlockMgrFactory.java:82)
at com.hp.hpl.jena.tdb.base.block.BlockMgrFactory.create(BlockMgrFactory.java:58)
at com.hp.hpl.jena.tdb.setup.Builder$BlockMgrBuilderStd.buildBlockMgr(Builder.java:196)
at com.hp.hpl.jena.tdb.setup.Builder$RangeIndexBuilderStd.createBPTree(Builder.java:165)
at com.hp.hpl.jena.tdb.setup.Builder$RangeIndexBuilderStd.buildRangeIndex(Builder.java:134)
at com.hp.hpl.jena.tdb.setup.Builder$IndexBuilderStd.buildIndex(Builder.java:112)
at com.hp.hpl.jena.tdb.setup.Builder$NodeTableBuilderStd.buildNodeTable(Builder.java:85)
at com.hp.hpl.jena.tdb.setup.DatasetBuilderStd$NodeTableBuilderRecorder.buildNodeTable(DatasetBuilderStd.java:389)
at com.hp.hpl.jena.tdb.setup.DatasetBuilderStd.makeNodeTable(DatasetBuilderStd.java:300)
at com.hp.hpl.jena.tdb.setup.DatasetBuilderStd._build(DatasetBuilderStd.java:167)
at com.hp.hpl.jena.tdb.setup.DatasetBuilderStd.build(DatasetBuilderStd.java:157)
at com.hp.hpl.jena.tdb.setup.DatasetBuilderStd.build(DatasetBuilderStd.java:70)
at com.hp.hpl.jena.tdb.StoreConnection.make(StoreConnection.java:132)
at com.hp.hpl.jena.tdb.transaction.DatasetGraphTransaction.<init>(DatasetGraphTransaction.java:46)
at com.hp.hpl.jena.tdb.sys.TDBMakerTxn._create(TDBMakerTxn.java:50)
at com.hp.hpl.jena.tdb.sys.TDBMakerTxn.createDatasetGraph(TDBMakerTxn.java:38)
at com.hp.hpl.jena.tdb.TDBFactory._createDatasetGraph(TDBFactory.java:166)
at com.hp.hpl.jena.tdb.TDBFactory.createDatasetGraph(TDBFactory.java:74)
at com.hp.hpl.jena.tdb.TDBFactory.createDataset(TDBFactory.java:53)
at tdb.cmdline.ModTDBDataset.createDataset(ModTDBDataset.java:95)
at arq.cmdline.ModDataset.getDataset(ModDataset.java:34)
at tdb.cmdline.CmdTDB.getDataset(CmdTDB.java:137)
at tdb.cmdline.CmdTDB.getDatasetGraph(CmdTDB.java:126)
at tdb.cmdline.CmdTDB.getDatasetGraphTDB(CmdTDB.java:131)
at tdb.tdbloader.loadQuads(tdbloader.java:163)
at tdb.tdbloader.exec(tdbloader.java:122)
at arq.cmdline.CmdMain.mainMethod(CmdMain.java:97)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:59)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:46)
at tdb.tdbloader.main(tdbloader.java:53)
Caused by: java.io.FileNotFoundException: d:\cygdrive\d\Project\Store_DB\data1\node2id.idn (The system cannot find the path specified.)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:222)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:107)
at com.hp.hpl.jena.tdb.base.file.ChannelManager.open$(ChannelManager.java:80)
... 37 more
I am not sure what node2id.idn file is and why is it expecting it?

The file node2id.idn is one of TDB's internal index files. It's not something that you have to create or manage for yourself. I've just tried tdbloader on cygwin myself, it it worked OK for me. I can think of two basic possibilities:
your disk is full
the TDB index is corrupted
If this is the first file you are loading into an otherwise emtpy TDB, the second possibility is unlikely. If you are loading into a non-empty TDB, try deleting the TDB image and starting again. Note that TDB by itself does not manage concurrent writes: if you have more than one process writing to a single TDB image, you must handle locking at the application level, or use TDB's transactions.
The final possibility, of course, is that your disk is flaky. You might want to try your code on another machine.
If none of these suggestions help, please send a complete minimal test case to the Jena users list.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.