I'm trying to create a program to compare the amount of time it takes various haskell scripts to run, which will later be used to create graphs and displayed in a GUI. I've tried to create said GUI using Haskell libraries but I haven't had much luck, especially since I'm having trouble finding up to date GUI libraries for Windows. I've tried to use Java to get these results but either get errors returned or simply no result.
I've constructed a minimal example to show roughly what I'm doing at the moment:
import java.io.*;
public class TestExec {
public static void main(String[] args) {
try {
Process p = Runtime.getRuntime().exec("ghc test.hs 2 2");
BufferedReader in = new BufferedReader(
new InputStreamReader(p.getInputStream()));
String line = null;
while ((line = in.readLine()) != null) {
System.out.println(line);
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
And here is the Haskell script this is calling, in this case a simple addition:
test x y = x + y
Currently there simply isn't any result stored or printed. Anyone have any ideas?
Since you're attempting to run this as an executable, you need to provide a main. In you're case it should look something like
import System.Environment
test :: Integer -> Integer -> Integer
test = (+)
main = do
[x, y] <- map read `fmap` getArgs
print $ x `test` y
This just reads the command line arguments, adds them, then prints them. Though I did something like a while ago, it's much easier to do the benchmarking/testing in Haskell, and dump the output data to a text file in a more structured format, then parse/display it in Java or whatever language you like.
This is mostly a Java question. Search for Runtime.getRuntime().exec().
On the Haskell side, you need to write a stand-alone Haskell script. The one by #jozefg is OK. You should be able to run it as
runghc /path/to/script.hs 1 2
from the command line.
Calling it from Java is no different than running any other external process in Java. In Clojure (a JVM language, I use it for brevity) I do:
user=> (def p (-> (Runtime/getRuntime) (.exec "/usr/bin/runghc /tmp/test.hs 1 2")))
#'user/p
user=> (-> p .getInputStream input-stream reader line-seq)
("3")
Please note that I use runghc to run a script (not ghc). Full paths are not necessary, but could be helpful. Your Java program can be modified this way:
--- TestExec.question.java
+++ TestExec.java
## -2,7 +2,7 ##
public class TestExec {
public static void main(String[] args) {
try {
- Process p = Runtime.getRuntime().exec("ghc test.hs 2 2");
+ Process p = Runtime.getRuntime().exec("/usr/bin/runghc /tmp/test.hs 2 2");
BufferedReader in = new BufferedReader(
new InputStreamReader(p.getInputStream()));
String line = null;
The modified version runs the Haskell script just fine. You may have to change paths to you runghc and test.hs locations.
At first to read from output you need to use OutputStreamReader(p.getOutputStream()) instead of InputStreamReader
As I said in comment such a benchmark is simply incorrect. While benchmarking one should eliminate as many side coasts as possible. The best solution is to use the criterion package. It produces nice graphical output as you desire.
Small example:
import Criterion
import Criterion.Main
import Criterion.Config
fac 1 = 1
fac n = n * (fac $ n-1)
myConfig = defaultConfig {
cfgReport = ljust "report.html"
}
main = defaultMainWith myConfig (return ()) [
bench "fac 30" $ whnf fac 30
]
After execution it produces a file "report.html" with neat interactive plots.
Related
I've just started looking at ABCL to mix some Lisp into Java. For now, loading some Lisp from a file will be sufficient, and I've been looking at the examples. In every case, the pattern is:
Interpreter interpreter = Interpreter.createInstance();
interpreter.eval("(load \"lispfunctions.lisp\")");
But say I'm building a Maven project with a view to packaging as a JAR: how can I load lispfunctions.lisp from src/main/resources? I can easily get an InputStream—can I go somewhere with that? Or is there another idiom I'm missing here for loading Lisp source from a resource like this?
I've gotten the following to work. I am working with ABCL 1.7.0 on MacOS, although I'm pretty sure this isn't version-specific.
/* load_lisp_within_jar.java -- use ABCL to load Lisp file as resource in jar
* copyright 2020 by Robert Dodier
* I release this work under terms of the GNU General Public License
*/
/* To run this example:
$ javac -cp /path/to/abcl.jar -d . load_lisp_within_jar.java
$ cat << EOF > foo.lisp
(defun f (x) (1+ x))
EOF
$ jar cvf load_lisp_within_jar.jar load_lisp_within_jar.class foo.lisp
$ java -cp load_lisp_within_jar.jar:/path/to/abcl.jar load_lisp_within_jar
*
* Expected output:
(F 100) => 101
*/
import org.armedbear.lisp.*;
import java.io.*;
public class load_lisp_within_jar {
public static void main (String [] args) {
try {
// It appears that interpreter instance is required even though
// it isn't used directly; I guess it arranges global resources.
Interpreter I = Interpreter.createInstance ();
LispObject LOAD_function = Symbol.LOAD.getSymbolFunction ();
// Obtain an input stream for Lisp source code in jar.
ClassLoader L = load_lisp_within_jar.class.getClassLoader ();
InputStream f = L.getResourceAsStream ("foo.lisp");
Stream S = new Stream (Symbol.SYSTEM_STREAM, f, Symbol.CHARACTER);
// Call COMMON-LISP:LOAD with input stream as argument.
LOAD_function.execute (S);
// Verify that function F has been defined.
Symbol F = Packages.findPackage ("COMMON-LISP-USER").findAccessibleSymbol ("F");
LispObject F_function = F.getSymbolFunction ();
LispObject x = F_function.execute (LispInteger.getInstance (100));
System.out.println ("(F 100) => " + x.javaInstance ());
}
catch (Exception e) {
System.err.println ("oops: " + e);
e.printStackTrace ();
}
}
}
As you can see, the program first gets the function associated with the symbol LOAD. (For convenience, many, maybe all of COMMON-LISP symbols have static definitions, so you can just say Symbol.LOAD instead of looking up the symbol via findAccessibleSymbol.) Then the input stream is supplied to the load function. Afterwards we verify that our function F is indeed defined.
I know this stuff can be kind of obscure; I'll be happy to try to answer any questions.
Hello I am new into using MOA and WEKA,
I need to test paired learners concept using this code and I have been able to locate the code but I cannot find any example online and
I am having a hard time figuring how to pas my data into the code and run a test and see my results.
Pls can anyone point my in a right direction or give me a few pointers that I could follow to implement this.
moa/moa/src/main/java/moa/classifiers/meta/PairedLearners.java
Trying to use a similar code like this:
https://groups.google.com/forum/#!topic/moa-development/3IKcguR2kOk
Best Regards.
//Sample code below
import moa.classifiers.meta.pairedLearner;
Public class SamplePairedlearner{
public static void main(String[] args) {
FileStream fStream = new FileStream();
fStream.arffFileOption.setValue("test.arff");// set the ARFF file name
fStream.normalizeOption.setValue(false);// set normalized to be true or false
fStream.prepareForUse();
int numLines = 0;
PairedLearner learners = PairedLearners();
learners.resetLearning();
learners.resetLearningImpl(); //this is where i get an error message
ClusteringStream stream = fStream;
while (stream.hasMoreInstances()) {
Instance curr = stream.nextInstance().getData();
learners.trainOnInstanceImpl(curr)//this line also generates an error
numLines++;
}
Clustering resDstream = dstream.getClusteringResult();
dstream.getMicroClusteringResult();
System.out.println("Size of result from Dstream: " + resDstream.size());
System.out.println(numLines + " lines have been read");
}
}
I could fix the code that you have there, but it wouldn't do you much good. MOA has it's own selection of tasks and evaluators for running these experiments at a much higher level. This is how to run evaluations properly and not dive too deeply into the code. I'll assume a few things:
We use PairedLearners as our classifier.
We evaluate stream classification performance.
We evaluate in predictive sequential (prequential) fashion, i.e. train, then test on each example in the sequence.
Therefore, we can define our task quite simply, as follows.
public class PairedLearnersExample {
public static void main(String[] args) {
ArffFileStream fs = new ArffFileStream(PairedLearnersExample.class.getResource("abalone.arff").getFile(), -1);
fs.prepareForUse();
PairedLearners learners = new PairedLearners();
BasicClassificationPerformanceEvaluator evaluator = new BasicClassificationPerformanceEvaluator();
EvaluatePrequential task = new EvaluatePrequential();
task.learnerOption.setCurrentObject(learners);
task.streamOption.setCurrentObject(fs);
task.evaluatorOption.setCurrentObject(evaluator);
task.prepareForUse();
LearningCurve le = (LearningCurve) task.doTask();
System.out.println(le);
}
}
If you want to do other tasks, you can quite happily swap out the evaluator, stream and learner to achieve whatever it is you want to do.
If you refer to the MOA Manual you'll see that all I'm doing is imitating the command line commands - you could equally perform this evaluation at the command line if you wished.
For example,
java -cp .:moa.jar:weka.jar -javaagent:sizeofag.jar moa.DoTask \
"EvaluatePrequential -l PairedLearners \
-e BasicClassificationPerformanceEvaluator \
-s (ArffFileStream -f abalone.arff) \
-i 100000000 -f 1000000" > plresult_abalone.csv
I want to do something like this:
Python Code:
nums = [1,2,3]
Java Code:
nums_Java[] = nums //from python
System.out.println(nums_Java[0])
Output:
1
I have been looking over jython but I just can't seem to find the answer. It seems like it should be very simple but I'm lost. Thanks!
If I understand the question correctly, you'd like to run some embedded python code from a java program, and get the value of a python variable.
Based on http://www.jython.org/archive/21/docs/embedding.html , I wrote a small program that might help:
import org.python.util.PythonInterpreter;
import org.python.core.*;
public class SimpleEmbedded {
public static void main(String[] args) throws PyException {
PythonInterpreter interp = new PythonInterpreter();
interp.exec("nums = [1,2,3]");
PyObject nums = interp.get("nums");
System.out.println("nums: " + nums);
System.out.println("nums is of type: " + nums.getClass());
}
}
Unfortunately, I don't have jython installed at the moment, so the above code is untested. Also I'm not sure what type you will get back from the interpreter, and how to convert it to a java array or access its items. But the program should get you started and give you some more information.
I have solved in various ways a simple problem on CodeEval, which specification can be found here (only a few lines long).
I have made 3 working versions (one of them in Scala) and I don't understand the difference of performances for my last Java version which I expected to be the best time and memory-wise.
I also compared this to a code found on Github. Here are the performance stats returned by CodeEval :
. Version 1 is the version found on Github
. Version 2 is my Scala solution :
object Main extends App {
val p = Pattern.compile("\\d+")
scala.io.Source.fromFile(args(0)).getLines
.filter(!_.isEmpty)
.map(line => {
val dists = new TreeSet[Int]
val m = p.matcher(line)
while (m.find) dists += m.group.toInt
val list = dists.toList
list.zip(0 +: list).map { case (x,y) => x - y }.mkString(",")
})
.foreach(println)
}
. Version 3 is my Java solution which I expected to be the best :
public class Main {
public static void main(String[] args) throws IOException {
Pattern p = Pattern.compile("\\d+");
File file = new File(args[0]);
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
while ((line = br.readLine()) != null) {
Set<Integer> dists = new TreeSet<Integer>();
Matcher m = p.matcher(line);
while (m.find()) dists.add(Integer.parseInt(m.group()));
Iterator<Integer> it = dists.iterator();
int prev = 0;
StringBuilder sb = new StringBuilder();
while (it.hasNext()) {
int curr = it.next();
sb.append(curr - prev);
sb.append(it.hasNext() ? "," : "");
prev = curr;
}
System.out.println(sb);
}
br.close();
}
}
Version 4 is the same as version 3 except I don't use a StringBuilder to print the output and do like in version 1
Here is how I interpreted those results :
version 1 is too slow because of the too high number of System.out.print calls. Moreover, using split on very large lines (that's the case in the tests performed) uses a lot of memory.
version 2 seems slow too but it is mainly because of an "overhead" on running Scala code on CodeEval, even very efficient code run slowly on it
version 2 uses unnecessary memory to build a list from the set, which also takes some time but should not be too significant. Writing more efficient Scala would probably like writing it in Java so I preferred elegance to performance
version 3 should not use that much memory in my opinion. The use of a StringBuilder has the same impact on memory as calling mkString in version 2
version 4 proves the calls to System.out.println are slowering down the program
Does someone see an explanation to those results ?
I conducted some tests.
There is a baseline for every type of language. I code in java and javascript. For javascript here are my test results:
Rev 1: Default empty boilerplate for JS with a message to standard output
Rev 2: Same without file reading
Rev 3: Just a message to the standard output
You can see that no matter what, there will be at least 200 ms runtime and about 5 megs of memory usage. This baseline depends on the load of the servers as well! There was a time when codeevals was heavily overloaded, thus making impossible to run anything within the max time(10s).
Check this out, a totally different challenge than the previous:
Rev4: My solution
Rev5: The same code submitted again now. Scored 8000 more ranking point. :D
Conclusion: I would not worry too much about CPU and memory usage and rank. It is clearly not reliable.
Your scala solution is slow, not because of "overhead on CodeEval", but because you are building an immutable TreeSet, adding elements to it one by one. Replacing it with something like
val regex = """\d+""".r // in the beginning, instead of your Pattern.compile
...
.map { line =>
val dists = regex.findAllIn(line).map(_.toInt).toIndexedSeq.sorted
...
Should shave about 30-40% off your execution time.
Same approach (build a list, then sort) will, probably, help your memory utilization in "version 3" (java sets are real memory hogs). It is also a good idea to give your list an initial size while you are at it (otherwise, it'll grow by 50% every time it runs out of capacity, which is wasteful in both memory and performance). 600 sounds like a good number, since that's the upper bound for the number of cities from the problem description.
Now, since we know the upper boundary, an even faster and slimmer approach is to do away with lists and boxed Integeres, and just do int dists[] = new int[600];.
If you wanted to get really fancy, you'd also make use of the "route length" range that's mentioned in the description. For example, instead of throwing ints into an array and sorting (or keeping a treeset), make an array of 20,000 bits (or even 20K bytes for speed), and set those that you see in input as you read it ... That would be both faster and more memory efficient than any of your solutions.
I tried solving this question and figured that you don't need the names of the cities, just the distances in a sorted array.
It has much better runtime of 738ms, and memory of 4513792 with this.
Although this may not help improve your piece of code, it seems like a better way to approach the question. Any suggestions to improve the code further are welcome.
import java.io.*;
import java.util.*;
public class Main {
public static void main (String[] args) throws IOException {
File file = new File(args[0]);
BufferedReader buffer = new BufferedReader(new FileReader(file));
String line;
while ((line = buffer.readLine()) != null) {
line = line.trim();
String out = new Main().getDistances(line);
System.out.println(out);
}
}
public String getDistances(String s){
//split the string
String[] arr = s.split(";");
//create an array to hold the distances as integers
int[] distances = new int[arr.length];
for(int i=0; i<arr.length; i++){
//find the index of , - get the characters after that - convert to integer - add to distances array
distances[i] = Integer.parseInt(arr[i].substring(arr[i].lastIndexOf(",")+1));
}
//sort the array
Arrays.sort(distances);
String output = "";
output += distances[0]; //append the distance to the closest city to the string
for(int i=0; i<arr.length-1; i++){
//get distance between current element(city) and next
int distance_between = distances[i+1] - distances[i];
//append the distance to the string
output += "," + distance_between;
}
return output;
}
}
I'm trying to create an "automated trainning" using weka's java api but I guess I'm doing something wrong, whenever I test my ARFF file via weka's interface using MultiLayerPerceptron with 10 Cross Validation or 66% Percentage Split I get some satisfactory results (around 90%), but when I try to test the same file via weka's API every test returns basically a 0% match (every row returns false)
here's the output from weka's gui:
=== Evaluation on test split ===
=== Summary ===
Correctly Classified Instances 78 91.7647 %
Incorrectly Classified Instances 7 8.2353 %
Kappa statistic 0.8081
Mean absolute error 0.0817
Root mean squared error 0.24
Relative absolute error 17.742 %
Root relative squared error 51.0603 %
Total Number of Instances 85
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0.885 0.068 0.852 0.885 0.868 0.958 1
0.932 0.115 0.948 0.932 0.94 0.958 0
Weighted Avg. 0.918 0.101 0.919 0.918 0.918 0.958
=== Confusion Matrix ===
a b <-- classified as
23 3 | a = 1
4 55 | b = 0
and here's the code I've using on java (actually it's on .NET using IKVM):
var classifier = new weka.classifiers.functions.MultilayerPerceptron();
classifier.setOptions(weka.core.Utils.splitOptions("-L 0.7 -M 0.3 -N 75 -V 0 -S 0 -E 20 -H a")); //these are the same options (the default options) when the test is run under weka gui
string trainingFile = Properties.Settings.Default.WekaTrainingFile; //the path to the same file I use to test on weka explorer
weka.core.Instances data = null;
data = new weka.core.Instances(new java.io.BufferedReader(new java.io.FileReader(trainingFile))); //loads the file
data.setClassIndex(data.numAttributes() - 1); //set the last column as the class attribute
cl.buildClassifier(data);
var tmp = System.IO.Path.GetTempFileName(); //creates a temp file to create an arff file with a single row with the instance I want to test taken from the arff file loaded previously
using (var f = System.IO.File.CreateText(tmp))
{
//long code to read data from db and regenerate the line, simulating data coming from the source I really want to test
}
var dataToTest = new weka.core.Instances(new java.io.BufferedReader(new java.io.FileReader(tmp)));
dataToTest.setClassIndex(dataToTest.numAttributes() - 1);
double prediction = 0;
for (int i = 0; i < dataToTest.numInstances(); i++)
{
weka.core.Instance curr = dataToTest.instance(i);
weka.core.Instance inst = new weka.core.Instance(data.numAttributes());
inst.setDataset(data);
for (int n = 0; n < data.numAttributes(); n++)
{
weka.core.Attribute att = dataToTest.attribute(data.attribute(n).name());
if (att != null)
{
if (att.isNominal())
{
if ((data.attribute(n).numValues() > 0) && (att.numValues() > 0))
{
String label = curr.stringValue(att);
int index = data.attribute(n).indexOfValue(label);
if (index != -1)
inst.setValue(n, index);
}
}
else if (att.isNumeric())
{
inst.setValue(n, curr.value(att));
}
else
{
throw new InvalidOperationException("Unhandled attribute type!");
}
}
}
prediction += cl.classifyInstance(inst);
}
//prediction is always 0 here, my ARFF file has two classes: 0 and 1, 92 zeroes and 159 ones
it's funny because if I change the classifier to let's say NaiveBayes the results match the test made via weka's gui
You are using a deprecated way of reading in ARFF files. See this documentation. Try this instead:
import weka.core.converters.ConverterUtils.DataSource;
...
DataSource source = new DataSource("/some/where/data.arff");
Instances data = source.getDataSet();
Note that that documentation also shows how to connect to a database directly, and bypass the creation of temporary ARFF files. You could, additionally, read from the database and manually create instances to populate the Instances object with.
Finally, if simply changing the classifier type at the top of the code to NaiveBayes solved the problem, then check the options in your weka gui for MultilayerPerceptron, to see if they are different from the defaults (different settings can cause the same classifier type to produce different results).
Update: it looks like you're using different test data in your code than in your weka GUI (from a database vs a fold of the original training file); it might also be the case that the particular data in your database actually does look like class 0 to the MLP classifier. To verify whether this is the case, you can use the weka interface to split your training arff into train/test sets, and then repeat the original experiment in your code. If the results are the same as the gui, there's a problem with your data. If the results are different, then we need to look more closely at the code. The function you would call is this (from the Doc):
public Instances trainCV(int numFolds, int numFold)
I had the same Problem.
Weka gave me different results in the Explorer compared to a cross-validation in Java.
Something that helped:
Instances dataSet = ...;
dataSet.stratify(numOfFolds); // use this
//before splitting the dataset into train and test set!