I am trying to print an R dataframe in Java but I am not sure of the datatype to be used for that.
I am getting a null value if I use STRING or STRING ARRAY. This is my code:
public static void main(String a[]) {
// Create an R vector in the form of a string.
String javaVector = "c(1,2,3,4,5)";
String name="c('a','b','c','d','e')";
// Start Rengine.
Rengine engine = new Rengine(new String[] { "--no-save" }, false, null);
// The vector that was created in JAVA context is stored in 'rVector' & 'names' which is a variable in R context.
engine.eval("rVector=" + javaVector);
engine.eval("names="+name);
//Calculate MEAN of vector using R syntax.
engine.eval("meanVal=mean(rVector)");
//Making a data frame in R.
engine.eval("data=data.frame(rVector,names)");
engine.eval("initial=paste('aakash','singhal')");
String myname = engine.eval("initial").asString();
//TRYING TO PRINT THE DATA FRAME (NOT SURE OF THE DATA TYPE)
REXP datafinal = engine.eval("data");
//RETRIEVE MEAN VALUE
double mean = engine.eval("meanVal").asDouble();
//Print output values
System.out.println("Mean of given vector is=" + mean);
System.out.println("Data table is :" + datafinal);
System.out.println(myname);
}
It prints-
Mean of given vector is=3.0
Data table is :[VECTOR ([REAL* (1.0, 2.0, 3.0, 4.0, 5.0)], [FACTOR {levels=("a","b","c","d","e"),ids=(0,1,2,3,4)}])]
aakash singhal
edit-
i used RXP so got this new result but not a table.
Related
i'm curious is it possible to path java object (in any type (java/class/jar)) to REngine? till now i'm successfully executing an operation from java to R or vice versa. for example: i have my custom jar files which i'm using in RStudio, i wish to have same opportunity from java code as well.
above code is from RStudio
bellow code is from java
String javaVector="c(1,2,3,4,5)";
Rengine rengine = new Rengine(new String[]{"-no-save"}, false, null);
rengine.eval("rVector <-"+javaVector);
rengine.eval("meanVal=mean(rVector)");
double mean = rengine.eval("meanVal").asDouble();
REXP rexp = rengine.eval("meanVal");
System.out.println("Mean of given vector is <-"+mean);
rengine.eval(String.format("greeting <- '%s'", "Hello R World"));
REXP result = rengine.eval("greeting");
System.out.println("Greeting from R: "+result.asString());
i will answer my question. if someone is interested in answer. so to path java object from java code to R script you can do next: first pick the object you want to work with, for example :
public class RAccess{
static public Object getObject(String id){
return test;
}
static TestClass test = new TestClass();
}
public class TestClass{
String message;
public void setMessage(String value){
message = value;
}
}
after evaluate R script exactly like you do it in R console or RStudio, just put it in curly braces.
REXP x = re.eval(rCode3);
System.out.println(RAccess.test.message);
static String rCode3 =
"{ \n" +
"library(rJava) \n" +
".jinit() \n" +
"obj <- .jcall(\"jriTest/RAccess\", \"Ljava/lang/Object;\", \"getObject\", \"id\") \n" +
".jcall(obj, \"V\", \"setMessage\", \"hello from R\") \n" +
"}";
jriTest is a package name.
First of all thanks for your help in advance.
I'm writing an investment algorithm and am currently pre-processing CSV historical data. The end goal for this part of the process is to create a symmetrical co-variance matrix of 2k x 2k / 2 (2 million) entries.
The Java class I'm writing takes a folder of CSVs each with 8 bits of information, key ones being Date, Time & Opening stock price. Date & time have been combined into one 'seconds from delta' time measure and opening stock prices remain unchanged. The output CSV contains the above two pieces of information also with a filename index for later referencing.
In order to create the co-variance matrix each stock on the NYSE must have a price value for every time, if values are missing the matrix cannot be properly completed. Due to discrepancies between time entries in the historical training CSV, I have to use a polynomial function to estimate missed values, which then can be fed into the next process in the chain.
My problem sounds fairly simple and should be easy to overcome (I'm probably being a massive idiot). The polynomial package I'm using takes in two arrays of doubles (Double[] x, Double[] y). X pertaining to an array of the 'seconds past delta' time values of a particular stock and Y the corresponding price. When I try to feed these in I'm getting a type error as what I'm actually trying to input are 'java.lang.Double' objects. Can anyone help me with converting an array of the latter to an array of the former?
I realise there is a load of ridiculousness after the main print statement, these are just me tinkering trying to miraculously change the type.
Again thanks for your time, I look forward to your replies!
Please find the relevant method below:
public void main(String filePath) throws IOException {
String index = filePath;
index = index.replace("/Users/louislimon/Desktop/Invest Algorithm/Data/Samples US Stock Data/data-1/5 min/us/nyse stocks/1/", "");
index = index.replace(".us.txt", "");
File fout = new File("/Users/louislimon/Desktop/Invest Algorithm/Data.csv");
FileOutputStream fos = new FileOutputStream(fout);
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(fos));
Reader in = new FileReader(filePath);
Iterable<CSVRecord> records;
try {
records = CSVFormat.EXCEL.withSkipHeaderRecord(true).parse(in);
} catch ( IOException ex ) {
System.out.println ( "[ERROR] " + ex );
return;
}
ZoneId zoneId = ZoneId.of("America/New_York");
boolean tmp = true;
Instant firstInstant = null; // Track the baseline against which we calculate the increasing time
ArrayList<Double> timeVals = new ArrayList<Double>();
ArrayList<Double> priceVals = new ArrayList<Double>();
for ( CSVRecord record : records ) {
if(tmp){
tmp = false;
}
else {
//System.out.println(record.toString());
String dateInput = record.get(0);
String timeInput = record.get(1);
Double price = Double.parseDouble(record.get(2));
LocalDate date = LocalDate.parse(dateInput);
LocalTime time = LocalTime.parse(timeInput);
//Double price = Double.parseDouble(priceInput);
LocalDateTime ldt = LocalDateTime.of(date, time);
ZonedDateTime zdt = ldt.atZone(zoneId);
Instant instant = zdt.toInstant(); // Use Instant (moment on the timeline in UTC) for data storage, exchange, serialization, database, etc.
if (null == firstInstant) {
firstInstant = instant; // Capture the first instant.
}
Duration duration = Duration.between(firstInstant, instant);
Long deltaInSeconds = duration.getSeconds();
double doubleDeltaInSeconds = deltaInSeconds.doubleValue();
timeVals.add(doubleDeltaInSeconds);
priceVals.add(price);
//System.out.println("deltaInSeconds: " + deltaInSeconds + " | price: " + price + " | index: " + index);
}
Double [] timeValsArray = timeVals.toArray(new Double[timeVals.size()]);
Double [] priceValsArray = timeVals.toArray(new Double[priceVals.size()]);
Double[] timeFeed = new Double[timeVals.size()];
Double[] priceFeed = new Double[priceVals.size()];
for(int x = 0;x<timeVals.size(); x++) {
timeFeed[x] = new Double (timeValsArray[x].doubleValue());
priceFeed[x] = new Double (priceValsArray[x]);
}
PolynomialFunctionLagrangeForm pflf = new PolynomialFunctionLagrangeForm(timeFeed,priceFeed);
}
According to the documentation, the PolynomialFunctionLagrangeForm constructor takes two double[] arrays, not Double[].
Hence you need to create a raw array and pass that:
...
double[] timeFeed = new double[timeVals.size()];
double[] priceFeed = new double[priceVals.size()];
for(int x = 0; x < timeVals.size(); x++) {
timeFeed[x] = timeValsArray[x].doubleValue();
priceFeed[x] = priceValsArray[x].doubleValue();
}
...
See also How to convert an ArrayList containing Integers to primitive int array? for some alternative ways to convert an ArrayList<T> (where T is a wrapper for a primitive type) to the corresponding raw array T[].
Note that there is also obviously a typo in your code:
Double [] priceValsArray = timeVals.toArray(new Double[priceVals.size()]);
needs to be
Double [] priceValsArray = priceVals.toArray(new Double[priceVals.size()]);
I am using Rserve to connect with R from within Java. I have a problem with using a library function in R only when I am accessing it through Java. Here are the details:
In Java I have four float arrays. These are used as input for the SpectrumSimilarity function in the OrgMassSpecRpackage of R. To provide these float arrays as an input using Rserve, I first have to convert them to string arrays. Here is the code:
String[] consensusIMzString = new String[consensusIMz.length];
String[] consensusIIntString = new String[consensusIInt.length];
String[] referenceJMzString = new String[referenceJMz.length];
String[] referenceJIntString = new String[referenceJInt.length];
System.out.println("Filename 1: " + fileNameOfI + "Filename 2: " + fileNameOfJ);
for(int i = 0; i < consensusIMz.length;i++)
{
consensusIMzString[i] = Float.toString(consensusIMz[i]);
consensusIIntString[i] = Float.toString(consensusIInt[i]);
}
for(int j = 0; j < referenceJMz.length; j++)
{
referenceJMzString[j] = Float.toString(referenceJMz[j]);
referenceJIntString[j] = Float.toString(referenceJInt[j]);
}
try {
RConnection rc = new RConnection();
rc.assign("generateSimilarityScore", currentDirPath.concat("/generateSimilarityScore.R"));
rc.eval("source(generateSimilarityScore)");
rc.assign("referenceJMzString", referenceJMzString);
rc.assign("referenceJIntString", referenceJIntString);
rc.assign("consensusIMzString",consensusIMzString);
rc.assign("consensusIIntString", consensusIIntString);
rc.assign("commonMassWindowThreshold", Float.toString(commonMassWindowThreshold));
REXP distanceSimilarityValue;
distanceSimilarityValue = rc.eval("generateSimilarityScore(referenceJMzString,referenceJIntString,consensusIMzString,consensusIIntString,commonMassWindowThreshold)");
System.out.println("***" + distanceSimilarityValue);
distance = Float.parseFloat(distanceSimilarityValue.asString());
System.out.println("Distance value: " + distance);
} catch (RserveException e) {
e.printStackTrace();
} catch (REngineException e) {
e.printStackTrace();
} catch (REXPMismatchException e) {
e.printStackTrace();
}
Here is the R function generateSimilarityScore which takes these values and calls the SpectrumSimilarity function. This function should return a single float value.
## define generateSimilarityScore function
generateSimilarityScore<-function(experimentalSpectrumMz, experimentalSpectrumInt, referenceSpectrumMz, referenceSpectrumInt, commonMassThreshold)
{
library(OrgMassSpecR)
# Convert experimentalSpectrumMz to numeric dataframe
experimentalSpectrumMz <- as.data.frame(sapply(experimentalSpectrumMz, as.numeric))
# Convert experimentalSpectrumInt to numeric dataframe
experimentalSpectrumInt <- as.data.frame(sapply(experimentalSpectrumInt, as.numeric))
# Merge experimentalSpectrumMz and experimentalSpectrumInt columnwise in a single data frame
experimentalSpectrum <- cbind(experimentalSpectrumMz, experimentalSpectrumInt)
experimentalSpectrum <- as.data.frame(experimentalSpectrum)
# Convert referenceSpectrumMz to numeric dataframe
referenceSpectrumMz <- as.data.frame(sapply(referenceSpectrumMz, as.numeric))
# Convert referenceSpectrumInt to numeric dataframe
referenceSpectrumInt <- as.data.frame(sapply(referenceSpectrumInt, as.numeric))
# Merge referenceSpectrumMz and referenceSpectrumInt columnwise in a single data frame
referenceSpectrum <- cbind(referenceSpectrumMz, referenceSpectrumInt)
referenceSpectrum <- as.data.frame(referenceSpectrum)
# Covert commonMassThreshold as numeric
commonMassThreshold <- as.numeric(commonMassThreshold)
# Call the SpectrumSimilarity function which should store a numeric value in similarityScoreValue
similarityScoreValue <- SpectrumSimilarity(experimentalSpectrum, referenceSpectrum, t = commonMassThreshold, b=1, top.label = "df1", bottom.label = "df2")
return(similarityScoreValue)
}
The SpectrumSimilarity method prints a table with the results and a single distance value, on the console, when accessed independently in R, however no distance value is generated/returned when accessed through Java (but a table is displayed in the console, meaning that the function is working). Can someone help me find why a distance value is not returned? I am completely stuck here.
Try to save the R function in a .R file and call it from Java using R's source() function. In your code, there is no way for Rserve to know the details of your function generateSimilarityScore().
I want to create a WEKA Java program that reads a group of newly created data that will be fed to a premade model from the GUI version.
Here is the program:
import java.util.ArrayList;
import weka.classifiers.Classifier;
import weka.core.Attribute;
import weka.core.DenseInstance;
import weka.core.Instances;
import weka.core.Utils;
public class UseModelWithData {
public static void main(String[] args) throws Exception {
// load model
String rootPath = "G:/";
Classifier classifier = (Classifier) weka.core.SerializationHelper.read(rootPath+"j48.model");
// create instances
Attribute attr1 = new Attribute("age");
Attribute attr2 = new Attribute("menopause");
Attribute attr3 = new Attribute("tumor-size");
Attribute attr4 = new Attribute("inv-nodes");
Attribute attr5 = new Attribute("node-caps");
Attribute attr6 = new Attribute("deg-malig");
Attribute attr7 = new Attribute("breast");
Attribute attr8 = new Attribute("breast-quad");
Attribute attr9 = new Attribute("irradiat");
Attribute attr10 = new Attribute("Class");
ArrayList<Attribute> attributes = new ArrayList<Attribute>();
attributes.add(attr1);
attributes.add(attr2);
attributes.add(attr3);
attributes.add(attr4);
attributes.add(attr5);
attributes.add(attr6);
attributes.add(attr7);
attributes.add(attr8);
attributes.add(attr9);
attributes.add(attr10);
// predict instance class values
Instances testing = new Instances("Test dataset", attributes, 0);
// add data
double[] values = new double[testing.numAttributes()];
values[0] = testing.attribute(0).addStringValue("60-69");
values[1] = testing.attribute(1).addStringValue("ge40");
values[2] = testing.attribute(2).addStringValue("10-14");
values[3] = testing.attribute(3).addStringValue("15-17");
values[4] = testing.attribute(4).addStringValue("yes");
values[5] = testing.attribute(5).addStringValue("2");
values[6] = testing.attribute(6).addStringValue("right");
values[7] = testing.attribute(7).addStringValue("right_up");
values[8] = testing.attribute(0).addStringValue("yes");
values[9] = Utils.missingValue();
// add data to instance
testing.add(new DenseInstance(1.0, values));
// instance row to predict
int index = 10;
// perform prediction
double myValue = classifier.classifyInstance(testing.instance(10));
// get the name of class value
String prediction = testing.classAttribute().value((int) myValue);
System.out.println("The predicted value of the instance ["
+ Integer.toString(index) + "]: " + prediction);
}
}
My references include:
Using a premade WEKA model in Java
the WEKA Manual provided in the 3.7.10 version - 17.3 Creating datasets in memory
Creating a single instance for classification in WEKA
So far the part where I create a new Instance inside the script causes the following error:
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 10, Size: 1
in the line
double myValue = classifier.classifyInstance(testing.instance(10));
I just want to use a latest row of instance values to a premade WEKA model. How do I solve this?
Resources
Program file
Arff file
j48.model
You have the error because you are trying to access the 11th instance and have only created one.
If you always want to access the last instance you might try the following:
double myValue = classifier.classifyInstance(testing.lastInstance());
Additionally, I don't believe that you are creating the instances you hope for. After looking at your provided ".arff" file, which I believe you are trying to mimic, I think you should proceed making instances as follows:
FastVector atts;
FastVector attAge;
Instances testing;
double[] vals;
// 1. set up attributes
atts = new FastVector();
//age
attAge = new FastVector();
attAge.addElement("10-19");
attAge.addElement("20-29");
attAge.addElement("30-39");
attAge.addElement("40-49");
attAge.addElement("50-59");
attAge.addElement("60-69");
attAge.addElement("70-79");
attAge.addElement("80-89");
attAge.addElement("90-99");
atts.addElement(new Attribute("age", attAge));
// 2. create Instances object
testing = new Instances("breast-cancer", atts, 0);
// 3. fill with data
vals = new double[testing.numAttributes()];
vals[0] = attAge.indexOf("10-19");
testing.add(new DenseInstance(1.0, vals));
// 4. output data
System.out.println(testing);
Of course I did not create the whole dataset, but the technique would be the same.
I need to sort data based on the third column of the table data structure. I tried based on the answers for the following question. But my sorting does not work. Please help me in this.
Here goes my code.
Object[] data = new Object[y];
rst.beforeFirst();
while (rst.next()) {
int p_id = Integer.parseInt(rst.getString(1));
String sw2 = "select sum(quantity) from tbl_order_detail where product_id=" + p_id;
rst1 = stmt1.executeQuery(sw2);
rst1.next();
String sw3 = "select max(order_date) from tbl_order where tbl_order.`Order_ID` in (select tbl_order_detail.`Order_ID` from tbl_order_detail where product_id=" + p_id + ")";
rst2 = stmt2.executeQuery(sw3);
rst2.next();
data[i] = new Object[]{new String(rst.getString(2)), new String(rst.getString(3)), new Integer(rst1.getString(1)), new String(rst2.getString(1))};
i++;
}
ColumnComparator cc = new ColumnComparator(2);
Arrays.sort(data, cc);
if (i == 0) {
table.addCell("");
table.addCell("");
table.addCell("");
table.addCell("");
} else {
for (int j = 0; j < y; j++) {
Object[] theRow = (Object[]) data[j];
table.addCell((String) theRow[0]);
table.addCell((String) theRow[1]);
table.addCell((String) theRow[2]);
table.addCell((String) theRow[3]);
}
Sample Expected Output:
Product_code Product_name Quantity Order_date
FK Cake 3000 2010-12-09
CK Jelly 100 2010-09-23
F juice 30 2010-12-09
but what I get is:
Product_code Product_name Quantity Order_date
CK Jelly 100 2010-09-23
F juice 30 2010-12-09
FK Cake 3000 2010-12-09
You have far too much going on here. You're mingling database access and UI all into a single method. I'd separate those concerns.
I'd also recommend having the database do the sorting. Add an ORDER BY to the SELECT and let the database do the work.
I'd map the data from the SELECT into an object that had a Comparator for sorting. Load the ResultSet into a List of that object; you can have all your wishes that way.
Is the problem the data or the comparator? In the other posting you where shown how to create a simple test program using hard coded data. The code you posted here doesn't help us because we don't have access to your database and we don't know if you are accessing the data correctly.
The output looks like it is sorted in ascending order by "String" value. So it does indeed look like the data is wrong. I don't know what the problem is since it looks like you are adding an Integer value to the array.
You want the output in descending order by amount, so you need to set a Comparator property to do this.
Anyway to make sure the problem wasn't with my Comparator I created a simple test:
import java.util.*;
public class SortSIJ
{
public static void main(String args[])
{
Object[] data = new Object[3];
data[0] = new Object[] {"CK", "Jelly", new Integer(100), "2010-09-23"};
data[1] = new Object[] {"FK", "Cake", new Integer(3000), "2010-12-09"};
data[2] = new Object[] {"F", "juice", new Integer(30), "2010-12-09"};
ColumnComparator cc = new ColumnComparator(2);
cc.setAscending( false );
Arrays.sort(data, cc);
for (Object row: data)
{
Object[] theRow = (Object[])row;
System.out.println( Arrays.asList(theRow) );
}
}
}
The output looks fine to me. All I can suggest is that you modify the ColumnComparator to add the following line of code to verify the Object type that is being sorted.
System.out.println(o1.getClass());
When I do that I get the following output:
class java.lang.Integer
class java.lang.Integer
[FK, Cake, 3000, 2010-12-09]
[CK, Jelly, 100, 2010-09-23]
[F, juice, 30, 2010-12-09]