I must massage data in a matrix of type 131072x1 int32 into a Java List<Integer> in Matlab. So far, the only working conversion I've come up with is to roll through the values and directly add them to a LinkedList.
count = size(data_flattened, 1);
ll = java.util.LinkedList;
for i = 1:count
ll.add(data_flattened(i));
end
Which is slow in the extreme (5 seconds). I've tried several formulations of converting first to a Java array and then to a List but I always end up with an array with 1 column and 131072 rows.
I need a way of quickly assigning an N-by-1 Matlab matrix of int32s to a Java List<Integer> type.
Convert to a cell
I found one way of getting Matlab to behave the way I want is to convert the matrix to cells.
cells = num2cell(data_flattened);
the_list = java.util.Arrays.asList(cells)
It is faster than rolling through the array and appending to the list, but it is still too slow. On average 0.25 seconds per conversion which is better but still too high.
Java 8 Stream
After some research and testing implementing a function in Java to handle the conversion to from an int[] to a List<Integer> in reasonable time (0.001 seconds).
public static List<Integer> flatten(int[] arr) {
return IntStream.of(arr).parallel().boxed().collect(Collectors.toList());
}
To use Java 8 you'll need to point your MATLAB_JAVA environment variable to the newer JRE. The location of your JRE can be found using java_home on a Mac.
/usr/libexec/java_home
Then in .bashrc or similar
export MATLAB_JAVA="$(/usr/libexec/java_home)/jre"
Launching MATLAB from the terminal will now correctly pick up the new JRE.
In Matlab you can check your Java version
version -java
and then in Matlab
matlab_data_flattened = matlab_data(:);
java_list = com.my.package.ClassName.flatten(matlab_data_flattened);
Related
I have covered lots of StackOverflow questions and Google search results, read many discussion topics but I couldn't find any proper answer for my question. I have an Sparse Matrix in .mat format which contains 36600 nodes (36600x36600 adjacency matrix) to read and manipulate (like matrix vector multiplication) in Java Environment. I applied many answers that discussed at here but I always got NullPointerException errors although there was a data at that .mat files.(Some says these result is because of size of data) I have applied these following code to my .mat file that return null and NullPointerException.
MatFileReader matfilereader = new MatFileReader("sourceData.mat");
MLArray mlArrayRetrieved = matfilereader.getMLArray("data");
System.out.println(mlArrayRetrieved);
System.out.println(mlArrayRetrieved.contentToString());
Also I have tried many times to convert .mat file to .csv or .xls in MATLAB Environment and Python Environment at Jupyter Notebook but, I did not get any result at these times, too.
That .mat file is going to be a adjacency matrix and will be a source for a specific algorithm in Cytoscape project. Hence, I must use it at Java Environment and I have decided to use the COLT Library for matrix manipulations. Suggestions and advises are going to help me so much. Thanks for reading.
just use find to get rows, columns and values of nonzeros elements and save these as text,csv or...:
[row, col, v] = find(my_spares_matrix);
Below is a code snippet using MFL that would result in a MATLAB-like printout of all values in your sparse matrix
Mat5.readFromFile("sourceData.mat")
.getSparse("data")
.forEach((row, col, real, imag) -> {
System.out.println(String.format("(%d,%d) \t %1.4f ", row + 1, col + 1, real));
});
The CSV workaround will work fine for the mentioned 750KB matrix, but it would likely become difficult to work with once data sets go beyond >50MB. MAT files store sparse data in a (binary) Compressed Sparse Column (CSC) format, which can be loaded with significantly less overhead than CSV files.
I have a huge number of context vectors and I want to find the average cosine similarity of them. However, it's not efficient to calculate it through the whole set. That's why, I want to take a random sample from this set.
The problem is that each context vector explains a degree of the meaning for a word so I want to make a balanced selection(according to vector values). I searched and found that I can use Monte Carlo method. I also found a Gibbs Sampler example here: https://darrenjw.wordpress.com/2011/07/16/gibbs-sampler-in-various-languages-revisited/
However, I confused a little bit. As I understand, the method provides a normal distribution and generates double numbers. I did not understand how to implement this method in my case. Could somebody explain me how can I solve this problem?
Thanks in advance.
You don't want a random sample, you want a representative sample. One relatively efficient way to do this is to sort your elements in "strength" order, then take every nth element, which will give you a representative sample of size/n elements.
Try this:
// Given
Set<Vector> mySet;
int reductionFactor = 200; // eg sample 0.5% of elements
List<Vector> list = new ArrayList<>(mySet);
Collections.sort(list, new Comparator<Vector> {
public int compare(Vector o1, Vector o2) {
// however you compare "strength"
}
});
List<Vector> randomSample = new ArrayList<>(list.size() / reductionFactor );
for (int i = 0; i < list.size(); i += reductionFactor)
randomSample.add(list.get(i);
The time complexity is O(n log n) due to the sort operation, and space complexity is O(n).
The program compiles and runs fine. It needs a jar file, or a "Java Archive" in order to compile and run. Specifically, it needs the ParallelColt library, a "a multithreaded version of Colt - a library for high performance scientific computing in Java." It can be found at this link. Once you have it, get the java JDK (SE version) from Oracle.
Copy the source you referenced and the parallelcolt-0.9.4.jar file into a directory and compile and run with these commands:
javac -cp parallelcolt-0.9.4.jar Gibbs.java
java -cp parallelcolt-0.9.4.jar;. Gibbs
Note, you will probably need to include the compiler on your path. In windows I do it like so:
path="c:\program files\java\jdk1.7.0_60\bin";%PATH%
Please select this response as an answer if it helps you.
I have 7 lines of data in a text file (shown below).
name: abcd
temp: 623.2
vel: 8
name: xyz
temp: 432
vel: 7.6
Using regex, I was able to read this data and I have been able to print it out. Now I need to store this data in some variable. I'm leaning towards storing this data in an array/ matrix. So physically, it would look something like this:
data = [abcd, 623.2, 8
xyz, 432, 7.6]
So in effect, 1st row contains the first 3 lines, the 2nd row contains lines from 5 to 7. My reason for choosing this type of variable for storage is that in the long run, calling out the data will be simpler - as in:
data[0][0] = abcd
data[1][1] = 432
I can't use the java matrix files from math.nist.gov because I'm not the root user and getting the IT dept to install stuff on my machine is proving to be a MAJOR waste of time. So I want to work with the resources I have - which is Eclipse and a java installation version 1.6.
I want to get this data and store it into a java array variable. What I wanted to know is: is choosing the array variable the right way to proceed? Or should I use a vector variable (altho, in my opinion, using a vector variable will complicate stuff)? or is there some other variable that will allow me to store data easily and call it out easily?
Btw, a little more details regarding my java installation - in case it helps in some way:
OpenJDK Runtime Environment (build 1.6.0-b09)
OpenJDK 64-bit Server VM (build 1.6.0-b09, mixed mode)
Thank you for your help
It seems to me that
name: abcd
temp: 623.2
vel: 8
is some sort of object, and you'd do well to store a list of these e.g. you would define an object
public class MyObject {
private String name;
private double temp;
private double vel;
// etc...
}
(perhaps - there may be more appropriate types), and store these in a list:
List<MyObject>
If you need to index them via their name attribute, then perhaps store a map (e.g.Map<String, MyObject>) where the key is the name of the object.
I'm suggesting creating an object for these since it's trivially easy to ask for obj.getName() etc. rather than remember or calculate array index offsets. Going forwards, you'll be able to add behaviour to these objects (e.g. you have a temp field - with an object you can retrieve that in centigrade/kelvin/fahrenheit etc.). Storing the raw data in arrays doesn't really allow you to leverage the functionality of a OO language.
(note re your installation woes - these classes are native to the Java JRE/JDK and don't require installations. They're fundamental to many programs in Java)
You can use an array, but rather than doing a two dimensional array, create a Data Class that holds the elements and then have an array of those elements.
For example:
public class MyData {
String name;
float temp;
int vel;
}
then you could define
MyData arr[];
You could also use a List() instead of an Array, depending on if you had sorting/searching type criteria. This approach gives you a lot more flexibility if you ever add an element or if you want to find duplicates or searching.
Wrap this information
name: xyz
temp: 432
vel: 7.6
in a class of it's own.
And use whichever implementation of a List<T> you prefer.
Provided that all keys in the key-value pair that you are reading are unique, why don't you store items in a java.util.Map?
Pattern pattern = Pattern.compile("(\\w+): (\\w+)");
try(BufferedReader reader = new BufferedReader(new FileReader("data.txt"))){
Map<String, String> items = new LinkedHashMap<>();
String line = null;
while( (line = reader.readLine()) != null) {
Matcher matcher = pattern.matcher(line);
while(matcher.find()){
items.put(matcher.group(1), matcher.group(2));
}
}
System.out.println(items);
}catch(IOException e) {
System.out.println(e.getMessage());
}
The map would then contain: {name=xyz, temp=432, vel=7}
And you could easily read a particular element like: items.get("name")
I think you can rely on java Collection framework.
You can use ArrayList instead of Arrays if there is a particular sequence in the data.
Moreover if you want to store data in key value pairs, then use Map.
Note: If you need sorted values, then use ArrayList with Comparator or Comparable Interface.
If you are using Map and you need unique and sorted values, then use TreeMap
I'm quite new to MATLAB programming and I ran into some trouble:
I want to call a dSPACE MLIB libriary function. According to their samples, it requires a string array as argument:
variables = {'Model Root/Spring-Mass-Damper System/Out1';...
'Model Root/Signal\nGenerator/Out1'};
libFunction(variables);
This variables is passed to the function. My problem is now: I have a frontend application where the user can choose from an arbitary number of strings which should be passed to the matlab function. Since the frontend is writtten in Java, the type of the incoming data is java.lang.String[].
How can I convert an array of java strings to something with the same type as the sample variable above (I think it is a cell array of cell arrays or sth like that).
Thanks in advance!
Take a look at the documentation. MATLAB makes it very easy to convert to and from Java types.
Handling data returned from Java
Dealing with Java arrays
You can convert an array of Java strings to either a cell or char array in MATLAB. Using cell arrays can work even with jagged arrays (which are permitted in Java).
Here are two simple examples:
%# Preparing a java.lang.String[] to play with.
a = javaArray('java.lang.String',10);
b = {'I','am','the','very','model','of','a','modern','major','general'};
for i=1:10; a(i) = java.lang.String(b{i}); end;
%# To cell array of strings. Simple, eh?
c = cell(a);
%# To char array. Also simple.
c = char(a);
I'm trying to read a matrix produced in Matlab into a 2D array in java.
I've been using jmatio so far for writing from java to a .mat file (successfully), but now can't manage to go the other way around.
I've managed to import a matrix into an MLArray object using this code:
matfilereader = new MatFileReader("filename.mat");
MLArray j = matfilereader.getMLArray("dataname");
But other than getting its string representation I couldn't manage to access the data itself. I found no example for this or documentation on the library itself, and I actually wrote a function to parse the intire string into a double[][] array but that's only good if the matrix is smaller than 1000 items...
Would be grateful for any experience or tips,
thanks,
Amir
matfilereader.getMLArray has several subclasses to access different kinds of data in MLArray object.
To represent double array you can cast MLArray to MLDouble:
MLDouble j = (MLDouble)matfilereader.getMLArray("dataname");
I'm not familiar with that tool, but it's pretty old. Try saving to an older version of *.mat file and see if your results change. That is, add either the '-v7.0' or '-v6' flag when you save you r*.mat file.
Example code:
save filename var1 var2 -v7.0
or
save filename var1 var2 -v6