I have this pipeline where i stream data from Python and connect to the stream in a Java applicaton. The data records are matrices of complex numbers. Now I've learned that json.dumps() can't deal with pythons complex type.
For the moment I've converted the complex values to a string, put it in a dictionary like this:
for entry in range(len(data_array)):
data_as_string = [str(i) for i in data_array[entry]["DATA"].tolist()]
send({'data': data_array[entry]["DATA"],
'coords': data_array[entry]["UVW"].tolist()})
and send it to he pipeline. But this requires extensive and expensive custom deserialization in Java, which increases the running time of the pipeline by a lot.
Currently I'm doing the deserialization like this:
JSONObject = new JSONOBJECT(string);
try {
data= jsonObject.getString("data");
uvw= jsonObject.getString("uvw");
} catch (JSONException ex) {
ex.printStackTrace();
}
And then I'm doing a lot of data.replace(string1, string2) to remove some of the signs added by the serialization and then looping through the matrix to convert every number into a Java Complex type.
My Java deserialization code looks the following:
data = data.replace("(","");
data = data.replace(")","");
data = data.replace("\"","");
data = data.replace("],[","¦");
data = data.replace("[","");
data = data.replace("]","");
uvw = uvw.replace("[","");
uvw = uvw.replace("]","");
String[] frequencyArrays = data.split("¦");
Complex[][] tempData = new Complex[48][4];
for(int i=0;i< frequencyArrays.length;i++){
String[] complexNumbersOfAFrequency = frequencyArrays[i].split(", ");
for(int j =0;j<complexNumbersOfAFrequency.length;j++){
boolean realPartNegative = false;
Complex c;
if(complexNumbersOfAFrequency[j].startsWith("-")){
realPartNegative = true;
//Get ridd of the first - sign to be able to split the real & imaginary parts
complexNumbersOfAFrequency[j] =complexNumbersOfAFrequency[j].replaceFirst("-","");
}
if(complexNumbersOfAFrequency[j].contains("+")){
String[] realAndImaginary = complexNumbersOfAFrequency[j].split("\\+");
try {
double real = Double.parseDouble(realAndImaginary[0]);
double imag = Double.parseDouble(realAndImaginary[1].replace("j",""));
if(realPartNegative){
c = new Complex(-real,imag);
} else {
c = new Complex(real,imag);
}
}catch(IndexOutOfBoundsException e) {
//System.out.println("Wrongly formatted number, setting it to 0");
c = new Complex(0,0);
}
catch (NumberFormatException e){
System.out.println("Wrongly formatted number, setting it to 0");
c = new Complex(0,0);
}
} else {
String[] realAndImaginary = complexNumbersOfAFrequency[j].split("-");
try {
double real = Double.parseDouble(realAndImaginary[0]);
double imag = Double.parseDouble(realAndImaginary[1].replace("j", "").replace("e", ""));
if (realPartNegative) {
c = new Complex(-real, -imag);
} else {
c = new Complex(real, -imag);
}
}
catch(IndexOutOfBoundsException e){
System.out.println("Not correctly formatted: ");
for(int temp = 0;temp<realAndImaginary.length;temp++){
System.out.println(realAndImaginary[temp]);
}
System.out.println("Setting it to (0,0)");
c = new Complex(0,0);
}
catch (NumberFormatException e){
c = new Complex(0,0);
}
}
tempData[i][j] = c;
}
}
Now my question would be if there is a way to either
1)Deserialize the Dictionary in Java without expensive String manipulations and looping through the matrices for each record or
2)Do a better Job in serializing the data in python so that it can be done better in java
Any hints are appreciated.
Edit: JSON looks the following
{"data": ["[(1 + 2j), (3 + 4j), ...]","[(5 + 6j), ...]", ..."],
"coords": [1,2,3]}
Edit: For the coordinates I can do the deserialization in Java pretty easily:
uvw = uvw.replace("[","");
uvw = uvw.replace("]","");
String[] coords = uvw.split(",");
And then cast the Strings in coords with Double.parseDouble(), howver for the data string this is way more complicated because the string is full of characters that need to be removed in order to get the actual numbers and to put them in the right place in the Complex[][] I want to cast it to
You are over-using JsonObject.getString, by using it to retrieve non-string data.
Let’s start with the coords property, since it’s a simpler case. [1,2,3] is not a string. It’s an array of numbers. Therefore, you should retrieve it as an array:
JsonArray coords = jsonObject.getJsonArray("coords");
int count = coords.size();
double[] uvw = new double[count];
for (int i = 0; i < count; i++) {
uvw[i] = coords.getJsonNumber(i).doubleValue();
}
The other property, data, is also an array, but with string elements:
JsonArray data = jsonObject.getJsonArray("data");
int count = data.size();
for (int i = 0; i < count; i++) {
String complexValuesStr = data.getString(i);
// ...
}
As for parsing out the complex numbers, I wouldn’t make all those String.replace calls. Instead, you can look for each complex value with a regular expression matcher:
Pattern complexNumberPattern = Pattern.compile(
"\\(\\s*" + // opening parenthesis
"(-?[0-9.]+)" + // group 1: match real part
"\\s*([-+])\\s*" + // group 2: match sign
"([0-9.]+)j" + // group 3: match imaginary part
"\\s*\\)"); // closing parenthesis
Matcher matcher = complexNumberPattern.matcher("");
JsonArray data = jsonObject.getJsonArray("data");
int count = data.size();
List<List<Complex>> allFrequencyValues = new ArrayList<>(count);
for (int i = 0; i < count; i++) {
String complexValuesStr = data.getString(i);
List<Complex> singleFrequencyValues = new ArrayList<>();
matcher.reset(complexValuesStr);
while (matcher.find()) {
double real = Double.parseDouble(matcher.group(1));
boolean positive = matcher.group(2).equals("+");
double imaginary = Double.parseDouble(matcher.group(3));
Complex value = new Complex(real, positive ? imaginary : -imaginary);
singleFrequencyValues.add(value);
}
allFrequencyValues.add(singleFrequencyValues);
}
You should not catch IndexOutOfBoundsException or NumberFormatException. Those indicate the input was invalid. You should not treat invalid input like it’s zero; it means the sender made an error, and you should make sure to let them know it. An exception is a good way to do that.
I have made the assumption that both terms are always present in each complex expression. For instance, 2i would appear as 0 + 2j, not just 2j. And a real number like 5 would appear as 5 + 0j. If that is not a safe assumption, the parsing gets more complicated.
Since you are concerned with performance, I would try the above; if the use of a regular expression makes the program too slow, you can always look for the parentheses and terms yourself, by stepping through the string. It will be more work but may provide a speed increase.
If I understand you correctly, your matrix would consist of arrays of complex numbers which in turn would contain a real number and an imaginary one.
If so, your data could look like this:
[[{'r':1,'j':2},{'r':3,'j':4}, ...],[{'r':5,'j':6}, ...]]
That means that you have a JSON array which contains arrays that contain objects. Those objects have 2 properties: r defining the value of the real number and j the value of the imaginary one.
Parsing that in Java should be straight forward, i.e. with some mapper like Jackson or Gson you'd just parse it into something like ComplexNumber[][] where ComplexNumber could look like this (simplified):
public class ComplexNumber {
public double r;
public double j;
}
Of course there may be already existing classes for complex numbers so you might want to use those. Additionally you might have to deserialize that manually (either because the target classes don't make it easy for the mappers or you can't/don't want to use a mapper) but in that case it would be just a matter of iterating over the JSONArray elements and extracting r and j from the JSONObjects.
Related
Hellow everyone,
I'm using Weka Java API for predictions. I was able to get the expected and actual behavior from the java code. But now what i wanted is to get the 'prediction margin' information from final results. From GUI, i can manage, but i wanted is a Java solution. appreciate if any one can help.What i wanted to get is the below highlighted information using java.
Below code shows the code i'm using as of now to predict actual/predicted.
for (int i = 0; i < testDataSet.numInstances(); i++) {
double actualClass = testDataSet.instance(i).classValue();
String actual = testDataSet.classAttribute().value((int) actualClass);
Instance newInst = testDataSet.instance(i);
double preJ48 = tree.classifyInstance(newInst);
String predictionString = testDataSet.classAttribute().value((int) preJ48);
System.out.println("Actual : " + actual + " Prediction : " + predictionString);
}
############################# Solution i found as below ##########
J48 tree = new J48();
tree.buildClassifier(trainDataSet);
double a = eval.evaluateModelOnceAndRecordPrediction(tree, testDataSet.instance(0));
eval.evaluateModel(tree, testDataSet, plainText);
for (String line : predsBuffer.toString().split("\n")) {
String[] linesplit = line.split("\\s+");
// If there's an error(error flag "+"), the length of linesplit is 6, otherwise 5
System.out.println("linesplit "+linesplit.length);
int id;
String expectedValue, classification;
double probability;
if (line.contains("+")) {
probability = Double.parseDouble(linesplit[5]);
System.out.println("Its Minus "+probability);
} else {
probability = Double.parseDouble(linesplit[4]);
System.out.println("Its Plus "+probability);
}
}
The prediction margin that you are referring to gets generated by the weka.gui.explorer.ClassifierErrorsPlotInstances class. Check the variables probActual and probNext in its process method.
This margin is simply the difference between the probability for the actual class label and the highest probability of the label that isn't the actual class label.
You can use the distributionForInstance method of your classifier to obtain the class distribution array and then determine these two probabilities to calculate the margin for the prediction.
I was trying a solve a issue which is bothering me for a while. I created a small parser that reads an .ini file and then stores the data in an ArrayList. However, I got stuck with the following snippet:
while (!(sCurrentLine.equals("[End]"))) {
formats.add(sCurrentLine);
for (int i = 0; formats.size() > 0; i++) {
}
sCurrentLine = br.readLine();
}
Now this is the place where I have to add values into formats, which is of type ArrayList.
The values that will be added like this:
0900.013-017=LABEL
0900.018-029=LABEL
Now the range is in between and I also have to make sure that '0900' and '=label' repeats themselves along with the expansion of numbers, for example:
0900.013=LABEL
0900.014=LABEL
0900.015=LABEL
0900.016=LABEL and so on...
and store it back in the ArrayList.
I don't want to depend upon third-party libraries. Please help me out with this.
Use a regular expression to parse the range, then loop over the parsed values. There is some fine tuning to be done but I think this should get you started.
Pattern rangePattern = Pattern.compile("([0-9]+)\\.([0-9]+)-([0-9]+)=(.*)$");
Matcher rangeMatcher = rangePattern.matcher("0900.13-17=First label");
if (rangeMatcher.matches()) {
String prefix = rangeMatcher.group(1);
int start = Integer.parseInt(rangeMatcher.group(2));
int end = Integer.parseInt(rangeMatcher.group(3));
String label = rangeMatcher.group(4);
for (int r = start; r < end; r++) {
System.out.println(prefix + "." + r + "=" + label);
}
}
Create the pattern once and then just get new matchers each time through your loop.
The results:
0900.13=First label
0900.14=First label
0900.15=First label
0900.16=First label
double pullPrice(String input){
if(input.length() < 3){
System.out.println("Error: 02; invalid item input, valid example: enter code here'milk 8.50'");
System.exit(0);
}
char[] inputArray = input.toCharArray();
char[] itemPriceArray;
double price;
boolean numVal = false;
int numCount = 0;
for(int i = 0; i <= inputArray.length-1; i ++){
//checking if i need to add char to char array of price
if(numVal == true){
//adding number to price array
itemPriceArray[numCount] = inputArray[i];
numCount++;
}
else{
if(inputArray[i] == ' '){
numVal = true;
//initializing price array
itemPriceArray = new char[inputArray.length - i];
}
else{
}
}
}
price = Double.parseDouble(String.valueOf(itemPriceArray));
return price;
}
Problem: attempting to pull the sequence of chars after white space between 'milk 8.50' as input. Initialization error occurs because I am initializing char array inside an if else statement that will initialize the array if it finds whitespace.
Question: since I don't know my char count number until I find a whitespace is there another way I can initialize? Does the compiler not trust me that I will initialize before calling array.
Also, if I am missing something or there are better ways to code any of this please let me know. I am in a java data structures class and learning fundamental data structures but would also like to focus on efficiency and modularity at the same time. I also have a pullPrice function that does the same thing but pulls the item name. I would like to combine these so i don't have to reuse the same code for both but can only return items with same datatype unless I create a class. Unfortunately this exercise is to use two arrays since we are practicing how to use ADT bags.
Any help is greatly appreciated?
Try something like this:
double pullPrice(String input)
{
try
{
// Instantiate a new scanner object, based on the input string
Scanner scanner = new Scanner(input);
// We skip the product (EG "milk")
String prod = scanner.next();
// and read the price(EG 8.5)
double price = scanner.nextDouble();
// We should close the scanner, to free resources...
scanner.close();
return price;
}
catch (NoSuchElementException ex)
{
System.out.println("Error: 02; invalid item input, valid example: enter code here 'milk 8.50'");
System.exit(0);
}
}
If you are sure that you program will get only proper input data then just initialize your array with null:
char[] itemPriceArray = null;
The main problem why the compiler is complaining - what happens if your program accesses uninitialized variable (for instance with wrong input data)? Java compiler prevents this kind of situations completely.
I will add to the other answers,
since you can't change the size of an array once created. You either have to allocate it bigger than you think you'll need or accept the overhead of having to reallocate it needs to grow in size. When it does you'll have to allocate a new one and copy the data from the old to the new:
int oldItems[] = new int[10];
for (int i=0; i<10; i++) {
oldItems[i] = i+10;
}
int newItems[] = new int[20];
System.arraycopy(oldItems, 0, newItems, 0, 10);
oldItems = newItems;
char[] itemPriceArray = new char[inputArray.length];
Im currently working on a program and any time i call Products[1] there is no null pointer error however, when i call Products[0] or Products[2] i get a null pointer error. However i am still getting 2 different outputs almost like there is a [0] and 1 or 1 and 2 in the array. Here is my code
FileReader file = new FileReader(location);
BufferedReader reader = new BufferedReader(file);
int numberOfLines = readLines();
String [] data = new String[numberOfLines];
Products = new Product[numberOfLines];
calc = new Calculator();
int prod_count = 0;
for(int i = 0; i < numberOfLines; i++)
{
data = reader.readLine().split("(?<=\\d)\\s+|\\s+at\\s+");
if(data[i].contains("input"))
{
continue;
}
Products[prod_count] = new Product();
Products[prod_count].setName(data[1]);
System.out.println(Products[prod_count].getName());
BigDecimal price = new BigDecimal(data[2]);
Products[prod_count].setPrice(price);
for(String dataSt : data)
{
if(dataSt.toLowerCase().contains("imported"))
{
Products[prod_count].setImported(true);
}
else{
Products[prod_count].setImported(false);
}
}
calc.calculateTax(Products[prod_count]);
calc.calculateItemTotal(Products[prod_count]);
prod_count++;
This is the output :
imported box of chocolates
1.50
11.50
imported bottle of perfume
7.12
54.62
This print works System.out.println(Products[1].getProductTotal());
This becomes a null pointer System.out.println(Products[2].getProductTotal());
This also becomes a null pointer System.out.println(Products[0].getProductTotal());
You're skipping lines containing "input".
if(data[i].contains("input")) {
continue; // Products[i] will be null
}
Probably it would be better to make products an ArrayList, and add only the meaningful rows to it.
products should also start with lowercase to follow Java conventions. Types start with uppercase, parameters & variables start with lowercase. Not all Java coding conventions are perfect -- but this one's very useful.
The code is otherwise structured fine, but arrays are not a very flexible type to build from program logic (since the length has to be pre-determined, skipping requires you to keep track of the index, and it can't track the size as you build it).
Generally you should build List (ArrayList). Map (HashMap, LinkedHashMap, TreeMap) and Set (HashSet) can be useful too.
Second bug: as Bohemian says: in data[] you've confused the concepts of a list of all lines, and data[] being the tokens parsed/ split from a single line.
"data" is generally a meaningless term. Use meaningful terms/names & your programs are far less likely to have bugs in them.
You should probably just use tokens for the line tokens, not declare it outside/ before it is needed, and not try to index it by line -- because, quite simply, there should be absolutely no need to.
for(int i = 0; i < numberOfLines; i++) {
// we shouldn't need data[] for all lines, and we weren't using it as such.
String line = reader.readLine();
String[] tokens = line.split("(?<=\\d)\\s+|\\s+at\\s+");
//
if (tokens[0].equals("input")) { // unclear which you actually mean.
/* if (line.contains("input")) { */
continue;
}
When you offer sample input for a question, edit it into the body of the question so it's readable. Putting it in the comments, where it can't be read properly, is just wasting the time of people who are trying to help you.
Bug alert: You are overwriting data:
String [] data = new String[numberOfLines];
then in the loop:
data = reader.readLine().split("(?<=\\d)\\s+|\\s+at\\s+");
So who knows how large it is - depends on the success of the split - but your code relies on it being numberOfLines long.
You need to use different indexes for the line number and the new product objects. If you have 20 lines but 5 of them are "input" then you only have 15 new product objects.
For example:
int prod_count = 0;
for (int i = 0; i < numberOfLines; i++)
{
data = reader.readLine().split("(?<=\\d)\\s+|\\s+at\\s+");
if (data[i].contains("input"))
{
continue;
}
Products[prod_count] = new Product();
Products[prod_count].setName(data[1]);
// etc.
prod_count++; // last thing to do
}
How can I generate the sum of minterms (boolean algebra) in java? We can generate sum of minterms throw ANDing with (X+X'). The following example explains the algorithm for a function with three variables A,B and C:
F(A,B,C)= A + B´*C
= A*(B+B´) + B´*C
= A*B + A*B´ + B´*C
= A*B*(C+C´) + A*B´*(C+C´) + B´*C*(A+A´)
= A*B*C+A*B*C´+A*B´*C+A*B´*C´+B´*C*A+B´*C*A´
= A*B*C+A*B*C´+A*B´*C+A*B´*C´+A*B´*C+A´*B´*C
The method in java looks like this:
String generateSumOfMinterms(String termsOfTheFunction, String variables){}
// Examples for functions with 2 variables A,B
generateSumOfMinterms("A", "A,B"){
//The result should looks like this
return "A*B+A*B'";
}
generateSumOfMinterms("A+B'", "A,B"){
//The result should looks like this (repeated terms are ok for example A*B')
return "A*B+A*B'+A'*B'+A*B'";
}
// Example for a function with 3 variables A,B,C
generateSumOfMinterms("A", "A,B,C"){
//The result should looks like this
return "A*B*C+A*B*C'+A*B'*C+A*B'*C'";
}
I have tried the following:
public List<Minterm> completeMinterm(Minterm minterm, String variables){
List<Minterm> minterms=new ArrayList<Minterm>();
minterms.add(minterm);
Minterm m1=new Minterm();
Minterm m2=new Minterm();
for (int k = 0; k < minterms.size(); k++) {
//A AB--> AB+AB'
for (int i = 0; i < variables.length(); i++) {
boolean varInMinterm=false;
for (int j = 0; j < minterms.get(k).atoms.size(); j++) {
if(minterms.get(k).atoms.get(j).variable==variables.charAt(i)){
varInMinterm=true;
break;
}
}
if(!varInMinterm){
varInMinterm=false;
m1= minterms.get(k);
m1.addAtom(new Atom(variables.charAt(i),false));
m2 = minterms.get(k);
m2.addAtom(new Atom(variables.charAt(i),true));
minterms.remove(k);
minterms.add(m1);
minterms.add(m2);
k=0;
}
}
}
I used eclipse debugger to find errors, I don't understand, why the atom added to m2 is added to m1 too in the same time, when this line is run:
m2.addAtom(new Atom(variables.charAt(i),true));
Here is an outline of a possible approach: First, you should create a more convenient representation of the expression - for example, the expression could be a list of instances of a Minterm class, and Minterm could contain a list of instances of an Atom class, each of which could contain a char that tells which variable it is and a boolean that tells whether the variable is negated or not. The first thing you should do is to loop through termsOfTheFunction and create such objects that represent the expression. Then, you can loop through the minterms, and every time you see a minterm that is missing one variable, you can remove it from the list and add two new minterms with the missing variable. Finally, you can loop through the finished minterms and "print" them to a result String.
Class declarations per request and for clarity (using public fields for brevity):
public class Atom {
public final char variable;
public final bool negated;
public Atom(char variable, bool negated) {
this.variable = variable;
this.negated = negated;
}
}
public class Minterm {
public final List<Atom> atoms = new ArrayList<Atom>();
}
In generateSumOfMinterms():
List<Minterm> expression = new ArrayList<Minterm>();
Minterm currentMinterm = new Minterm();
expression.add(currentMinterm);
Then, loop through the characters of termsOfTheFunction. Each time you see a letter, look at the next character to see if it is a ´, and add an Atom with that letter and with the correct negation. Each time you see a +, create a new Minterm and add it to expression, and keep going. Afterwards, you can start analyzing the minterms and expanding them.
Edit in response to your code: Looks like you're well on your way! The reason both atoms get added to the same minterm is that both m1 and m2 refer to the k'th minterm since you say m1 = minterms.get(k); and m2 = minterms.get(k);. get() does not copy or remove an element from a list; the element will still be inside the list. So for m2, you need to create a new minterm that has all of the atoms from the old one, plus the new atom.