Finding matching objects in Java - java

I'm currently trying to match 2 objects based on their values. Except, it's not a.a = a.a, but a.a = a.b and a.b = b.a. This means that overriding equals is an option but it's certainly not the right option.
While sorting these objects will make the matching time quicker, the population will be small so it is unnecessary. Also, compareTo isn't exactly right either for the same reason given for equals.
Do I simply make my own method in case? There will be 4 fields to match which is why I am not using an if statement up front.
public boolean isOpposite(Object other) {
return (this.a == other.b) ? true : false;
}
There is also the possibility that the object will implement/extend a base object to take on more fields and implement its own way of matching.
I'm considering using a LinkedList because I know it to be quicker for use than ArrayList, however I've also been considering Maps.
Edit: better explanation of objects
public class Obj {
public String a;
public String b;
public String c;
public double d;
}
The relationships are as follows:
Obj obj1, obj2;
obj1.a == obj2.b //.equals for String of course
obj1.b == obj2.a
obj1.c == obj2.c
obj1.d == obj2.d * -1

Overriding the equals or compareTo is not the right way to go, as you've mentioned. Because there is an assumption that both methods should be transitive, i.e. A eq B and B eq C => A eq C but it doesn't hold for the "opposite" objects. It's good to know, because you can't define a equivalence class and partition it into subsets, but you need to find all the pairs (depending on your use case).
Not sure, what is your goal. If you have some containers with such objects and you need to find all pairs that suffice the condition, then I am afraid you'd need to do n^2 comparisons.
I'll probably create two hash sets, one with the originals and second with the opposites and ask if the second hash set contains the opposite of each member of original hash set.

I've done some testing and determined that the cleanest way I knew how to implement this was with using ArrayList<Obj>.
This was my implementation:
public static List<ObjGroup> getNewSampleGroup(int size) {
List<ObjGroup> sampleGroup = new ArrayList<ObjGroup>();
sampleGroup.add(new ObjGroup((generateNumbers(size, 1)))); //Positives
sampleGroup.add(new ObjGroup((generateNumbers(size, -1)))); //Negatives
return sampleGroup;
}
private static List<Obj> generateNumbers(int size, int x) {
List<Obj> sampleGroup = new ArrayList<Obj>();
for (int i = 0; i < size; i ++) {
Random rand = new Random();
String randC;
String randA;
String randB;
double randD;
if (x == 1) {
randD = rand.nextInt((maxP - minP + 1) + minP);
randA = "aval";// + String.valueOf(rand.nextInt((max - min + 1) + min));
randB = "bval";// + String.valueOf(rand.nextInt((max - min + 1) + min));
randC = "cval";// + String.valueOf(rand.nextInt((max - min + 1) + min));
} else {
randD = rand.nextInt((maxP - minP + 1) + minP) * -1;
randA = "bval";// + String.valueOf(rand.nextInt((max - min + 1) + min));
randB = "aval";// + String.valueOf(rand.nextInt((max - min + 1) + min));
randC = "cval";// + String.valueOf(rand.nextInt((max - min + 1) + min));
}
sampleGroup.add(new Obj(randA, randB, randC, randD));
}
return sampleGroup;
}
public List<ObjGroup> findMatches(List<ObjGroup> unmatchedList) {
List<Obj> pivotPos = unmatchedList.get(0).getObjs(); //First grouping are positives
List<Obj> pivotNeg = unmatchedList.get(1).getObjs(); //Second grouping are negatives
List<ObjGroup> matchedList = new ArrayList<ObjGroup>();
long iterations = 0;
Collections.sort(pivotPos);
Collections.sort(pivotNeg, Collections.reverseOrder());
for (Iterator<Obj> iteratorPos = pivotPos.iterator(); iteratorPos.hasNext();) {
final Obj focus = iteratorPos.next();
iteratorPos.remove(); //Remove this once pulled as you won't match it again.
for (Iterator<Obj> iteratorNeg = pivotNeg.iterator(); iteratorNeg.hasNext();) {
final Obj candidate = iteratorNeg.next();
if (compare(focus, candidate)) {
matchedList.add(new ObjGroup(new ArrayList<Obj>() {
{
add(focus);
add(candidate);
}
}));
iteratorNeg.remove(); //Remove this once matched as you won't match it again.
break;
}
iterations ++;
}
iterations ++;
}
return matchedList;
}
I ran this against a sample size of 4,000,000 psuedo random Obj objects. This was my output:
Starting matching test.
18481512007 iterations.
3979042 matched objects.
10479 unmatched objects.
Processing time: 44 minutes.
There were 1989521 number of matches found.
Closing matching test.

Related

Java Recursion Scope Variable

I was looking for a program to generate balanced parentheses for given n pairs. Link: https://leetcode.com/problems/generate-parentheses/
for the solution, I found that in the below code
public void P(List<String> list, int openB, int closeB, String output, int n) {
if (output.length() == 2 * n) {
list.add(output);
return;
}
if (openB < n) {
String op1 = output + "(";
// openB=openB + 1;
//P(list, openB, closeB, op1, n); using this is giving different output.
P(list, openB + 1, closeB, op1, n);
}
if (closeB < openB) {
String op2 = output + ")";
P(list, openB, closeB + 1, op2, n);
}
}
Here using
openB=openB+1; is giving a different result as compared to passing the value in the method itself
Well, when you pass openB + 1 as an argument, it doesn't change local openB variable. On the other hand, when you do openB = openB + 1 it does change its value, and, since we use it later in method (in closeB < openB branch), program could behave differently.

Optimizing Recursive Function in Java

I'm having problem with optimizing a function AltOper. In set {a, b, c}, there is a given multiplication (or binomial operation, whatsoever) which does not follow associative law. AltOper gets string consists of a, b, c such as "abbac", and calculates any possible answers for the operation, such as ((ab)b)(ac) = c, (a(b(ba)))c = a. AltOper counts every operation (without duplication) which ends with a, b, c, and return it as a triple tuple.
Though this code runs well for small inputs, it takes too much time for bit bulky ones. I tried memoization for some small ones, but apparently it's not enough. Struggling some hours, I finally figured out that its time complexity is basically too large. But I couldn't find any better algorithm for calculating this. Can anyone suggest idea for enhancing (significantly) or rebuilding the code? No need to be specific, but just vague idea would also be helpful.
public long[] AltOper(String str){
long[] triTuple = new long[3]; // result: {number-of-a, number-of-b, number-of-c}
if (str.length() == 1){ // Ending recursion condition
if (str.equals("a")) triTuple[0]++;
else if (str.equals("b")) triTuple[1]++;
else triTuple[2]++;
return triTuple;
}
String left = "";
String right = str;
while (right.length() > 1){
// splitting string into two, by one character each
left = left + right.substring(0, 1);
right = right.substring(1, right.length());
long[] ltemp = AltOper(left);
long[] rtemp = AltOper(right);
// calculating possible answers from left/right split strings
triTuple[0] += ((ltemp[0] + ltemp[1]) * rtemp[2] + ltemp[2] * rtemp[0]);
triTuple[1] += (ltemp[0] * rtemp[0] + (ltemp[0] + ltemp[1]) * rtemp[1]);
triTuple[2] += (ltemp[1] * rtemp[0] + ltemp[2] * (rtemp[1] + rtemp[2]));
}
return triTuple;
}
One comment ahead: I would modify the signature to allow for a binary string operation, so you can easiely modify your "input operation".
java public long[] AltOper(BiFunction<long[], long[], long[]> op, String str) {
I recommend using some sort of lookup table for subportions you have already answered. You hinted that you tried this already:
I tried memoization for some small ones, but apparently it's not enough
I wonder what went wrong, since this is a good idea, especially since your input is strings, which are both quickly hashable and comparable, so putting them in a map is cheap. You just need to ensure, that the map does not block the entire memory by ensuring, that old, unused entries are dropped. Cache-like maps can do this. I leave it to you to find one that suites your personal preferences.
From there, I would run any recursions through the cache check, to find precalculated results in the map. Small substrings that would otherwise be calculated insanely often are then looked up quickly, which cheapens your algorithm drastically.
I rewrote your code a bit, to allow for various inputs (including different operations):
import java.util.Arrays;
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.function.BiFunction;
import org.junit.jupiter.api.Test;
public class StringOpExplorer {
#Test
public void test() {
BiFunction<long[], long[], long[]> op = (left, right) -> {
long[] r = new long[3];
r[0] += ((left[0] + left[1]) * right[2] + left[2] * right[0]);
r[1] += (left[0] * right[0] + (left[0] + left[1]) * right[1]);
r[2] += (left[1] * right[0] + left[2] * (right[1] + right[2]));
return r;
};
long[] result = new StringOpExplorer().opExplore(op, "abcacbabc");
System.out.println(Arrays.toString(result));
}
#SuppressWarnings("serial")
final LinkedHashMap<String, long[]> cache = new LinkedHashMap<String, long[]>() {
#Override
protected boolean removeEldestEntry(final Map.Entry<String, long[]> eldest) {
return size() > 1_000_000;
}
};
public long[] opExplore(BiFunction<long[], long[], long[]> op, String input) {
// if the input is length 1, we return.
int length = input.length();
if (length == 1) {
long[] result = new long[3];
if (input.equals("a")) {
++result[0];
} else if (input.equals("b")) {
++result[1];
} else if (input.equals("c")) {
++result[2];
}
return result;
}
// This will check, if the result is already known.
long[] result = cache.get(input);
if (result == null) {
// This will calculate the result, if it is not yet known.
result = applyOp(op, input);
cache.put(input, result);
}
return result;
}
public long[] applyOp(BiFunction<long[], long[], long[]> op, String input) {
long[] result = new long[3];
int length = input.length();
for (int i = 1; i < length; ++i) {
// This might be easier to read...
String left = input.substring(0, i);
String right = input.substring(i, length);
// Subcalculation.
long[] leftResult = opExplore(op, left);
long[] rightResult = opExplore(op, right);
// apply operation and add result.
long[] operationResult = op.apply(leftResult, rightResult);
for (int d = 0; d < 3; ++d) {
result[d] += operationResult[d];
}
}
return result;
}
}
The idea of the rewrite was to introduce caching and to isolate the operation from the exploration. After all, your algorithm is in itself an operation, but not the 'operation under test'. So now you colud (theoretically) test any operation, by changing the BiFunction parameter.
This result is extremely fast, though I really wonder about the applicability...

Print Tree with 4 nodes (simple forest) for checking a benchmark

I implemented an experimental OOP language and now benchmark garbage collection using a Storage benchmark. Now I want to check/print the following benchmark for small depths (n=2, 3, 4,..).
The tree (forest with 4 subnode) is generated by the buildTreeDepth method. The code is as follows:
import java.util.Arrays;
public final class StorageSimple {
private int count;
private int seed = 74755;
public int randomNext() {
seed = ((seed * 1309) + 13849) & 65535;
return seed;
}
private Object buildTreeDepth(final int depth) {
count++;
if (depth == 1) {
return new Object[randomNext() % 10 + 1];
} else {
Object[] arr = new Object[4];
Arrays.setAll(arr, v -> buildTreeDepth(depth - 1));
return arr;
}
}
public Object benchmark() {
count = 0;
buildTreeDepth(7);
return count;
}
public boolean verifyResult(final Object result) {
return 5461 == (int) result;
}
public static void main(String[] args) {
StorageSimple store = new StorageSimple();
System.out.println("Result: " + store.verifyResult(store.benchmark()));
}
}
Is there a somewhat simple/straight forward way to print the tree generated by buildTreeDepth? Just the short trees of n=3, 4, 5.
As other has already suggested, you may choose some lib to do so. But if you just want a simple algo to test in command line, you may do the following, which I always use when printing tree in command line (write by handle, may have some bug. Believe you can get what this BFS algo works):
queue.add(root);
queue.add(empty);
int count = 1;
while (queue.size() != 1) {
Node poll = queue.poll();
if (poll == empty) {
count = 1;
queue.add(empty);
}
for (Node n : poll.getChildNodes()) {
n.setNodeName(poll.getNodeName(), count++);
queue.add(n);
}
System.out.println(poll.getNodeName());
}
Sample output:
1
1-1 1-2 1-3 1-4
1-1-1 1-1-2 1-1-3 1-2-1 1-2-2 1-3-1 1-3-2 1-4-1
...
And in your case you use array, which seems even easier to print.
Instead of using object arrays, use a List implementation like ArrayList. For an improved better result subclass ArrayList to also hold a 'level' value and add indentation to the toString() method.

How to compare elements of a list with elements of a map in Java?

I have the following code (with some sample data), and wished to check whether there is any better or performant way to compare each element of the list of map to the subsequent one:
import java.util.*;
public class CompareElements {
private static List<Map<String, String>> sample = new ArrayList<>(0);
private static int MIN = 0;
private static int MAX = 10;
static {
populateListOfMaps();
}
/*
* This is the main part of the question, rest is just to generate test data..
*/
public static void main(String[] args){
// Can we simplify this part using lambda's or any library?
for (int i = 0; i < sample.size() -1; i++) {
for (int j = i+1; j < sample.size(); j++) {
Map<String, String> referenceMap = sample.get(i);
Map<String, String> candideMap = sample.get(j);
if(referenceMap.get("key").equalsIgnoreCase(candideMap.get("key"))){
System.out.println("Equal : " + i + " || " + referenceMap.get("key") + " and "+ j + " || " + candideMap.get("key") + " are pairs");
} else {
System.out.println("Not equal : " + i + " || " + referenceMap.get("key") + " and "+ j + " || " + candideMap.get("key") + " are pairs");
}
}
}
}
private static void populateListOfMaps(){
if(sample.size() <= 10){
Map<String, String> someMap = new HashMap<>(0);
someMap.put("key", "value" + randInt(MIN, MAX));
sample.add(someMap);
populateListOfMaps();
}
}
public static int randInt(int min, int max) {
Random rand = new Random();
int randomNum = rand.nextInt((max - min) + 1) + min;
return randomNum;
}
}
My requirement is to compare each element of the list of maps and then check for equality to remove duplicate, this is a simpler part, but each map in my real time application has 2 keys-values (but both are String.. no custom POJO object).
The above code works but I wish to make this more concise and performant code.
Can we use lambdas or streams?
As you are getting data from MongoDB, I assume you have no control over the schema, so using a POJO isn't a simple option. (it can be done with generated code, but you probably don't want to go there)
What you can do is using groupingBy to change this O(n^2) loops into O(n)
public static void main(String... args) {
List<Map<String, String>> sample = populateListOfMaps();
sample.stream()
.collect(Collectors.groupingBy(m -> m.get("key")))
.forEach((key, list) -> System.out.println(key + " : " + list));
}
private static List<Map<String, String>> populateListOfMaps() {
Random rand = new Random();
return IntStream.range(0, 10)
.mapToObj(i -> {
Map<String, String> someMap = new HashMap<>(2);
someMap.put("key", "value-" + rand.nextInt(10));
return someMap;
})
.collect(Collectors.toList());
}
This will print all the entries which have the same "key" value with O(n) time complexity. e.g.
value-9 : [{key=value-9}]
value-8 : [{key=value-8}, {key=value-8}, {key=value-8}]
value-5 : [{key=value-5}]
value-7 : [{key=value-7}, {key=value-7}]
value-1 : [{key=value-1}]
value-0 : [{key=value-0}]
value-2 : [{key=value-2}]
I'm not realy sure what your exact requirements are so to tackle your question one part at a time:
check whether there is any better or performant way to compare each element of the list of map to the subsequent one:
How about using keySets?
Set<String> s1 = new HashSet< String >(referenceMap.values());
Set<String> s2 = new HashSet< String >(candideMap.values());
// Get intersection of values
s1.retainAll(s2);
// You can also get corresponding keys for each value later
This should reduce your complexity from O(n^2) to O(n)
each map in my real time application has 2 keys-values (but both are String.. no custom POJO object).
Not sure what you mean by real-time. Are the maps changing in real time? Neither your solution nor mine would be thread safe.
Do you mean 2 keys-values for each entry? If you mean 2 values for each key, you would probably override the hashcode(), equals() and your code should work.
Let me know if I misunderstood your question

How to create a Mixed Comparator?

I have a table full of values where the first 2 digits are a year the next 3 digits are a value between 0 and 999 and the final 2 characters are 2 alphaOnly characters. A few example values: 0, 0, 99001AG, 99002FG, 54001AG, 54050AB, There are also a few instances where the value is just a 6 digit String SGP4DC. There will be multiple of values that are SGP4DC. The 0's are bad data but I have to account for them for the testing purposes.
Special cases: Because of the two digit year, when sorting descending, launches from 1999 (e.g. 99001A) are always sorted as "greater" than launches from the 2000's (e.g. 06001A). The special handler should ensure that any items between 00 and 56 are sorted as greater than any items between 57 and 99.
Now my sorting goal is to first order by the first 2 digits to take care of the above special case. Then follow that up with the following 3 digits. And finally just a string sort on the last 2 characters. And finally follow that up with the String compare of values that dont start with 2 numerical digits.
Example of expected sorted in ascending order would be
0
0
60001AG
60002FB
42001AG
42002GD
APG4GP
APG4GP
Again note if the 2 leading digits are greater than or equal to 57 it represents 1957-1999. And if the 2 leading digits are less than 57 they represent 2000-2056.
Finally my code. Note I have some bogus data currently in the table with values for 0. Thus I was attempting to make them less than everything else. I do not have the power to remove the 0's so i'm trying to code around them. IE 0's will always show up after the above sorted list.
#Override
public int compare(String o1, String o2) {
if(o1.equals("0") && o2.equals("0")){
return 0;
}
System.out.println("Comparing " + o1 + " and " + o2);
if (o1.length() == 1) {
return -1;
}
if (o2.length() == 1) {
return 1;
}
String o1year = null;
String o2year = null;
Integer obj1year;
Integer obj2year;
if (o1.length() >= 2) {
o1year = o1.substring(0, 2);
}
if (o2.length() >= 2) {
o2year = o2.substring(0, 2);
}
if (isInteger(o1year)) {
if (isInteger(o2year)) {
obj1year = Integer.parseInt(o1year);
obj2year = Integer.parseInt(o2year);
// handles years 2000 - 2056 being greater than anything from
// ##57-##99
if (obj1year < 57 && obj2year > 56) {
return 1;
}
if (obj1year == obj2year) {
int returnValue = compareIncriment(o1, o2);
if(returnValue == 0){
return o1.compareToIgnoreCase(o2);
}
return returnValue;
}
if (obj1year > obj2year) {
return 1;
} else {
return -1;
}
}
return 1;
}
// object 2 starts with a 2 digit year and object 1 didnt
if (isInteger(o2year)) {
return -1;
}
// final return
return o1.compareToIgnoreCase(o2);
}
private int compareIncriment(String o1, String o2) {
// TODO Auto-generated method stub
int inc1;
int inc2;
if(isInteger(o1.substring(2, 4))){
inc1 = Integer.parseInt(o1.substring(2, 4));
}else if(isInteger(o1.substring(2, 3))){
inc1 = Integer.parseInt(o1.substring(2, 3));
}else{
inc1 = Integer.parseInt(o1.substring(2, 2));
}
if(isInteger(o2.substring(2, 4))){
inc2 = Integer.parseInt(o2.substring(2, 4));
}else if(isInteger(o2.substring(2, 3))){
inc2 = Integer.parseInt(o2.substring(2, 3));
}else{
inc2 = Integer.parseInt(o2.substring(2, 2));
}
return inc1 - inc2;
}
Updated Code***
I currently see nothing in my table and i'm getting a Comparison method violates its general contract error.
You should write unit tests for your comparator to discover bugs. You should also factor your code better, because your function is very hard to understand. First, classify the product code into the "0" case, the year-included case, and the no-year case. If the two codes are not in the same class, return the appropriate result.
If they are in the same class, factor out the specific comparisons into separate functions, or even separate Comparators. Having separate Comparators makes them easier to test; separate functions would be harder to justify making public.
I found one bug by looking at the code: for c.compare("0", "0") it returns -1 when it should return 0. Beyond that, it's really hard to tell.

Categories

Resources