Java - Objects comparing algorithm

Java - Objects comparing algorithm - java

I have following decision table:
My task is to compare all objects (S1, S2, S3...) with each other using choosen attributes set (e.g {Distance, Capacity}). So to achieve this I have to create two LOOPS (one nested) and use IF condition.
When objects set is small everything is working fine. But when set is big (e.g 10000 objects) performance of this solution is getting worse...
Is it another, faster, "smarter" way do do it?

Pseudocode:
Step 1. HashMap<String, ArrayList<String>> hashMap = new HashMap<>();
Step 2. For each object s do
String key = getSelectedAttributesValueInString();
if (!hashMap.containsKey(key)) {
hashMap.put(key, new ArrayList<String>());
}
hashMap.get(key).add(s.getName);
Here getSelectedAttributesValueInString is the concatenation of all the Selected Attributes Value.
For example: object s1 Attributes {Distance, Capacity} the function return ShortYES.
Step 3. Now print the hashMap arraylist value that have length greater than 1.
Complexity Analysis:
Your approach O(n^2)
My Approach O(n) (Because HashMap add and get complexity is O(1))

Related

Assert List of Maps in Java ignoring the order

Is there any way where I can assert 2 lists of Maps ignoring the order? thanks
Example:
List<Map<String, Object>> dataList1 = new ArrayList<>();
List<String> headers1 = new ArrayList<>();
headers1.add("Header1");
headers1.add("Header2");
headers1.add("Header3");
Map<String, Object> dataMap1 = new LinkedHashMap<>();
dataMap1.put(headers1.get(0), "testData1");
dataMap1.put(headers1.get(1), "testData2");
dataMap1.put(headers1.get(2), "testData3");
Map<String, Object> dataMap2 = new LinkedHashMap<>();
dataMap2.put(headers1.get(0), "testData4");
dataMap2.put(headers1.get(1), "testData5");
dataMap2.put(headers1.get(2), "testData6");
dataList1.add(dataMap1);
dataList1.add(dataMap2);
List<Map<String, Object>> dataList2 = new ArrayList<>();
List<String> headers3 = new ArrayList<>();
headers3.add("Header1");
headers3.add("Header2");
headers3.add("Header3");
Map<String, Object> dataMap3 = new LinkedHashMap<>();
dataMap3.put(headers3.get(0), "testData1");
dataMap3.put(headers3.get(1), "testData2");
dataMap3.put(headers3.get(2), "testData3");
Map<String, Object> dataMap4 = new LinkedHashMap<>();
dataMap4.put(headers3.get(0), "testData4");
dataMap4.put(headers3.get(1), "testData5");
dataMap4.put(headers3.get(2), "testData6");
dataList2.add(dataMap4);
dataList2.add(dataMap3);
System.out.println(dataList1);
System.out.println(dataList2);
and the results would be:
[{Header1=testData1, Header2=testData2, Header3=testData3}, {Header1=testData4, Header2=testData5, Header3=testData6}]
[{Header1=testData4, Header2=testData5, Header3=testData6}, {Header1=testData1, Header2=testData2, Header3=testData3}]
I want to get a TRUE result since they are actually the same but with different order. thank you in advance!
EDIT:
Just to add. I am trying to check if the 2 lists of maps are equal from 2 different sources (Excel file vs Database data). so there's a chance that the lists of data have duplicates.

That is not what a list is for. If you don't care about the order, you should use a Set instead.
However if the order is important but you just want to make sure they contain the same Maps then you could just convert them to Sets for the assertion:
new HashSet<Map<String, Object>>(dataList1).equals(new HashSet<Map<String, Object>>(dataList2))

In the case you're describing the value lists would rather have "bag" semantics, i.e. sets which contain duplicates.
To compare those you need to write your own comparison logic (or find a library that provides it, e.g. Apache Commons Collections' CollectionUtils#isEqualCollection() or Hamcrest's Matchers#containsInAnyOrder()) since the default methods won't help. The assert would then be something like assertTrue(mapsAreEqual(actual, expected)) or assertEqual(new MapEqualWrapper(actual), new MapEqualWrapper(expected)) where MapEqualWrapper would implement the logic in its equals().
For the check you could sort the lists (or copies of them) and do a traditional comparison or use a frequency map (after other checks of course):
first check the sizes - if they are different the lists aren't equal
build a frequency map for the first list, i.e. increment the value by 1 for each occurence
check the elements in the second list by decreasing the occurences and removing any frequencies that hit 0
if you hit an element that has no entry in the frequency map you can stop already since the bags aren't equal
Sorting and comparing would be easier to implement with just a couple of lines but time complexity would be O(n*log(n)) due to the sorting.
On the other hand, time complexity for using the frequency map would basically be O(n) (iterations are O(n) and map put/get/remove should be O(1) in theory). This, however, shouldn't matter unless you need to quickly compare large lists and thus I'd go with the sort and compare method first.

Two ArrayList into a TreeMap

I've been searching StackOverflow for an answer regarding this issue.
Lets say I created two array lists,
arraylist1 holds Strings
arraylist2 holds Integers.
NOTE - I've added things into both of these arraylists. The values at each of the indices are related to the other value.
Meaning. lets say index 1 of arraylist1 = "Name". Index 1 of arraylist2 = 3, they are related in that I want to put things into a TreeMap (for the purpose of sorting by key) so that the treemap puts in the value ("Name", 3).
My problem -
TreeMap<String, Integer> mymap = new TreeMap<String, Integer>();
for(String s : arraylist1) {
for(Integer v : arraylist2) {
mymap.put(s, v);
The problem with this is if I added a bunch of random things for testing,
arraylist1.add("h");
arraylist1.add("i");
arraylist1.add("e");
arraylist2.add(1);
arraylist2.add(3);
arraylist2.add(2);
And I did the for loops, my result would come out to...
Key e Value: 2
Key h Value: 2
Key i Value: 2
Which solves the problem of sorting by key. However, the problem is that only the last value in the Integer arraylist, arraylist2, is being put into the TreeMap.

Don't you want to iterate through both lists at the same time ?
for (int i = 0; i < list.size(); i++) {
map.put(list1.get(i), list2.get(i));
}
As it currently stands, you're iterating over your second map for each key, and inserting the second list's values. Since a map only holds one value per key, your map results in the final value in your second list for each key.
As an aside, if these 2 values are intrinsically linked, perhaps create (or use an existing) Pair object to store these from the outset. The problem with using a standard map is that you can only store one value per key. e.g. you can't store (A,B) and (A,C)

Well fark, I was typing this in your deleted question before you deleted it, but didn't finish til after you deleted it:
I'm not sure if I fully understand your question, but consider:
Creating a Game class that is in charge of the logic of the game.
Creating Player objects that interact with the Game.
When a Player wants to move, he calls Game's move(Piece piece, Position position) method, but with parameters.
Have this method return a boolean, true if the move is valid and false if not...

Data structure for holding sets of interchangeable strings

I have a set of strings. Out of these, groups of 2 or more may represent the same thing. These groups should be stored in a way that given any member of the group, you can fetch other members of the group with high efficiency.
So given this initial set: ["a","b1","b2","c1","c2","c3"] the result structure should be something like ["a",["b1","b2"],["c1","c2","c3"]] and Fetch("b") should return ["b1","b2"].
Is there a specific data structure and/or algorithm for this purpose?
EDIT: "b1" and "b2" are not actual strings, they're indicating that the 2 belong to the same group. Otherwise a Trie would be a perfect fit.

I may be misinterpreting the initial problem setup, but I believe that there is a simple and elegant solution to this problem using off-the-shelf data structures. The idea is, at a high level, to create a map from strings to sets of strings. Each key in the map will be associated with the set of strings that it's equal to. Assuming that each string in a group is mapped to the same set of strings, this can be done time- and space-efficiently.
The algorithm would probably look like this:
Construct a map M from strings to sets of strings.
Group all strings together that are equal to one another (this step depends on how the strings and groups are specified).
For each cluster:
Create a canonical set of the strings in that cluster.
Add each string to the map as a key whose value is the canonical set.
This algorithm and the resulting data structure is quite efficient. Assuming that you already know the clusters in advance, this process (using a trie as the implementation of the map and a simple list as the data structure for the sets) requires you to visit each character of each input string exactly twice - once when inserting it into the trie and once when adding it to the set of strings equal to it, assuming that you're making a deep copy. This is therefore an O(n) algorithm.
Moreover, lookup is quite fast - to find the set of strings equal to some string, just walk the trie to find the string, look up the associated set of strings, then iterate over it. This takes O(L + k) time, where L is the length of the string and k is the number of matches.
Hope this helps, and let me know if I've misinterpreted the problem statement!

Since this is Java, I would use a HashMap<String, Set<String>>. This would map each string to its equivalence set (which would contain that string and all others that belong to the same group). How you would construct the equivalence sets from the input depends on how you define "equivalent". If the inputs are in order by group (but not actually grouped), and if you had a predicate implemented to test equivalence, you could do something like this:
boolean differentGroups(String a, String b) {
// equivalence test (must handle a == null)
}
Map<String, Set<String>> makeMap(ArrayList<String> input) {
Map<String, Set<String>> map = new HashMap<String, Set<String>>();
String representative = null;
Set<String> group;
for (String next : input) {
if (differentGroups(representative, next)) {
representative = next;
group = new HashSet<String>();
}
group.add(next);
map.put(next, group);
}
return map;
}
Note that this only works if the groups are contiguous elements in the input. If they aren't you'll need more complex bookkeeping to build the group structure.

Get List common value count

I have two ArrayList<Long> with huge size about 5,00,000 in each. I have tried using for loop which usage list.contains(object), but it takes too much time. I have tried by splitting one list and comparing in multiple threads but no effective result found.
I need the no. of elements that are same in both list.
Any optimized way?

Let l1 be the first list and l2 the second list. In Big O notation, that runs in O(l1*l2)
Another approach could be to insert one list into a HashSet, then for all other elements in the other list test if it exist in the HashSet. This would give roughly 2*l1+l2 -> O(l1+l2)

Have you considered putting you elements into a HashSet instead? This would make the lookups much faster. This would of course only work if you don't have duplicates.
If you have duplicates you could construct HashMap that has the value as the key and the count as the value.

General mechanism would be to sort both lists and then iterate the sorted lists looking for matches.

A list isn't a efficient data structure when you have much elements, you have to use a data structure more efficent when you search a element.
For example an tree or a hashmap!

Let us assume that list one has m elements and list two has n elements , m>n. If elements are not numerically ordered , it seems that they are not , total number of comparison steps - that is the cost of the method - factor mxn - n^2/2. In this case cost factor is about 50000x49999.
Keeping both lists ordered will be the optimal solution. If lists are ordered , cost of comparison of these will be factor m. In this case that is about 50000. This optimal result will be achieved , when both of lists are iterated via two cursor. This method can be represented in code as follows :
int i=0,j=0;
int count=0;
while(i<List1.size() && j<List2.size())
{
if(List1[i]==List2[j])
{
count++;
i++;
}
else if(List1[i]<List2[j])
i++;
else
j++;
}
If it is possible for you to keep lists ordered all the time , this method will make difference. Also I consider that it is not possible split and compare unless lists are ordered.

nth item of hashmap

HashMap selections = new HashMap<Integer, Float>();
How can i get the Integer key of the 3rd smaller value of Float in all HashMap?
Edit
im using the HashMap for this
for (InflatedRunner runner : prices.getRunners()) {
for (InflatedMarketPrices.InflatedPrice price : runner.getLayPrices()) {
if (price.getDepth() == 1) {
selections.put(new Integer(runner.getSelectionId()), new Float(price.getPrice()));
}
}
}
i need the runner of the 3rd smaller price with depth 1
maybe i should implement this in another way?

Michael Mrozek nails it with his question if you're using HashMap right: this is highly atypical scenario for HashMap. That said, you can do something like this:
get the Set<Map.Entry<K,V>> from the HashMap<K,V>.entrySet().
addAll to List<Map.Entry<K,V>>
Collections.sort the list with a custom Comparator<Map.Entry<K,V>> that sorts based on V.
If you just need the 3rd Map.Entry<K,V> only, then a O(N) selection algorithm may suffice.
//after edit
It looks like selection should really be a SortedMap<Float, InflatedRunner>. You should look at java.util.TreeMap.
Here's an example of how TreeMap can be used to get the 3rd lowest key:
TreeMap<Integer,String> map = new TreeMap<Integer,String>();
map.put(33, "Three");
map.put(44, "Four");
map.put(11, "One");
map.put(22, "Two");
int thirdKey = map.higherKey(map.higherKey(map.firstKey()));
System.out.println(thirdKey); // prints "33"
Also note how I take advantage of Java's auto-boxing/unboxing feature between int and Integer. I noticed that you used new Integer and new Float in your original code; this is unnecessary.
//another edit
It should be noted that if you have multiple InflatedRunner with the same price, only one will be kept. If this is a problem, and you want to keep all runners, then you can do one of a few things:
If you really need a multi-map (one key can map to multiple values), then you can:
have TreeMap<Float,Set<InflatedRunner>>
Use MultiMap from Google Collections
If you don't need the map functionality, then just have a List<RunnerPricePair> (sorry, I'm not familiar with the domain to name it appropriately), where RunnerPricePair implements Comparable<RunnerPricePair> that compares on prices. You can just add all the pairs to the list, then either:
Collections.sort the list and get the 3rd pair
Use O(N) selection algorithm

Are you sure you're using hashmaps right? They're used to quickly lookup a value given a key; it's highly unusual to sort the values and then try to find a corresponding key. If anything, you should be mapping the float to the int, so you could at least sort the float keys and get the integer value of the third smallest that way

You have to do it in steps:
Get the Collection<V> of values from the Map
Sort the values
Choose the index of the nth smallest
Think about how you want to handle ties.

You could do it with the google collections BiMap, assuming that the Floats are unique.

If you regularly need to get the key of the nth item, consider:
using a TreeMap, which efficiently keeps keys in sorted order
then using a double map (i.e. one TreeMap mapping integer > float, the other mapping float > integer)
You have to weigh up the inelegance and potential risk of bugs from needing to maintain two maps with the scalability benefit of having a structure that efficiently keeps the keys in order.
You may need to think about two keys mapping to the same float...
P.S. Forgot to mention: if this is an occasional function, and you just need to find the nth largest item of a large number of items, you could consider implementing a selection algorithm (effectively, you do a sort, but don't actually bother sorting subparts of the list that you realise you don't need to sort because their order makes no difference to the position of the item you're looking for).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.