Sorting by, which is better - hashmap, treemap, custom implementation - java

I have an ArrayList of Subjects and they have parameter Date, I want to group them into sorted array (maybe not array, but still sorted structure) of Day objects, so every Day will have parameter Date and object will contain only subjects with this date. So the thing I wanna do is somehow group them by date and then get them. I saw implementations of grouping by using HashMap, but then I have grouped structure but not sorted, so after that I should convert this to ArrayList for example. Or maybe I should use TreeMap, which will do the same but give me back sorted structure, or maybe best way is simply write my own sorter which will get ArrayList<Subject> and return ArrayList<Day>. Also I can use LinkedHashMap which will work too
So now I have no idea what is better and what should I choose? Important thing is that most likely I will not put new values or delete values from structure, I will only get them.
UPD: If I use map then Date will be key and Day object will be value.
By saying "get them" I meant iterate through them.
All this I'm doing in order to fill my UI elements with this info so most likely I will not search something in my structure later

Here's what I think you are asking for, but hopefully my answer can help even if it's not exactly it:
fast lookup using a Day as the key
the result of that lookup should be sorted (i.e. multiple times of the same day are ordered)
the possibility to see all subjects sorted by their Day
Here's one option. Use a Map that associates a Day to a sorted list of Subjects, so Map<Day, List<Subject>>. Since you don't need to add to it, you can build your mapping at the start and then sort it before you do any lookups. Here's an outline:
Map<Day, List<Subject>> buildMap(List<Subject> subjects) {
Map<Day, List<Subject>> map = new HashMap<Day, List<Subject>>();
// create a list of subjects for each day
for (Subject subject : subjects) {
if (!map.containsKey(subject.getDate().getDay())) {
map.put(subject.getDate().getDay(), new ArrayList<Subject>());
}
map.get(subject.getDate().getDay()).add(subject);
}
// go through and sort everything now that you have grouped them
for (Day day : map.keySet()) {
Collections.sort(map.get(day));
}
return map;
}
If you also need to be able to 'get' every entry sorted throughout the map, you could maintain a sorted list of days. Like so:
List<Day> buildSortedDaysList(Map<Day, List<Subject>> map) {
List<Day> sortedDays = new ArrayList<Day>(map.keySet());
// again, many ways to sort, but I assume Day implements Comparable
Collections.sort(sortedDays);
return sortedDays;
}
You could then wrap it in a class, of which I recommend you create a better name:
class SortedMapThing {
Map<Day, List<Subject>> map;
List<Day> orderedDays;
SortedMapThing(List<Subject> subjects) {
map = buildMap(subjects);
orderedDays = buildSortedDaysList(map);
}
List<Subject> getSubject(Day day) {
return map.get(day);
}
List<Subject> getAllSubjects() {
List<Subject> subjects = new ArrayList<Subject>();
for (Day day : orderedDays) {
subjects.addAll(map.get(day));
}
return subjects;
}
}
This implementation puts the work up front and gives you efficient lookup speed. If I misunderstood your question slightly, you should be able to adjust it accordingly. If I misunderstood your question entirely...I will be sad. Cheers!

Related

Efficient ways to traverse and group similar objects from a huge collection

I am currently working towards on an implementation that basically involves attending to an arraylist of objects, say a 1000, find commonalities in their properties and group them.
For example
ArrayList itemList<CustomJaxbObj> = {Obj1,obj2,....objn} //n can reach to 1000
Object attributes - year of registration, location, amount
Grouping criteria - for objects with same year of reg and location...add the amount
If there are 10 Objects, out of which 8 objects have same loc and year of registration, add amount for all 8 and other 2 whose year of reg and loc match. So at the end of operation I am left with 2 objects. 1 which is a total sum of 8 matched objects and 1 which is a total of 2 matched criteria of objects.
Currently I am using dual traditional loops. Advanced loops are better but they dont offer much control over indices, which I need to perform grouping. It allows me to keep track of which individual entries combined to form a new entry of grouped entries.
for (i = 0; i < objlist.size(); i++) {
for(j = i+1; j< objList.size();j++){
//PErform the check with if/else condition and traverse the whole list
}
}
Although this does the job, looked very inefficient and process heavy. Is there a better way to do this. I have seen other answers which asked me to use Java8 streams, but the operations are complex, hence grouping needs to be done. I have given an example of doing something when there is a match but there is more to it than just adding.
Is there a better approach to this? A better data structure to hold data of this kind which makes searching and grouping easier?
Adding more perspective, apologies for not furnishing this info before.
The arraylist is a collection of jaxb objects from an incoming payload xml.
XML heirarchy
<Item>
<Item1>
<Item-Loc/>
<ItemID>
<Item-YearofReg/>
<Item-Details>
<ItemID/>
<Item-RefurbishMentDate>
<ItemRefurbLoc/>
</Item-Details>
</Item1>
<Item2></Item2>
<Item3></Item3>
....
</Item>
So the Jaxb Object of Item has a list of 900-1000 Items. Each item might have a sub section of ItemDetails which has a refurbishment date.The problem I face is, dual loops work fine when there is no Item Details section, and every item can be traversed and checked. Requirement says if the item has been refurbished, then we overlook its year of reg and instead consider year of refurbishment to match the criteria.
Another point is, Item Details need not belong to same Item in the section, that is Item1's item details can come up in Item2 Item Details section, item id is the field using which we map the correct item to its item details.
This would mean I cannot start making changes unless I have read through the complete list. Something a normal for loop would do it, but it would increase the cyclomatic complexity, which has already increased because of dual loops.
Hence the question, which would need a data structure to first store and analyse the list of objects before performing the grouping.
Apologies for not mentioning this before. My first question in stackoverflow, hence the inexperience.
Not 100% sure what your end goal is but here is something to get you started. to group by the two properties, you can do something like:
Map<String, Map<Integer, List<MyObjectType>>> map = itemList.stream()
.collect(Collectors.groupingBy(MyObjectType::getLoc,
Collectors.groupingBy(MyObjectType::getYear)));
The solution above assumes getLoc is a type String and getYear is a type Integer, you can then perform further stream operations to get the sum you want.
You can use hash to add the amounts of elements having same year of registration and location
You can use Collectors.groupingBy(classifier, downstream) with Collectors.summingInt as the downstream collector. You didn't post the class of the objects so I took the leave to define my own. But the idea is similar. I also used AbstractMap.SimpleEntry as the key to the final map.
import java.util.AbstractMap;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
public class GroupByYearAndLoc {
static class Node {
private Integer year;
private String loc;
private int value;
Node(final Integer year, final String loc, final int value) {
this.year = year;
this.loc = loc;
this.value = value;
}
}
public static void main(String[] args) {
List<Node> nodes = new ArrayList<>();
nodes.add(new Node(2017, "A", 10));
nodes.add(new Node(2017, "A", 12));
nodes.add(new Node(2017, "B", 13));
nodes.add(new Node(2016, "A", 10));
Map<AbstractMap.SimpleEntry<Integer, String>, Integer> sums = nodes.stream()
// group by year and location, then sum the value.
.collect(Collectors.groupingBy(n-> new AbstractMap.SimpleEntry<>(n.year, n.loc), Collectors.summingInt(x->x.value)));
sums.forEach((k, v)->{
System.out.printf("(%d, %s) = %d\n", k.getKey(), k.getValue(), v);
});
}
}
And the output:
(2017, A) = 22
(2016, A) = 10
(2017, B) = 13
I would make "Year+Location" concatenated be the key in a hashmap, and then let that map hold whatever is associated with each unique key. Then you can just have one "for loop" (not nested looping). That's the simplest approach.

Fastest and optimized way to search for value in a List<T>

I have a List<Person> persons = new ArrayList<Person>(), the size of this list is 100+. I want to check whether a particular personID object is contained in this list or not. Currently I am doing it in this way :
for(Person person : persons)
{
for(Long pid : listOfIDs)
{
if(person.personid == pid)
{
// do somthing
}
else
{
// do somthing
}
} // end of inner for
}
But I don't want to traverse through the persons list for each element in listOfIDs. I thought of taking HashMap of Person with personid as the key and Person object as value. So that I can only traverse through listOfIDs and check for contains()
Is there any other way to do it?
Your implementation with nested loops will not scale well if the lists get long. The number of operations you will do is the product of the length of the two lists.
If at least one of your lists is sorted by ID, you can use binary search. This will be an improvement over nested loops.
Building a Map is a good idea and will scale well. Using this technique, you will iterate over the list of Persons once to build the map and then iterate over the list of IDs once to do the lookups. Make sure that you initialize the size of the HashMap with the number of Persons (so you don't have to rehash as you put the Persons into the Map). This is a very scalable option and does not require that either list be sorted.
If BOTH lists happen to be sorted by ID, you have another attractive alternative: jointly walk down the two lists. You will start at the beginning of both lists and move forward in the list with the smallest ID. If the IDs are equal, then you do your business logic for having found the person with that ID and step forward in both lists. As soon as you get to the end of either list, you are done.
Java's Collections provides a binary search which is very fast but it assumes you are searching for a member of the list. You could implement your own using your ID criteria:
Collections.sort(persons, (p1, p2) -> p1.personID - p2.personID);
if (binarySearch(persons, id)) {
...
}
boolean binarySearch(List<Person> personList, Long id) {
if (personList.empty())
return false;
long indexToTest = personList.size() / 2;
long idToTest = personList.get(indexToTest).personID;
if (idToTest < id)
return binarySearch(personList.subList(indexToTest + 1, personList.size());
else if (idToTest > id)
return binarySearch(personList.subList(0, indexToTest));
else
return true;
}
If you don't want to sort your list then you could copy it to a sorted list and search on that: for large lists that would still be much faster than iterating through it. In fact that's pretty similar to keeping a separate hash map (though a hash map could be faster).
If you must iterate, then you can at least use a parallel stream to take advantage of multiple cores if you have them:
if (persons.parallelStream().anyMatch(p -> p.personID == id)) {
...
}

ArrayList Retrieve object by Id

Suppose I have an ArrayList<Account> of my custom Objects which is very simple. For example:
class Account
{
public String Name;
public Integer Id;
}
I want to retrieve the particular Account object based on an Id parameter in many parts of my application. What would be best way of going about this ?
I was thinking of extending ArrayList but I am sure there must be better way.
It sounds like what you really want to use is a Map, which allows you to retrieve values based on a key. If you stick to ArrayList, your only option is to iterate through the whole list and search for the object.
Something like:
for(Account account : accountsList) {
if(account.getId().equals(someId) {
//found it!
}
}
versus
accountsMap.get(someId)
This sort of operation is O(1) in a Map, vs O(n) in a List.
I was thinking of extending ArrayList but I am sure there must be
better way.
Generally speaking, this is poor design. Read Effective Java Item 16 for a better understanding as to why - or check out this article.
Java Solution:
Account account = accountList.stream().filter(a -> a.getId() == YOUR_ID).collect(Collectors.toList()).get(0);
Kotlin Solution 1:
val index = accountList.indexOfFirst { it.id == YOUR_ID }
val account = accountList[index]
Kotlin Solution 2:
val account = accountList.first { it.id == YOUR_ID }
A better way to do this would be to use a Map.
In your case, you could implement it in the following way
Map<account.getId(), account>
you can use the "get" method to retrieve the appropriate account object.
accountMap.get(id);
Assuming that it is an unordered list, you will need to iterate over the list and check each object.
for(int i = 0; i < sizeOfList; i++) {
list.get(i).equals(/* What you compare against */)
}
There's also the other for syntax:
for(Account a : accountList)
You could put this loop into a helper method that takes an Account and compares it against each item.
For ordered lists, you have more efficient search options, but you will need to implement a search no matter what.
You must use the Map for example:
private Map<String, int> AccountMap;
for (String account : accounts )
AccountMap.put(account, numberofid);
ArrayList does not sort the elements contained. If you want to look for a single element in an ArrayList, you're going to need to loop through the list and compare each one to the value you're looking for.
Account foundAccount;
for(Account a : accountList){
if(a.Id == targetID){
foundAccount = a;
break;
}
}
if(foundAccount != null){
//handle foundAccount
}
else{
//not found
}
Alternatively, you can use a more intelligent data structure which does sort and keep information on the data contianed.
You'll want to research the Map interface, specifically the HashMap implementation. This lets you store each element in an order tied to a certain key. So you could place each of your objects in a HashMap with the Id as the key, and then you can directly ask the HashMap if it has an object of a certain key or not.
Extending ArrayList is almost never a good solution to your problem. This is a base Java implementation of List, which allows you to store objects in a specific order, and retrieve them by their index.
If you want to be able to index elements using an unique identifier, you may have a look into Map, and its implementation HashMap.
It could help you to solve your problem, by using a Map<Integer, Account>.
Inserting objects: map.put(id, account) instead of list.add(account)
Retrieving objects: map.get(id)
This will be the fastest implementation. But, if you cannot change this, you can still iterate through your ArrayList and find the right account:
for (Account acc : accounts) {
if (acc.getId() == yourId) {
return acc;
}
}
throw new NoSuchElementException();

Sort and dedupe java collections

I want to achieve the following, I have a collection of dates in a list form which I want deduped and sorted. I'm using collections.sort to sort the list in ascending date order and then using a treeSet to copy and dedupe elements from the list. This is a 2 shot approach ? Is there a faster, 1 step approach ?
EDIT::
Metadata
{
String name;
Date sourceDate;
}
Basically I want to order Metadata object based on the sourceDate and dedupe it too.
You can skip the Collections#sort step: TreeSet will remove duplicates and sort the entries. So basically it is a one line operation:
Set<Date> sortedWithoutDupes = new TreeSet<Date> (yourList);
If the Date is a field in your object, you can either:
have your object implement Comparable and compare objects based on their date
or pass a Comparator<YourObject> as an argument to the TreeSet constructor, that sorts your objects by date
In both cases, you don't need to pre-sort your list.
IMPORTANT NOTE:
TreeSet uses compareTo to compare keys. So if 2 keys have the same date but different names, you should make sure that your compare or compareTo method returns a non-0 value, otherwise the 2 objects will be considered equal and only one will be inserted.
EDIT
The code could look like this (not tested + you should handle nulls):
Comparator<Metadata> comparator = new Comparator<Metadata>() {
#Override
public int compare(Metadata o1, Metadata o2) {
if (o1.sourceDate.equals(o2.sourceDate)) {
return o1.name.compareTo(o2.name);
} else {
return o1.sourceDate.compareTo(o2.sourceDate);
}
}
};
Set<Metadata> sortedWithoutDupes = new TreeSet<Metadata> (comparator);
sortedWithoutDupes.addAll(yourList);
TreeSet will automatically sort its elements, so you shouldn't need to sort the list before adding to the set.

Implementing search based on 2 fields in a java class

I am trying to present a simplified version of my requirement here for ease of understanding.
I have this class
public class MyClass {
private byte[] data1;
private byte[] data2;
private long hash1; // Hash value for data1
private long hash2; // Hash value for data2
// getter and setters }
Now I need to search between 2 List instances of this class, find how many hash1's match between the 2 instances and for all matches how many corresponding hash2's match. The 2 list will have about 10 million objects of MyClass.
Now I am planning to iterate over first list and search in the second one. Is there a way I can optimize the search by sorting or ordering in any particular way? Should I sort both list or only 1?
Best solution would be to iterate there is no faster solution than this. You can create Hashmap and take advantage that map does not add same key but then it has its own creation overload
sort only second, iterate over first and do binary search in second, sort O(nlogn) and binary search for n item O(nlogn)
or use hashset for second, iterate over first and search in second, O(n)
If you have to check all the elements, I think you should iterate over the first list and have a Hashmap for the second one as said AmitD.
You just have to correctly override equals and hashcode in your MyClass class. Finally, I will recomend you to use basic types as much as possible. For example, for the first list, instead of a list will be better to use a simple array.
Also, at the beginning you could select which of the two lists is the shorter one (if there's a difference in the size) and iterate over that one.
I think you should create a hashmap for one of the lists (say list1) -
Map<Long, MyClass> map = new HashMap<Long, MyClass>(list1.size());//specify the capacity
//populate map like - put(myClass.getHash1(), myClass) : for each element in the list
Now just iterate through the second list (there is no point in sorting both) -
int hash1MatchCount = 0;
int hash2MatchCount = 0;
for(MyClass myClass : list2) {
MyClass mc = map.get(myClass.getHash1());
if(mc != null) {
hash1MatchCount++;
if(myClass.getHash2() == mc.getHash2) {
hash2MatchCount++;
}
}
}
Note: Assuming that there is no problem regarding hash1 being duplicates.

Categories

Resources