TreeMap filtered view performance

TreeMap filtered view performance - java

I have a class that has (among other things):
public class TimeSeries {
private final NavigableMap<LocalDate, Double> prices;
public TimeSeries() { prices = new TreeMap<>(); }
private TimeSeries(NavigableMap<LocalDate, Double> prices) {
this.prices = prices;
}
public void add(LocalDate date, double price) { prices.put(date, price); }
public Set<LocalDate> dates() { return prices.keySet(); }
//the 2 methods below are examples of why I need a TreeMap
public double lastPriceAt(LocalDate date) {
Map.Entry<LocalDate, Double> price = prices.floorEntry(date);
return price.getValue(); //after some null checks
}
public TimeSeries between(LocalDate from, LocalDate to) {
return new TimeSeries(this.prices.subMap(from, true, to, true));
}
}
Now I need to have a "filtered" view on the map where only some of the dates are available. To that effect I have added the following method:
public TimeSeries onDates(Set<LocalDate> retainDates) {
TimeSeries filtered = new TimeSeries(new TreeMap<> (this.prices));
filtered.dates().retainAll(retainDates);
return filtered;
}
The onDates method is a huge performance bottleneck, representing 85% of the processing time of the program. And since the program is running millions of simulations, that means hours spent in that method.
How could I improve the performance of that method?

I'd give ImmutableSortedMap a try, assuming you can use it. It's based on a sorted array rather then a balanced tree, so I guess its overhead is much smaller(*). For building it, you need to employ biziclop's idea as the builder supports no removals.
(*) There's a call to Collection.sort there, but it should be harmless as the collection is already sorted and TimSort is optimized for such a case.
In case your original map doesn't change after creating onDates, maybe a view could help. In case it does, you'd need some "persistent" map, which sounds rather complicated.
Maybe some hacky solution based on sorted arrays and binary search could be fastest, maybe you could even convert LocalDate first to int and then to double and put everything into a single interleaved double[] in order to save memory (and hopefully also time). You'd need your own binary search, but this is rather trivial.
The view idea is rather simple, assuming that
you don't need all NavigableMap methods, but just a couple of methods
the original map doesn't change
only a few elements are missing in retainDates
An example method:
public double lastPriceAt(LocalDate date) {
Map.Entry<LocalDate, Double> price = prices.floorEntry(date);
while (!retainDates.contains(price.getKey()) {
price = prices.lowerEntry(price.getKey()); // after some null checks
}
return price.getValue(); // after some null checks
}

The simplest optimisation:
public TimeSeries onDates(Set<LocalDate> retainDates) {
TreeMap<LocalDate, Double> filteredPrices = new TreeMap<>();
for (Entry<LocalDate, Double> entry : prices.entrySet() ) {
if (retainDates.contains( entry.getKey() ) ) {
filteredPrices.put( entry.getKey(), entry.getValue() );
}
}
TimeSeries filtered = new TimeSeries( filteredPrices );
return filtered;
}
Saves you the cost of creating a full copy of your map first, then iterating across the copy again to filter.

Related

How do manipulate list of objects by date parameters in a hashmap?

I have a hashmap with key and object like
HashMap<String,List<Object,> > profileMap= new HashMap<>();
ArrayList eventList = new ArrayList();
for(Profile profile:Plist) {
> profileMap.putIfAbsent(profile.getprofileID(),eventList );
cpToEvent.get(event.getContact_profile()).add(event);
}
Profile object contains information about different events, event date, and profileID associated with that event.
I need to delete the events of the profile where the gap between two events in a profile is more than 1 yrs.
For that, I need to sort the list so that I can calculate the gap between them before deleting them.
How do achieve this?

If you are trying to have the elements in your List sorted, I recommend using a natively existing type such as a "SortedSet" implementation. E.g. a TreeSet
Map<String, TreeSet<Object>> profileMap = new HashMap<>();
This will have you implementing the Comparator Interface in which you can define to sort by Date.
public class Objekt implements Comparator<Objekt> {
#Override
public int compare(Objekt o1, Objekt o2) {
if (o1.getDate().before(o2.getDate())) {
return -1;
} else if (o1.getDate().after(o2.getDate())) {
return 1;
} else {
return 0;
}
}
More on how to implement that here: Compare Object by dates ( implements Comparator)

You can try to iterate over the HashMap item and filter the element with Date that is older than 1 year.
Given the Profile class as below
public class Profile {
private Date createdAt;
public Date getCreatedAt() {
return createdAt;
}
public void setCreatedAt(Date createdAt) {
this.createdAt = createdAt;
}
}
And our List is
HashMap<String, Profile> profiles = new HashMap<>();
Then we can simply do as below to get the list of Map.Entry that matches your requirement.
List<Map.Entry<String, Profile>> matchProfile = profiles.entrySet().stream().filter(item -> item.getValue().getCreatedAt().getYear() > 2015)
.collect(Collectors.toList());

There are several constraints you should have in mind, mostly regarding modifying your existing objects.
The simplest code that processes your items is this:
Map<String, List<Profile>> profileMap= ...;
profileMap.forEach((k, v) -> {
v.sort(Comparator.comparing(Profile::getDate));
// additional processing on "v" here (v is the value in the Map.Entry, i.e. the list of profiles)
});
But the code above modifies the List which exists in your map.
If you need to preserve the existing List as-is, then instead of sorting v, you should create a new List and then process that.
profileMap.entrySet().forEach(e -> {
List<Profile> profiles = new ArrayList<>(e.getValue());
profiles.sort(Comparator.comparing(Profile::getDate));
e.setValue(profiles);
});
The code above modifies the profileMap, it now maps the original keys to new values.
Again, if that is not ok, and you want to preserve the original profileMap entirely, then in the forEach above you need to fill a new Map instead of setValue-ing the existing entries.
Make sure to focus on solving or improving the overall product, not just on a small piece of the processing. Sometimes, the best way to improving a process is to eliminate some parts of it entirely and adjust the remaining pieces.
Why do you have a huge list of events that is both unsorted and containing obsolete entries? Can you sort the events when receiving them? Or when reading them from the database?

Should I sort a hashmap that contains frequency with bucketsort or heapsort?

I have a hashmap in Java in this form HashMap<String, Integer> frequency. The key is a string where I hold the name of a movie and the value is the frequency of the said movie.
My program takes input from users so whenever someone is adding a video to favorite I go in the hashmap and I increment its frequency.
Now the problem is at one point I need to take the most k frequent movies. I've found that I could use bucketsort or heapsort in this leetcode problem (check the first comment), however I am not sure if it is more efficient in my case. My hashmap constantly updates, therefore I need to call the sorting algorithm again times if one frequency changed.
From my understanding, it takes O(N) time to build the map, where 'N' is the number of movies even with duplicates as it needs to add to the frequency, which gets me 'M' unique movie titles. Would that mean that heapsort will result in O(M * log(k)) and bucketsort O(M) for any given k?

Having a map that sorts on values (the thing you map to) isn't a thing, unfortunately. You could instead have a set whose keys sort themselves on frequency, but given that frequency is the key at that point, you couldn't look up entries in this set without knowing the frequency beforehand which eliminates the point of the exercise.
One strategy that comes to mind is to have 2 separate data structures. One serves to let you look up the actual object based on the name of the movie, the other is to be self-sorting:
#Data
public class MovieFrequencyTuple implements Comparable<MovieFrequencyTable> {
#NonNull private final String name;
private int frequency;
public void incrementFrequency() {
frequency++;
}
#Override public int compareTo(MovieFrequencyTuple other) {
int c = Integer.compare(frequency, other.frequency);
if (c != 0) return -c;
return name.compareTo(other.name);
}
}
and with that available to you:
SortedSet<MovieFrequencyTuple> frequencies = new TreeSet<>();
Map<String, MovieFrequencyTuple> movies = new HashMap<>();
public int increment(String movieName) {
MovieFrequencyTuple tuple = movies.get(name);
if (tuple == null) {
tuple = new MovieFrequencyTuple(name);
movies.put(name, tuple);
}
// Self-sorting data structures will just fail
// to do the job if you modify a sorting order on
// an object already in the collection. Thus,
// we take it out, modify, put it back in.
frequencies.remove(tuple);
tuple.incrementFrequency();
frequencies.add(tuple);
return tuple.getFrequency();
}
public int get(String movieName) {
MovieFrequencyTuple tuple = movies.get(movieName);
if (tuple == null) return 0;
return tuple.getFrequency();
}
public List<String> getTop10() {
var out = new ArrayList<String>();
for (MovieFrequencyTuple tuple : frequencies) {
out.add(tuple.getName());
if (out.size() == 10) break;
}
return out;
}
Each operation is amortized O(1) or O(logn), even the top10 operation. So, if you run a million times 'increment a movie's frequency, then obtain the top 10', with n = # of times we do that, then the worst case scenario is O(nlogn) performance.
NB: Uses lombok for constructors, getters, etc - if you don't like that, have your IDE generate these things.

Appending to a list within a stream to a map

I'm attempting to consolidate multiple unnecessary web requests into a map, with the key connected to a location's ID, and the value being a list of products at that location.
The idea is to reduce the amount of requests to my flask server by creating a single request for each location, with a list of required products mapped to it.
I have tried to find others who has faced a similar problem using Java 8's streaming functionality, but I cannot find anyone who is trying to append to a list within a map.
Example;
public class Product {
public Integer productNumber();
public Integer locationNumber();
}
List<Product> products = ... (imagine many products in this list)
Map<Integer, List<Integer>> results = products.stream()
.collect(Collectors.toMap(p -> p.locationNumber, p -> Arrays.asList(p.productNumber));
Also, the second p parameter cannot access the current product in stream.
Because of this, I have been unable to test if I can append to a List when the location number matches a pre-existing list. I don't believe I can use Arrays.asList(), as I believe its immutable.
At the end, the map should have many product numbers in a list per location. Is it possible to append Integers to a pre-existing list within a map?

You may do it like so,
Map<Integer, List<Integer>> res = products.stream()
.collect(Collectors.groupingBy(Product::locationNumber,
Collectors.mapping(Product::productNumber, Collectors.toList())));

The java collectors API is pretty powerful and have lots of nice utility method to solve this.
public class Learn {
static class Product {
final Integer productNumber;
final Integer locationNumber;
Product(Integer productNumber, Integer locationNumber) {
this.productNumber = productNumber;
this.locationNumber = locationNumber;
}
Integer getProductNumber() {
return productNumber;
}
Integer getLocationNumber() {
return locationNumber;
}
}
public static Product of(int i, int j){
return new Product(i,j);
}
public static void main(String[] args) {
List productList = Arrays.asList(of(1,1),of(2,1),of(3,1),
of(7,2),of(8,2),of(9,2));
Map> results = productList.stream().collect(Collectors.groupingBy(Product::getLocationNumber,
Collectors.collectingAndThen(Collectors.toList(), pl->pl.stream().map(Product::getProductNumber).collect(Collectors.toList()))));
System.out.println(results);
}
}
So, what we are doing here is we are streaming the product list and grouping the stream by the location attribute but with the twist that we want to transform the collected list of products to list of product numbers.
Collectors.collectingAndThen is precisely the method for this which will let you specify a main collector toList() and a transformer function which is nothing but again a stream to map product to product numbers. IN java API doc the main collector and transformer are labeled as downstream collector and finisher.
Please go through the Collectors source code to have a complete understanding as to how all these different collectors are defined.

Iterate over two TreeMap at the same time in Java

I have two maps:
Map<Date, List<Journey>> journeyMap = new TreeMap<Date, List<Journey>>
Map<Date, List<Job>> jobMap = new TreeMap<Date, List<Job>>
I used TreeMap because that means they're sorted by date but I want to go through both maps at the same time, get the values of Journey/Job, then do some work.
I think i could use generics, storing the Job/Journey as an Object, then checking the instanceOf but I'm not sure if thats the solution?
Thanks.

Even though the others are right, that there are better, safer and more comfortable ways to achive whatever you want, it is possible to iterate over (the entries of) two Maps (aka Collections) at the same time.
//replace keySet() with your favorite method in for-each-loops
Iterator<Date> journeyIterator = journeyMap.keySet().iterator()
Iterator<Date> jobIterator = jobMap.keySet().iterator();
while(journeyIterator.hasNext() && jobIterator.hasNext()){
Date journeyDate = journeyIter.next()
Date jobDate = jobIterator.next();
//... do whatever you want with the data
}
This code does explicitly, what a for-each-loop can do implicitly for one Collection. It retrieves the Iterator and gets the element from the Collection from it, much like reading a file.

You're making an assumption that these maps are having values sorted in the very same way, but this is definitely not correct. At least if you want to write a logic like this you need to declare the same implementing class as a reference:
TreeMap<Date, List<Journey>> journeyMap = new TreeMap<Date, List<Journey>>
TreeMap<Date, List<Job>> jobMap = new TreeMap<Date, List<Job>>
but believe me you don't want to do it.
You're right! Instead doing 2 maps create 1, holding pair of Job/Journey objects - create a JobJourneyHolder class which holds both objects, this will be a good solution.

Yes, defining a new class for that is definitely the solution, because it composes related objects together, which is very welcomed in OOP. And you should not forget to implement hashCode() and equals() methods to make such classes work properly in Java collections:
public final class JourneyJob {
final Journey journey;
final Job job;
public JourneyJob(Journey journey, Job job) {
if (journey == null || job == null)
throw new NullPointerException();
this.journey = journey;
this.job = job;
}
public int hashCode() {
return Objects.hash(journey, job);
}
public boolean equals(JourneyJob other) {
return other.job.equals(job) && other.journey.equals(journey);
}
}
To add elements to common Map:
Map<Date, List<JourneyJob>> map = new TreeMap<>();
...
if (map.contains(date)) {
map.get(date).add(new JourneyJob(journey, job));
} else {
map.put(date, new ArrayList<>(Arrays.asList(new JourneyJob(journey, job)));
}
...
To retrieve JourneyJob objects:
for (List<JourneyJob> jjList : map.values()) {
for (JourneyJob jj : jjList) {
journey = jj.journey;
job = jj.job;
//... do your work here
}
}
Or, if you use Java 8, this can be done using nested forEach():
map.values().stream().forEach(list ->
list.stream().forEach(jj -> {
Journey journey = jj.journey;
Job job = jj.job;
//... do your work here
})
);

Which collection to use?

What kind of collection should I use if I need to create a collection that will allow me to store books and how many copies there are in circulation (for a library)?
I would use an ArrayList, but I also want to be able to sort the books by order of issue year.

You can create a Book Class with all the attributes you have for a book. And implement a Comparable for that Book Class and write sorting logic in there.
Maintain a List<Book>, and use Collections.sort method, to sort your List according to the implemented Sorting logic.
UPDATE: -
As far as, fast look-up is concerned, a Map is always the best bet. And is appropriate to implement a dictionary look-up kind of structure. For that, you would need some attribute that uniquely identifies each book. And then store your book as Map<String, Book>, where your key might be id of type String.
Also, in this case, your sorting logic will change a little. Now you would have to sort on the basis of your Map's value, i.e. on the basis of attributes of Book.
Here's a sample code you can make use of. I have just considered sorting on the basis of id. You can change the sorting logic as needed: -
class Book {
private int id;
private String title;
public Book() {
}
public Book(int id, String title) {
this.id = id;
this.title = title;
}
#Override
public String toString() {
return "Book[Title:" + this.getTitle() + ", Id:" + this.getId() + "]";
}
// Getters and Setters
}
public class Demo {
public static void main(String[] args) {
final Map<String, Book> map = new HashMap<String, Book>() {
{
put("b1", new Book(3, "abc"));
put("b2", new Book(2, "c"));
}
};
List<Map.Entry<String, Book>> keyList = new LinkedList<Map.Entry<String, Book>>(map.entrySet());
Collections.sort(keyList, new Comparator<Map.Entry<String, Book>>() {
#Override
public int compare(Map.Entry<String, Book> o1, Map.Entry<String, Book> o2) {
return o1.getValue().getId() - o2.getValue().getId();
}
});
Map<String, Book> result = new LinkedHashMap<String, Book>();
for (Iterator<Map.Entry<String, Book>> it = keyList.iterator(); it.hasNext();) {
Map.Entry<String, Book> entry = it.next();
result.put(entry.getKey(), entry.getValue());
}
System.out.println(result);
}
}
OUTPUT: -
"{b2=Book[Title:c, Id:2], b1=Book[Title:abc, Id:3]}"

Well, If the entire purpose of your collection is to store the counts of the books, than a dictionary/map, or whatever java's key-value collection is called.
It would probably have title as your key, and the count as your value.
Now I suspect that your collection might be a little more complicated than that, so you might want to make a Book class which has Count as a field, and then I'd probably have a string -> Book dictionary/map anyway, with the string as it's dewy decimal number or some other unique identifier.

Beyond a simple educational or toy project, you'd want to use a database rather than an in-memory collection. (Not really an answer, but I think worth stating.)

java.util.TreeMap can be used to index and sort this kind of requirements.
Check http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html for more details.
You can use your Book object as key mapped to the number of copies as the value.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

TreeMap filtered view performance - java

Related

How do manipulate list of objects by date parameters in a hashmap?

Should I sort a hashmap that contains frequency with bucketsort or heapsort?

Appending to a list within a stream to a map

Iterate over two TreeMap at the same time in Java

Which collection to use?

Categories

Resources