Best complexity for synchronizing a map with a collection

Best complexity for synchronizing a map with a collection - java

I have the following problem for which I would like a decent solution.
I have a HashMap that contains some objects in the form of String(email) and object(Person).
This map is populated via a collection via a method updatePersonList(Collection list) as described below:
Every time a new collection is received via the above method the map will basically add all the elements from the collection to the map. That is all the map needs, the latest collection. What is not in the collection should be discarded from the map.
Now, I want to know how can I update efficiently the map because, as it can be read above it is possible to have the following scenarios :
1. Some objects can be found in both the map and the collection, therefore, only the new objects from the collection should be kept and not all.
2. Objects that are in the map but are not in the collection should be removed.
What is the best solution in terms of complexity?
After some investigation I came with the remove of all the objects from the map and add the ones from the collection. If someone knows something better would be nice if it can be shared.

You will never get better than O(n+m) where n is the size of your Collection and m is the size of your Map because you will always need to read at least both ones.
So in O-notation you could simply erase the hole Map and create a new one.
But in reality the constant might be not so unimportant and you also may want to reduce garbage collection. In this cases it might make sense to iterate through both and only delete the needed entries from the Map and add the new elements from the Collection to the Map.
But only profiling will tell you if you gained anything for that effort.

According to my point of view Use Treemap instead of list to improve your iteration performance i.e (Treemap will take only unique values) so that newly inserted objects will identify automatically
After succesfully design treemap don't remove markers on the map instead add news markers which will define in treemap
i hope it will help

If list has often the same content as map you can check it before cleaning the map and adding new entries.

Related

Is there a LinkedHashMap-like class where the entry that is PUT last is also the last in order?

In a LinkedHashMap the entries are by default sorted by insertion order. The construction parameter "accessOrder" allows to change the sorting such that the last entry is the one that was ACCESSED last.
However, I do need a map where the sorting only changes when entries are added or overwritten via PUT (put, putAll, ...). Furthermore, the given map should directly provide a reverse iterator, i.e. without having to create a reversed array of the map's keySet to iterate over.
Does anyone know an existing Java class that provides what I'm searching for?
PS: It would be great if the map would allow concurrent modifications, but that is not essential as I can also synchronize the access manually.

Hibernate: Why SET is better than List for *to many relationship

I am learning hibernate with JPA.
With one to many relationship, I got a issue of lazy initialization. Then I changed fetch type as EAGER, it was showing "can not fetch multiple bags".
Then I changed my List into Set. I wonder its working fine.
But I want to know the reason, why set is better than list.
Kindly explain me to understand the functionality of set and List.

Set is a collection that cannot contain duplicate elements.
List is an ordered collection and can contain duplicate elements. You can access any element from it’s index. List is more like array with dynamic length.
Lookup from a HashSet is constant time O(1), when lookup from ArrayList will take time O(n). So HashSet performance will become more reliable than ArrayList. For more detail you can read it here for list and here for set.

Efficient update of sorted JavaFX ObservableList

I have a Java ObservableList with thousands of entries that receives hundreds of updates a second backing a JavaFX TableView.
The ObservableList is backed by an ArrayList. Arbitrary sort orders can be applied to the list. The updates may change the sort order of a single entity in the list. I have performance issues if I try to preform a sort after each update, so currently I have a background task that performs a sort every second. However, I'd like to try to sort in real time if possible.
Assuming that the list is already sorted and I know the index of the element to change, is there a more efficient way to update the index of the element than calling sort on the list again?
I've already determined I can use Collections.binarySearch() to efficiently find the index of the element to update. Is there also a way I can efficiently find the index the updated element needs to move to and shift the ArrayList so it remains in order?
I also need to handle add and remove operations, but those are far less common.

Regarding your answer, FXCollections.sort() should be even faster because it handles the FX-Properties better and is specifically written for ObservableLists.

I would use a TreeSet. It can update the order with O(log N) time complexity whereas ArrayList will do an insertion sort with O(n) per entry.

A few suggestions when dealing with sorting on a JavaFX ObservableList/TableView combo:
Ensure your model class includes Property accessors.
Due to a weird quirk in the JavaFX 2.2 implementation that is not present in JavaFX 8+, TableView is far less efficient when dealing with large data models that do not have property accessors than it is when dealing with those that do include property accessor functions. See JavaFx tableview sort is really slow how to improve sort speed as in java swing for more details.
Perform bulk changes on the ObservableList.
Each time you modify an ObservableList that is being observed, the list change listeners on the list are fired to communicate the permutations of the change to the observers. By reducing the number of modifications you make on the list, you can cut down on the number of change events which occur and hence on the overhead of observer notification and processing.
An example technique for this might be to keep a mirror copy of the list data in a standard non-observable list, sort that data, then set that sorted data into the observable list in a single operation.
To avoid premature optimization issues, only do this sort of optimization if the operation is initially slow and the optimization itself provides a significant measurable improvement.
Don't update the ObservableList more often than necessary.
JavaFX display framerate is capped, by default, at 60fps. Updating visible components more often than once a pulse (a frame render trigger) is unnecessary, so batch up all of your changes for each pulse.
For example, if you have a new record coming in every millisecond, collate all records that come in every 20 milliseconds and apply those changes all at once.
To avoid premature optimization issues, only do this sort of optimization if the operation is initially slow and the optimization itself provides a significant measurable improvement.
Java 8 contains some new classes to assist in using sorted content in tables.
I don't really know how the TableView sorting function and SortList work in Java 8. You can request that Oracle write a tutorial with samples and best practices for the Java 8 TableView sort feature by emailing jfx-docs-feedback_ww#oracle.com
For further reference, see the javadoc:
the sorting section of the JavaFX 8 TableView javadoc.
new SortEvent class.
SortedList class.

What is not quite clear is if you need the list to be sorted the whole time? If you sort it just in order to retrieve and update your entries quicker, you can do that faster using a HashMap. You can create a HashMap<YourClass, YourClass> if you implement a proper hashCode() and equals() method on the key fields in the class. If you only need to output a sorted list occasionally, also implement the Comparable<YourClass> interface and just create a TreeSet<YourClass>( map.keySet() ), that will create a sorted representation while the data in your HashMap stays in place. If you need it sorted always, you can consider to use TreeMap<YourClass,YourClass> instead of a HashMap. Maps are easier than Sets because they provide a way to retrieve the objects.

After some research, I concluded that Collections.sort() is pretty fast, even for 1 item. I haven't found a more efficient way than to update the item in the list and just call sort. I can't use a TreeSet since the the TableView relies on the List interface and I'd have to rebuild the TreeSet every time the sort order is changed.
I found that I could update at 60 FPS by using a Timer or KeyFrame and still have reasonable performance. I haven't found a better solution without upgrading to JavaFX 8.

You could pull the element out of the array list and insert (in sorted order) the updated element.

Algorithm to store Item-to-Item-Associations

I need some help to store some data efficiently. I have a large list of objects (about 100.000) and want to store associations between this items with a coefficient. Not all items are associated, in fact I have something about 1 Mio. Associations. I need fast access to these associations when referencing by the two items. What I did is a structure like that:
Map<Item, Map<Item, Float>>
I tried this with HashMap and Hashtable. Both work fine and is fast enough. My problem is, that all that Maps create a lot of overhead in memory, concrete for the given scenario more than 300 MB. Is there a Map-Implementation with less footprint? Is there maybe a better algorithm to store that kind of data?

Here are some ideas:
Store in a Map<Pair<Item,Item>,Float>. If you are worried about allocating a new Pair for each lookup, and your code is synchronized, you can keep a single lookup Pair instance.
Loosen the outer map to be Map<Item, ?>. The value can be a simple {Item,Float} tuple for the first association, a small tuple array for a small number of associations, then promote to a full fledged Map.
Use Commons Collections' Flat3Map for the inner maps.
If you are in tight control of the Items, and Item equivalence is referential (i.e. each Item instance is not equal to any other Item instance, then you can number each instance. Since you are talking about < 2 billion instances, a single Long will represent an Item pair with some bit manipulation. Then the map gets much smaller if you use Trove's TLongObjectHashMap

You have two options.
1) Reduce what you're storing.
If your data is calculable, using a WeakHashMap will allow the garbage collector to remove members. You will probably want to decorate it with a mechanism that calculates lost or absent key/value pairs on the fly. This is basically a cache.
Another possibility that might trim a relatively tiny amount of RAM is to instruct your JVM to use compressed object pointers. That may save you about 3 MB with your current data size.
2) Expand your capacity.
I'm not sure what your constraint is (run-time memory on a desktop, serialization, etc.) but you can either expand the heapsize and deal with it, or you can push it out of process. With all those "NoSQL" stores out there, one will probably fit your needs. Or, an indexed db table can be quite fast. If you're looking for a simple key-value store, Voldemort is extremely easy to set up and integrate.
However, I don't know what you're doing with your working set. Can you give more details? Are you performing aggregations, partitioning, cluster analysis, etc.? Where are you running into trouble?

What is the best way to deal with collections (lists or sets) in key-value storage?

I wonder what can be an effective way to add/remove items from a really large list when your storage is memcached-like? Maybe there is some distributed storage with Java interface that deals with this problem well?
Someone may recommend Terracotta. I know about it, but that's not exactly what I need. ;)

Hazelcast 1.6 will have distributed implementation MultiMap, where a key can be associated with a set of values.
MultiMap<String, String> multimap = Hazelcast.getMultiMap ("mymultimap");
multimap.put ("1", "a");
multimap.put ("1", "b");
multimap.put ("1", "c");
multimap.put ("2", "x");
multimap.put ("2", "y");
Collection<String> values = multimap.get("1"); //containing a,b,c
Hazelcast is an open source transactional, distributed/partitioned implementation of queue, topic, map, set, list, lock and executor service. It is super easy to work with; just add hazelcast.jar into your classpath and start coding. Almost no configuration is required.
Hazelcast is released under Apache license and enterprise grade support is also available. Code is hosted at Google Code.

Maybe you should also have a look at Scalaris!

You can use a key-value store to model most data structures if you ignore concurrency issues. Your requirements aren't entirely clear, so I'm going to make some assumptions about your use case. Hopefully if they are incorrect you can generalize the approach.
You can trivially create a linked list in the storage by having a known root (let's call it 'node_root') node which points to a value tuple of {data, prev_key, next_key}. The prev_key and next_key elements are key names which should follow the convention 'node_foo' where foo is a UUID (ideally you can generate these sequentially, if not you can use some other type of UUID). This provides ordered access to your data.
Now if you need O(1) removal of a key, you can add a second index on the structure with key 'data' and value 'node_foo' for the right foo. Then you can perform the removal just as you would a linked list in memory. Remove the index node when you're done.
Now, keep in mind that concurrent modification of this list is just as bad as concurrent modification of any shared data structure. If you're using something like BDBs, you can use their (excellent) transaction support to avoid this. For something without transactions or concurrency control, you'll want to provide external locking or serialize accesses to a single thread.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.