When to use Android's ArrayMap instead of a HashMap? - java

Android have their own implementation of HashMap, which doesnt use Autoboxing and it is somehow better for performance (CPU or RAM)?
https://developer.android.com/reference/android/support/v4/util/ArrayMap.html
From what I read here, I should replace my HashMap objects with ArrayMap objects if I have HashMaps whose size is below hundreds of records and will be frequently written to. And there is no point in replacing my HashMaps with ArrayMaps if they are going to contain hundreds of objects and will be written to once and read frequently. Am I Correct?

You should take a look at this video : https://www.youtube.com/watch?v=ORgucLTtTDI
Perfect situations:
1. small number of items (< 1000) with a lots of accesses or the insertions and deletions are infrequent enough that the overhead of doing so is not really noticed.
2. containers of maps - maps of maps where the submaps tends to have low number of items and often iterate over then a lot of time.

ArrayMap uses way less memory than HashMap and is recommended for up to a few hundred items, especially if the map is not updated frequently. Spending less time allocating and freeing memory may also provide some general performance gains.
Update performance is a bit worse because any insert requires an array copy. Read performance is comparable for a small number of items and uses binary search.

Is there any reason for you to attempt such a replacement?
If it's to improve performance then you have to make measures before and after the replacement and see if the replacements helped.
Probably, not worth of the effort.

Related

What is Difference between HashMap and ArrayMap and which one is faster [duplicate]

Android have their own implementation of HashMap, which doesnt use Autoboxing and it is somehow better for performance (CPU or RAM)?
https://developer.android.com/reference/android/support/v4/util/ArrayMap.html
From what I read here, I should replace my HashMap objects with ArrayMap objects if I have HashMaps whose size is below hundreds of records and will be frequently written to. And there is no point in replacing my HashMaps with ArrayMaps if they are going to contain hundreds of objects and will be written to once and read frequently. Am I Correct?
You should take a look at this video : https://www.youtube.com/watch?v=ORgucLTtTDI
Perfect situations:
1. small number of items (< 1000) with a lots of accesses or the insertions and deletions are infrequent enough that the overhead of doing so is not really noticed.
2. containers of maps - maps of maps where the submaps tends to have low number of items and often iterate over then a lot of time.
ArrayMap uses way less memory than HashMap and is recommended for up to a few hundred items, especially if the map is not updated frequently. Spending less time allocating and freeing memory may also provide some general performance gains.
Update performance is a bit worse because any insert requires an array copy. Read performance is comparable for a small number of items and uses binary search.
Is there any reason for you to attempt such a replacement?
If it's to improve performance then you have to make measures before and after the replacement and see if the replacements helped.
Probably, not worth of the effort.

No definite answer: Which Java Map is the cheapest?

It probably has been asked before but I come across this situation time and time again, that I want store a very small amont of properties that I am absolutely certain will never ever exceed say 20 keys. It seems a complete waste of CPU and memory to use a HashMap with all the overhead to begin with, but also the bad performance calculating an advanced hash value for each key lookup. If there are only <20 keys (probably more like 5 most of the time). I am absolutely certain that calculating a hash value takes hundred times more time than just iterating and comparing ...no?
There is this talk about premature optimization, but I don't totally agree here. I am on Android mostly, and any extra CPU/memory will opt for more juice for other stuff. Not necessarily talking about the consumer market here. The use-case here is very well-defined and doesn't change much, furthermore; it would be trivial to replace a very cheap map with a HashMap in case (something that will never happen) there will be a very large amount of new keys suddenly.
So, my question is; which is the very cheapest, most basic Map I can use in Java?
To all your first paragraph : no ! There won't be a dramatic memory overhead since as far as I know, an HashMap is initialized with 16 buckets and then doubles its size each time it rehashes, so in the worst case you would have 12 exceeding buckets for your map, so this is no big deal.
Concerning the lookup time, it is constant and equivalent to the time of accessing an element of an array, which is always better than looping over O(n) elements (even if n < 20). The only backdrop for HashMap is that it is unsorted, but as far as I am concerned, I consider it the default Map implementation in Java when I have no particular requirement about the order.
To conclude : use HashMap !
If you worry about hashCode() computation time on your keys, consider caching computed values, as, for example, java.lang.String does. See how caching hashcode works in Java as suggested by Joshua Bloch in effective java? question about on that.
Caveat: I suggest you take seriously cautions about premature optimization. For most programmers in most apps, I seriously doubt you need to worry about the performance of your Map. More important is to consider needs of concurrency, iteration-order, and nulls. But since you asked, here is my specific answer.
EnumMap
If your keys are enums, then your very fastest Map implementation will be EnumMap.
Based on a bitmap representing the domain of enum objects, an EnumMap is very fast to execute while using very little memory.
IdentityHashMap
If you are really so concerned about performance, then consider using IdentityHashMap.
This implementation of Map uses reference-equality rather than object-equality. While there is still a hash value involved, it is a hash of the object's address in memory (so to speak, we do not have direct memory access in Java). So the possibly lengthy call to each key object’s own hashCode method is avoided entirely. So performance may be better than a HashMap. You will see constant-time performance for the basic operations (get and put).
Study the documentation carefully to see if you want to take this route. Note the discussion about linear-probe versus chaining for better performance. Be aware that this class partially breaks the Map contract which mandates the use of the equals method when comparing objects. And this map does not provide concurrency.
Here is a table I made to help compare the various Map implementations bundled with Java 11.

Java hashtable or hashmap? [duplicate]

This question already has answers here:
What are the differences between a HashMap and a Hashtable in Java?
(35 answers)
Closed 9 years ago.
I've been researching to find a faster alternative to list. In an algorithm book, hashtable seems to be the fastest using separate chaining. Then I found that java has an implementation of hashtable and from what I read it seems to it uses separate chaining. However, there is the overhead of synchronization so the implementation of hashmap is suggested as a faster alternative to hashtable.
My queations are:
Is java hashmap the fastest data structure implemented in java to
insert/delete/search?
While reading, a few posts had concerns about the memory usage of
hashmap. One post mentioned that an empty hashmap occupy 300
bytes. Is hashtable more memory efficient than hasmap?
Also, is the hash function in each the most efficient for
strings?
There is too much context missing to be able to answer the question which suggests to me that you should use the simplest option and not worry about performance until you have measured that you have a problem.
Is java hashmap the fastest data structure implemented in java to insert/delete/search?
ArrayList is significantly faster than HashMap depending on that you need it for. I have seen people use Maps when they should have used objects. In this case a custom class instance can be 10 faster and smaller.
While reading, a few posts had concerns about the memory usage of hashmap. One post mentioned that an empty hashmap occupy 300 bytes.
Unless you know that 300 bytes (which costs less than what you would be paid on minimum wage to blink) matters, I would assume it doesn't.
Is hashtable more memory efficient than hasmap?
It can be but not enough to matter. Hashtable starts with a smaller size by default. If you make a HashMap with a smaller capacity it will be smaller.
Also, is the hash function in each the most efficient for strings?
In the general case it is efficient enough. In rare cases you may want to change the strategy eg to prevent denial of service attacks. If you really care about memory efficiency and performance perhaps you shouldn't be using String in the first place.
HashMap (or, more likely, HashSet) is probably a good place to start, at your point. It's not perfect, and it does consume more memory than e.g. a list, but it should be your default when you need fast add, remove, and contains operations. The String.hashCode() implementation is not the best hash function, though it is fast, and good enough for most purposes.
The access time of HashMap (& HashTable as well I believe) is O(1) since the internal bucket placement of the given value during put() is determined by computing (Hash of the value's key) % (Total Number of buckets). This O(1) is average access time, if however many keys hash to the same bucket then the access time will tend towards O(n) as all the values are placed into the same bucket grow and they all grow in linked list fashion.
As you said considering the overhead of synchronization inside Hashtable, I would probably opt for Hash map. Besides you can fine tune Hashmap by setting its various params like load factor that offers means of memory optimization. I vote for HashMap...
As you've pointed Hashtable is fully synchronized so it depends on your environment. IF you have many threads then ConcurrentHashMap will be better solution. However you can look at Trove4J - maybe it will better suite your needs. Trove uses chaining hashing similar to hashtables
1.HashMap is only one of the fastest data structures implemented in java to insert/delete/search,HashSet is as fast as HashMap to insert/delete/search,and ArrayList is as fast as HashMap when insert a element to the end.
2.Hashtable is not more memory efficient than HashMap,they are all implemented by separate chaining.
3.Hash function of the two data structures are the same,but you can write a subclass extends them then override the hash function to make it most fit your application.
As others pointed out, a set would be a good replacement for a list but don't forget that lists allow duplicate elements, while sets do not, so while certain operations are faster, e.g., exists, sets and lists represent solutions to different problems.
As a start I recommend HashSet or TreeSet (in case ordering is important). A HashMap maps keys to values which is different. Refer to this discussion to understand the differences between the HashMap and Hashtable. I personally haven't used a Hashtable since 2007.
Finally, if you don't mind using a third party library, I highly recommend to take a look at the Guava immutable collections. Immutability automatically provides thread safety and easier to understand programs.
EDIT: Regarding efficiency concerns, this is a moot point. As a guideline, use the data structure (as in the abstract concept of a data structure) that best fits your problem and choose the vanilla implementation available. If you can prove you have a performance problem in you code, you might start thinking about using something 'more efficient'. That's in quote because it's a very loose definition: are we talking about memory efficiency, computing time efficiency, garbage collection efficiency, etc. Never forget the rules for code optimization.

How are Trove collections more efficient than the standard Java collections?

In an interview recently, I was asked about how HashMap works in Java and I was able to explain it well and explain that in the worst case the HashMap may degenerate into a list due to chaining. I was asked to figure out a way to improve this performance but I was unable to do that during the interview. The interviewer asked me to look up "Trove".
I believe he was pointing to this page. I have read the description provided on that page but still can't figure out how it overcomes the limitations of the java.util.HashMap.
Even a hint would be appreciated. Thanks!!
The key phrase there is open addressing. Instead of hashing to an array of buckets, all the entries are in one big array. When you add an element, if the space for it is already in use you just move down the array to find a free space.
As long as the array is kept sufficiently bigger than the number of entries and the hash function is well distributed it's possible to keep average lookup times small. And by having one array you can get better performance - it's more cache friendly.
However it still has worst-case linear behaviour if (say) every key hashes to the same value, so it doesn't avoid that issue.
It seems to me from the Trove page that there are two main differences that improve performance.
The first is the use of open addressing (http://en.wikipedia.org/wiki/Hash_table#Open_addressing). This doesn't avoid the collision issue, but it does mean that there's no need to create "Entry" objects for every item that goes in the map.
The second important difference is being able to provide your own hash function, which differs from the one provided by the class of the keys. So you could provide a much faster hash function if it made sense to do so.
One advantage of Trove is that it avoids object creation, especially for primitives.
For big hashtables in an embedded java device this can be advantageous due fewer memory consumption.
The other advantage, I saw is the use of custom hash codes / functions without the need to override hashcode(). For a specific data set, and an expert in writing hash functions this can be an advantage.

Should I change from buffered read to in-memory/tokenize on Android app for reading a 100,000 line file?

Currently I am loading a text file that contains 100,000 lines into a SortedMap using buffered reads. Should I abandon this approach and instead load the entire file into memory and then tokenize by line feeds into the SortedMap? Note, I have to parse each line to extract the key and create a per-key supporting object that I then insert into the SortedMap. The file is less than 4MB in size so that fits in line with Android's in-memory file size limitations. I am wondering if it's worth the effort to switch to the in-memory approach or if the speed-up gained just isn't worth it.
Also, would a HashMap be a lot faster than a SortedMap? I only need lookup-by-key and can live without the sorted keys if necessary, but it would be nice to have around. If there is a better structure than what I am using let me know and if you have any Android speed tips related to this issue please mention those too.
-- roschler
It's unclear to me why it would be simpler to load the entire file into memory and then tokenize. Reading a line at a time and parsing it that way is pretty simple, isn't it? While I'm all for loading things all at once when it genuinely makes things simpler, I can't see that it would be significantly easier here.
As for SortedMap vs HashMap - typically a HashMap lookup is O(1) if you don't have many hash collisions, but a SortedMap lookup is only O(log n) if there aren't equal elements. How expensive are comparisions compared with hash computations in your object model? With 100,000 elements you'll have around 16-17 comparisons per lookup. Ultimately, I wouldn't want to guess which will be faster - you should test it, as for all performance options. Look at the memory usage too... I would expect a SortedMap to use less memory, but I could easily be wrong.

Categories

Resources