TreeMap or HashMap faster [duplicate]

TreeMap or HashMap faster [duplicate] - java

This question already has answers here:
Difference between HashMap, LinkedHashMap and TreeMap
(17 answers)
What is the difference between a HashMap and a TreeMap? [duplicate]
(8 answers)
Closed 8 years ago.
I am writing an dictionary that make heavily use of String as key in Map<String, Index>. What I concern is which one of HashMap and TreeMap will result in better (faster) performance in searching a key in the map?

Given that there are not many collissions hashmaps will give you o(1) performance (with a lot of colissions this can degrade to potentially O(n) where N is the number of entries (colissions) in any single bucket). TreeMaps on the other hand are used if you want to have some sort of balanced tree structure which yields O(logN) retrieval. So it really depends on your particular use-case. But if you just want to access elements, irrespective of their order use HashMap

public class MapsInvestigation {
public static HashMap<String, String> hashMap = new HashMap<String, String>();
public static TreeMap<String, String> treeMap = new TreeMap<String, String>();
public static ArrayList<String> list = new ArrayList<String>();
static {
for (int i = 0; i < 10000; i++) {
list.add(Integer.toString(i, 16));
}
}
public static void main(String[] args) {
System.out.println("Warmup populate");
for (int i = 0; i < 1000; i++) {
populateSet(hashMap);
populateSet(treeMap);
}
measureTimeToPopulate(hashMap, "HashMap", 1000);
measureTimeToPopulate(treeMap, "TreeMap", 1000);
System.out.println("Warmup get");
for (int i = 0; i < 1000; i++) {
get(hashMap);
get(treeMap);
}
measureTimeToContains(hashMap, "HashMap", 1000);
measureTimeToContains(treeMap, "TreeMap", 1000);
}
private static void get(Map<String, String> map) {
for (String s : list) {
map.get(s);
}
}
private static void populateSet(Map<String, String> map) {
map.clear();
for (String s : list) {
map.put(s, s);
}
}
private static void measureTimeToPopulate(Map<String, String> map, String setName, int reps) {
long start = System.currentTimeMillis();
for (int i = 0; i < reps; i++) {
populateSet(map);
}
long finish = System.currentTimeMillis();
System.out.println("Time to populate " + (reps * map.size()) + " entries in a " + setName + ": " + (finish - start));
}
private static void measureTimeToContains(Map<String, String> map, String setName, int reps) {
long start = System.currentTimeMillis();
for (int i = 0; i < reps; i++) {
get(map);
}
long finish = System.currentTimeMillis();
System.out.println("Time to get() " + (reps * map.size()) + " entries in a " + setName + ": " + (finish - start));
}
}
Gives these results:
Warmup populate
Time to populate 10000000 entries in a HashMap: 230
Time to populate 10000000 entries in a TreeMap: 1995
Warmup get
Time to get() 10000000 entries in a HashMap: 140
Time to get() 10000000 entries in a TreeMap: 1164

HashMap is O(1) (usually) for access; TreeMap is O(log n) (guaranteed).
This assumes that your key objects are immutable and have properly written equals and hashCode methods. See Joshua Bloch's "Effective Java" chapter 3 for how to override equals and hashCode correctly.

a HashMap is O(1) average, so it is supposed to be faster, and for large maps will probably have better throughput.
However, a HashMap requires rehashing when Load Balance become too high. rehashing is O(n), so at any time of the program's life, you may suffer unexpectedly performance loss due to rehash, which might be critical in some apps [high latency]. So think twice before using HashMap if latency is an issue!
a HashMap is also vulnerable to poor hashing functions, which might cause O(n), if many items in use are hashed into the same place.

HashMap is faster. However if you would often need to process your dictionary in alphabetical order, you would be better off with the TreeMap since you would otherwise need to sort all your words every time you need to process them in alphabetical order.
For your application HashMap is the better choice since I doubt you will need the alphabetically sorted list often, if ever.

Related

Are Java concurrent collections performance tips documented anywhere (e.g., for ConcurrentHashMap, calling `get()` before `putIfAbsent()`) [duplicate]

I am aggregating multiple values for keys in a multi-threaded environment. The keys are not known in advance. I thought I would do something like this:
class Aggregator {
protected ConcurrentHashMap<String, List<String>> entries =
new ConcurrentHashMap<String, List<String>>();
public Aggregator() {}
public void record(String key, String value) {
List<String> newList =
Collections.synchronizedList(new ArrayList<String>());
List<String> existingList = entries.putIfAbsent(key, newList);
List<String> values = existingList == null ? newList : existingList;
values.add(value);
}
}
The problem I see is that every time this method runs, I need to create a new instance of an ArrayList, which I then throw away (in most cases). This seems like unjustified abuse of the garbage collector. Is there a better, thread-safe way of initializing this kind of a structure without having to synchronize the record method? I am somewhat surprised by the decision to have the putIfAbsent method not return the newly-created element, and by the lack of a way to defer instantiation unless it is called for (so to speak).

Java 8 introduced an API to cater for this exact problem, making a 1-line solution:
public void record(String key, String value) {
entries.computeIfAbsent(key, k -> Collections.synchronizedList(new ArrayList<String>())).add(value);
}
For Java 7:
public void record(String key, String value) {
List<String> values = entries.get(key);
if (values == null) {
entries.putIfAbsent(key, Collections.synchronizedList(new ArrayList<String>()));
// At this point, there will definitely be a list for the key.
// We don't know or care which thread's new object is in there, so:
values = entries.get(key);
}
values.add(value);
}
This is the standard code pattern when populating a ConcurrentHashMap.
The special method putIfAbsent(K, V)) will either put your value object in, or if another thread got before you, then it will ignore your value object. Either way, after the call to putIfAbsent(K, V)), get(key) is guaranteed to be consistent between threads and therefore the above code is threadsafe.
The only wasted overhead is if some other thread adds a new entry at the same time for the same key: You may end up throwing away the newly created value, but that only happens if there is not already an entry and there's a race that your thread loses, which would typically be rare.

As of Java-8 you can create Multi Maps using the following pattern:
public void record(String key, String value) {
entries.computeIfAbsent(key,
k -> Collections.synchronizedList(new ArrayList<String>()))
.add(value);
}
The ConcurrentHashMap documentation (not the general contract) specifies that the ArrayList will only be created once for each key, at the slight initial cost of delaying updates while the ArrayList is being created for a new key:
http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html#computeIfAbsent-K-java.util.function.Function-

In the end, I implemented a slight modification of #Bohemian's answer. His proposed solution overwrites the values variable with the putIfAbsent call, which creates the same problem I had before. The code that seems to work looks like this:
public void record(String key, String value) {
List<String> values = entries.get(key);
if (values == null) {
values = Collections.synchronizedList(new ArrayList<String>());
List<String> values2 = entries.putIfAbsent(key, values);
if (values2 != null)
values = values2;
}
values.add(value);
}
It's not as elegant as I'd like, but it's better than the original that creates a new ArrayList instance at every call.

Created two versions based on Gene's answer
public static <K,V> void putIfAbsetMultiValue(ConcurrentHashMap<K,List<V>> entries, K key, V value) {
List<V> values = entries.get(key);
if (values == null) {
values = Collections.synchronizedList(new ArrayList<V>());
List<V> values2 = entries.putIfAbsent(key, values);
if (values2 != null)
values = values2;
}
values.add(value);
}
public static <K,V> void putIfAbsetMultiValueSet(ConcurrentMap<K,Set<V>> entries, K key, V value) {
Set<V> values = entries.get(key);
if (values == null) {
values = Collections.synchronizedSet(new HashSet<V>());
Set<V> values2 = entries.putIfAbsent(key, values);
if (values2 != null)
values = values2;
}
values.add(value);
}
It works well

This is a problem I also looked for an answer. The method putIfAbsent does not actually solve the extra object creation problem, it just makes sure that one of those objects doesn't replace another. But the race conditions among threads can cause multiple object instantiation. I could find 3 solutions for this problem (And I would follow this order of preference):
1- If you are on Java 8, the best way to achieve this is probably the new computeIfAbsent method of ConcurrentMap. You just need to give it a computation function which will be executed synchronously (at least for the ConcurrentHashMap implementation). Example:
private final ConcurrentMap<String, List<String>> entries =
new ConcurrentHashMap<String, List<String>>();
public void method1(String key, String value) {
entries.computeIfAbsent(key, s -> new ArrayList<String>())
.add(value);
}
This is from the javadoc of ConcurrentHashMap.computeIfAbsent:
If the specified key is not already associated with a value, attempts
to compute its value using the given mapping function and enters it
into this map unless null. The entire method invocation is performed
atomically, so the function is applied at most once per key. Some
attempted update operations on this map by other threads may be
blocked while computation is in progress, so the computation should be
short and simple, and must not attempt to update any other mappings of
this map.
2- If you cannot use Java 8, you can use Guava's LoadingCache, which is thread-safe. You define a load function to it (just like the compute function above), and you can be sure that it'll be called synchronously. Example:
private final LoadingCache<String, List<String>> entries = CacheBuilder.newBuilder()
.build(new CacheLoader<String, List<String>>() {
#Override
public List<String> load(String s) throws Exception {
return new ArrayList<String>();
}
});
public void method2(String key, String value) {
entries.getUnchecked(key).add(value);
}
3- If you cannot use Guava either, you can always synchronise manually and do a double-checked locking. Example:
private final ConcurrentMap<String, List<String>> entries =
new ConcurrentHashMap<String, List<String>>();
public void method3(String key, String value) {
List<String> existing = entries.get(key);
if (existing != null) {
existing.add(value);
} else {
synchronized (entries) {
List<String> existingSynchronized = entries.get(key);
if (existingSynchronized != null) {
existingSynchronized.add(value);
} else {
List<String> newList = new ArrayList<>();
newList.add(value);
entries.put(key, newList);
}
}
}
}
I made an example implementation of all those 3 methods and additionally, the non-synchronized method, which causes extra object creation: http://pastebin.com/qZ4DUjTr

Waste of memory (also GC etc.) that Empty Array list creation problem is handled with Java 1.7.40. Don't worry about creating empty arraylist.
Reference : http://javarevisited.blogspot.com.tr/2014/07/java-optimization-empty-arraylist-and-Hashmap-cost-less-memory-jdk-17040-update.html

The approach with putIfAbsent has the fastest execution time, it is from 2 to 50 times faster than the "lambda" approach in evironments with high contention. The Lambda isn't the reason behind this "powerloss", the issue is the compulsory synchronisation inside of computeIfAbsent prior to the Java-9 optimisations.
the benchmark:
import java.util.Random;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;
public class ConcurrentHashMapTest {
private final static int numberOfRuns = 1000000;
private final static int numberOfThreads = Runtime.getRuntime().availableProcessors();
private final static int keysSize = 10;
private final static String[] strings = new String[keysSize];
static {
for (int n = 0; n < keysSize; n++) {
strings[n] = "" + (char) ('A' + n);
}
}
public static void main(String[] args) throws InterruptedException {
for (int n = 0; n < 20; n++) {
testPutIfAbsent();
testComputeIfAbsentLamda();
}
}
private static void testPutIfAbsent() throws InterruptedException {
final AtomicLong totalTime = new AtomicLong();
final ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<String, AtomicInteger>();
final Random random = new Random();
ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
for (int i = 0; i < numberOfThreads; i++) {
executorService.execute(new Runnable() {
#Override
public void run() {
long start, end;
for (int n = 0; n < numberOfRuns; n++) {
String s = strings[random.nextInt(strings.length)];
start = System.nanoTime();
AtomicInteger count = map.get(s);
if (count == null) {
count = new AtomicInteger(0);
AtomicInteger prevCount = map.putIfAbsent(s, count);
if (prevCount != null) {
count = prevCount;
}
}
count.incrementAndGet();
end = System.nanoTime();
totalTime.addAndGet(end - start);
}
}
});
}
executorService.shutdown();
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
+ " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
}
private static void testComputeIfAbsentLamda() throws InterruptedException {
final AtomicLong totalTime = new AtomicLong();
final ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<String, AtomicInteger>();
final Random random = new Random();
ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
for (int i = 0; i < numberOfThreads; i++) {
executorService.execute(new Runnable() {
#Override
public void run() {
long start, end;
for (int n = 0; n < numberOfRuns; n++) {
String s = strings[random.nextInt(strings.length)];
start = System.nanoTime();
AtomicInteger count = map.computeIfAbsent(s, (k) -> new AtomicInteger(0));
count.incrementAndGet();
end = System.nanoTime();
totalTime.addAndGet(end - start);
}
}
});
}
executorService.shutdown();
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
+ " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
}
}
The results:
Test testPutIfAbsent average time per run: 115.756501 ns
Test testComputeIfAbsentLamda average time per run: 276.9667055 ns
Test testPutIfAbsent average time per run: 134.2332435 ns
Test testComputeIfAbsentLamda average time per run: 223.222063625 ns
Test testPutIfAbsent average time per run: 119.968893625 ns
Test testComputeIfAbsentLamda average time per run: 216.707419875 ns
Test testPutIfAbsent average time per run: 116.173902375 ns
Test testComputeIfAbsentLamda average time per run: 215.632467375 ns
Test testPutIfAbsent average time per run: 112.21422775 ns
Test testComputeIfAbsentLamda average time per run: 210.29563725 ns
Test testPutIfAbsent average time per run: 120.50643475 ns
Test testComputeIfAbsentLamda average time per run: 200.79536475 ns

Genetic Algorithm using Roulette Wheel Selection

I'm trying to create different selection methods for a genetic algorithm I'm working on but one problem I meet in all selection methods is that my fitness of each node must be different. This is a problem for me as my fitness calculator is quite basic and will yield several identical fitness's
public static Map<String, Double> calculateRouletteSelection(Map<String, Double> population) {
String[] keys = new String[population.size()];
Double[] values = new Double[population.size()];
Double[] unsortedValues = new Double[population.size()];
int index = 0;
for(Map.Entry<String, Double> mapEntry : population.entrySet()) {
keys[index] = mapEntry.getKey();
values[index] = mapEntry.getValue();
unsortedValues[index] = mapEntry.getValue();
index++;
}
Arrays.sort(values);
ArrayList<Integer> numbers = new ArrayList<>();
while(numbers.size() < values.length/2) {
int random = rnd.nextInt(values.length);
if (!numbers.contains(random)) {
numbers.add(random);
}
}
HashMap<String, Double> finalHashMap = new HashMap<>();
for(int i = 0; i<numbers.size(); i++) {
for(int j = 0; j<values.length; j++) {
if(values[numbers.get(i)] == unsortedValues[j]) {
finalHashMap.put(keys[j], unsortedValues[j]);
}
}
}
return finalHashMap;
}
90% of all my different selection methods are the same so I'm sure if I could solve it for one I can solve it for all.
Any help on what I'm doing wrong would be appreciated
EDIT: I saw I'm meant to post the general behavior of what's happening so essentially the method takes in a HashMap<>, sorts the values based on their fitness, picks half sorted values randomly and adds these to a new HashMap<> with their corresponding chromosomes.

I think you'd be much better off using collection classes.
List<Map.Entry<String, Double>> sorted = new ArrayList<>(population.entrySet());
// sort by fitness
Collections.sort(sorted, Comparator.comparing(Map.Entry::getValue));
Set<Integer> usedIndices = new HashSet<>(); // keep track of used indices
Map<String, Double> result = new HashMap<>();
while (result.size() < sorted.size()/2) {
int index = rnd.nextInt(sorted.size());
if (!usedIndices.add(index)) {
continue; // was already used
}
Map.Entry<String,Double> survivor = sorted.get(index);
result.put(survivor.getKey(), survivor.getValue());
}
return result;
But, as Sergey stated, I don't believe this is what you need for your algorithm; you do need to favor the individuals with higher fitness.

As mentioned in the comments, in roulette wheel selection order is not important, only weights are. A roulette wheel is like a pie chart with different sections occupying different portions of the disk, but in the end they all sum up to unit area (the area of the disk).
I'm not sure if there is an equivalent in Java, but in C++ you have std::discrete_distribution. It generates a distribution [0,n) which you initialise with weights representing the probability of each of those integers being picked. So what I normally do is have the IDs of my agents in an array and their corresponding fitness values in another array. Order is not important as long as indices match. I pass the array of fitness values to the discrete distribution, which returns an integer interpretable as an array index. I then use that integer to select the individual from the other array.

Elegant solution for string-counting?

The problem I have is an example of something I've seen often. I have a series of strings (one string per line, lets say) as input, and all I need to do is return how many times each string has appeared. What is the most elegant way to solve this, without using a trie or other string-specific structure? The solution I've used in the past has been to use a hashtable-esque collection of custom-made (String, integer) objects that implements Comparable to keep track of how many times each string has appeared, but this method seems clunky for several reasons:
1) This method requires the creation of a comparable function which is identical to the String's.compareTo().
2) The impression that I get is that I'm misusing TreeSet, which has been my collection of choice. Updating the counter for a given string requires checking to see if the object is in the set, removing the object, updating the object, and then reinserting it. This seems wrong.
Is there a more clever way to solve this problem? Perhaps there is a better Collections interface I could use to solve this problem?
Thanks.

One posibility can be:
public class Counter {
public int count = 1;
}
public void count(String[] values) {
Map<String, Counter> stringMap = new HashMap<String, Counter>();
for (String value : values) {
Counter count = stringMap.get(value);
if (count != null) {
count.count++;
} else {
stringMap.put(value, new Counter());
}
}
}
In this way you still need to keep a map but at least you don't need to regenerate the entry every time you match a new string, you can access the Counter class, which is a wrapper of integer and increase the value by one, optimizing the access to the array

TreeMap is much better for this problem, or better yet, Guava's Multiset.
To use a TreeMap, you'd use something like
Map<String, Integer> map = new TreeMap<>();
for (String word : words) {
Integer count = map.get(word);
if (count == null) {
map.put(word, 1);
} else {
map.put(word, count + 1);
}
}
// print out each word and each count:
for (Map.Entry<String, Integer> entry : map.entrySet()) {
System.out.printf("Word: %s Count: %d%n", entry.getKey(), entry.getValue());
}
Integer theCount = map.get("the");
if (theCount == null) {
theCount = 0;
}
System.out.println(theCount); // number of times "the" appeared, or null
Multiset would be much simpler than that; you'd just write
Multiset<String> multiset = TreeMultiset.create();
for (String word : words) {
multiset.add(word);
}
for (Multiset.Entry<String> entry : multiset.entrySet()) {
System.out.printf("Word: %s Count: %d%n", entry.getElement(), entry.getCount());
}
System.out.println(multiset.count("the")); // number of times "the" appeared

You can use a hash-map (no need to "create a comparable function"):
Map<String,Integer> count(String[] strings)
{
Map<String,Integer> map = new HashMap<String,Integer>();
for (String key : strings)
{
Integer value = map.get(key);
if (value == null)
map.put(key,1);
else
map.put(key,value+1);
}
return map;
}
Here is how you can use this method in order to print (for example) the string-count of your input:
Map<String,Integer> map = count(input);
for (String key : map.keySet())
System.out.println(key+" "+map.get(key));

You can use a Bag data structure from the Apache Commons Collection, like the HashBag.
A Bag does exactly what you need: It keeps track of how often an element got added to the collections.
HashBag<String> bag = new HashBag<>();
bag.add("foo");
bag.add("foo");
bag.getCount("foo"); // 2

How can I update Java hashmap values by previous value

This question is a bit more complex that the title states.
What I am trying to do is store a map of {Object:Item} for a game where the Object represents a cupboard and the Item represents the content of the cupboard (i.e the item inside).
Essentially what I need to do is update the values of the items in a clockwise (positive) rotation; though I do NOT want to modify the list in any way after it is created, only shift the positions of the values + 1.
I am currently doing almost all That I need, however, there are more Object's than Item's so I use null types to represent empty cupboards. However, when I run my code, the map is being modified (likely as it's in the for loop) and in turn, elements are being overwritten incorrectly which after A while may leave me with a list full of nulls (and empty cupboards)
What I have so far...
private static Map<Integer, Integer> cupboardItems = new HashMap<Integer, Integer>();
private static Map<Integer, Integer> rewardPrices = new HashMap<Integer, Integer>();
private static final int[] objects = { 10783, 10785, 10787, 10789, 10791, 10793, 10795, 10797 };
private static final int[] rewards = { 6893, 6894, 6895, 6896, 6897 };
static {
int reward = rewards[0];
for (int i = 0; i < objects.length; i++) {
if (reward > rewards[rewards.length - 1])
cupboardItems.put(objects[i], null);
else
cupboardItems.put(objects[i], reward);
reward++;
}
}
// updates the items in the cupboards in clockwise rotation.
for (int i = 0; i < cupboardItems.size(); i++) {
if (objects[i] == objects[objects.length - 2])
cupboardItems.put(objects[i], cupboardItems.get(objects[0]));
else if (objects[i] == objects[objects.length - 1])
cupboardItems.put(objects[i], cupboardItems.get(objects[1]));
else
cupboardItems.put(objects[i], cupboardItems.get(objects[i + 2]));
}
So how may I modify my code to update so i get the following results..
======
k1:v1
k2:v2
k3:v3
k4:none
=======
k1:none
k2:v1
k3:v2
k4:v3
?

HashMap doesn't guarantee ordering, therefore if you need ordering, use ArrayList or LinkedList.
If you want to stick with HashMap, you need to sort the HashMap based on the key before each rotation. You can sort easily since the keys are Integer objects. But this will affect the performace.

Ragavan has a good answer if you want to stick to your approach. However, you are doing a lot of work to just rotate the items. It would be much more efficient to just rotate the index (using modulus) and keep the arrays the same:
final static List<Integer> objects = new ArrayList<Integer>(
Arrays.asList(10783, 10785, 10787, 10789, 10791, 10793, 10795, 10797));
final static List<Integer> rewards = new ArrayList<Integer>(
Arrays.asList(6893, 6894, 6895, 6896, 6897, -1, -1, -1));
public static int getReward(int obj, int rot){
int rotIndex = (objects.indexOf(obj) - rot)%objects.size();
//modulus in java can be negative
rotIndex = rotIndex < 0 ? rotIndex+objects.size():rotIndex;
return rewards.get(rotIndex);
}
public static void main(String... args){
//This should give 6897, which is the reward for obj 10783 after 4 rotations
System.out.println(getReward(10783,4));
}

Using ConcurrentHashMap for parallelism

I've a program where I am trying to understand thread parallelism. This program deals with coin-flips and counts the number of heads and tails (and the total number of coin flips).
Please see the following code:
import java.util.Random;
import java.util.concurrent.ConcurrentHashMap;
public class CoinFlip{
// main
public static void main (String[] args) {
if (args.length != 2){
System.out.println("CoinFlip #threads #iterations");
return;
}
// check if arguments are integers
int numberOfThreads = 0;
long iterations = 0;
try{
numberOfThreads = Integer.parseInt(args[0]);
iterations = Long.parseLong(args[1]);
}catch(NumberFormatException e){
System.out.println("error: I asked for numbers mate.");
System.out.println("error: " + e);
System.exit(1);
}
// ------------------------------
// set time field
// ------------------------------
// create a hashmap
ConcurrentHashMap <String, Long> universalMap = new ConcurrentHashMap <String, Long> ();
// store count for heads, tails and iterations
universalMap.put("HEADS", new Long(0));
universalMap.put("TAILS", new Long(0));
universalMap.put("ITERATIONS", new Long(0));
long startTime = System.currentTimeMillis();
Thread[] doFlip = new Thread[numberOfThreads];
for (int i = 0; i < numberOfThreads; i ++){
doFlip[i] = new Thread( new DoFlip(iterations/numberOfThreads, universalMap));
doFlip[i].start();
}
for (int i = 0; i < numberOfThreads; i++){
try{
doFlip[i].join();
}catch(InterruptedException e){
System.out.println(e);
}
}
// log time taken to accomplish task
long elapsedTime = System.currentTimeMillis() - startTime;
System.out.println("Runtime:" + elapsedTime);
// print the output to check if the values are legal
// iterations = heads + tails = args[1]
System.out.println(
universalMap.get("HEADS") + " " +
universalMap.get("TAILS") + " " +
universalMap.get("ITERATIONS") + "."
);
return;
}
private static class DoFlip implements Runnable{
// local counters for heads/tails/count
long heads = 0, tails = 0, iterations = 0;
Random randomHT = new Random();
// constructor values -----------------------
long times = 0; // number of iterations
ConcurrentHashMap <String, Long> map; // pointer to hash map
DoFlip(long times, ConcurrentHashMap <String, Long> map){
this.times = times;
this.map = map;
}
public void run(){
while(this.times > 0){
int r = randomHT.nextInt(2); // 0 and 1
if (r == 1){
this.heads ++;
}else{
this.tails ++;
}
// System.out.println("Happening...");
this.iterations ++;
this.times --;
}
updateStats();
}
public void updateStats(){
// read from hashmap and get the existing values
Long nHeads = (Long)this.map.get("HEADS");
Long nTails = (Long)this.map.get("TAILS");
Long nIterations = (Long)this.map.get("ITERATIONS");
// update values
nHeads = nHeads + this.heads;
nTails = nTails + this.tails;
nIterations = nIterations + this.iterations;
// push updated values to hashmap
this.map.put("HEADS", nHeads);
this.map.put("TAILS", nTails);
this.map.put("ITERATIONS", nIterations);
}
}
}
I am using a ConcurrentHashMap to store the different counts. Apparently, when the returns wrong values.
I wrote a PERL script to check the (sum of) values of heads and tails (individually for each thread), it seems to be appropriate. I cannot understand why I get different values from the hashmap.

A concurrent hash map provides you with guarantees with respect to visibility of changes with respect to the map itself, not to its values. In this case you retrieve some values from the map, hold them for some arbitrary amount of time, then try and store them into the map again. In between the read and consequent write though, any number of operations might have happened on the map.
The concurrent in concurrent hash map just guarantees, for example, that if I put a value into a map, that I will actually be able to read that value in another thread (aka it will be visible).
What you need to do is ensure that all threads accessing the map wait their turn, so to speak, when updating the shared counters. In order to do this, you either have to use an atomic operation like 'addAndGet` on AtomicInteger:
this.map.get("HEADS").addAndGet(this.heads);
or you need to synchronize both the read and write manually (most easily accomplished by synchronizing on the map itself):
synchronized(this.map) {
Long currentHeads = this.map.get("HEADS");
this.map.put("HEADS", Long.valueOf(currentHeads.longValue() + this.heads);
}
Personally, I prefer to leverage the SDK whenever I can, so I would go with the use of an Atomic data type.

You should use AtomicLongs as values and you should create them only once and increment them instead of get/put.
ConcurrentHashMap <String, AtomicLong> universalMap = new ConcurrentHashMap <String, AtomicLong> ();
...
universalMap.put("HEADS", new AtomicLong(0));
universalMap.put("TAILS", new AtomicLong(0));
universalMap.put("ITERATIONS", new AtomicLong(0));
...
public void updateStats(){
// read from hashmap and get the existing values
this.map.get("HEADS").getAndAdd(heads);
this.map.get("TAILS").getAndAdd(tails);
this.map.get("ITERATIONS").getAndAdd(iterations);
}
Long is immutable.
An example:
Thread 1: get 0
Thread 2: get 0
Thread 2: put 10
Thread 3: get 10
Thread 3: put 15
Thread 1: put 5
Now your map contains 5 instead of 20
Basically your problem is not the Map. You can use a regular HashMap since you do not modify it. Of course you have to make the map field final.

A couple things. One you really don't need to use a ConcurrentHashMap. A ConcurrentHashMap is only useful when you are dealing with concurrent put/removes. In this case the map is fairly static as far as the keys go simply use an UnmodifiableMap to prove this.
Finally if you are dealing with concurrent adds you really should consider using a LongAdder. It scales far better when many parallel adds occur in which you don't need to worry about the count until the end.
public class HeadsTails{
private final Map<String, LongAdder> map;
public HeadsTails(){
Map<String,LongAdder> local = new HashMap<String,LongAdder>();
local.put("HEADS", new LongAdder());
local.put("TAILS", new LongAdder());
local.put("ITERATIONS", new LongAdder());
map = Collections.unmodifiableMap(local);
}
public void count(){
map.get("HEADS").increment();
map.get("TAILS").increment();
}
public void print(){
System.out.println(map.get("HEADS").sum());
/// etc...
}
}
I mean, in reality I wouldn't even use a map...
public class HeadsTails{
private final LongAdder heads = new LongAdder();
private final LongAdder tails = new LongAdder();
private final LongAdder iterations = new LongAdder();
private final Map<String, LongAdder> map;
public void count(){
heads.increment();
tails.increment();
}
public void print(){
System.out.println(iterations.sum());
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

TreeMap or HashMap faster [duplicate] - java

HashMap is O(1) (usually) for access; TreeMap is O(log n) (guaranteed). This assumes that your key objects are immutable and have properly written equals and hashCode methods. See Joshua Bloch's "Effective Java" chapter 3 for how to override equals and hashCode correctly.

Related

Are Java concurrent collections performance tips documented anywhere (e.g., for ConcurrentHashMap, calling `get()` before `putIfAbsent()`) [duplicate]

Genetic Algorithm using Roulette Wheel Selection

Elegant solution for string-counting?

How can I update Java hashmap values by previous value

Using ConcurrentHashMap for parallelism

Categories

Resources