Collection.synchronizedMap vs synchronizing individual methods in HashMap

Collection.synchronizedMap vs synchronizing individual methods in HashMap - java

What is the difference between a Collections.synchronizedMap() and a wrapper around a HashMap with all the methods synchronized. I dont see any difference becuase Collections.synchronizedMap() internally maintains the same lock for all methods.
Basically, what is the difference between the following code snippets
Class C {
Object o;
public void foo() {
synchronized(o) {
// thread safe code here
}
}
}
and
Class C {
Object o;
public synchronized void foo() {
}
}

There is only one difference:
Collections.synchronizedMap is able to use a different monitor than itself.
Using synchronized methods is the same as using sychnchonized(this)-blocks, which means, the wrapper would be the monitor and could be locked from the outside of the wrapper.
If you doesn't want an outside application to lock your monitor, you need to hide it.
On the other side, if you want to call multiple methods in a thread safe fashion, it is the easiest way to lock the whole collection (but it's not very scaleable, indeed).
Ps: For reuse, it's better to delegate the method calls to a backup-Map than to override the class, because you can switch to another Map implementation later, without changing your wrapper.

Both approaches acquire a monitor on the object and so should perform exactly the same. The main reason for the difference is architectural. The synchronized wrapper allows extending the basic non-thread safe variation easily.
Having said that don't use either, use ConcurrentHashMap. It uses lock striping so it's much quicker to use than either approach (as they are the same in terms of overhead + contention). Lock striping allows segments of the backing array to be locked independently. This means it's less probable that two threads will request to acquire the same lock.

Do not reinvent the wheel and use what is provided by the API.

You should always decorate rather than lumping everything and all feartures into one big featured class.
Always take the plain Map and decorate it with Collections or use a java.util.concurrent and use a real lock, so one can atomically inspect and update the map. Tomorrow you might want to change the Hashtable to a Treemap and you will be in trouble if your stuck with a hashtable.

So, why do you ask? :) Do you really believe that if class is placed in java.util package then some magic happens and its java code works in some tricky way?
It really just wraps all methods with synchronized {} block and nothing more.
UPD: the difference is that you have much less chances to make a mistake if you use synchronized collection instead of doing all synchronization stuff by yourself.
UPD 2: as you can see in sources they use 'mutex'-object as monitor. When you use synchronized modifier in method signature (i.e. synchronized void doSmth()) current instance of your object (i.e. this) is used as a monitor. Two blocks of code below are the same:
1.
synchronized public void doSmth () {
someLogic ();
moreLogic ();
}
synchronized public static void doSmthStatic () {
someStaticLogic ();
moreStaticLogic ();
}
2.
public void doSmth () {
synchronized (this) {
someLogic ();
moreLogic ();
}
}
public static void doSmthStatic () {
synchronized (ClassName.class) {
someStaticLogic ();
moreStaticLogic ();
}
}

If thread safety is the case, use concurrency package data structures. Using the wrapper class will reduce all accesses to the Map into a sequential queue.
a) Threads waiting to do operations at totally different points in the Map will be waiting for the same lock. Based on the number of threads this can affect the application performance.
b) Consider compound operations on the Map. Using a wrapper with a Single lock will not help. For example. "Look if present then add" kind of operations. Thread syncronization will again become an issue.

Related

Difference between two ways of defining a thread safe class

I am wondering what functional difference in there in the below two example classes. Which style should be preferred over another and why.
public class MyQueue {
private Queue<Abc> q;
public MyQueue() {
q = Collections.synchronizedList(new LinkedList<Abc>());
}
public void put(Abc obj) {
q.add(obj);
}
public Abc get() {
return q.remove();
}
}
OR
public class MyQueue {
private Queue<Abc> q;
public MyQueue() {
q = new LinkedList<Abc>();
}
public synchronized void put(Abc obj) {
q.add(obj);
}
public synchronized Abc get() {
return q.remove();
}
}
My take is - both will work perfectly fine as far as this much functionality in the class, just a matter of personal preference.
Please let me know if there is more difference to it.

The main architectural difference is that the second implementation exposes the synchronization monitor (the object itself) to the outside world. It means that everyone can potentially acquire the same lock you use for internal synchronization:
MyQueue myQueue = new MyQueue(); // a shared instance
synchronized(myQueue) {
// No one else can call synchronized methods while you're here
}
It might bring benefits or cause problems depending on your class use cases.
The fact that the first implementation hides the details of synchronization gives you a bit more flexibility for adding new functionality in the future (for instance, you can add some not synchronized code into your put() and get() methods if you need), but comes with a minor penalty of having an additional layer around your list.
Otherwise, there's no difference given the presented functionality.
PS: Don't forget to add final to the q declaration. Otherwise your class doesn't guarantee so called safe publication and cannot be called fully thread-safe.

In case of a list there should not be any difference in functionality between accessing it through the wrapper or guarding every access to it using the synchronized block.
But there is a case where it is better to use the synchronization mechanisms offered by the wrapper, as is the case with ConcurrentHashMap.
If you would take care of guarding yourself the accesses to a simple non-thread safe HashMap for example, you would lock the entire map(all the keys) for any read/ write and that would affect concurrency on the map. ConcurrentHashMap instead locks only key sets of the map so you get better performance using it for read/write concurrent operations.
Thanks!

Is this HashMap usage thread safe?

I have a static HashMap which will cache objects identifed by unique integers; it will be accessed from multiple threads. I will have multiple instances of the type HashmapUser running in different threads, each of which will want to utilize the same HashMap (which is why it's static).
Generally, the HashmapUsers will be retrieving from the HashMap. Though if it is empty, it needs to be populated from a Database. Also, in some cases the HashMap will be cleared because it needs the data has change and it needs to be repopulated.
So, I just make all interactions with the Map syncrhonized. But I'm not positive that this is safe, smart, or that it works for a static variable.
Is the below implementation of this thread safe? Any suggestions to simplify or otherwise improve it?
public class HashmapUser {
private static HashMap<Integer, AType> theMap = new HashSet<>();
public HashmapUser() {
//....
}
public void performTask(boolean needsRefresh, Integer id) {
//....
AType x = getAtype(needsRefresh, id);
//....
}
private synchronized AType getAtype(boolean needsRefresh, Integer id) {
if (needsRefresh) {
theMap.clear();
}
if (theMap.size() == 0) {
// populate the set
}
return theMap.get(id);
}
}

As it is, it is definitely not thread-safe. Each instance of HashmapUsers will use a different lock (this), which does nothing useful. You have to synchronise on the same object, such as the HashMap itself.
Change getAtype to:
private AType getAtype(boolean needsRefresh, Integer id) {
synchronized(theMap) {
if (needsRefresh) {
theMap.clear();
}
if (theMap.size() == 0) {
// populate the set
}
return theMap.get(id);
}
}
Edit:
Note that you can synchronize on any object, provided that all instances use the same object for synchronization. You could synchronize on HashmapUsers.class, which also allows for other objects to lock access to the map (though it is typically best practice to use a private lock).
Because of this, simply making your getAtype method static would work, since the implied lock would now be HashMapUsers.class instead of this. However, this exposes your lock, which may or may not be what you want.

No, this won't work at all.
If you don't specify lock object, e.g. declare method synchronized, the implicit lock will be instance. Unless the method is static then the lock will be class. Since there are multiple instances, there are also multiple locks, which i doubt is desired.
What you should do is create another class which is the only class with the access to HashMap.
Clients of HashMap, such as the HashMapUser must not even be aware that there is synchronization in place. Instead, thread safety should be assured by the proper class wrapping the HashMap hiding the synchronization from the clients.
This lets you easily add additional clients to the HashMap since synchronization is hidden from them, otherwise you would have to add some kind of synchronization between the different client types too.

I would suggest you go with either ConcurrentHashMap or SynchronizedMap.
More info here: http://crunchify.com/hashmap-vs-concurrenthashmap-vs-synchronizedmap-how-a-hashmap-can-be-synchronized-in-java/
ConcurrentHashMap is more suitable for high - concurrency scenarios. This implementation doesn't synchronize on the whole object, but rather does that in an optimised way, so different threads, accessing different keys can do that simultaneously.
SynchronizerMap is simpler and does synchronization on the object level - the access to the instance is serial.
I think you need performance, so I think you should probably go with ConcurrentHashMap.

Java - Is calling a synchronized getter function during a synchronized setter function the right way to manipulate a shared variable?

I have several threads trying to increment a counter for a certain key in a not thread-safe custom data structure (which you can image to be similiar to a HashMap). I was wondering what the right way to increment the counter in this case would be.
Is it sufficient to synchronize the increment function or do I also need to synchronize the get operation?
public class Example {
private MyDataStructure<Key, Integer> datastructure = new CustomDataStructure<Key, Integer>();
private class MyThread implements Runnable() {
private synchronized void incrementCnt(Key key) {
// from the datastructure documentation: if a value already exists for the given key, the
// previous value will be replaced by this value
datastructure.put(key, getCnt(key)+1);
// or can I do it without using the getCnt() function? like this:
datastructure.put(key, datastructure.get(key)+1));
}
private synchronized int getCnt(Key key) {
return datastructure.get(key);
}
// run method...
}
}
If I have two threads t1, t2 for example, I would to something like:
t1.incrementCnt();
t2.incrmentCnt();
Can this lead to any kind of deadlock? Is there a better way to solve this?

Main issue with this code is that it's likely to fail in providing synchronization access to datastructure, since accessing code synchronizing on this of an inner class. Which is different for different instances of MyThread, so no mutual exclusion will happen.
More correct way is to make datastructure a final field, and then to synchronize on it:
private final MyDataStructure<Key, Integer> datastructure = new CustomDataStructure<Key, Integer>();
private class MyThread implements Runnable() {
private void incrementCnt(Key key) {
synchronized (datastructure) {
// or can I do it without using the getCnt() function? like this:
datastructure.put(key, datastructure.get(key)+1));
}
}
As long as all data access is done using synchronized (datastructure), code is thread-safe and it's safe to just use datastructure.get(...). There should be no dead-locks, since deadlocks can occur only when there's more than one lock to compete for.

As the other answer told you, you should synchronize on your data structure, rather than on the thread/runnable object. It is a common mistake to try to use synchronized methods in the thread or runnable object. Synchronization locks are instance-based, not class-based (unless the method is static), and when you are running multiple threads, this means that there are actually multiple thread instances.
It's less clear-cut about Runnables: you could be using a single instance of your Runnable class with several threads. So in principle you could synchronize on it. But I still think it's bad form because in the future you may want to create more than one instance of it, and get a really nasty bug.
So the general best practice is to synchronize on the actual item that you are accessing.
Furthermore, the design conundrum of whether or not to use two methods should be solved by moving the whole thing into the data structure itself, if you can do so (if the class source is under your control). This is an operation that is confined to the data structure and applies only to it, and doing the increment outside of it is not good encapsulation. If your data structure exposes a synchronized incrementCnt method, then:
It synchronizes on itself, which is what you wanted.
It can use its own private fields directly, which means you don't actually need to call a getter and a setter.
It is free to have the implementation changed to one of the atomic structures in the future if it becomes possible, or add other implementation details (such as logging increment operations separately from setter access operations).

What's synchronizes in Java Collection?

I'm trying to figure out what are synchronized on the Java collection framework. But still haven’t got any clear solution.
I mean, if we get
list
Queie
Set
And on the list
ArrayList
LinkedList
Vector
What are the synchronized?
If we get HashMap and HashTable we know Hashtable is synchronized on the table while access to the HashMap isn't.

Take a look on the following utility methods:
Collections.synchronizedCollection()
Collections.synchronizedList()
Collections.synchronizedSet()
Collections.synchronizedMap()

Not a single implementation of the collection is synchronized because synchronized is not a class property, it is only applicable to methods and blocks.

Each class in the jdk collection api documents whether it is thread safe or not. Older classes like java.util.Vector tended to be synchronized on every method, until they became replaced with a non synchronized version. ArrayList in this case. Then the concurrent package was added, and everything in there had a thread safety strategy of one kind or another. In general though, if the class documentation does not say that the class is thread safe then treat it as not being thread safe.

When you write
private Object syncObject = new Object();
public void someFunction(Stuff stuff)
{
synchronized(syncObject)
{
list.add(stuff);
}
}
public void someOtherFunction()
{
synchronized(syncObject)
{
for(Stuff stuff : list)
{
stuff.doStuff();
}
}
}
Then what it means is that the Monitor of the syncObject object does not allow multiple threads inside it, it allows only a single Thread into that object's monitor. This is called mutual exclusion, http://en.wikipedia.org/wiki/Mutual_exclusion
This is basically so that if you have multiple threads, then you can make execute specific code blocks only one thread at a time. For example, while you iterate the array through one, but you're adding items to that array in another, and you're removing in a third. You don't want them to mess with each other, because that can create inconsistent results.
The function
Collections.synchronizedCollection(Collection<T> c)
creates a Decorator around the Collection to make its methods synchronized, for example.

"Synchronized" means the methods of that class are synchronized. All the legacy classes(e.g. Vector, Hashtable, Stack) are considered to be synchronized. More specifically, the personal methods of these legacy classes are synchronized.
But now take an example of Vector class, Vector implements Collection interface hence it contains some utility methods from Collection interface(e.g. add(), remove() etc.), and these methods are not synchronized.

all legacy classes are the synchronize classes it means thread safe classes.
Vector and Stack are the synchronize and Enumeration is used to access the values in synchronize class.

Legacy classes are synchronized like Vector, Stack, HashMap. You also can check if all the methods of a particular class are synchronized or not like
javap java.util.HashMap
Which is not synchronized class
javap java.util.Hashtable
Which is synchronized as all the methods are synchronized of Hashtable

Is there reasoning behind choosing a specific object?

Is the locking object used for synchronization arbitrary or is there reasoning behind choosing a specific object?

Why would you lock an object? Because it is shared among various threads. That's all there is. How you implement locking and threading is probably the difficult part, as opposed to choosing which object to lock on.

You'd be better off using one of the more modern locking techniques where much of the complexity and pitfalls have been removed/smoothed over. Package java.util.concurrent.locks would be a good start.

Your question is rather unclear.
You may be referring to a Semaphore object as a lock. You may also be referring to synchronized objects.
1) A semaphore may as well arbitrary object. It's intended purpose is that it can be used to hold threads at the semaphore until other threads release it.
2) Synchronized objects make all of their functions atomic: If a thread is operating on the object, the other object waits to complete its own function. This is usually implemented using a semaphore internally.
Semaphores are the objects used to solve synchronization problems.

The locking object needs to represent the exclusive part.
if you lock the whole object meaning using it exclusively by an thread, you may use the object "this" to lock. This is the way "synchronize" work on methods work.
public class A
{
public synchronized void do1 ()
{
...
}
public synchronized void do2 ()
{
...
}
}
if your object just has some set of members which should be used exclusively, you need separate (explicit) locking objects
public class B
{
private X x;
private Y y;
private Object lockXY = new Object ();
private R r;
private S s;
private Object lockRS = new Object ();
public void do1 ()
{
synchronize (lockXY) {
}
...
}
public void do2 ()
{
synchronize (lockRS) {
}
}
}
Beware to make locking to complex, you may run into dead locks

As in the accepted answer, the object you choose is arbitrary, just make sure you use it correctly. However, some objects are better than others. It's best practice not to use some object that may be accessible outside the context of the locking - if it is some other piece of code may also decide to synchronize on it, or call notify on it or whatever. So preferably use java.util.concurrent instead, or use private objects.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Collection.synchronizedMap vs synchronizing individual methods in HashMap - java

Do not reinvent the wheel and use what is provided by the API.

Related

Difference between two ways of defining a thread safe class

Is this HashMap usage thread safe?

Java - Is calling a synchronized getter function during a synchronized setter function the right way to manipulate a shared variable?

What's synchronizes in Java Collection?

Is there reasoning behind choosing a specific object?

Categories

Resources