Difference between two ways of defining a thread safe class - java

I am wondering what functional difference in there in the below two example classes. Which style should be preferred over another and why.
public class MyQueue {
private Queue<Abc> q;
public MyQueue() {
q = Collections.synchronizedList(new LinkedList<Abc>());
}
public void put(Abc obj) {
q.add(obj);
}
public Abc get() {
return q.remove();
}
}
OR
public class MyQueue {
private Queue<Abc> q;
public MyQueue() {
q = new LinkedList<Abc>();
}
public synchronized void put(Abc obj) {
q.add(obj);
}
public synchronized Abc get() {
return q.remove();
}
}
My take is - both will work perfectly fine as far as this much functionality in the class, just a matter of personal preference.
Please let me know if there is more difference to it.

The main architectural difference is that the second implementation exposes the synchronization monitor (the object itself) to the outside world. It means that everyone can potentially acquire the same lock you use for internal synchronization:
MyQueue myQueue = new MyQueue(); // a shared instance
synchronized(myQueue) {
// No one else can call synchronized methods while you're here
}
It might bring benefits or cause problems depending on your class use cases.
The fact that the first implementation hides the details of synchronization gives you a bit more flexibility for adding new functionality in the future (for instance, you can add some not synchronized code into your put() and get() methods if you need), but comes with a minor penalty of having an additional layer around your list.
Otherwise, there's no difference given the presented functionality.
PS: Don't forget to add final to the q declaration. Otherwise your class doesn't guarantee so called safe publication and cannot be called fully thread-safe.

In case of a list there should not be any difference in functionality between accessing it through the wrapper or guarding every access to it using the synchronized block.
But there is a case where it is better to use the synchronization mechanisms offered by the wrapper, as is the case with ConcurrentHashMap.
If you would take care of guarding yourself the accesses to a simple non-thread safe HashMap for example, you would lock the entire map(all the keys) for any read/ write and that would affect concurrency on the map. ConcurrentHashMap instead locks only key sets of the map so you get better performance using it for read/write concurrent operations.
Thanks!

Related

Is this HashMap usage thread safe?

I have a static HashMap which will cache objects identifed by unique integers; it will be accessed from multiple threads. I will have multiple instances of the type HashmapUser running in different threads, each of which will want to utilize the same HashMap (which is why it's static).
Generally, the HashmapUsers will be retrieving from the HashMap. Though if it is empty, it needs to be populated from a Database. Also, in some cases the HashMap will be cleared because it needs the data has change and it needs to be repopulated.
So, I just make all interactions with the Map syncrhonized. But I'm not positive that this is safe, smart, or that it works for a static variable.
Is the below implementation of this thread safe? Any suggestions to simplify or otherwise improve it?
public class HashmapUser {
private static HashMap<Integer, AType> theMap = new HashSet<>();
public HashmapUser() {
//....
}
public void performTask(boolean needsRefresh, Integer id) {
//....
AType x = getAtype(needsRefresh, id);
//....
}
private synchronized AType getAtype(boolean needsRefresh, Integer id) {
if (needsRefresh) {
theMap.clear();
}
if (theMap.size() == 0) {
// populate the set
}
return theMap.get(id);
}
}
As it is, it is definitely not thread-safe. Each instance of HashmapUsers will use a different lock (this), which does nothing useful. You have to synchronise on the same object, such as the HashMap itself.
Change getAtype to:
private AType getAtype(boolean needsRefresh, Integer id) {
synchronized(theMap) {
if (needsRefresh) {
theMap.clear();
}
if (theMap.size() == 0) {
// populate the set
}
return theMap.get(id);
}
}
Edit:
Note that you can synchronize on any object, provided that all instances use the same object for synchronization. You could synchronize on HashmapUsers.class, which also allows for other objects to lock access to the map (though it is typically best practice to use a private lock).
Because of this, simply making your getAtype method static would work, since the implied lock would now be HashMapUsers.class instead of this. However, this exposes your lock, which may or may not be what you want.
No, this won't work at all.
If you don't specify lock object, e.g. declare method synchronized, the implicit lock will be instance. Unless the method is static then the lock will be class. Since there are multiple instances, there are also multiple locks, which i doubt is desired.
What you should do is create another class which is the only class with the access to HashMap.
Clients of HashMap, such as the HashMapUser must not even be aware that there is synchronization in place. Instead, thread safety should be assured by the proper class wrapping the HashMap hiding the synchronization from the clients.
This lets you easily add additional clients to the HashMap since synchronization is hidden from them, otherwise you would have to add some kind of synchronization between the different client types too.
I would suggest you go with either ConcurrentHashMap or SynchronizedMap.
More info here: http://crunchify.com/hashmap-vs-concurrenthashmap-vs-synchronizedmap-how-a-hashmap-can-be-synchronized-in-java/
ConcurrentHashMap is more suitable for high - concurrency scenarios. This implementation doesn't synchronize on the whole object, but rather does that in an optimised way, so different threads, accessing different keys can do that simultaneously.
SynchronizerMap is simpler and does synchronization on the object level - the access to the instance is serial.
I think you need performance, so I think you should probably go with ConcurrentHashMap.

Java - Is calling a synchronized getter function during a synchronized setter function the right way to manipulate a shared variable?

I have several threads trying to increment a counter for a certain key in a not thread-safe custom data structure (which you can image to be similiar to a HashMap). I was wondering what the right way to increment the counter in this case would be.
Is it sufficient to synchronize the increment function or do I also need to synchronize the get operation?
public class Example {
private MyDataStructure<Key, Integer> datastructure = new CustomDataStructure<Key, Integer>();
private class MyThread implements Runnable() {
private synchronized void incrementCnt(Key key) {
// from the datastructure documentation: if a value already exists for the given key, the
// previous value will be replaced by this value
datastructure.put(key, getCnt(key)+1);
// or can I do it without using the getCnt() function? like this:
datastructure.put(key, datastructure.get(key)+1));
}
private synchronized int getCnt(Key key) {
return datastructure.get(key);
}
// run method...
}
}
If I have two threads t1, t2 for example, I would to something like:
t1.incrementCnt();
t2.incrmentCnt();
Can this lead to any kind of deadlock? Is there a better way to solve this?
Main issue with this code is that it's likely to fail in providing synchronization access to datastructure, since accessing code synchronizing on this of an inner class. Which is different for different instances of MyThread, so no mutual exclusion will happen.
More correct way is to make datastructure a final field, and then to synchronize on it:
private final MyDataStructure<Key, Integer> datastructure = new CustomDataStructure<Key, Integer>();
private class MyThread implements Runnable() {
private void incrementCnt(Key key) {
synchronized (datastructure) {
// or can I do it without using the getCnt() function? like this:
datastructure.put(key, datastructure.get(key)+1));
}
}
As long as all data access is done using synchronized (datastructure), code is thread-safe and it's safe to just use datastructure.get(...). There should be no dead-locks, since deadlocks can occur only when there's more than one lock to compete for.
As the other answer told you, you should synchronize on your data structure, rather than on the thread/runnable object. It is a common mistake to try to use synchronized methods in the thread or runnable object. Synchronization locks are instance-based, not class-based (unless the method is static), and when you are running multiple threads, this means that there are actually multiple thread instances.
It's less clear-cut about Runnables: you could be using a single instance of your Runnable class with several threads. So in principle you could synchronize on it. But I still think it's bad form because in the future you may want to create more than one instance of it, and get a really nasty bug.
So the general best practice is to synchronize on the actual item that you are accessing.
Furthermore, the design conundrum of whether or not to use two methods should be solved by moving the whole thing into the data structure itself, if you can do so (if the class source is under your control). This is an operation that is confined to the data structure and applies only to it, and doing the increment outside of it is not good encapsulation. If your data structure exposes a synchronized incrementCnt method, then:
It synchronizes on itself, which is what you wanted.
It can use its own private fields directly, which means you don't actually need to call a getter and a setter.
It is free to have the implementation changed to one of the atomic structures in the future if it becomes possible, or add other implementation details (such as logging increment operations separately from setter access operations).

Is there reasoning behind choosing a specific object?

Is the locking object used for synchronization arbitrary or is there reasoning behind choosing a specific object?
Why would you lock an object? Because it is shared among various threads. That's all there is. How you implement locking and threading is probably the difficult part, as opposed to choosing which object to lock on.
You'd be better off using one of the more modern locking techniques where much of the complexity and pitfalls have been removed/smoothed over. Package java.util.concurrent.locks would be a good start.
Your question is rather unclear.
You may be referring to a Semaphore object as a lock. You may also be referring to synchronized objects.
1) A semaphore may as well arbitrary object. It's intended purpose is that it can be used to hold threads at the semaphore until other threads release it.
2) Synchronized objects make all of their functions atomic: If a thread is operating on the object, the other object waits to complete its own function. This is usually implemented using a semaphore internally.
Semaphores are the objects used to solve synchronization problems.
The locking object needs to represent the exclusive part.
if you lock the whole object meaning using it exclusively by an thread, you may use the object "this" to lock. This is the way "synchronize" work on methods work.
public class A
{
public synchronized void do1 ()
{
...
}
public synchronized void do2 ()
{
...
}
}
if your object just has some set of members which should be used exclusively, you need separate (explicit) locking objects
public class B
{
private X x;
private Y y;
private Object lockXY = new Object ();
private R r;
private S s;
private Object lockRS = new Object ();
public void do1 ()
{
synchronize (lockXY) {
}
...
}
public void do2 ()
{
synchronize (lockRS) {
}
}
}
Beware to make locking to complex, you may run into dead locks
As in the accepted answer, the object you choose is arbitrary, just make sure you use it correctly. However, some objects are better than others. It's best practice not to use some object that may be accessible outside the context of the locking - if it is some other piece of code may also decide to synchronize on it, or call notify on it or whatever. So preferably use java.util.concurrent instead, or use private objects.

What is the difference between synchronized on lockObject and using this as the lock?

I know the difference between synchronized method and synchronized block but I am not sure about the synchronized block part.
Assuming I have this code
class Test {
private int x=0;
private Object lockObject = new Object();
public void incBlock() {
synchronized(lockObject) {
x++;
}
System.out.println("x="+x);
}
public void incThis() { // same as synchronized method
synchronized(this) {
x++;
}
System.out.println("x="+x);
}
}
In this case what is the difference between using lockObject and using this as the lock? It seems to be the same to me..
When you decide to use synchronized block, how do you decide which object to be the lock?
Personally I almost never lock on "this". I usually lock on a privately held reference which I know that no other code is going to lock on. If you lock on "this" then any other code which knows about your object might choose to lock on it. While it's unlikely to happen, it certainly could do - and could cause deadlocks, or just excessive locking.
There's nothing particularly magical about what you lock on - you can think of it as a token, effectively. Anyone locking with the same token will be trying to acquire the same lock. Unless you want other code to be able to acquire the same lock, use a private variable. I'd also encourage you to make the variable final - I can't remember a situation where I've ever wanted to change a lock variable over the lifetime of an object.
I had this same question when I was reading Java Concurrency In Practice, and I thought I'd add some added perspective on the answers provided by Jon Skeet and spullara.
Here's some example code which will block even the "quick" setValue(int)/getValue() methods while the doStuff(ValueHolder) method executes.
public class ValueHolder {
private int value = 0;
public synchronized void setValue(int v) {
// Or could use a sychronized(this) block...
this.value = 0;
}
public synchronized int getValue() {
return this.value;
}
}
public class MaliciousClass {
public void doStuff(ValueHolder holder) {
synchronized(holder) {
// Do something "expensive" so setter/getter calls are blocked
}
}
}
The downside of using this for synchronization is other classes can synchronize on a reference to your class (not via this, of course). Malicious or unintentional use of the synchronized keyword while locking on your object's reference can cause your class to behave poorly under concurrent usage, as an external class can effectively block your this-synchronized methods and there is nothing you can do (in your class) to prohibit this at runtime. To avoid this potential pitfall, you would synchronize on a private final Object or use the Lock interface in java.util.concurrent.locks.
For this simple example, you could alternately use an AtomicInteger rather than synchronizing the setter/getter.
Item 67 of Effective Java Second Edition is Avoid excessive synchronization, thus I would synchronize on a private lock object.
Every object in Java can act as a monitor. Choosing one is dependent on what granularity you want. Choosing 'this' has the advantage and disadvantage that other classes could also synchronize on the same monitor. My advice though is to avoid using the synchronize keyword directly and instead use constructs from the java.util.concurrency library which are higher level and have well defined semantics. This book has a lot of great advice in it from very notable experts:
Java Concurrency in Practice
http://amzn.com/0321349601
In this case it does not matter which object you choose for lock. But you must consistently use the same object for locking to achieve correct synchronization. Above code does not ensure proper synchronization as you once use the 'this' object as lock and next the 'lockObject' as lock.

Collection.synchronizedMap vs synchronizing individual methods in HashMap

What is the difference between a Collections.synchronizedMap() and a wrapper around a HashMap with all the methods synchronized. I dont see any difference becuase Collections.synchronizedMap() internally maintains the same lock for all methods.
Basically, what is the difference between the following code snippets
Class C {
Object o;
public void foo() {
synchronized(o) {
// thread safe code here
}
}
}
and
Class C {
Object o;
public synchronized void foo() {
}
}
There is only one difference:
Collections.synchronizedMap is able to use a different monitor than itself.
Using synchronized methods is the same as using sychnchonized(this)-blocks, which means, the wrapper would be the monitor and could be locked from the outside of the wrapper.
If you doesn't want an outside application to lock your monitor, you need to hide it.
On the other side, if you want to call multiple methods in a thread safe fashion, it is the easiest way to lock the whole collection (but it's not very scaleable, indeed).
Ps: For reuse, it's better to delegate the method calls to a backup-Map than to override the class, because you can switch to another Map implementation later, without changing your wrapper.
Both approaches acquire a monitor on the object and so should perform exactly the same. The main reason for the difference is architectural. The synchronized wrapper allows extending the basic non-thread safe variation easily.
Having said that don't use either, use ConcurrentHashMap. It uses lock striping so it's much quicker to use than either approach (as they are the same in terms of overhead + contention). Lock striping allows segments of the backing array to be locked independently. This means it's less probable that two threads will request to acquire the same lock.
Do not reinvent the wheel and use what is provided by the API.
You should always decorate rather than lumping everything and all feartures into one big featured class.
Always take the plain Map and decorate it with Collections or use a java.util.concurrent and use a real lock, so one can atomically inspect and update the map. Tomorrow you might want to change the Hashtable to a Treemap and you will be in trouble if your stuck with a hashtable.
So, why do you ask? :) Do you really believe that if class is placed in java.util package then some magic happens and its java code works in some tricky way?
It really just wraps all methods with synchronized {} block and nothing more.
UPD: the difference is that you have much less chances to make a mistake if you use synchronized collection instead of doing all synchronization stuff by yourself.
UPD 2: as you can see in sources they use 'mutex'-object as monitor. When you use synchronized modifier in method signature (i.e. synchronized void doSmth()) current instance of your object (i.e. this) is used as a monitor. Two blocks of code below are the same:
1.
synchronized public void doSmth () {
someLogic ();
moreLogic ();
}
synchronized public static void doSmthStatic () {
someStaticLogic ();
moreStaticLogic ();
}
2.
public void doSmth () {
synchronized (this) {
someLogic ();
moreLogic ();
}
}
public static void doSmthStatic () {
synchronized (ClassName.class) {
someStaticLogic ();
moreStaticLogic ();
}
}
If thread safety is the case, use concurrency package data structures. Using the wrapper class will reduce all accesses to the Map into a sequential queue.
a) Threads waiting to do operations at totally different points in the Map will be waiting for the same lock. Based on the number of threads this can affect the application performance.
b) Consider compound operations on the Map. Using a wrapper with a Single lock will not help. For example. "Look if present then add" kind of operations. Thread syncronization will again become an issue.

Categories

Resources