What's synchronizes in Java Collection? - java

I'm trying to figure out what are synchronized on the Java collection framework. But still haven’t got any clear solution.
I mean, if we get
list
Queie
Set
And on the list
ArrayList
LinkedList
Vector
What are the synchronized?
If we get HashMap and HashTable we know Hashtable is synchronized on the table while access to the HashMap isn't.

Take a look on the following utility methods:
Collections.synchronizedCollection()
Collections.synchronizedList()
Collections.synchronizedSet()
Collections.synchronizedMap()

Not a single implementation of the collection is synchronized because synchronized is not a class property, it is only applicable to methods and blocks.

Each class in the jdk collection api documents whether it is thread safe or not. Older classes like java.util.Vector tended to be synchronized on every method, until they became replaced with a non synchronized version. ArrayList in this case. Then the concurrent package was added, and everything in there had a thread safety strategy of one kind or another. In general though, if the class documentation does not say that the class is thread safe then treat it as not being thread safe.

When you write
private Object syncObject = new Object();
public void someFunction(Stuff stuff)
{
synchronized(syncObject)
{
list.add(stuff);
}
}
public void someOtherFunction()
{
synchronized(syncObject)
{
for(Stuff stuff : list)
{
stuff.doStuff();
}
}
}
Then what it means is that the Monitor of the syncObject object does not allow multiple threads inside it, it allows only a single Thread into that object's monitor. This is called mutual exclusion, http://en.wikipedia.org/wiki/Mutual_exclusion
This is basically so that if you have multiple threads, then you can make execute specific code blocks only one thread at a time. For example, while you iterate the array through one, but you're adding items to that array in another, and you're removing in a third. You don't want them to mess with each other, because that can create inconsistent results.
The function
Collections.synchronizedCollection(Collection<T> c)
creates a Decorator around the Collection to make its methods synchronized, for example.

"Synchronized" means the methods of that class are synchronized. All the legacy classes(e.g. Vector, Hashtable, Stack) are considered to be synchronized. More specifically, the personal methods of these legacy classes are synchronized.
But now take an example of Vector class, Vector implements Collection interface hence it contains some utility methods from Collection interface(e.g. add(), remove() etc.), and these methods are not synchronized.

all legacy classes are the synchronize classes it means thread safe classes.
Vector and Stack are the synchronize and Enumeration is used to access the values in synchronize class.

Legacy classes are synchronized like Vector, Stack, HashMap. You also can check if all the methods of a particular class are synchronized or not like
javap java.util.HashMap
Which is not synchronized class
javap java.util.Hashtable
Which is synchronized as all the methods are synchronized of Hashtable

Related

How do I copy ArrayList<T> in java multi-threaded environment?

In thread A, an ArrayList is created. It is managed from thread A only.
In thread B, I want to copy that to a new instance.
The requirement is that copyList should not fail and should return a consistent version of the list (= existed at some time at least during the copying process).
My approach is this:
public static <T> ArrayList<T> copyList(ArrayList<? extends T> list) {
List<? extends T> unmodifiableList = Collections.unmodifiableList(list);
return new ArrayList<T>(unmodifiableList);
}
Q1: Does that satisfy the requirements?
Q2: How can I do the same without Collections.unmodifiableList with proably iterators and try-catch blocks?
UPD. That is an interview question I was asked a year ago. I understand this a bad idea to use non-thread-safe collections like ArrayList in multithreaded environment
No. ArrayList is not thread safe and you are not using an explicit synchronization.
While you are executing the method unmodifiableList the first thread can modify the original list and you will have a not valid unmodifiable list.
The simplest way I think is the following:
Replace the List with a synchronized version of it.
On the copy list synchronize on the arrayList and make a copy
For example, something like:
List<T> l = Collections.synchronizedList(new ArrayList<T>());
...
public static <T> List<T> copyList(List<? extends T> list) {
List<T> copyList = null;
synchronized(list) {
copyList = new ArrayList<T>(list);
}
return copyList;
}
You should synchronize access to the ArrayList, or replace ArrayList with a concurrent collection like CopyOnWriteArrayList.
Without doing that you might end up with a copy that is inconsistent.
There is absolutely no way to create a copy of a plain ArrayList if the "owning" thread does not offer some protocol to do so.
Without any protocol, thread A can modify the list potentially at any time, meaning thread B never gets a chance to ensure that is sees a consistent state of the list.
To actually allow a consistent copy to be made, thread A must ensure that any modifications it has made are written to memory and are visible to other threads.
Normally, the VM is allowed to reorder instructions, reads and writes as it sees fit, provided no difference can be observed from within the thread executing the program. This includes, for example, delaying writes by holding values in CPU registers or on the local stack.
The only way to ensure that everything is consistently written to main menory, is for thread A to execute an instruction that presents a reordering barrier to the VM (e.g. synchronized block or volatile field access).
So without some cooperation from thread A, there is no way to ensure above conditions are guaranteed to be fulfilled.
Common methods of circumventing this are to synchronize access to the List by only using it in a safely wrapped form (Collections.synchronizedCollection), or use of a List implementation that has these guarantees built in (any type of concurrent list implementation).
The javadoc for Collections.unmodifiableList(...) says, "Returns an unmodifiable view of the specified list."
The key word there is "view". That means it does not copy the data. All it does is create a wrapper for the given list with mutators that all throw exceptions rather than modify the base list.
Yes, but I acually create new ArrayList(Collections.unmodif...), wouldn't this work?
Oops! I missed that. If you're going to copy the list, then there's no point in calling unmodifiableList(). The only code that will ever access the unmodifiable view is the code that's right there in the same method where it's created. You don't have to worry about that code modifying the list contents because you wrote it.
On the other hand, if you're going to copy the list when other threads could be updating the list, then you're going to need synchronized all around. Every place where code could update the list needs to be in a synchronized block, as does the code that makes the copy. Of course, all of those synchronized blocks must synchronize on the same object.
Some programmers will use the list object itself as the lock object. Others will prefer to use a separate object.
Q1: Does that satisfy the requirements?
If the provided list is modified while copying it using new ArrayList<T>(unmodifiableList), you will get a ConcurrentModificationException even if you wrapped it using Collections.unmodifiableList because the Iterator of an UnmodifiableList simply calls the Iterator of the wrapped list and here as it is a non thread safe list you can still get a ConcurrentModificationException.
What you could do is indeed use CopyOnWriteArrayList instead as it is a thread safe list implementation that provides consistent snapshots of the List when you try to iterate over it. Another way could be to make the Thread A push for other threads regularly a safe copy of it using new ArrayList<T>(myList) as it is the only thread that modifies it we know that while creating the copy no other thread will modify it so it would be safe.
Q2: How can I do the same without Collections.unmodifiableList with
probably iterators and try-catch blocks?
As mentioned above Collections.unmodifiableList is not helping here to make it thread safe, for me the only thing that could make sense is actually the opposite: the thread A (the only thread that can modify the list) creates a safe copy of your ArrayList using new ArrayList<T>(list) then it pushes to other threads an unmodified list of it using Collections.unmodifiableList(list).
Generally speaking you should avoid specifying implementations in your method's definition especially public ones, you should only use interfaces or abstract classes because otherwise you would provide an implementation details to the users of your API which is not expected. So here it should be List or Collection not ArrayList.

Java - Is calling a synchronized getter function during a synchronized setter function the right way to manipulate a shared variable?

I have several threads trying to increment a counter for a certain key in a not thread-safe custom data structure (which you can image to be similiar to a HashMap). I was wondering what the right way to increment the counter in this case would be.
Is it sufficient to synchronize the increment function or do I also need to synchronize the get operation?
public class Example {
private MyDataStructure<Key, Integer> datastructure = new CustomDataStructure<Key, Integer>();
private class MyThread implements Runnable() {
private synchronized void incrementCnt(Key key) {
// from the datastructure documentation: if a value already exists for the given key, the
// previous value will be replaced by this value
datastructure.put(key, getCnt(key)+1);
// or can I do it without using the getCnt() function? like this:
datastructure.put(key, datastructure.get(key)+1));
}
private synchronized int getCnt(Key key) {
return datastructure.get(key);
}
// run method...
}
}
If I have two threads t1, t2 for example, I would to something like:
t1.incrementCnt();
t2.incrmentCnt();
Can this lead to any kind of deadlock? Is there a better way to solve this?
Main issue with this code is that it's likely to fail in providing synchronization access to datastructure, since accessing code synchronizing on this of an inner class. Which is different for different instances of MyThread, so no mutual exclusion will happen.
More correct way is to make datastructure a final field, and then to synchronize on it:
private final MyDataStructure<Key, Integer> datastructure = new CustomDataStructure<Key, Integer>();
private class MyThread implements Runnable() {
private void incrementCnt(Key key) {
synchronized (datastructure) {
// or can I do it without using the getCnt() function? like this:
datastructure.put(key, datastructure.get(key)+1));
}
}
As long as all data access is done using synchronized (datastructure), code is thread-safe and it's safe to just use datastructure.get(...). There should be no dead-locks, since deadlocks can occur only when there's more than one lock to compete for.
As the other answer told you, you should synchronize on your data structure, rather than on the thread/runnable object. It is a common mistake to try to use synchronized methods in the thread or runnable object. Synchronization locks are instance-based, not class-based (unless the method is static), and when you are running multiple threads, this means that there are actually multiple thread instances.
It's less clear-cut about Runnables: you could be using a single instance of your Runnable class with several threads. So in principle you could synchronize on it. But I still think it's bad form because in the future you may want to create more than one instance of it, and get a really nasty bug.
So the general best practice is to synchronize on the actual item that you are accessing.
Furthermore, the design conundrum of whether or not to use two methods should be solved by moving the whole thing into the data structure itself, if you can do so (if the class source is under your control). This is an operation that is confined to the data structure and applies only to it, and doing the increment outside of it is not good encapsulation. If your data structure exposes a synchronized incrementCnt method, then:
It synchronizes on itself, which is what you wanted.
It can use its own private fields directly, which means you don't actually need to call a getter and a setter.
It is free to have the implementation changed to one of the atomic structures in the future if it becomes possible, or add other implementation details (such as logging increment operations separately from setter access operations).

Synchronized keyword- how it works?

If I have a class, call it X and X contains a collection (assume I am not using one of the synchronized colections, just a normal one).
If I was to write my own method synchronized add()- how does the locking work? Is the locking done on the instance of X, and not on the collection object?
So synchronizing my add() method would not stop many instances of X from calling add() and inserting into the collection- therefore I could still have threading problems?
A synchronized method locks the object. If your X.add is synchronized, it will prevent concurrent execution of other synchronized methods of the same X object. If anyone out of that X object has access to the same collection, the collection will not be protected.
If you want your collection to be protected, make sure it is not accessible to the rest of the world in any way other than a synchronized method of X. Also, this is a bit unclear in your question, but note that a synchronized non-static method locks the object. Assuming each X instance will have a collection of its own, they won't interfere with each other.
Another option, BTW, is to lock the collection instead of the X object:
void add(Object o) {
synchronized(myCollection) {
myCollection.add(o);
}
}
This will synchronize access to the locked collection instead of the X object. Use whichever you find easier and more effective.
In your example, synchronized will make sure only one thread can invoke the method on one instance of the class at a time. Other methods could access that collection, which would not be safe. Look up concurrent collections for more information on thread-safe collection implementations.
If I was to write my own method synchronized add()- how does the locking work? Is the locking done on the instance of X, and not on the collection object?
The locking is done on the object that you synchronized on -- not any fields within the object. For locking to work, all of the threads must synchronize on the same exact object. Typically a private final object is best to be locked on.
private final Collection<...> myCollection = ...
...
synchronize (myCollection) {
myCollection.add(...);
}
Although a common pattern is to lock on the object that you are protecting, it really can be any constant object. You could also do:
private final Object lockObject = new Object();
...
synchronize (lockObject) {
myCollection.add(...);
}
So synchronizing my add() method would not stop many instances of X from calling add() and inserting into the collection- therefore I could still have threading problems?
If other parts of your application are accessing the myCollection without being inside of a synchronized (myCollection) block, then yes, you are going to have threading problems. You would need to synchronize around all accesses to properly protect the collection and provide a memory barrier. That means add(...), contains(...), iterators, etc..
Often, if you are trying to protect a collection or other class, it makes sense to wrap it in a class which does the synchronization. This hides the locking and protects the collection from unintended modifications from code that is missing a synchronized block.
Is it true that you are sharing one collection across many X instances? Then you need to synchronize on the collection instance itself. Don't make the method itself synchronized, but wrap all its code in a synchronized(coll) { ... } block.
If, on the other hand, each X has its own collection, then synchronized add() is all you need. This will guarantee that no two threads are executing add on the same instance at the same time.

Manually need to synchronize access to synchronized list/map/set etc

Collection class provides various methods to get thread safe collections . Then why is it necessary to manually synchronize access while iterating ?
Each method is thread safe. If you make multiple calls to a synchronized collection this is not thread safe unless you hold a lock explicitly. Using an Iterator involves making multiple calls to the iterator implicitly so there is no way around this.
What some of the Concurrency Libraries collections do is provide weak consistency. They provide a pragmatic solution which is that an added or removed element may, or may not be seen when Iterating.
A simple example of a thread safe collection used in an unsafe manner.
private final List<String> list = Collections.synchronizedList(
new ArrayList<String>());
list.add("hello");
String hi = list.remove(list.size()-1);
Both add and remove are thread safe and you won't get an error using them individually. The problem is another thread can alter the collection BETWEEN calls (not within calls) causing this code to break in a number of ways.

Collection.synchronizedMap vs synchronizing individual methods in HashMap

What is the difference between a Collections.synchronizedMap() and a wrapper around a HashMap with all the methods synchronized. I dont see any difference becuase Collections.synchronizedMap() internally maintains the same lock for all methods.
Basically, what is the difference between the following code snippets
Class C {
Object o;
public void foo() {
synchronized(o) {
// thread safe code here
}
}
}
and
Class C {
Object o;
public synchronized void foo() {
}
}
There is only one difference:
Collections.synchronizedMap is able to use a different monitor than itself.
Using synchronized methods is the same as using sychnchonized(this)-blocks, which means, the wrapper would be the monitor and could be locked from the outside of the wrapper.
If you doesn't want an outside application to lock your monitor, you need to hide it.
On the other side, if you want to call multiple methods in a thread safe fashion, it is the easiest way to lock the whole collection (but it's not very scaleable, indeed).
Ps: For reuse, it's better to delegate the method calls to a backup-Map than to override the class, because you can switch to another Map implementation later, without changing your wrapper.
Both approaches acquire a monitor on the object and so should perform exactly the same. The main reason for the difference is architectural. The synchronized wrapper allows extending the basic non-thread safe variation easily.
Having said that don't use either, use ConcurrentHashMap. It uses lock striping so it's much quicker to use than either approach (as they are the same in terms of overhead + contention). Lock striping allows segments of the backing array to be locked independently. This means it's less probable that two threads will request to acquire the same lock.
Do not reinvent the wheel and use what is provided by the API.
You should always decorate rather than lumping everything and all feartures into one big featured class.
Always take the plain Map and decorate it with Collections or use a java.util.concurrent and use a real lock, so one can atomically inspect and update the map. Tomorrow you might want to change the Hashtable to a Treemap and you will be in trouble if your stuck with a hashtable.
So, why do you ask? :) Do you really believe that if class is placed in java.util package then some magic happens and its java code works in some tricky way?
It really just wraps all methods with synchronized {} block and nothing more.
UPD: the difference is that you have much less chances to make a mistake if you use synchronized collection instead of doing all synchronization stuff by yourself.
UPD 2: as you can see in sources they use 'mutex'-object as monitor. When you use synchronized modifier in method signature (i.e. synchronized void doSmth()) current instance of your object (i.e. this) is used as a monitor. Two blocks of code below are the same:
1.
synchronized public void doSmth () {
someLogic ();
moreLogic ();
}
synchronized public static void doSmthStatic () {
someStaticLogic ();
moreStaticLogic ();
}
2.
public void doSmth () {
synchronized (this) {
someLogic ();
moreLogic ();
}
}
public static void doSmthStatic () {
synchronized (ClassName.class) {
someStaticLogic ();
moreStaticLogic ();
}
}
If thread safety is the case, use concurrency package data structures. Using the wrapper class will reduce all accesses to the Map into a sequential queue.
a) Threads waiting to do operations at totally different points in the Map will be waiting for the same lock. Based on the number of threads this can affect the application performance.
b) Consider compound operations on the Map. Using a wrapper with a Single lock will not help. For example. "Look if present then add" kind of operations. Thread syncronization will again become an issue.

Categories

Resources