I am using JCS for caching in my application.Recently some errors are occuring where concurrent access to data in the cache results in null value i.e one thread writes to the cache and one thread reads to the cache.I want to know whether JCS inherently supports thread safe implementations when writing and reading from the cache.
I also want to know how to make my implementation thread safe.Because I have multiple classes writing to the cache,say PutData which implements Runnable is used for writing to the cache and GetData which also implements Runnable for reading from the cache,so making the method synchronized has no meaning and also making them atomic also is not meaningful since the data is not shared among the classes and I pass around the data object to the individual classes.BTW I am using a POJO serialiazed class.
Is there anyway to overcome this,or do I have to change my implementation in such a way that it forcefully completes writing and then reading,which is foolish I think.
This is more of like a producer-consumer problem except in the fact that my consumer thread here is not consuming data,but just reading data.
So synchronizing does guarantee that only one thread writes to the cache,but that does not solve my problem because the other thread accesses objects of different key.
Expecting your answers,
Thanks,
Madhu.
Recently I started using JCS and I had the problem "Is JCS thread safe?".
Well I had a look at the source code and found that the implementation has considered about thread safety.
JCS.getInstance(String region) does always return the same CompositeCache object per region key, wrapped in a new JCS object.In other words, reference of the one an only CompositeCache object is kept in a newly created wrapper JCS object. When we call methods like JCS.get(), JCS.put(), JCS.remove() etc. it will always ended up in invoking a method of the one an only CompositeCache object. So, it is singleton.
Importantly, CompositeCache object has synchronized methods for its write operations (put remove etc.) and in the inner implementation Hashtable objects have been used, which are also thread safe. So I think the JCS has taken care of thread safety at atomic levels.
What Thomas has mentioned above is true. If the cache object was synchronized, then a concurrency issue should have been avoided, which seems to be not the case as mentioned above, may be the issue is something else not really concurrency.
However, I just wanted to share the fact that, one should not plan to use the JCS by gaining an object level lock as discussed above, as the implementation seems to be thread safe, and we should let the concurrency to be handled at more atomic levels, looking for better performance.
I don't know JCS but you can synchronize on objects, so you might want to synchronize on the cache object.
Someting like this:
public void putToCache(...) {
synchronized (cache) { //cache is your actual cache instance here
//put to cache here
}
}
Related
public BlockingQueue<Message> Queue;
Queue = new LinkedBlockingQueue<>();
I know if I use, say a synchronized List, I need to surround it in synchronized blocks to safely use it across threads
Is that the same for Blocking Queues?
No you do not need to surround with synchronized blocks.
From the JDK javadocs...
BlockingQueue implementations are thread-safe. All queuing methods achieve their effects atomically using internal locks or other forms of concurrency control. However, the bulk Collection operations addAll, containsAll, retainAll and removeAll are not necessarily performed atomically unless specified otherwise in an implementation. So it is possible, for example, for addAll(c) to fail (throwing an exception) after adding only some of the elements in c.
Just want to point out that from my experience the classes in the java.util.concurrent package of the JDK do not need synchronization blocks. Those classes manage the concurrency for you and are typically thread-safe. Whether intentional or not, seems like the java.util.concurrent has superseded the need to use synchronization blocks in modern Java code.
Depends on use case, will explain 2 scenarios where you may need synchronized blocks or dont need it.
Case 1: Not required while using queuing methods e.g. put, take etc.
Why not required is explained here, important line is below:
BlockingQueue implementations are thread-safe. All queuing methods
achieve their effects atomically using internal locks or other forms
of concurrency control.
Case 2: Required while iterating over blocking queues and most concurrent collections
Since iterator (one example from comments) is weakly consistent, meaning it reflects some but not necessarily all of the changes that have been made to its backing collection since it was created. So if you care about reflecting all changes you need to use synchronized blocks/ Locks while iterating.
You are thinking about synchronization at too low a level. It doesn't have anything to do with what classes you use. It's about protecting data and objects that are shared between threads.
If one thread is able to modify any single data object or group of related data objects while other threads are able to look at or modify the same object(s) at the same time, then you probably need synchronization. The reason is, it often is not possible for one thread to modify data in a meaningful way without temporarily putting the data into an invalid state.
The purpose of synchronization is to prevent other threads from seeing the invalid state and possibly doing bad things to the same data or to other data as a result.
Java's Collections.synchronizedList(...) gives you a way for two or more threads to share a List in such a way that the list itself is safe from being corrupted by the action of the different threads. But, It does not offer any protection for the data objects that are in the List. If your application needs that protection, then it's up to you to supply it.
If you need the equivalent protection for a queue, you can use any of the several classes that implement java.util.concurrent.BlockingQueue. But beware! The same caveat applies. The queue itself will be protected from corruption, but the protection does not automatically extend to the objects that your threads pass through the queue.
In Java an Object itself can act as a lock for guarding its own state . This convention is used in many built in classes like Vector and other synchronized collections where every method is synchronized and thus guarded by the intrinsic lock of the object itself . Is this good or bad ? Please give reasons also .
Pros
It's simple.
You can control the lock externally.
Cons
It breaks encapuslation.
You can't change its locking behaviour without changing its implied contract.
For the most part, it doesn't matter unless you are developing an API which will be widely used. So while using synchronised(this) is not ideal, it is simple.
Well Vector, Hashtable, etc. were synchronized like this internally and we all know what happened to them...
I honestly can't find any good reason to do synchronization like this. Here are the disadvantages that I see:
There's almost always a more efficient way of ensuring thread-safety than just putting a lock on the entire method.
It slows down the code in single threaded environments because you pay the overhead of locking and unlocking without actually needing the lock.
It gives a false sense of security because although each operation is synchronized, sequences of operations are not and you can still accidentally create data races. Imagine a collection which is synchronized on each method and the following code:
if(collection.isEmpty()) {
collection.add(...);
}
Assuming the aim is to have only a single item added, the above code is not thread safe because a thread can be interrupted between the if check and the actual call to add, even though both operations are synchronized individually, so it is possible to actually get two items in the collection.
I have started learning concurrency and threads in Java. I know the basics of synchronized (i.e. what it does). Conceptually I understand that it provides mutually exclusive access to a shared resource with multiple threads in Java. But when faced with an example like the one below I am confused about whether it is a good idea to have it synchronized. I know that critical sections of the code should be synchronized and this keyword should not be overused or it effects the performance.
public static synchronized List<AClass> sortA(AClass[] aArray)
{
List<AClass> aObj = getList(aArray);
Collections.sort(aObj, new AComparator());
return aObj;
}
public static synchronized List<AClass> getList(AClass[] anArray)
{
//It converts an array to a list and returns
}
Assuming each thread passes a different array then no synchronization is needed, because the rest of the variables are local.
If instead you fire off a few threads all calling sortA and passing a reference to the same array, you'd be in trouble without synchronized, because they would interfere with eachother.
Beware, that it would seem from the example that the getList method returns a new List from an array, such that even if the threads pass the same array, you get different List objects. This is misleading. For example, using Arrays.asList creates a List backed by the given array, but the javadoc clearly states that Changes to the returned list "write through" to the array. so be careful about this.
Synchronization is usually needed when you are sharing data between multiple invocations and there is a possibility that the data would be modified resulting in inconsistency. If the data is read-only then you dont need to synchronize.
In the code snippet above, there is no data that is being shared. The methods work on the input provided and return the output. If multiple threads invoke one of your method, each invocation will have its own input and output. Hence, there is no chance of in-consistency anywhere. So, your methods in the above snippet need not be synchornized.
Synchronisation, if unnecessarily used, would sure degrade the performance due to the overheads involved and hence should be cautiously used only when required.
Your static methods don't depend on any shared state, so need not be synchronized.
There is no rule defined like when to use synchronized and when not, when you are sure that your code will not be accessed by concurrent threads then you can avoid using synchronised.
Synchronization as you have correctly figured has an impact on the throughput of your application, and can also lead to starving thread.
All get basically should be non blocking as Collections under concurrency package have implemented.
As in your example all calling thread will pass there own copy of array, getList doesn't need to be synchronized so is sortA method as all other variables are local.
Local variables live on stack and every thread has its own stack so other threads cannot interfere with it.
You need synchronization when you change the state of the Object that other threads should see in an consistent state, if your calls don't change the state of the object you don't need synchronization.
I wouldn't use synchronized on single threaded code. i.e. where there is no chance an object will be accessed by multiple threads.
This may appear obvious but ~99% of StringBuffer used in the JDK can only be used by one thread can be replaced with a StringBuilder (which is not synchronized)
We know that ConcurrentHashMap can provide concurrent access to multiple threads to boost performance , and inside this class, segments are synchronized up (am I right?). Question is, can this design guarantee the thread safety? Say we have 30+ threads accessing &changing an object mapped by the same key in a ConcurrentHashMap instance, my guess is, they still have to line up for that, don't they?
From my recollection that the book "Java Concurrency in Practice" says the ConcurrentHashMap provide concurrent reading and a decent level of concurrent writing. in the aforementioned scenario, and if my guess is correct, the performance won't be better than using the Collection's static synchonization wrapper api?
Thanks for clarifying,
John
You will still have to synchronize any access to the object being modified, and as you suspect all access to the same key will still have contention. The performance improvement comes in access to different keys, which is of course the more typical case.
All a ConcurrentMap can give you wrt to concurrency is that modifications to the map itself are done atomically, and that any writes happen-before any reads (this is important as it provides safe publishing of any reference from the map.
Safe-publishing means that any (mutable) object retrieved from the map will be seen with all writes to it before it was placed in the map. It won't help for publishing modifications that are made after retrieving it though.
However, concurrency and thread-safety is generally hard to reason about and make correct if you have mutable objects that are being modified by multiple parties. Usually you have to lock in order to get it right. A better approach is often to use immutable objects in conjunction with the ConcurrentMap conditional putIfAbsent/replace methods and linearize your algorithm that way. This lock-free style tends to be easier to reason about.
Question is, can this design guarantee the thread safety?
It guarantees the thread safety of the map; i.e. that access and updates on the map have a well defined and orderly behaviour in the presence of multiple threads performing updates simultaneously.
It does guarantee thread safety of the key or value objects. And it does not provide any form of higher level synchronization.
Say we have 30+ threads accessing &changing an object mapped by the same key in a ConcurrentHashMap instance, my guess is, they still have to line up for that, don't they?
If you have multiple threads trying to use the same key, then their operations will inevitably be serialized to some degree. That is unavoidable.
In fact, from briefly looking at the source code, it looks like ConcurrentHashMap falls back to using conventional locks if there is too much contention for a particular segment of the map. And if you have multiple threads trying to access AND update the same key simultaneously, that will trigger locking.
first remember that a thread safe tool doesn't guarantee thread safe usage of it in and of itself
the if(!map.contains(k))map.put(k,v); construct to putIfAbsent for example is not thread safe
and each value access/modification still has to be made thread safe independently
Reads are concurrent, even for the same key, so performance will be better for typical applications.
Naively, I expected a ThreadLocal to be some kind of WeakHashMap of Thread to the value type. So I was a little puzzled when I learned that the values of a ThreadLocal is actually saved in a map in the Thread. Why was it done that way? I would expect that the resource leaks associated with ThreadLocal would not be there if the values are saved in the ThreadLocal itself.
Clarification: I was thinking of something like
public class AlternativeThreadLocal<T> {
private final Map<Thread, T> values =
Collections.synchronizedMap(new WeakHashMap<Thread, T>());
public void set(T value) { values.put(Thread.currentThread(), value); }
public T get() { return values.get(Thread.currentThread());}
}
As far as I can see this would prevent the weird problem that neither the ThreadLocal nor it's left over values could ever be garbage-collected until the Thread dies if the value somehow strongly references the ThreadLocal itself.
(Probably the most devious form of this occurs when the ThreadLocal is a static variable on a class the value references. Now you have a big resource leak on redeployments in application servers since neither the objects nor their classes can be collected.)
Sometimes you get enlightened by just asking a question. :-) Now I just saw one possible answer: thread-safety. If the map with the values is in the Thread object, the insertion of a new value is trivially thread-safe. If the map is on the ThreadLocal you have the usual concurrency issues, which could slow things down. (Of course you would use a ReadWriteLock instead of synchronize, but the problem remains.)
You seem to be misunderstanding the problem of ThreadLocal leaks. ThreadLocal leaks occur when the same thread is used repeatedly, such as in a thread pool, and the ThreadLocal state is not cleared between usages. They're not a consequence of the ThreadLocal remaining when the Thread is destroyed, because nothing references the ThreadLocal Map aside from the thread itself.
Having a weakly reference map of Thread to thread-local objects would not prevent the ThreadLocal leak problem because the thread still exists in the thread pool, so the thread-local objects are not eligible for collection when the thread is reused from the pool. You'd still need to manually clear the ThreadLocal to avoid the leak.
As you said in your answer, concurrency control is simplified with the ThreadLocal Map being a single instance per thread. It also makes it impossible for one thread to access another's thread local objects, which might not be the case if the ThreadLocal object exposed an API on the Map you suggest.
I remember some years ago Sun changed the implementation of thread locals to its current form. I don't remember what version it was and what the old impl was like.
Anyway, for a variable that each thread should have a slot for, Thread is the natural container of choice. If we could, we would also add our thread local variable directly as a member of Thread class.
Why would the Map be on ThreadLocal? That doesn't make a lot of sense. So it'd be a Map of ThreadLocals to objects inside a ThreadLocal?
The simple reason it's a Map of Threads to Objects is because:
It's an implementation detail ie that Map isn't exposed in any way;
It's always easy to figure out the current thread (with Thread.currentThread()).
Also the idea is that a ThreadLocal can store a different value for each Thread that uses it so it makes sense that it is based on Thread, doesn't it?