synchronization ideas for a method - java

I have a multithreaded class A which accesses the following insert() method of another class B (A has only a single instance of B).
Instead of making the entire method synchronized, are there any better ways to synchronize the following method? (to reduce the synchronization overhead)
private void insert(byte[] shardKey, byte[] queueKey,
byte[] value, PipelineMessageType msgType) {
PipelineMessage pipelineMessage = new PipelineMessage(queueKey,
value, msgType);
LinkedBlockingQueue<PipelineMessage> queue;
JedisShardInfo shardInfo = shardedJedis.getShardInfo(shardKey); // shardedJedis is an instance variable of this class
String mapKey = shardInfo.getHost() + shardInfo.getPort();
queue = shardQueue.get(mapKey); // shardQueue is an instance variable of this class
boolean insertSuccessful = queue.offer(pipelineMessage);
if(!insertSuccessful) {
// perform the pipeline sync - flush the queue
// use another thread for this
// (processing of queue entries is given to another thread here)
// queue would be empty now. Insert (k,v)
queue.offer(pipelineMessage);
}
}
I tried to synchronize only the fragment which accesses the instance variables but there might be a scenario where 2 threads try to insert into a full queue and enter the if block. Then 2 threads might process the queue entries which I don't want to happen.
Any suggestions are appreciated. Thank you in advance.

Seems to me that if JedisShardInfo would be a read-only item, then you should need to protect/synchronize it. So you could synchronize only from the line
queue= ...
Otherwise, almost everything should be synchronized, except the first statement (declaration of pipeline message), and then I really wonder if it changes much compared to declaring the whole method synchronized.
Also, if you got other points of synchronization, I mean other methods or block codes that are synchronized on this, you should consider splitting them and synchronize on different data members of this depending on which data members you wish to protect from multi-threading :
Object lockerA = new Object() {};
synchronized( lockerA )
{}//sync
Well, not much to say. :)
Regards,
Stéphane

The key to correct synchronization is to follow this pattern:
synchronize(lockObjectForState) { // All code that alters state must also synchronise on the same lock
while(!stateOkToProceed()) {
try {
lockForState.wait();
} catch (InterruptedException e) {
// handle if your thread was interrupted deliberately as a single to exit, or spuriously (in which case do nothing)
}
}
updateState();
lockForState.notifyAll();
}
java.util.concurrent package offer many thread-safe implementations of classes needed to solve common threading problems. Consider using a BlockingQueue.

Related

Synchronizing on entire object v/s synchronizing on particular field of an object - which is a better approach

While working on a Producer Consumer problem I stumbled upon a scenario wherein I can synchronize on the field "sharedLinkedList" or on the entire object "this" of class "SharedObject".
"sharedLinkedList" is the field the Producer and Consumer threads are going to work on.
Please find a snippet of the class with the Produce method:-
public class SharedObject {
LinkedList<Integer> sharedLinkedList = new LinkedList<Integer>();
int capacity = 5;
public void produce() {
while (true) {
synchronized (sharedLinkedList) {// Alternately synchronized(this)
try {
while(sharedLinkedList.size() == capacity){
sharedLinkedList.wait();//Alternately this.wait()
}
} catch (InterruptedException e) {
e.printStackTrace();
}
sharedLinkedList.add(123);
sharedLinkedList.notify();//Alternately this.notify
}
}
}
}
Both the approaches work.
1. Synchronizing on the field "sharedLinkedList"
2. Synchronizing on the entire object (put as a comment in the code snippet).
Which is a better approach and why?
With the amount of code you show, both approaches are equivalent. That doesn't mean they are equivalent under all circumstances.
synchronized(x) set the lock on x, so in general, you should synchronize using an object common to all threads using shared resources. If threads use a single common resource (such as the list), then synchronizing on that list works. If threads share more than one resource, you have to synchronize on a separate object to control access to those shared resources. This can be an object, or this depending on the context.
synchronized(this) also prevents mutual exclusion among the methods of a class, so that, or synchronized methods are useful if object state needs to be modified in a thread safe manner.

What does it mean when somebody says "Result is not Thread-Safe"

I was writing an app specific wrapper over Java HBase APIs when I read this doc:
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Result.html
It says This class is **NOT THREAD SAFE**.
What exactly does it mean by not thread safe. I'm basically a C++ programmer and if someone says the function strtok() is not thread safe, I'll not use it in a multithreaded env. Its something like strtok() uses a static variable and calls to this function by two different threads is not a good idea.
Is it the same when it comes to JAVA?
I have a function:
public String get(String key, String family) {
Get get = new Get(key.getBytes());
get.addFamily(family.getBytes());
Result result = null;
try {
result = _table.get(get);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return "";
}
The function get might be called by multiple threads. Does it make Result unsafe to use somehow?
What exactly does it mean by not thread safe.
It means that if the given class object is accessed via various Threads then calling its method(s) within those Threads may result in unpredictable results because of unwanted interaction between the various Threads. The basic reason for this unpredictable result is the sharing of same data of an object among various threads. You can look at here at Wikipedia Article to know more about Thread-safety.
After going through your code I am seeing that you are using a member variable _table in line result = _table.get(get); . So , most probably it is not Thread-safe.
If a class is "not Thread-Safe", its methods cannot be called (on the same instance) by multiple threads without additional synchronization. For example, you can't simultaneously iterate over an ArrayList in one thread, and modify its contents in other.
In your case it shouldn't be a problem, because in each invocation of function get new instance of Result is created, so these threads operate on different Result objects.
When you read that Result is not thread safe it means that if you have a situation where multiple threads can access this result object you need to make sure the usage is synchronized (see here for list of ways to synchronize access).
If your code will stay as is, i.e. each call to get will generate its own new instance of Result you are ok. If you'd keep this Result instance between calls in the object and it will be used in multiple calls you'd need to protect the access to that Result object

Java synchronized - am I doing it right?

I'm not used to working with synchronized. Does the following snippet look right?
public void setNewSessionListener(NewSessionListener newSessionListener) {
if (this.newSessionListener != null)
synchronized (this.newSessionListener) {
this.newSessionListener = newSessionListener;
}
else
this.newSessionListener = newSessionListener;
}
More specifically do I need to perform a null check? I have a gut feeling there is something fundamentally wrong with that code.
There are two mistakes. The first one is that if you access a field which requires synchronization, you always have to access it with the same lock held. Also you have to check if the field is null and write to the field in the same sychronized block, because otherwise when you write something to the field, it may already be not null.
The second one is that it is best to sychronize on something that doesn't change, in other words, on a static final field or on the instance itself. For example, you can create a lock object specifically for this purpose:
private static final Object LOCK = new Object();
And then you will write:
synchronized (LOCK) {
if (this.newSessionListener == null) this.newSessionListener = newSessionListener;
}
Your feeling is right. You should do the null check inside the synchronized block. Otherwise the block won't prevent double initialization. Furthermore, you shouldn't synchronize on this.newSessionListener which you are about to change - choose an object (reference) which is going to stay around for the whole scope of the block. This is the only way to guarantee that only one thread can enter this block of code at any point in time. The typical way to achieve this is to synchronize on this. Alternatively, you may synchronize on a private final object, kept for this sole purpose.
Moreover, ultimately you are performing the same assignment in both the if and the else branches, which is probably not what you want.
This is, at a minimum, a very bad idea. You are synchronizing on an object you then assign to.
Because you are using synchronized I assume this is called asynchronously and it could be called by one thread while another thread is inside this code. If so, you are not locking on a common object, you are locking on the value it is holding at that point in time.
Probably, and I stress probably, you can do synchronized (this). That will insure that all calls to this method for this specific object are synchronized. And that calls to other instances of this class are locked for that other object - but not across instances.
If you want to synchronize across all instantiated objects, call synchronized (YourClass)
Here's another possibility (i tend to prefer explicit locks over the synchronized block):
private ReentrantLock lock = new ReentrantLock();
lock.lock();
try {
// do your synchronized code here.
}
finally {
lock.unlock();
}
Though just by looking at your code, i'm not sure why there's even an if block. Why are you synchronized in one case, and not the other? Especially considering you're making the same assignment in either case?

Implementing a blocking queue in JavaME: how to optimize it?

I'm trying to implement a simple blocking queue in Java ME. In JavaME API, the concurrency utilities of Java SE are not available, so I have to use wait-notify like in the old times.
This is my provisional implementation. I'm using notify instead of notifyAll because in my project there are multiple producers but only a single consumer. I used an object for wait-notify on purpose to improve readability, despite it wastes a reference:
import java.util.Vector;
public class BlockingQueue {
private Vector queue = new Vector();
private Object queueLock = new Object();
public void put(Object o){
synchronized(queueLock){
queue.addElement(o);
queueLock.notify();
}
}
public Object take(){
Object ret = null;
synchronized (queueLock) {
while (queue.isEmpty()){
try {
queueLock.wait();
} catch (InterruptedException e) {}
}
ret = queue.elementAt(0);
queue.removeElementAt(0);
}
return ret;
}
}
My main question is about the put method. Could I put the queue.addElement line out of the synchronized block? Will performance improve if so?
Also, the same applies to take: could I take the two operations on queue out of the synchronized block?
Any other possible optimization?
EDIT:
As #Raam correctly pointed out, the consumer thread can starve when being awakened in wait. So what are the alternatives to prevent this? (Note: In JavaME I don't have all these nice classes from Java SE. Think of it as the old Java v1.2)
The Vector class makes no guarantees to be thread safe, and you should synchronize access to it, like you have done. Unless you have evidence that your current solution has performance problems, I wouldn't worry about it.
On a side note, I see no harm in using notifyAll rather than notify to support multiple consumers.
synchronized is used to protect access to shared state and ensure atomicity.
Note that methods of Vector are already synchronized, therefore Vector protects it own shared state itself. So, your synchronization blocks are only needed to ensure atomicity of your operations.
You certainly cannot move operations on queue from the synchronized block in your take() method, because atomicity is crucial for correctness of that method. But, as far as I understand, you can move queue operation from the synchronized block in the put() method (I cannot imagine a situation when it can go wrong).
However, the reasoning above is purely theoretical, because in all cases you have double synchronization: your synchronize on queueLock and methods of Vector implicitly synchronize on queue. Therefore proposed optimization doesn't make sense, its correctness depends on presence of that double synchronization.
To avoid double synchronization you need to synchronize on queue as well:
synchronized (queue) { ... }
Another option would be to use non-synchronized collection (such as ArrayList) instead of Vector, but JavaME doesn't support it. In this case you won't be able to use proposed optimization as well because synchronized blocks also protect shared state of the non-synchronized collection.
Unless you have performance issues specifically due to garbage collection, I would rather use a linked list than a Vector to implement a queue (first in,first out).
I would also write code that would be reused when your project (or another) gets multiple consumers. Although in that case, you need to be aware that the Java language specifications do not impose a way to implement monitors. In practice, that means that you don't control which consumer thread gets notified (half of the existing Java Virtual Machines implement monitors using a FIFO model and the other half implement monitors using a LIFO model)
I also think that whoever is using the blocking class is also supposed to deal with the InterruptedException. After all, the client code would have to deal with a null Object return otherwise.
So, something like this:
/*package*/ class LinkedObject {
private Object iCurrentObject = null;
private LinkedObject iNextLinkedObject = null;
LinkedObject(Object aNewObject, LinkedObject aNextLinkedObject) {
iCurrentObject = aNewObject;
iNextLinkedObject = aNextLinkedObject;
}
Object getCurrentObject() {
return iCurrentObject;
}
LinkedObject getNextLinkedObject() {
return iNextLinkedObject;
}
}
public class BlockingQueue {
private LinkedObject iLinkedListContainer = null;
private Object iQueueLock = new Object();
private int iBlockedThreadCount = 0;
public void appendObject(Object aNewObject) {
synchronized(iQueueLock) {
iLinkedListContainer = new iLinkedListContainer(aNewObject, iLinkedListContainer);
if(iBlockedThreadCount > 0) {
iQueueLock.notify();//one at a time because we only appended one object
}
} //synchonized(iQueueLock)
}
public Object getFirstObject() throws InterruptedException {
Object result = null;
synchronized(iQueueLock) {
if(null == iLinkedListContainer) {
++iBlockedThreadCount;
try {
iQueueLock.wait();
--iBlockedThreadCount; // instead of having a "finally" statement
} catch (InterruptedException iex) {
--iBlockedThreadCount;
throw iex;
}
}
result = iLinkedListcontainer.getCurrentObject();
iLinkedListContainer = iLinkedListContainer.getNextLinkedObject();
if((iBlockedThreadCount > 0) && (null != iLinkedListContainer )) {
iQueueLock.notify();
}
}//synchronized(iQueueLock)
return result;
}
}
I think that if you try to put less code in the synchronized blocks, the class will not be correct anymore.
There seem to be some issues with this approach. You can have scenarios where the consumer can miss notifications and wait on the queue even when there are elements in the queue.
Consider the following sequence in chronological order
T1 - Consumer acquires the queueLock and then calls wait. Wait will release the lock and cause the thread to wait for a notification
T2 - One producer acquires the queueLock and adds an element to the queue and calls notify
T3 - The Consumer thread is notified and attempts to acquire queueLock BUT fails as another producer comes at the same time. (from the notify java doc - The awakened thread will compete in the usual manner with any other threads that might be actively competing to synchronize on this object; for example, the awakened thread enjoys no reliable privilege or disadvantage in being the next thread to lock this object.)
T4 - The second producer now adds another element and calls notify. This notify is lost as the consumer is waiting on queueLock.
So theoretically its possible for the consumer to starve (forever stuck trying to get the queueLock) also you can run into a memory issue with multiple producers adding elements to the queue which are not being read and removed from the queue.
Some changes that I would suggest is as follows -
Keep an upper bound to the number of items that can be added to the queue.
Ensure that the consumer always read all the elements. Here is a program which shows how the producer - consumer problem can be coded.

Piping data between threads with Java

I am writing a multi-threaded application that mimics a movie theater. Each person involved is its own thread and concurrency must be done completely by semaphores. The only issue I am having is how to basically link threads so that they can communicate (via a pipe for instance).
For instance:
Customer[1] which is a thread, acquires a semaphore that lets it walk up to the Box Office. Now Customer[1] must tell the Box Office Agent that they want to see movie "X". Then BoxOfficeAgent[1] also a thread, must check to make sure the movie isn't full and either sell a ticket or tell Customer[1] to pick another movie.
How do I pass that data back and forth while still maintaining concurrency with the semaphores?
Also, the only class I can use from java.util.concurrent is the Semaphore class.
One easy way to pass data back and forth between threads is to use the implementations of the interface BlockingQueue<E>, located in the package java.util.concurrent.
This interfaces has methods to add elements to the collection with different behaviors:
add(E): adds if possible, otherwise throws exception
boolean offer(E): returns true if the element has been added, false otherwise
boolean offer(E, long, TimeUnit): tries to add the element, waiting the specified amount of time
put(E): blocks the calling thread until the element has been added
It also defines methods for element retrieval with similar behaviors:
take(): blocks until there's an element available
poll(long, TimeUnit): retrieves an element or returns null
The implementations I use most frequently are: ArrayBlockingQueue, LinkedBlockingQueue and SynchronousQueue.
The first one, ArrayBlockingQueue, has a fixed size, defined by a parameter passed to its constructor.
The second, LinkedBlockingQueue, has illimited size. It will always accept any elements, that is, offer will return true immediately, add will never throw an exception.
The third, and to me the most interesting one, SynchronousQueue, is exactly a pipe. You can think of it as a queue with size 0. It will never keep an element: this queue will only accept elements if there's some other thread trying to retrieve elements from it. Conversely, a retrieval operation will only return an element if there's another thread trying to push it.
To fulfill the homework requirement of synchronization done exclusively with semaphores, you could get inspired by the description I gave you about the SynchronousQueue, and write something quite similar:
class Pipe<E> {
private E e;
private final Semaphore read = new Semaphore(0);
private final Semaphore write = new Semaphore(1);
public final void put(final E e) {
write.acquire();
this.e = e;
read.release();
}
public final E take() {
read.acquire();
E e = this.e;
write.release();
return e;
}
}
Notice that this class presents similar behavior to what I described about the SynchronousQueue.
Once the methods put(E) gets called it acquires the write semaphore, which will be left empty, so that another call to the same method would block at its first line. This method then stores a reference to the object being passed, and releases the read semaphore. This release will make it possible for any thread calling the take() method to proceed.
The first step of the take() method is then, naturally, to acquire the read semaphore, in order to disallow any other thread to retrieve the element concurrently. After the element has been retrieved and kept in a local variable (exercise: what would happen if that line, E e = this.e, were removed?), the method releases the write semaphore, so that the method put(E) may be called again by any thread, and returns what has been saved in the local variable.
As an important remark, observe that the reference to the object being passed is kept in a private field, and the methods take() and put(E) are both final. This is of utmost importance, and often missed. If these methods were not final (or worse, the field not private), an inheriting class would be able to alter the behavior of take() and put(E) breaking the contract.
Finally, you could avoid the need to declare a local variable in the take() method by using try {} finally {} as follows:
class Pipe<E> {
// ...
public final E take() {
try {
read.acquire();
return e;
} finally {
write.release();
}
}
}
Here, the point of this example if just to show an use of try/finally that goes unnoticed among inexperienced developers. Obviously, in this case, there's no real gain.
Oh damn, I've mostly finished your homework for you. In retribution -- and for you to test your knowledge about Semaphores --, why don't you implement some of the other methods defined by the BlockingQueue contract? For example, you could implement an offer(E) method and a take(E, long, TimeUnit)!
Good luck.
Think it in terms of shared memory with read/write lock.
Create a buffer to put the message.
The access to the buffer should be controlled by using a lock/semaphore.
Use this buffer for inter thread communication purpose.
Regards
PKV

Categories

Resources