Piping data between threads with Java

Piping data between threads with Java - java

I am writing a multi-threaded application that mimics a movie theater. Each person involved is its own thread and concurrency must be done completely by semaphores. The only issue I am having is how to basically link threads so that they can communicate (via a pipe for instance).
For instance:
Customer[1] which is a thread, acquires a semaphore that lets it walk up to the Box Office. Now Customer[1] must tell the Box Office Agent that they want to see movie "X". Then BoxOfficeAgent[1] also a thread, must check to make sure the movie isn't full and either sell a ticket or tell Customer[1] to pick another movie.
How do I pass that data back and forth while still maintaining concurrency with the semaphores?
Also, the only class I can use from java.util.concurrent is the Semaphore class.

One easy way to pass data back and forth between threads is to use the implementations of the interface BlockingQueue<E>, located in the package java.util.concurrent.
This interfaces has methods to add elements to the collection with different behaviors:
add(E): adds if possible, otherwise throws exception
boolean offer(E): returns true if the element has been added, false otherwise
boolean offer(E, long, TimeUnit): tries to add the element, waiting the specified amount of time
put(E): blocks the calling thread until the element has been added
It also defines methods for element retrieval with similar behaviors:
take(): blocks until there's an element available
poll(long, TimeUnit): retrieves an element or returns null
The implementations I use most frequently are: ArrayBlockingQueue, LinkedBlockingQueue and SynchronousQueue.
The first one, ArrayBlockingQueue, has a fixed size, defined by a parameter passed to its constructor.
The second, LinkedBlockingQueue, has illimited size. It will always accept any elements, that is, offer will return true immediately, add will never throw an exception.
The third, and to me the most interesting one, SynchronousQueue, is exactly a pipe. You can think of it as a queue with size 0. It will never keep an element: this queue will only accept elements if there's some other thread trying to retrieve elements from it. Conversely, a retrieval operation will only return an element if there's another thread trying to push it.
To fulfill the homework requirement of synchronization done exclusively with semaphores, you could get inspired by the description I gave you about the SynchronousQueue, and write something quite similar:
class Pipe<E> {
private E e;
private final Semaphore read = new Semaphore(0);
private final Semaphore write = new Semaphore(1);
public final void put(final E e) {
write.acquire();
this.e = e;
read.release();
}
public final E take() {
read.acquire();
E e = this.e;
write.release();
return e;
}
}
Notice that this class presents similar behavior to what I described about the SynchronousQueue.
Once the methods put(E) gets called it acquires the write semaphore, which will be left empty, so that another call to the same method would block at its first line. This method then stores a reference to the object being passed, and releases the read semaphore. This release will make it possible for any thread calling the take() method to proceed.
The first step of the take() method is then, naturally, to acquire the read semaphore, in order to disallow any other thread to retrieve the element concurrently. After the element has been retrieved and kept in a local variable (exercise: what would happen if that line, E e = this.e, were removed?), the method releases the write semaphore, so that the method put(E) may be called again by any thread, and returns what has been saved in the local variable.
As an important remark, observe that the reference to the object being passed is kept in a private field, and the methods take() and put(E) are both final. This is of utmost importance, and often missed. If these methods were not final (or worse, the field not private), an inheriting class would be able to alter the behavior of take() and put(E) breaking the contract.
Finally, you could avoid the need to declare a local variable in the take() method by using try {} finally {} as follows:
class Pipe<E> {
// ...
public final E take() {
try {
read.acquire();
return e;
} finally {
write.release();
}
}
}
Here, the point of this example if just to show an use of try/finally that goes unnoticed among inexperienced developers. Obviously, in this case, there's no real gain.
Oh damn, I've mostly finished your homework for you. In retribution -- and for you to test your knowledge about Semaphores --, why don't you implement some of the other methods defined by the BlockingQueue contract? For example, you could implement an offer(E) method and a take(E, long, TimeUnit)!
Good luck.

Think it in terms of shared memory with read/write lock.
Create a buffer to put the message.
The access to the buffer should be controlled by using a lock/semaphore.
Use this buffer for inter thread communication purpose.
Regards
PKV

Related

How synchronized Block In Java works? Variable reference or memory is blocked?

I have a situation and I need some advice about synchronized block in Java. I have a Class Test below:
Class Test{
private A a;
public void doSomething1(String input){
synchronized (a) {
result = a.process(input);
}
}
public void doSomething2(String input){
synchronized (a) {
result = a.process(input);
}
}
public void doSomething3(String input){
result = a.process(input);
}
}
What I want is when multi threads call methods doSomeThing1() or doSomeThing2(), object "a" will be used and shared among multi threads (it have to be) and it only processes one input at a time (waiting until others thread set object "a" free) and when doSomeThing3 is called, the input is processed immediately.
My question is will the method doSomeThing3() be impacted my method doSomeThing1() and doSomeThing2()? Will it have to wait if doSomeThing1() and doSomeThing2() are using object "a"?

A method is never impacted by anything that your threads do. What gets impacted is data, and the answer to your question depends entirely on what data are updated (if any) inside the a.process() call.
You asked "Variable reference or memory is blocked?"
First of all, "variable" and "memory" are the same thing. Variables, and fields and objects are all higher level abstractions that are built on top of the lower-level idea of "memory".
Second of all, No. Locking an object does not prevent other threads from accessing or modifying the object or, from accessing or modifying anything else.
Locking an object does two things: It prevents other threads from locking the same object at the same time, and it makes certain guarantees about the visibility of memory updates. The simple explanation is, if thread X updates some variables and then releases a lock, thread Y will be guaranteed to see the updates only after it has acquired the same lock.
What that means for your example is, if thread X calls doSomething1() and modifies the object a; and then thread Y later calls doSomething3(), thread Y is not guaranteed to see the the updates. It might see the a object in its original state, it might see it in the fully updated state, or it might see it in some invalid half-way state. The reason why is because, even though thread X locked the object, modified it, and then released the lock; thread Y never locked the same object.

In your code, doSomething3() can proceed in parallel with doSomething1() or doSomething2(), so in that sense it does what you want. However, depending on exactly what a.process() does, this may cause a race condition and corrupt data. Note that even if doSomething3() is called, any calls to doSomething1() or doSomething2() that have started will continue; they won't be put in abeyance while doSomething3() is processed.

what thread safety issue does java.util.Stack or java.util.Queue has

If I ignore the size() inaccuracy, and assume I allocated large enough underlying Vector so that no reallocation happens, what thread safety issue does java.util.Stack or java.util.Queue has?
I cannot think of a valid/reasonable consistency argument to say they are thread unsafe.
Anybody has some insights?

"Thread safe" isn't an absolute attribute for a class -- what's safe or unsafe is your usage of the object. You can come up with unsafe ways to use a ConcurrentHashMap, and you can come up with thread-safe ways to use a plain HashMap.
When people say a class is thread-safe, they generally mean that each method is implemented in a way that's thread-safe on its own. In that sense, a Stack is thread-safe. But its interface doesn't allow for easy/safe handling of common use cases, so in that sense it's not very thread-safe.
For instance, if your code checks that the Stack is not empty, and if so, pop an element -- that's unsafe because it could be that it had one element (and thus was not empty), but someone else popped it before you got a chance to (in which case you're trying to pop an empty stack, and will get an exception).
To be more thread-safe, you really need a single method that handles that case for you. A BlockingQueue gives you that. For instance, take() will block until there's a value to pop, while poll() will instantly return back a value or null if there's no element to pop.

Stack, which extends Vector, has every method synchronized. This means that interactions with individual methods are thread-safe.
Queue is an interface. The safety of use across threads is up to the individual implementations. For example, an ArrayBlockingQueue is thread safe, but a LinkedList is not.

Look at this method from ArrayBlockingQueue (leave any existing synchronisation aside):
private void insert(E x) {
items[putIndex] = x;
// HERE
putIndex = inc(putIndex);
++count;
notEmpty.signal();
}
Let thread A progress until HERE, and let thread B take over and execute the method; then let A continue. It is easy to see that B's E x overwrites A's E x, with count being incremented by 2 and putIndex being advanced twice.
Similar HEREs can be found in other methods as well.
All data structures with memory for data and variables for bookkeeping are blatantly vulnerable to unsynced concurrent access.

Hashtable: why is get method synchronized?

I known a Hashtable is synchronized, but why its get() method is synchronized?
Is it only a read method?

If the read was not synchronized, then the Hashtable could be modified during the execution of read. New elements could be added, the underlying array could become too small and could be replaced by a bigger one, etc. Without sequential execution, it is difficult to deal with these situations.
However, even if get would not crash when the Hashtable is modified by another thread, there is another important aspect of the synchronized keyword, namely cache synchronization. Let's use a simplified example:
class Flag {
bool value;
bool get() { return value; } // WARNING: not synchronized
synchronized void set(bool value) { this->value = value; }
}
set is synchronized, but get isn't. What happens if two threads A and B simultaneously read and write to this class?
1. A calls read
2. B calls set
3. A calls read
Is it guaranteed at step 3 that A sees the modification of thread B?
No, it isn't, as A could be running on a different core, which uses a separate cache where the old value is still present. Thus, we have to force B to communicate the memory to other core, and force A to fetch the new data.
How can we enforce it? Everytime, a thread enters and leaves a synchronized block, an implicit memory barrier is executed. A memory barrier forces the cache to be updated. However, it is required that both the writer and the reader have to execute the memory barrier. Otherwise, the information is not properly communicated.
In our example, thread B already uses the synchronized method set, so its data modification is communicated at the end of the method. However, A does not see the modified data. The solution is to make get synchronized, so it is forced to get the updated data.

Have a look in Hashtable source code and you can think of lots of race conditions that can cause problem in a unsynchronized get() .
(I am reading JDK6 source code)
For example, a rehash() will create a empty array, and assign it to the instance var table, and put the entries from old table to the new one. Therefore if your get occurs after the empty array assignment, but before actually putting entries in it, you cannot find your key even it is in the table.
Another example is, there is a loop iterate thru the linked list at the table index, if in middle in your iteration, rehash happens. You may also failed to find the entry even it exists in the hashtable.

Hashtable is synchronized meaning the whole class is thread-safe
Inside the Hashtable, not only get() method is synchronized but also many other methods are. And particularly put() method is synchronized like Tom said.
A read method must be synchronized as a write method because it will make sure the visibility and the consistency of the variable.

Implementing a blocking queue in JavaME: how to optimize it?

I'm trying to implement a simple blocking queue in Java ME. In JavaME API, the concurrency utilities of Java SE are not available, so I have to use wait-notify like in the old times.
This is my provisional implementation. I'm using notify instead of notifyAll because in my project there are multiple producers but only a single consumer. I used an object for wait-notify on purpose to improve readability, despite it wastes a reference:
import java.util.Vector;
public class BlockingQueue {
private Vector queue = new Vector();
private Object queueLock = new Object();
public void put(Object o){
synchronized(queueLock){
queue.addElement(o);
queueLock.notify();
}
}
public Object take(){
Object ret = null;
synchronized (queueLock) {
while (queue.isEmpty()){
try {
queueLock.wait();
} catch (InterruptedException e) {}
}
ret = queue.elementAt(0);
queue.removeElementAt(0);
}
return ret;
}
}
My main question is about the put method. Could I put the queue.addElement line out of the synchronized block? Will performance improve if so?
Also, the same applies to take: could I take the two operations on queue out of the synchronized block?
Any other possible optimization?
EDIT:
As #Raam correctly pointed out, the consumer thread can starve when being awakened in wait. So what are the alternatives to prevent this? (Note: In JavaME I don't have all these nice classes from Java SE. Think of it as the old Java v1.2)

The Vector class makes no guarantees to be thread safe, and you should synchronize access to it, like you have done. Unless you have evidence that your current solution has performance problems, I wouldn't worry about it.
On a side note, I see no harm in using notifyAll rather than notify to support multiple consumers.

synchronized is used to protect access to shared state and ensure atomicity.
Note that methods of Vector are already synchronized, therefore Vector protects it own shared state itself. So, your synchronization blocks are only needed to ensure atomicity of your operations.
You certainly cannot move operations on queue from the synchronized block in your take() method, because atomicity is crucial for correctness of that method. But, as far as I understand, you can move queue operation from the synchronized block in the put() method (I cannot imagine a situation when it can go wrong).
However, the reasoning above is purely theoretical, because in all cases you have double synchronization: your synchronize on queueLock and methods of Vector implicitly synchronize on queue. Therefore proposed optimization doesn't make sense, its correctness depends on presence of that double synchronization.
To avoid double synchronization you need to synchronize on queue as well:
synchronized (queue) { ... }
Another option would be to use non-synchronized collection (such as ArrayList) instead of Vector, but JavaME doesn't support it. In this case you won't be able to use proposed optimization as well because synchronized blocks also protect shared state of the non-synchronized collection.

Unless you have performance issues specifically due to garbage collection, I would rather use a linked list than a Vector to implement a queue (first in,first out).
I would also write code that would be reused when your project (or another) gets multiple consumers. Although in that case, you need to be aware that the Java language specifications do not impose a way to implement monitors. In practice, that means that you don't control which consumer thread gets notified (half of the existing Java Virtual Machines implement monitors using a FIFO model and the other half implement monitors using a LIFO model)
I also think that whoever is using the blocking class is also supposed to deal with the InterruptedException. After all, the client code would have to deal with a null Object return otherwise.
So, something like this:
/*package*/ class LinkedObject {
private Object iCurrentObject = null;
private LinkedObject iNextLinkedObject = null;
LinkedObject(Object aNewObject, LinkedObject aNextLinkedObject) {
iCurrentObject = aNewObject;
iNextLinkedObject = aNextLinkedObject;
}
Object getCurrentObject() {
return iCurrentObject;
}
LinkedObject getNextLinkedObject() {
return iNextLinkedObject;
}
}
public class BlockingQueue {
private LinkedObject iLinkedListContainer = null;
private Object iQueueLock = new Object();
private int iBlockedThreadCount = 0;
public void appendObject(Object aNewObject) {
synchronized(iQueueLock) {
iLinkedListContainer = new iLinkedListContainer(aNewObject, iLinkedListContainer);
if(iBlockedThreadCount > 0) {
iQueueLock.notify();//one at a time because we only appended one object
}
} //synchonized(iQueueLock)
}
public Object getFirstObject() throws InterruptedException {
Object result = null;
synchronized(iQueueLock) {
if(null == iLinkedListContainer) {
++iBlockedThreadCount;
try {
iQueueLock.wait();
--iBlockedThreadCount; // instead of having a "finally" statement
} catch (InterruptedException iex) {
--iBlockedThreadCount;
throw iex;
}
}
result = iLinkedListcontainer.getCurrentObject();
iLinkedListContainer = iLinkedListContainer.getNextLinkedObject();
if((iBlockedThreadCount > 0) && (null != iLinkedListContainer )) {
iQueueLock.notify();
}
}//synchronized(iQueueLock)
return result;
}
}
I think that if you try to put less code in the synchronized blocks, the class will not be correct anymore.

There seem to be some issues with this approach. You can have scenarios where the consumer can miss notifications and wait on the queue even when there are elements in the queue.
Consider the following sequence in chronological order
T1 - Consumer acquires the queueLock and then calls wait. Wait will release the lock and cause the thread to wait for a notification
T2 - One producer acquires the queueLock and adds an element to the queue and calls notify
T3 - The Consumer thread is notified and attempts to acquire queueLock BUT fails as another producer comes at the same time. (from the notify java doc - The awakened thread will compete in the usual manner with any other threads that might be actively competing to synchronize on this object; for example, the awakened thread enjoys no reliable privilege or disadvantage in being the next thread to lock this object.)
T4 - The second producer now adds another element and calls notify. This notify is lost as the consumer is waiting on queueLock.
So theoretically its possible for the consumer to starve (forever stuck trying to get the queueLock) also you can run into a memory issue with multiple producers adding elements to the queue which are not being read and removed from the queue.
Some changes that I would suggest is as follows -
Keep an upper bound to the number of items that can be added to the queue.
Ensure that the consumer always read all the elements. Here is a program which shows how the producer - consumer problem can be coded.

synchronization ideas for a method

I have a multithreaded class A which accesses the following insert() method of another class B (A has only a single instance of B).
Instead of making the entire method synchronized, are there any better ways to synchronize the following method? (to reduce the synchronization overhead)
private void insert(byte[] shardKey, byte[] queueKey,
byte[] value, PipelineMessageType msgType) {
PipelineMessage pipelineMessage = new PipelineMessage(queueKey,
value, msgType);
LinkedBlockingQueue<PipelineMessage> queue;
JedisShardInfo shardInfo = shardedJedis.getShardInfo(shardKey); // shardedJedis is an instance variable of this class
String mapKey = shardInfo.getHost() + shardInfo.getPort();
queue = shardQueue.get(mapKey); // shardQueue is an instance variable of this class
boolean insertSuccessful = queue.offer(pipelineMessage);
if(!insertSuccessful) {
// perform the pipeline sync - flush the queue
// use another thread for this
// (processing of queue entries is given to another thread here)
// queue would be empty now. Insert (k,v)
queue.offer(pipelineMessage);
}
}
I tried to synchronize only the fragment which accesses the instance variables but there might be a scenario where 2 threads try to insert into a full queue and enter the if block. Then 2 threads might process the queue entries which I don't want to happen.
Any suggestions are appreciated. Thank you in advance.

Seems to me that if JedisShardInfo would be a read-only item, then you should need to protect/synchronize it. So you could synchronize only from the line
queue= ...
Otherwise, almost everything should be synchronized, except the first statement (declaration of pipeline message), and then I really wonder if it changes much compared to declaring the whole method synchronized.
Also, if you got other points of synchronization, I mean other methods or block codes that are synchronized on this, you should consider splitting them and synchronize on different data members of this depending on which data members you wish to protect from multi-threading :
Object lockerA = new Object() {};
synchronized( lockerA )
{}//sync
Well, not much to say. :)
Regards,
Stéphane

The key to correct synchronization is to follow this pattern:
synchronize(lockObjectForState) { // All code that alters state must also synchronise on the same lock
while(!stateOkToProceed()) {
try {
lockForState.wait();
} catch (InterruptedException e) {
// handle if your thread was interrupted deliberately as a single to exit, or spuriously (in which case do nothing)
}
}
updateState();
lockForState.notifyAll();
}
java.util.concurrent package offer many thread-safe implementations of classes needed to solve common threading problems. Consider using a BlockingQueue.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.