Is there really a race condition in this multi-threaded java code? - java

I saw a snippet of code in this question which I could not understand (most probably due to the fact am a beginner in this area). The question talks about "an obvious race condition where sometimes the producer will finish, signal it, and the ConsumerWorkers will stop BEFORE consuming everything in the queue."
In my understanding, "isRunning" will be set on the consumers only after the producer decides not to add anymore items in the queue. So, if a consumer thread sees isRunning as FALSE AND then sees inputQueue is empty, then there is NO possibility of anything more getting added into the queue in the future.
Obviosuly, I am wrong and missing something, as no one who responded to that question said the scenario of the question is impossible. So, Can someone pls explain what sequence of events causes this race condition ?
In fact, I see a problem with something else. For ex, if multiple consumer threads saw that the producer isRunning, and say the queue had ONE item, many threads could enter the blocked 'take'. If the producer STOPS now, while one thread would come out of the 'take',
the other threads are blocked on the 'take' forever. Interestingly, no one who answered the question pointed out this problem as well. So, my understanding of this is also probably faulty ?!
I didnt want to add this as a comment there in that question, as it is an old question and my doubt may never get answered !
I am copy/placing the code from that question here for quick reference.
public class ConsumerWorker implements Runnable{
private BlockingQueue<Produced> inputQueue;
private volatile boolean isRunning = true;
public ConsumerWorker(BlockingQueue<Produced> inputQueue) {
this.inputQueue = inputQueue;
}
#Override
public void run() {
//worker loop keeps taking en element from the queue as long as the producer is still running or as
//long as the queue is not empty:
while(isRunning || !inputQueue.isEmpty()) {
System.out.println("Consumer "+Thread.currentThread().getName()+" START");
try {
Object queueElement = inputQueue.take();
//process queueElement
} catch (Exception e) {
e.printStackTrace();
}
}
}
//this is used to signal from the main thread that he producer has finished adding stuff to the queue
public void setRunning(boolean isRunning) {
this.isRunning = isRunning;
}

I think OP of the original question probably meant
while(isRunning && !inputQueue.isEmpty())
rather than
while(isRunning || !inputQueue.isEmpty())
The former clearly produces the issue described by the original poster (*), while the later does indeed have the problem you described in your second point. A simple oversight there, but now we can note that both approaches are incorrect.
(*) and somehow assumes that the queue will never be empty.

You are correct in both questions. Yes && is correct and || is not. As for the second question, answers was to use poison pill or timeout, both ways resolving the problem.
As for me, I would create new synchronization class which aggregates both the queue and isRunning variable, so that changing isRunning causes an exception in take() thus signalling the end of work.

Related

Is this a proper customized synchronizer?

I had a strong need for a synchronizer similar to a CountDownLatch, but the starting number for the countdown is unknown. To add context, if I'm going through a buffered recordset (say from a text file or a query) and kicking off a runnable for each record, but I don't know how many records there will be... I need a synchronizer that signals when the iteration is complete and all runnables are complete.
This is the synchronizer I came up with... a BufferedLatch. A method is called in the iteration loop for each record incrementing the recordSetSize. At the end of each runnable kicked off for each record, the processedRecordSetSize is incremented. When the iteration through all records is complete (but runnables may still be in queue), the setDownloadComplete() method is called letting the BufferedLatch know the recordSetSize is now fixed. The await() method waits for the iterationComplete variable to be true (recordsetSize is now fixed) and recordsetSize == processedRecordSetSize;
Is this an optimal implementation of this synchronizer? Is there more concurrent opportunity that synchronization is holding back? Although testing seems to work fine, are there any gotcha's I'm overlooking?
import java.util.concurrent.atomic.AtomicInteger;
public final class BufferedLatch {
/** A customized synchronizer built for concurrent iteration processes where the number of objects to be iterated is unknown
* and a runnable will be kicked off for each object, and the await() method will wait for all runnables to be complete
*/
private final AtomicInteger recordsetSize = new AtomicInteger(0);
private final AtomicInteger processedRecordsetSize = new AtomicInteger(0);
private volatile boolean iterationComplete = false;
public int incrementRecordsetSize() throws Exception {
if (iterationComplete) {
throw new Exception("Cannot increase recordsize after download is flagged complete!");
}
else {
return recordsetSize.incrementAndGet();
}
}
public void incrementProcessedRecordSize() {
synchronized(this) {
processedRecordsetSize.incrementAndGet();
if (iterationComplete) {
if (processedRecordsetSize.get() == recordsetSize.get()) {
this.notifyAll();
}
}
}
}
public void setDownloadComplete() {
synchronized(this) {
iterationComplete = true;
}
}
public void await() throws InterruptedException {
while (! (iterationComplete && (processedRecordsetSize.get() == recordsetSize.get()))) {
synchronized(this) {
while (! (iterationComplete && (processedRecordsetSize.get() == recordsetSize.get()))) {
this.wait();
}
}
}
}
}
UPDATE-- NEW CODE
public final class BufferedLatch {
/** A customized synchronizer built for concurrent iteration processes where the number of objects to be iterated is unknown
* and a runnable will be kicked off for each object, and the await() method will wait for all runnables to be complete
*/
private int recordCount = 0;
private int processedRecordCount = 0;
private boolean iterationComplete = false;
public synchronized void incrementRecordCount() throws Exception {
if (iterationComplete) {
throw new Exception("Cannot increase recordCount after download is flagged complete!");
}
else {
recordCount++;
}
}
public synchronized void incrementProcessedRecordCount() {
processedRecordCount++;
if (iterationComplete && recordCount == processedRecordCount) {
this.notifyAll();
}
}
public synchronized void setIterationComplete() {
iterationComplete = true;
if (iterationComplete && recordCount == processedRecordCount) {
this.notifyAll();
}
}
public synchronized void await() throws InterruptedException {
while (! (iterationComplete && (recordCount == processedRecordCount))) {
this.wait();
}
}
}
Probably not. I think conceptually you're onto something here, as it looks like your application needs something more than just a CountDownLatch. However, the implementation seems to have several problems.
First, I note that it looks odd to mix atomics/volatiles AND ordinary object monitor locks (synchronized). While there may be proper uses that mix these different constructs, mixing in this case I believe will lead to errors.
Consider incrementRecordsetSize() which first checks iterationComplete and only if it's false does it increment recordsetSize. The iterationComplete variable is volatile so updates from other threads will be visible. However, the fact that no locking is done here allows TOCTOU race conditions (time-of-check vs time-of-use). The rule seems to be, recordsetSize must not be incremented if iterationComplete is true. Suppose thread T1 comes along and finds iterationComplete to be false, so it decides to increment recordsetSize. Before it does so, another thread T2 comes along and sets iterationComplete to be true. This would allow T1 to do the increment improperly. Worse, before it does so, suppose another thread T3 came along and called incrementProcessedRecordSize(). It would increment processedRecordsetSize and then find iterationComplete true. It further might find that processedRecordsetSize equals recordsetSize and then notify all waiters, who then proceed as if the processing is complete. But it's not, as T1 then proceeds to increment recordsetSize and presumably continues with its processing.
The problem here is that this object's state consists of the fusion of three independent pieces of state -- two int counters and a boolean -- and all three must be read and written atomically. If certain bits of logic attempt to take advantage of individual volatile or atomic properties, it introduces the possibility of race conditions such as the one I described.
I'd suggest rewriting this as a plain object with two plain ints and a boolean (not atomic, not volatile) and just lock around everything. This should certainly clear up the logic and make things easier to understand.
In incrementProcessedRecordSize I note that the condition essentially duplicates the condition in the await method. A simplifying convention is for all updates to notify and have the condition evaluated only by the waiters. This may result in some unnecessary wakeups. If this is a problem, you might consider minimizing the number of notifies, but you need to think about maintainability. If you're not careful, the wait/notify conditions will become spread across the code and will be very hard to reason about. Alternatively, you could refactor the condition into a method and call it from the different places that do waiting and notification.
It looks like await() does a complicated form of double-checked locking. Instead of testing a volatile boolean outside the lock, it tests several separate pieces of information both outside and inside the lock. This seems susceptible to TOCTOU problems (as above) but it might be safe if you can prove the state really latches, that is, that once it becomes true it never returns to false. I'd have to stare at the code for a long time before I'd be able to convince myself it's correct.
On the other hand, what does this buy you? It seems to optimize away just the taking of the lock. If you have a zillion threads that are going to come by after processing is complete, it might be worth it, but it doesn't seem like it. I'd just remove the outer while loop and check the variables within a synchronized block.
Finally, having an object that represents counters and a boolean may very well be sensible for what you're doing, but other things you've said (in the question and in comments) are that some threads are generating a workload (e.g. reading lines from a file) and other threads are retiring that workload. This implies that there is some other data structure like a queue that contains this workload, and you have a producer-consumer problem here. That other structure has to be thread-safe, of course, since multiple threads are interacting over it. But the counters and boolean in this structure need to be updated in lockstep with the updates to the workload structure, otherwise there could be race conditions between checking and updating these separate objects.
It seems to me you could replace the counters in this object with the queue and just put simple locks around everything. The producers would append to the queue until they're done, at which time they set iterationComplete to true which prevents more work from being added. The consumers pull from the queue until iterationComplete is true and the queue is empty, at which point they're done. If they find the queue empty but iterationComplete is false, they know to block while awaiting further work.
I'd say to stick with simple locking and avoid volatiles/atomics until you get the basics correct. If there are bottlenecks in that code, then apply optimizations selectively while preserving the same invariants.

implement-your-own blocking queue in java

I know this question has been asked and answered many times before, but I just couldn't figure out a trick on the examples found around internet, like this or that one.
Both of these solutions check for emptiness of the blocking queue's array/queue/linkedlist to notifyAll waiting threads in put() method and vice versa in get() methods. A comment in the second link emphasizes this situation and mentions that that's not necessary.
So the question is; It also seems a bit odd to me to check whether the queue is empty | full to notify all waiting threads. Any ideas?
Thanks in advance.
I know this is an old question by now, but after reading the question and answers I couldn't help my self, I hope you find this useful.
Regarding checking if the queue is actually full or empty before notifying other waiting threads, you're missing something which is both methods put (T t) and T get() are both synchronized methods, meaning that only one thread can enter one of these methods at a time, yet this will not prevent them from working together, so if a thread-a has entered put (T t) method another thread-b can still enter and start executing the instructions in T get() method before thread-a has exited put (T t), and so this double-checking design is will make the developer feel a little bit more safe because you can't know if future cpu context switching if will or when will happen.
A better and a more recommended approach is to use Reentrant Locks and Conditions:
//I've edited the source code from this link
Condition isFullCondition;
Condition isEmptyCondition;
Lock lock;
public BQueue() {
this(Integer.MAX_VALUE);
}
public BQueue(int limit) {
this.limit = limit;
lock = new ReentrantLock();
isFullCondition = lock.newCondition();
isEmptyCondition = lock.newCondition();
}
public void put (T t) {
lock.lock();
try {
while (isFull()) {
try {
isFullCondition.await();
} catch (InterruptedException ex) {}
}
q.add(t);
isEmptyCondition.signalAll();
} finally {
lock.unlock();
}
}
public T get() {
T t = null;
lock.lock();
try {
while (isEmpty()) {
try {
isEmptyCondition.await();
} catch (InterruptedException ex) {}
}
t = q.poll();
isFullCondition.signalAll();
} finally {
lock.unlock();
}
return t;
}
Using this approach there's no need for double checking, because the lock object is shared between the two methods, meaning only one thread a or b can enter any of these methods at a time unlike synchronized methods which creates different monitors, and only those threads waiting because the queue is full will be notified when there's more space, and the same goes for threads waiting because the queue is empty, this will lead to a better cpu utilization.
you can find more detailed example with source code here
I think logically there is no harm doing that extra check before notifyAll().
You can simply notifyAll() once you put/get something from the queue. Everything will still work, and your code is shorter. However, there is also no harm checking if anyone is potentially waiting (by checking if hitting the boundary of queue) before you invoke notifyAll(). This extra piece of logic saves unnecessary notifyAll() invocations.
It just depends on you want a shorter and cleaner code, or you want your code to run more efficiently. (Haven't looked into notifyAll() 's implementation. If it is a cheap operation if there is no-one waiting, the performance gain may not be obvious for that extra checking anyway)
The reason why the authors used notifyAll() is simple: they had no clue whether or not it was necessary, so they decided for the "safer" option.
In the above example it would be sufficient to just call notify() as for each single element added, only a single thread waiting can be served under all circumstances.
This becomes more obvious, if your queue as well has the option to add multiple elements in one step like addAll(Collection<T> list), as in this case more than one thread waiting on an empty list could be served, to be exact: as many threads as elements have been added.
The notifyAll() however causes an extra overhead in the special single-element case, as many threads are woken up unnecessarily and therefore have to be put to sleep again, blocking queue access in the meantime. So replacing notifyAll() with notify() would improve speed in this special case.
But then not using wait/notify and synchronized at all, but instead use the concurrent package would increase speed by a lot more than any smart wait/notify implementation could ever get to.
I would like to write a simple blocking queue implementation which will help the people to understand this easily. This is for someone who is novice to this.
class BlockingQueue {
private List queue = new LinkedList();
private int limit = 10;
public BlockingQueue(int limit){
this.limit = limit;
}
public synchronized void enqueue(Object ele) throws InterruptedException {
while(queue.size() == limit)
wait();
if(queue.size() == 0)
notifyAll();
// add
queue.add(ele);
}
public synchronized Object deque() throws InterruptedException {
while (queue.size() == 0)
wait();
if(queue.size() == limit)
notifyAll();
return queue.remove(0);
}
}

How to prevent tiny race condition in a consumer

In my applications, two things happen:
Various threads produce jobs.
There is a function (but not 1 constantly running thread) that consumes the jobs. This function is started by the producers, but is locked so that it only runs once.
For example, a job is produced:
addJobToDatabase(...);
triggerPass();
And this is how the consumer function is started:
public void triggerPass() {
// prevent running more than once
if (onceLock.tryLock()) { // onceLock is a ReentrantLock
try {
while (haveJobs()) {
doJobs();
}
} finally {
onceLock.unlock();
}
} else {
log.info("Pass triggered, but already running");
}
}
Now, there is a tiny race condition possible here. If
Thread A has left the while but not yet done onceLock.unlock()
Thread B does onceLock.tryLock() which returns false
...thread B's job is not executed until a later call to triggerPass();
While I doubt it will get me into trouble in practice, can this little gap be closed for correctness?
I think I've worked around it by replacing tryLock() with tryLock(1, TimeUnit.SECONDS). It's not nice though, and not even foolproof if the delay between there are no more jobs and unlock() is longer than 1 second (and who knows what happens in the database).
Unfortunately race condition is unavoidable in this design. Want it to be correct? Why don't you just create something like iterator or queue where the read-modify-write action is the single atomic operation?
public void triggerPass() {
Job job = null;
while ((job = jobIterator.next()) != null) {
doJob(job);
}
}

Proper implementation of producer-consumer scenario and "graceful" termination of thread pool

I am working on my first multi-threaded project and thus have a couple of things that I am unsure of. Details on my setup was on a previous question, in short: I have a thread pool implemented by Executors.newFixedThreadPool(N). One thread is given an action which does a series of queries to local and remote resources and iteratively populates an ArrayBlockingQueue, while the rest of the threads invoke take() method on the queue and process the objects in the queue.
Even though small and supervised tests seem to run OK, I am unsure about how I handle special scenarios such as the beginning (the queue has no items yet), the end (the queue is emptied), and any eventual InterruptedExceptions. I have done some reading here on SO, which then led me to two really nice articles by Goetz and Kabutz. The consensus seems to be that one should not ignore these exceptions. However I am unsure how the examples supplied relates to my situation, I have not invoked thread.interrupt() anywhere in my code... Speaking of which, I'm getting unsure if I should have done so...
To sum it up, given the code below, how do I best handle the special cases, such as termination criteria and the InterrruptedExceptions? Hope the questions make sense, otherwise I'll do my best to describe it further.
Thanks in advance,
edit: I have been working on the implementation for a while now, and I have come across a new hiccup so I figured I'd update the situation. I have had the misfortune of coming across ConcurrentModificationException which was most likely due to incomplete shutdown/termination of the thread pool. As soon as I figured out I could use isTerminated() I tried that, then I got a IllegalMonitorStateException due to an unsynchronized wait(). The current state of the code is below:
I have followed some of the advices from #Jonathan's answer, however I don't think his proposal works quite like what I need/want. The background story is the same as I have mentioned above, and relevant bits of code are as follows:
Class holding/managing the pool, and submission of runnables:
public void serve() {
try {
this.started = true;
pool.execute(new QueryingAction(pcqs));
for(;;){
PathwayImpl p = bq.take();
if (p.getId().equals("0")){
System.out.println("--DEBUG: Termination criteria found, shutdown initiated..");
pool.shutdown();
// give 3 minutes per item in queue to finish up
pool.awaitTermination(3 * bq.size(), TimeUnit.MINUTES);
break;
}
int sortMethod = AnalysisParameters.getInstance().getSort_method();
pool.submit(new AnalysisAction(p));
}
} catch (Exception ex) {
ex.printStackTrace();
System.err.println("Unexpected error in core analysis, terminating execution!");
System.exit(0);
}finally{ pool.shutdown(); }
}
public boolean isDone(){
if(this.started)
return pool.isTerminated();
else
return false;
}
Elements are added to the queue by the following code on located in a separate class:
this.queue.offer(path, offer_wait, TimeUnit.MINUTES);
... motivation behind offer() instead of take() is as Jonathan mentioned. Unforeseen blocks are annoying and hard to figure out as my analysis take a long time as it is. So I need to know relatively quick if the fails due to a bad block, or if it's just crunching numbers...
and finally; here's the code in my test class where I check the interaction between the "concurrency service" (named cs here) and the rest of the objects to be analyzed:
cs.serve();
synchronized (this) {
while(!cs.isDone())
this.wait(5000);
}
ReportGenerator rg = new ReportGenerator();
rg.doReports();
I realize that this has been a VERY long question but I tried to be detailed and specific. Hopefully it won't be too much of a drag, and I apologize in case it is...
Instead of using take, which blocks, use something more like this:
PathwayImpl p = null;
synchronized (bq) {
try {
while (bq.isEmpty() && !stopSignal) {
bq.wait(3000); // Wait up to 3 seconds and check again
}
if (!stopSignal) {
p = bq.poll();
}
}
catch (InterruptedException ie) {
// Broke us out of waiting, loop around to test the stopSignal again
}
}
This assumes that the block is enclosed in some sort of while (!stopSignal) {...}.
Then, in the code that adds to the queue, do this:
synchronized (bq) {
bq.add(item);
bq.notify();
}
As for InterruptedExceptions, they are good for signaling the thread to test the stop signal immediately, instead of waiting until the next timeout-and-test. I suggest just testing your stop signal again, and possibly logging the exception.
I use them when signaling a panic, versus a normal shutdown, but it is rare that such a situation is necessary.

killing an infinite loop in java

I am using a third-party library to process a large number of data sets. The process very occasionally goes into an infinite loop (or is blocked - don't know why and can't get into the code). I'd like to kill this after a set time and continue to the next case. A simple example is:
for (Object data : dataList) {
Object result = TheirLibrary.processData(data);
store(result);
}
processData normally takes 1 second max. I'd like to set a timer which kills processData() after , say, 10 seconds
EDIT
I would appreciate a code snippet (I am not practiced in using Threads). The Executor approach looks useful but I don't quite know how to start. Also the pseudocode for the more conventional approach is too general for me to code.
#Steven Schlansker - suggests that unless the thirdparty app anticipates the interrupt it won't work. Again detail and examples would be appreciated
EDIT
I got the precise solution I was wanting from my colleagues Sam Adams, which I am appending as an answer. It has more detail than the other answers, but I will give them both a vote. I'll mark Sam's as the approved answer
One of the ExecutorService.invokeAll(...) methods takes a timeout argument. Create a single Callable that calls the library, and wrap it in a List as an argument to that method. The Future returned indicate how it went.
(Note: untested by me)
Put the call to the library in another thread and kill this thread after a timeout. That way you could also proces multiple objects at the same time if they are not dependant to each other.
EDIT: Democode request
This is pseudo code so you have to improve and extend it. Also error checking weather a call was succesful or not will be of help.
for (Object data : dataList) {
Thread t = new LibThread(data);
// store the thread somewhere with an id
// tid and starting time tstart
// threads
t.start();
}
while(!all threads finished)
{
for (Thread t : threads)
{
// get start time of thread
// and check the timeout
if (runtime > timeout)
{
t.stop();
}
}
}
class LibThread extends Thread {
Object data;
public TextThread(Object data)
{
this.data = data;
}
public void processData()
{
Object result = TheirLibrary.processData(data);
store(result);
}
}
Sam Adams sent me the following answer, which is my accepted one
Thread thread = new Thread(myRunnableCode);
thread.start();
thread.join(timeoutMs);
if (thread.isAlive()) {
thread.interrupt();
}
and myRunnableCode regularly checks Thread.isInterrupted(), and exits cleanly if this returns true.
Alternatively you can do:
Thread thread = new Thread(myRunnableCode);
thread.start();
thread.join(timeoutMs);
if (thread.isAlive()) {
thread.stop();
}
But this method has been deprecated since it is DANGEROUS.
http://download.oracle.com/javase/1.4.2/docs/api/java/lang/Thread.html#stop()
"This method is inherently unsafe. Stopping a thread with Thread.stop causes it to unlock all of the monitors that it has locked (as a natural consequence of the unchecked ThreadDeath exception propagating up the stack). If any of the objects previously protected by these monitors were in an inconsistent state, the damaged objects become visible to other threads, potentially resulting in arbitrary behavior."
I've implemented the second and it does what I want at present.

Categories

Resources