Trying to debug a race condition where one of our application's poller threads never return causing future pollers to never get scheduled. In abstract terms to hide our business logic while capturing the problem, here's what our code path is.
We have to update some state X of resource Y in a remote server. We have a resource manager, which changes the resource state and updates X as a side effect of the change. This manager polls the resource continually and when it believes resource is updated, it uses a ThreadPoolExecutor to do the work. This thread pool executor has a reasonably sized blocking queue but fairly small number of max threads. The hang itself from thread dump happens in invokeAll call (among other things)
We have reasons to believe that the number of core/max threads in this pool executor are busy doing other stuff (more resource state updations, if you will).
Since invokeAll returns us futures which we wait on, the question is does invokeAll hang even if the blocking data structure used by the executor is big enough to take in the work passed in via invokeAll but there are no enough threads available?
As other users have pointed out, without some code (even pseudo-code), and a clearer understanding of what "state X" is, and what "resource Y" is, it is virtually impossible for anybody here to provide an intelligent answer. In short, you need an SSCCE. Nevertheless, I'll do my best here ;-). And if you do post code and/or provide more info, I'll update my answer accordingly.
From the Java 7 ExecutorService#invokeAll javadoc:
Executes the given tasks, returning a list of Futures holding their status and results when all complete. Future.isDone() is true for each element of the returned list. Note that a completed task could have terminated either normally or by throwing an exception. The results of this method are undefined if the given collection is modified while this operation is in progress.
From your description (and again, I can't tell for sure because of the lack of details), one of your worker threads is hanging. Since you're calling invokeAll(...), the executor is hanging because it's waiting for the hung thread to finish. But it never does. Now, as to why you're getting a hung thread, that's an entirely different issue, and we would definitely need to see some code. HTH.
Related
I'm trying to run a number of jobs concurrently using Java's ForkJoinPool. The main task (which is already running in the pool) spawns all the jobs and then does a series of joins. I was sure that a task calling join would free the thread it is running in, but it seems like it is actually blocked on it, and therefore it is "wasting" the thread, i.e., since the number of threads equals the number of CPU cores, one core will be inactive.
I know that if I run invokeAll instead, then the first of the sub-jobs gets to run in the same thread, and indeed this works. However, this seems sub-optimal, because if the first task is actually a very fast one, i have the same problem. One of the threads is blocked waiting on join. There are more jobs than threads, so I would rather another one of the jobs gets started.
I can try and bypass all this manually but its not so nice, and it seems like I am redoing what ForkJoinPool is supposed to do.
So the question is: Am I understanding ForkJoinPool wrong? or if what I'm saying is correct, then is there simple way to utilize the threads more efficiently?
ForkJoinPool is designed to prevent you having to think about thread utilization in this way. The 'work stealing' algorithm ensures that each thread is always busy so long as there are tasks in the queue.
Check out these notes for a high-level discussion:
https://www.dre.vanderbilt.edu/~schmidt/cs891f/2018-PDFs/L4-ForkJoinPool-pt3.pdf
To see the ugly details go down the rabbit hole of the ForkJoinPool#awaitJoin source.
Roughly, if I'm reading the (very complex) code correctly: When a thread joins a sub-task, it attempts to complete that task itself, otherwise if the sub-task's worker queue is non-empty (i.e. it is also depending on other tasks), the joining thread repeatedly attempts to complete one of those tasks, via ForkJoinPool#tryHelpStealer, whose Javadoc entry provides some insight:
Tries to locate and execute tasks for a stealer of the given
task, or in turn one of its stealers, Traces currentSteal ->
currentJoin links looking for a thread working on a descendant
of the given task and with a non-empty queue to steal back and
execute tasks from. The first call to this method upon a
waiting join will often entail scanning/search, (which is OK
because the joiner has nothing better to do), but this method
leaves hints in workers to speed up subsequent calls. The
implementation is very branchy to cope with potential
inconsistencies or loops encountering chains that are stale,
unknown, or so long that they are likely cyclic.
Notice that ForkJoinTask does not extend Thread, so 'blocking' of the join operation means something different here than usual. It doesn't mean that the underlying thread is in a blocked state, rather it means that the computation of the current task is held up further up the call stack while join goes off and attempts to resolve the tree of sub-tasks impeding progress.
So basically I am learning a bit more serious concurrency (studying how things actually work, instead of just using random stuff if needed).
And my proffesor, when I asked him about this, said me that he and his colleagues hadn't been able to reproduce a spurious wake up, and believes that line is an old line not deleted (like, it was there, java got "better", it's not longer needed, the line is still there), and that is not the case.
Link:
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Condition.html
It's right below the point called:
Implementation Considerations
In his opinion, a condition that looked kind of like this:
lock.lock()
if (p>q) {
lock.newCondition().await
}
Would be perfectly fine, since he says a spurious wake up can not happen, it wouldn't be needed a loop:
lock.lock()
while (p>q) {
lock.newCondition().await
}
I am MORE than likely mixing things and understanding both the doc and my teacher the wrong way, but I do have spent some time trying to understand why each thing, and can't come with an "answer" of my own, I either believe one or the other (not like it matters, it's pure I-want-to-learn).
My teacher does spend time telling us how explaining concurrency in java it's pretty silly, but I didn't choose it either, so there's that.
Would be perfectly fine, since he says a spurious wake up can not happen, it wouldnt be needed a loop:
Your teacher is wrong for two reasons:
Spurious wakeups do happen. It may not happen on the architecture that they tested on but if you don't take it into account, when you move your application to a different piece of hardware or a different OS revision, you will see problems. It may also be that the spurious interrupts happen occasionally during an exceptional kernel event such as a signal getting delivered at precisely the wrong time. Again, your application may run fine in testing but when you move it into production with a lot higher load, the frequency of the exceptional event may increase...
The underlying problem is that certain native thread implementations may choose to wakeup all conditions associated with an application instead of the specific one that was notified. This is well documented in the javadocs for Object.wait():
As in the one argument version, interrupts and spurious wakeups are possible, and this method should always be used in a loop:
Here's one example of an architecture that has this limitation. I'll quote from this interesting blog entry:
Internally, wait is implemented as a call to the 'futex' system call. Each blocking system call on Linux returns abruptly when the process receives a signal -- because calling signal handler from kernel call is tricky. What if the signal handler calls some other system function? And a new signal arrives? It's easy to run out of kernel stack for a process. Exactly because each system call can be interrupted, when glibc calls any blocking function, like 'read', it does it in a loop, and if 'read' returns EINTR, calls 'read' again.
The while loop is also very important to protect against race conditions -- especially in multiple thread producer/consumer models. If you have multiple threads that are consuming from a queue (for example), a notification that there are items in the queue may wakeup a thread but by the time it is able to get the lock, another thread has already dequeued the item.
This is well documented on my page here with a sample program that demonstrates the race condition without the use of while.
Producer Consumer Thread Race Conditions
In your example, thread A may be waiting in await() while another thread B may be waiting to get the lock(). Thread C has the lock and is adding to the queue.
// B is here waiting for the lock
lock.lock()
while (p > q) {
// A is here waiting for the signal
lock.newCondition().await();
}
// dequeue
lock.unlock();
Then if the producer adds something to the queue and calls signal() the thread A moves from the WAIT state to the BLOCKED state to get the lock itself. But it may be behind thread B which is already waiting. Once the lock is released, thread B dequeues the element, not thread A. When thread A then gets a chance to dequeue, the queue is empty. Without the while loop, you can get out-of-bounds exceptions or other problems by trying to dequeue from an empty queue.
See my link for more explicit details of the race.
It is still necessary. Your professor is not necessarily incorrect, but has created a strawman argument to knock down.
There are two reasons why you must protect your conditions in a loop.
The first is spurious wake-up. Your professor seems to have been unable to reproduce this, and it may likely not be a problem on the platforms he tests on. This does not mean it is unreproduceable on all platforms.
The second is that between the times that you wake up and actually go to do the logic, the condition may no longer be true. You must guard against this potential race condition. This is also notoriously difficult to reproduce in the lab, and will probably only happen in bizarre circumstances in production.
The ArrayBlockingQueue will block the producer thread if the queue is full and it will block the consumer thread if the queue is empty.
Does not this concept of blocking goes against the very idea of multi threading? if I have a 'main' thread and let us say I want to delegate all 'Logging' activities to another thread. So Basically inside my main thread,I create a Runnable to log the output and I put the Runnable on an ArrayBlockingQueue. The whole purpose of doing this is have the 'main' thread return immediately without wasting any time in an expensive logging operation.
But if the queue is full, the main thread will be blocked and will wait until a spot is available. So how does it help us?
The queue doesn't block out of spite, it blocks to introduce an additional quality into the system. In this case, it's prevention of starvation.
Picture a set of threads, one of which produces work units really fast. If the queue were to be allowed unbounded growth, potentially, the "rapid producer" queue could hog all the producing capacity. Sometimes, prevention of such side-effects is more important than having all threads unblocked.
I think this is the designer's decision. If he chose blocking mode ArrayBlockingQueue provides it with put method. If the desiner dont want blocking mode ArrayBlockingQueue has offer method which will return false when queue is full but then he needs to decide what to do with regected logging event.
In your example I would consider blocking to be a feature: It prevents an OutOfMemoryError.
Generally speaking, one of your threads is just not fast enough to cope with the assigned load. So the others must slow down somehow in order not to endanger the whole application.
On the other hand, if the load is balanced, the queue will not block.
Blocking is a necessary function of multithreading. You must block to have synchronized access to data. It does not defeat the purpose of multithreading.
I would suggest throwing an exception when the producer attempts to submit an item to a queue which is full. There are methods to test if the capacity is full beforehand I believe.
This would allow the invoking code to decide how it wants to handle a full queue.
If execution order when processing items from the queue is unimportant, I recommend using a threadpool (known as an ExecutorService in Java).
It depends on the nature of your multi threading philosophy. For those of us who favour Communicating Sequential Processes a blocking queue is nearly perfect. In fact, the ideal would be one where no message can be put into the queue at all unless the receiver is ready to receive it.
So no, I don't think that a blocking queue goes against the very purpose of multi-threading. In fact, the scenario that you describe (the main thread eventually getting stalled) is a good illustration of the major problem with the actor-model of multi-threading; you've no idea whether or not it will deadlock / block, and you can't exhaustively test for it either.
In contrast, imagine a blocking queue that is zero messages deep. That way for the system to work at all you'd have to find a way to ensure that the logger is always guaranteed to be able to receive a message from the main thread. That's CSP. It might mean that in your hypothetical logger thread you have to have application defined buffering (as opposed to some framework developer's best guess of how deep a FIFO should be), a fast I/O subsystem, checks for keeping up, ways of dealing with falling behind, etc. In short it doesn't let you get away with it, you're forced to address every aspect of your system's performance.
That is of course harder, but that way you end up with a system that's definitely OK rather than the questionable "maybe" that you have if your blocking queues are an unknown number of messages deep.
It sounds like you have the general idea right of why you'd use something like an ArrayBlockingQueue to talk between threads.
Having a blocking queue gives you the option to do something different in case something goes wrong with your background worker threads, rather than blindly adding more requests to the queue. If there is room in the queue, there is no blocking.
For your specific use case, though, I would use ExecutorService rather than reading/writing queues directly, which creates a pool of background worker threads:
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ExecutorService.html
pool = Executors.newFixedThreadPool(poolSize);
pool.submit(myRunnable);
A multithreaded program is non-deterministic insofar as you can't say beforehand: n producer actions will take exactly as long as m consumer actions. Therefore, synchronization between n producers and m consumers is necessary in every case.
You'll want to choose the queue size so that the number of active producers and consumers is maximized most of the time. But the thread model of java does not guarantee that any consumer will run unless it is the only unblocked thread. (Yet, of course, on multi-core CPUs it is very likely that the consumer will run).
You have to make a choice about what to do when a Queue is full. In the case of an Array Blocking queue, that choice is to wait.
Another option would be to just throw away new Objects if the queue was full; you can achieve this with offer.
You have to make a trade-off.
I have a bit of an issue with an application running multiple Java threads.
The application runs a number of working threads that peek continuously at an input queue and if there are messages in the queue they pull them out and process them.
Among those working threads there is another verification thread scheduled to perform at a fixed period a check to see if the host (on which the application runs) is still in "good shape" to run the application. This thread updates an AtomicBoolean value which in turn is verified by the working thread before they start peeking to see if the host is OK.
My problem is that in cases with high CPU load the thread responsible with the verification will take longer because it has to compete with all the other threads. If the AtomicBoolean does not get updated after a certain period it is automatically set to false, causing me a nasty bottleneck.
My initial approach was to increase the priority of the verification thread, but digging into it deeper I found that this is not a guaranteed behavior and an algorithm shouldn't rely on thread priority to function correctly.
Anyone got any alternative ideas? Thanks!
Instead of peeking into a regular queue data structure, use the java.util.concurrent package's LinkedBlockingQueue.
What you can do is, run an pool of threads (you could use executer service's fixed thread pool, i.e., a number of workers of your choice) and do LinkedBlockingQueue.take().
If a message arrives at the queue, it is fed to one of the waiting threads (yeah, take does block the thread until there is something to be fed with).
Java API Reference for Linked Blocking Queue's take method
HTH.
One old school approach to throttling rate of work, that does not use a health check thread at all (and so by-passes these problems) is to block or reject requests to add to the queue if the queue is longer than say 100. This applies dynamic back pressure on to the clients generating the load, slowing them down when the worker threads are over loaded.
This approach was added to the Java 1.5 library, see java.util.concurrent.ArrayBlockingQueue. Its put(o) method blocks if the queue is full.
Are u using Executor framework (from Java's concurrency package)? If not give it a shot. You could try using ScheduledExecutorService for the verification thread.
More threads does not mean better performance. Usually if you have dual core, 2 threads gives best performance, 3 or more starts getting worse. Quad core should handle 4 threads best, etc. So be careful how much threads you use.
You can put the other threads to sleep after they perform their work, and allow other threads to do their part. I believe Thread.yield() will pause the current thread to give time to other threads.
If you want your thread to run continuously, I would suggest creating two main threads, thread A and B. Use A for the verification thread, and from B, create the other threads. Therefore thread A gets more execution time.
Seems you need to utilize Condition variables. Peeking will take cpu cycles.
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/Condition.html
I have a series of concurrent tasks to run. If any one of them fails, I want to interrupt them all and await termination. But assuming none of them fail, I want to wait for all of them to finish.
ExecutorCompletionService seems like almost what I want here, but there doesn't appear to be a way to tell if all of my tasks are done, except by keeping a separate count of the number of tasks. (Note that both of the examples of in the Javadoc for ExecutorCompletionService keep track of the count "n" of the tasks, and use that to determine if the service is finished.)
Am I overlooking something, or do I really have to write this code myself?
Yes, you do need to keep track if you're using an ExecutorCompletionService. Typically, you would call get() on the futures to see if an error occurred. Without iterating over the tasks, how else could you tell that one failed?
If your series of tasks is of a known size, then you should use the second example in the javadoc.
However, if you don't know the number of tasks which you will submit to the CompletionService, then you have a sort of Producer-Consumer problem. One thread is producing tasks and placing them in the ECS, another would be consuming the task futures via take(). A shared Semaphore could be used, allowing the Producer to call release() and the Consumer to call acquire(). Completion semantics would depend on your application, but a volatile or atomic boolean on the producer to indicate that it is done would suffice.
I suggest a Semaphore over wait/notify with poll() because there is a non-deterministic delay between the time a task is produced and the time that task's future is available for consumption. Therefore the consumer and producer needs to be just slightly smarter.