I have a large number of state machines. Occasionally, a state machine will need to be moved from one state to another, which may be cheap or expensive and may involve DB reads and writes and so on.
These state changes occur because of incoming commands from clients, and can occur at any time.
I want to parallelise the workload. I want a queue saying 'move this machine from this state to this state'. Obviously the commands for any one machine need to be performed in sequence, but I can be moving many machines forward in parallel if I have many threads.
I could have a thread per state machine, but the number of state machines is data-dependent and may be many hundreds or thousands; I don't want a dedicated thread per state machine, I want a pool of some sort.
How can I have a pool of workers but ensure that the commands for each state machine are processed strictly sequentially?
UPDATE: so imagine the Machine instance has a list of outstanding commands. When an executor in the thread pool has finished consuming a command, it puts the Machine back into the thread-pool's task queue if it has more outstanding commands. So the question is, how to atomically put the Machine into the thread pool when you append the first command? And ensure this is all thread safe?
I suggest you this scenario:
Create thread pool, probably some of fix size with Executors.newFixedThreadPool
Create some structure (probably it would be a HashMap) which holds one Semaphore for each state machine. That semaphores will have a value of 1 and would be fair semaphores to keep sequence
In Runnable which will do the job on the begging just add semaphore.aquire() for semaphore of its state machine and semaphore.release() at the end of run method.
With size of thread pool you will control level of parallelism.
I suggest another approach. Instead of using a threadpool to move states in a state machine, use a threadpool for everything, including doing the work. After doin some work resulting in a state-change the state-change event should be added to the queue. After the state-change is processed, another do-work event should be added to the queue.
Assuming that the state transition is work-driven, and vice-versa, asequential processing is not possible.
The idea with storing semaphores in a special map is very dangerous. The map will have to be synchronized (adding/removing objs is thread-unsafe) and there is relatively large overhead of doing the searches (possibly synchronizing on the map) and then using the semaphore.
Besides - if you want to use a multithreaded architecture in your application, I think that you should go all the way. Mixing different architectures may proove troublesome later on.
Have a thread ID per machine. Spawn the desired number of threads. Have all the threads greedily process messages from the global queue. Each thread locks the current message's server to be used exclusively by itself (until it's done processing the current message and all messages on its queue), and the other threads puts messages for that server on its internal queue.
EDIT: Handling message pseudo-code:
void handle(message)
targetMachine = message.targetMachine
if (targetMachine.thread != null)
targetMachine.thread.addToQueue(message);
else
targetMachine.thread = this;
process(message);
processAllQueueMessages();
targetMachine.thread = null;
Handling message Java code: (I may be overcomplicating things slightly, but this should be thread-safe)
/* class ThreadClass */
void handle(Message message)
{
// get targetMachine from message
targetMachine.mutexInc.aquire(); // blocking
targetMachine.messages++;
boolean acquired = targetMachine.mutex.aquire(); // non-blocking
if (acquired)
targetMachine.threadID = this.ID;
targetMachine.mutexInc.release();
if (!acquired)
// can put this before release, it may speed things up
threads[targetMachine.threadID].addToQueue(message);
else
{
process(message);
targetMachine.messages--;
while (true)
{
while (!queue.empty())
{
process(queue.pop());
targetMachine.messages--;
}
targetMachine.mutexInc.acquire(); // blocking
if (targetMachine.messages > 0)
{
targetMachine.mutexInc.release();
Thread.sleep(1);
}
else
break;
}
targetMachine.mutex.release();
targetMachine.mutexInc.release();
}
}
Related
I'm writing a Java command line application that scrapes a website and downloads video files. The video files range in size from a few megs to 20 GB or more. This means downloading a file can take as little as a few seconds to as much as a few hours. I've decided to implement a produce/consumer pattern to handle the scraping and downloading of files. A producer thread scrapes the site and retrieves the links to the video files and puts those links into an object and puts that object into an unbounded blocking queue. There are N consumer threads that handle the download. They retrieve the objects containing the URLs from the blocking queue and each thread downloads the file. The object that the producer puts on the queue contains the URL along with some other information that the consumer will need to save the file to the correct location in local storage. Before a file is downloaded, the consumer thread first checks if the file already exists in local storage. If the file exists, the download is skipped and the next object is pulled from the queue. If a consumer experiences a problem while downloading a file (connection reset, etc.), the consumer puts the object containing the URL into a separate queue for failed requests and sleeps for 15 minutes. This allows the application to deal with temporary network interruptions. While the producer is active, it checks the failed URLs queue and removes those URLs from that queue and puts them back into the main queue.
After implementing this initial design, I quickly realized that I had a problem. Because I'm using a blocking queue and the worker threads are polling without a timeout, once the producer was finished, it couldn't just complete its execution because it needed to hang around to put failed URLs back into the queue. My first attempt at a solution was to remove the second "failure" queue and have workers put failed URLs back into the main queue. This meant that the application now had N consumers and N + 1 producers. This approach would allow the main producer thread to just exit when it was finished because it didn't have to worry about putting failed requests back into the queue. Once that problem was solved, there was still another problem. The problem of notifying the worker threads that they could exit once the queue was empty. A blocking queue has no mechanism for the producer to signal that it won't be putting more data to the queue. I thought about having the consumers poll the queue with a timeout and have the primary producer set some sort of flag when it exits. When a consumer times out, it checks the flag. If the flag is set, the consumer exits, if not set, it polls the queue again. While this approach will work, I don't like the design. I don't like the idea of having threads sitting around unnecessarily and I hate even more the use of a magic flag. The only interaction between producer and consumers should be via the queue. The consumers have no knowledge of the producer and checking a magic flag breaks that principle.
I ditched the blocking queue and decided to use a regular non-blocking queue. To prevent the worker threads from exiting as soon as they started, I used a CyclicBarrier. When a worker thread starts, it waits at the barrier before polling the queue. Meanwhile, the producer thread was coded to lower the barrier once the queue contained 10 x N URLs. Once the barrier was lowered, the worker threads would begin processing the Queue. This approach quickly failed because in some cases the consumers would consume the queue faster than the producer could replenish it. This happens in cases where a large number of files are already stored on disk so the consumers don't need to download anything. Once the queue was empty, the consumers exited, even though the producer was still scraping the site looking for URLs.
This tells me that I need to use a blocking queue. I'm continuing to try to find a clean, elegant solution that doesn't depend on timeouts and magic flags. I would love to hear your approach to solving this problem given the requirements.
UPDATE:
I finally settled on a solution based on comments made by user Martin James. Since these were comments and not an answer, there isn't an answer for me to accept. If Martin summarizes his comments into an answer, I'll accept it. Now here's the solution.
When the producer thread completes, it places N objects into the queue that contain null as the value for the URL. I updated the consumer thread to check for a null URL when they pull an object from the queue. If the URL is null, the consumer exits. This approach solves the notification to consumers that the producer is complete. However, it doesn't solve the problem of consumers putting URLs into the queue after the producer has exited. To solve that problem, I switched to a priority blocking queue. I made the object that gets put into the queue a Comparable and the compareTo logic was coded such that objects with null values for the URL will always be last in the queue. So when the producer exits and it places the terminating objects in the queue, if/when a consumer places an object back into the queue, those objects will always be ahead of the terminating objects.
Thanks all for the comments and feedback. Very much appreciated.
My approach would be to use framework with back-pressure mechanism support, for example vert.x reactive streams.
Good examples of systems handling back-pressure built on vert.x can be found in the book vert.x in action
ExecutorService in Java, for example, is a producer-consumer model with a series of worker threads trying to fetch tasks from a work queue. I might close the thread pool by ExecutorService#shutdownNow, this method will set the thread pool state to STOP and interrupt each worker. Take a look at shutdownNow method and worker's run method(I removed the irrelevant code):
public List<Runnable> shutdownNow() {
advanceRunState(STOP);
interruptWorkers();
}
final void runWorker(Worker w) {
try {
while (task != null || (task = getTask()) != null) {
// ...
}
}
}
private Runnable getTask() {
boolean timedOut = false; // Did the last poll() time out?
for (;;) {
// ...
// Check if queue empty only if necessary.
if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
decrementWorkerCount();
return null;
}
try {
Runnable r = timed ?
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
workQueue.take();
if (r != null)
return r;
timedOut = true;
}
}
}
I think this is an example of using Flag & interrupt to stop consumers. I don't think it's inelegant.
I have an piece of code which writes an object to disk as an when the object is put into the LinkedBlockingQueue.
As of now, this is Single threaded. I need to make it multi threaded as the contents are being written to the different files on disk.and therefore, there is no harm in writing them independently.
I am not sure if i can use ThreadPool here as i dont know when the object will be placed on the queue!! now if i decide to have a fixedThreadPool of 5 threads, how do i distribute it among multiple objects?
Any suggestions are highly appreciated.
here is my existing code. I want to Spawn a new thread as and when i get a new object in the queue.
how do i distribute it among multiple objects?
Well, you don't have to worry about task distribution. All you need to do is submit a runnable or callable(which describes your task) and it will be handed to an idle thread in the pool or if all the threads are busy processing, this new task will wait in queue.
Below is what you can try...
1) Create a thread pool that suits your need best.
ExecutorService es = Executors.newFixedThreadPool(desiredNoOfThreads);
2) As you already have the queue in place -
while (true) {
//peek or poll the queue and check for non null value
//if not null then create a runnable or callable and submit
//it to the ExecutorService
}
Generally speaking, if your files are on the same physical device you will get no performance benefit since storage devices work synchronously on read/write operations. So your will get your threads blocked on I/O, which can lead to poorer speed, and definitely will waste threads that could be doing useful work.
In a web app i have a method, this waits for another thread for generate reports if the quantity of customers is less than 10, but if greater than 10 i start my thread but without apply the join method, when the thread finish i notify by e-mail.
I'm a little afraid about the orphan threads with a large execution and the impact on the server.
Is good launch a "heavy" process in background (asynchronically) without use the join method or there is a better way to make it?
try {
thread.start();
if(flagSendEmail > 10){
return "{\"message\":\"success\", \"text\":\"you will be notified by email\"}";
}else{
thread.join(); //the customer waits until finish
}
} catch (InterruptedException e) {
LogError.saveErrorApp(e.getMessage(), e);
return "{\"message\":\"danger\", \"text\":\"can't generate the reports\"}";
}
Orphan threads aren't the problem, simply make sure that the run() method has a finally block that sends out the email.
The problem is that you have no control over the number of threads and that's got nothing to do with calling join(). (Unless you always wait for every single thread in the caller, at which point there's no point launching a background thread in the first place.)
The solution is to use an ExecutorService, which gives you a thread pool, and thus precise control over how many of these background threads are running at any one time. If you submit more tasks than the executor can handle at a given time, the remaining ones are queued up, waiting to be run. This way you can control the load on your server.
An added bonus is that because an executor service will typically recycle the same worker threads, the overhead of submitting a new task is less, meaning that you don't need to bother about whether you've got more than 10 items or not, everything can be run the same way.
In your case you could even consider using two separate executors: one for running the report generation and another one for sending out the emails. The reason for this is that you may want to limit the number of emails sent out in a busy period but without slowing report generation down.
There's no point is starting a thread if the very next thing you do is join() it.
I'm not sure I understand what you're trying to do, but if your example is on the right path, then this would be even better because it avoids creating and destroying a new thread (expensive) in the flagSendEmail <= 10 case:
Runnable r = ...;
if (flagSendEmail > 10) {
Thread thread = new Thread(r);
thread.start();
return "...";
} else {
r.run();
return ???
}
But chances are, you should not be explicitly creating new Threads at all. Any time a program continually creates and destroys threads, that's a sign that it should be using a thread pool instead. (See the javadoc for java.util.concurrent.ThreadPoolExecutor)
By the way: t.join() does not do anything to thread t. It doesn't do anything at all except wait until thread t is dead.
Yes it is safe, I don't recall seeing any Thread#join() actual invocations.
But it will depends on what are you trying to do. I don't know if you mean to use a pool or threads that generate reports or have some resource assigned. In any case you should limit yourself to a maximum number of threads for reports. If they are getting blocked or looped (for some bug or poor synchronization), allowing more and more threads will utterly clog your application.
Thread#join waits for the referred thread to die. Are those threads actually ending? Are you waiting for a thread to die just to launch another thread? Usually synchronization is done with wait() and notify() over the synchronization object.
Launching a process (Runtime#exec()) probably will make things even worse, unless it helps work around some weird limitation.
There are some tools like JConsole which can give you some heads up about threads getting locked and other issues.
I have a thread pool of m threads. Let's say m were 10 and fix. Then there are n queues with the possibility of n becoming large (like 100'000 or more). Every queue holds tasks to be executed by those m threads. Now, very important, every queue must be worked off sequentially task by task. This is a requirement to make sure that tasks are executed in the order they were added to the queue. Otherwise the data could become inconsistent (same as, say, with JMS queues).
So the question is now how to make sure that the tasks in those n queues are processed by the available m threads in a way that no task added to the same queue can be executed "at the same time" by different threads.
I tried to solve this problem myself and figured out that it is quite demanding. Java ThreadPoolExecutor is nice, but you would have to add quite a bit of functionality that is not easy to develop. So the question is whether anyone knows of some framework or system for Java that already solves this problem?
Update
Thanks to Adrian and Tanmay for their suggestions. The number of queues may be very large (like 100'000 or more). So one thread per queue is unhappily not possible although it would be simple and easy. I will look into the fork join framework. Looks like an interesting path to pursue.
My current first iteration solution is to have a global queue to which all tasks are added (using a JDK8 TransferQueue, which has very little locking overhead). Tasks are wrapped into a queue stub with the lock of the queue and its size. The queue itself does not exist physically, only its stub.
An idle thread first needs to obtain a token before it can access the global queue (the token would be a single element in a blocking queue, e.g. JDK8 TransferQueue). Then it does a blocking take on the global queue. When a task was obtained, it checks whether the queue lock of the task's queue stub is down. Actually, I think just using an AtomicBoolean would be sufficient and create less lock contention than a lock or synchronized block.
When the queue lock is obtained, the token is returned to the global queue and the task is executed. If it is not obtained, the task is added to a 2nd level queue and another blocking take from the global queue is done. Threads need to check whether the 2nd level queue is empty and take a task from it to be executed as well.
This solution seems to work. However, the token every thread needs to acquire before being allowed to access the global queue and the 2nd level queue looks like a bottleneck. I believe it will create high lock contention. So, I'm not so happy with this. Maybe I start with this solution and elaborate on it.
Update 2
All right, here now the "best" solution I have come up with so far. The following queues are defined:
Ready Queue (RQ): Contains all tasks that can be executed immediately by any thread in the thread pool
Entry Queue (EQ): Contains all tasks the user wants to be executed as well as internal admin tasks. The EQ is a priority queue. Admin tasks have highest priority.
Channels Queues (CQ): For every channel there is an internal channel queue that is used to preserve the ordering of the tasks, e.g. make sure task are executed sequentially in the order they were added to EQ
Scheduler: Dedicated thread that takes tasks from EQ. If the task is a user task it is added to the CQ of the channel the task was added to. If the head of the CQ equals the just inserted user task it is also added to the EQ (but remains in the CQ) so that it is executes as soon as the next thread of the thread pool becomes available.
If a user task has finished execution an internal task TaskFinished is added to RQ. When executed by the scheduler, the head is taken from the associated CQ. If the CQ is not empty after the take, the next task is polled (but not taken) from the CQ and added to the RQ. The TaskFinished tasks have higher priority than user tasks.
This approach contains in my opinion no logical errors. Note that EQ and RQ need to be synchronized. I prefer using TransferQueue from JDK8 which is very fast and where checking for it to be empty or not, polling the head item is also very fast. The CQs need not be synchronized as they are always accessed by the Scheduler only.
So far I'm quite happy with this solution. What makes me think is whether the Scheduler could turn into a bottleneck. If there are much more tasks in the EQ than it can handle the EQ might grow building up some backlog. Any opinions about that would be appreciated :-)
You can use Fork Join Framework if you are working in Java 7 or Java 8.
You can create a RecursiveTask using popped first element from each queue.
Remember to provide a reference to the queues to the corresponding RecursiveTasks.
Invoke all of the at once. (In a loop or stream).
Now at the end of the compute method (after processing of a task is completed), create another RecursiveTask by popping another element from the corresponding queue and call invoke on it.
Notes:
Each task will be responsible for extracting new element from the queue, so all tasks from the queue would be executed sequentially.
There should be a new RecursiveTask created and invoked separately for each element in the queues. This ensures that some queues do not hog the threads and starvation is avoided.
Using an ExecutorService is also a viable option, but IMO ForkJoin's API if friendlier for your use case
Hope this helps.
One simple solution would be to create a task whenever an element is added to an empty queue. This task would be responsible for only that queue and would end when the queue has been worked off. Ensure that the Queue implementations are thread-safe and the task stops after removing the last element.
EDIT: These tasks should be added to a ThreadPoolExecutor with an internal queue, for example one created by ExecutorService.newFixedThreadPool, which will work off the tasks in parallel with a limited number of threads.
Alternatively, just divide the queues among a fixed number of threads:
public class QueueWorker implements Runnable {
// should be unique and < NUM_THREADS:
int threadId;
QueueWorker(int threadId) {
this.threadId = threadId;
}
#Override
public void run() {
int currentQueueIndex = threadId;
while (true) {
Queue currentQueue = queues.get(currentQueue);
// execute tasks until empty
currentQueueIndex += NUM_THREADS;
if (currentQueueIndex > queues.size()) {
currentQueueIndex = threadId;
}
}
}
}
I have a bit of an issue with an application running multiple Java threads.
The application runs a number of working threads that peek continuously at an input queue and if there are messages in the queue they pull them out and process them.
Among those working threads there is another verification thread scheduled to perform at a fixed period a check to see if the host (on which the application runs) is still in "good shape" to run the application. This thread updates an AtomicBoolean value which in turn is verified by the working thread before they start peeking to see if the host is OK.
My problem is that in cases with high CPU load the thread responsible with the verification will take longer because it has to compete with all the other threads. If the AtomicBoolean does not get updated after a certain period it is automatically set to false, causing me a nasty bottleneck.
My initial approach was to increase the priority of the verification thread, but digging into it deeper I found that this is not a guaranteed behavior and an algorithm shouldn't rely on thread priority to function correctly.
Anyone got any alternative ideas? Thanks!
Instead of peeking into a regular queue data structure, use the java.util.concurrent package's LinkedBlockingQueue.
What you can do is, run an pool of threads (you could use executer service's fixed thread pool, i.e., a number of workers of your choice) and do LinkedBlockingQueue.take().
If a message arrives at the queue, it is fed to one of the waiting threads (yeah, take does block the thread until there is something to be fed with).
Java API Reference for Linked Blocking Queue's take method
HTH.
One old school approach to throttling rate of work, that does not use a health check thread at all (and so by-passes these problems) is to block or reject requests to add to the queue if the queue is longer than say 100. This applies dynamic back pressure on to the clients generating the load, slowing them down when the worker threads are over loaded.
This approach was added to the Java 1.5 library, see java.util.concurrent.ArrayBlockingQueue. Its put(o) method blocks if the queue is full.
Are u using Executor framework (from Java's concurrency package)? If not give it a shot. You could try using ScheduledExecutorService for the verification thread.
More threads does not mean better performance. Usually if you have dual core, 2 threads gives best performance, 3 or more starts getting worse. Quad core should handle 4 threads best, etc. So be careful how much threads you use.
You can put the other threads to sleep after they perform their work, and allow other threads to do their part. I believe Thread.yield() will pause the current thread to give time to other threads.
If you want your thread to run continuously, I would suggest creating two main threads, thread A and B. Use A for the verification thread, and from B, create the other threads. Therefore thread A gets more execution time.
Seems you need to utilize Condition variables. Peeking will take cpu cycles.
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/Condition.html