I have two threads: a primary thread that does the main processing of the application, and a secondary thread that receives data batches from the primary thread and processes and outputs them, either to the user, to a file, or over the network. In general, data should be processed at a much faster rate than it is produced. I would like to ensure that the main thread never waits for the secondary thread. The secondary thread can accept any amount of overhead, expanding buffers, redoing work, and so on, with the sole objective of maximizing performance of the main thread. Ideally the main thread will never synchronize at all. Is there any way to push synchronization costs onto one thread in Java?
This is an outline of a solution:
The main thread works in isolation for some time, piling up data into a collection;
when it has generated a nice batch, it:
i. creates a new collection for itself;
ii. sets the filled-up collection aside, available to be picked up by the reading thread;
iii. CASes this collection into an AtomicReference.
The reading thread polls this AtomicReference for updates;
when it notices it has been set, it picks up the batch, CASing null into the shared reference, so that the main thread knows it can put another collection in.
This has negligible coordination costs for the main thread (just one CAS operation per batch) assuming that the reference is always already null when it's time to share a new batch.
The reading thread may run a busy loop polling the shared reference, sleeping a small amount of time each time it reads null. The best technique to make the thread sleep for a really short time is
LockSupport.parkNanos(1);
which will typically sleep for some 30 µs and the whole loop will consume about 2-3% CPU time. You could use a longer pause, of course, if you want to bring down the CPU time even more.
Note that coordination techniques which make the thread wait in a wait set impose a very large latency on both sides, so you should stay away from them if, say, 1 ms latency is a big concern for you.
The simplest approach is
a BlockingQueue without size limit (LinkedBlockingQueu) as the was of communication would prevent your main thread from 'synchronization' costs if you meant by them waiting for other thread when sending the data.
Related
While reading about Java synchronized, I just wondered, if the processing should be in synchronization, why not just creating a single thread (not main thread) and process one by one instead of creating multiple threads.
Because, by 'synchronized', all other threads will be just waiting except single running thread. It seems like the only single thread is working in the time.
Please advise me what I'm missing it.
I would very appreciate it if you could give some use cases.
I read an example, that example about accessing bank account from 2 ATM devices. but it makes me more confused, the blocking(Lock) should be done by the Database side, I think. and I think the 'synchronized' would not work in between multiple EC2 instances.
If my thinking is wrong, please fix me.
If all the code you run with several threads is within a synchronized block, then indeed it makes no difference vs. using a single thread.
However in general your code contains parts which can be run on several threads in parallel and parts which can't. The latter need synchronization but not the former. By using several threads you can speed up the "parallelisable" bits.
Let's consider the following use-case :
Your application is a internet browser game. Every player has a score and can click a button. Every time a player clicks the button, their score is increased and their opponent's is decreased. The first player to reach 10 wins.
As per the nature of the game, and to single a unique winner, you have to consider the two counters increase (and the check for the winner) atomically.
You'll have each player send clickEvents on their own thread and every event will be translated into the increase of the owner's counter, the check on whether the counter reached 10 and the decrease of the opponent's counter.
This is very easily done by synchronizing the method which handles modifying the counters : every concurrent thread will try to obtain the lock, and when they do, they'll execute the code (and finally release the lock).
The locking mechanism is pretty lightweight and only requires a single keyword of code.
If we follow your suggestion to implement another thread that will handle the execution, we'd have to implement the whole thread management logic (more code), to initialize that Thread (more resource) and even so, to guarantee fairness in the handling of events, you still need a way for your client threads to pass the event to your executor thread. The only way I see to do so, is to implement a BlockingQueue, which is also synchronized to prevent the race condition that naturally occurs when trying to add elements from two other thread.
I honnestly don't see a way to resolve this very simple use-case without synchronization (or implementing your own locking algorithm that basically does the same).
You can have a single thread and process one-by-one (and this is done), but there are considerable overheads in doing so and it does not remove the need for synchronization.
You are in a situation where you are starting with multiple threads (for example, you have lots of simultaneous web sessions). You want to do a part of the processing in a single thread - let's say updating some common structure with some new data. You need to pass the new data to the single thread - how do you get it there? You would have to use some kind of message queue (or an equivalent thing) and have the single thread pick requests off the message queue and that would have have to be synchronized anyway, plus there is the overhead of managing the queue, plus the issue that you need to get a reply back from the single thread asynchronously. So you are back to square one.
This technique is used where the processing you need to do is considerable and you don't want to block your main threads for a long time.
In summary: having a single thread does not remove the need for synchronization.
I have read many similar questions . However I was not quite satisfied with answers.
I would like to build an algorithm that would adjust the number of threads depending on the average speed.
Let's say as I introduce a new thread, the average speed of task execution increases , it means that the new thread is good. Then the algorithm should try to add another thread ... until the optimal number of threads is achieved .......
Also the algorithm should be keeping track of the average speed. If at some point the average speed goes down significantly, let's say by 10 % (for any reason e.g. i open a different application or whatever) , then the algorithm should terminate one thread and see if the speed goes up ...
Maybe such an API exists. Please, give me any directions or any code example how I could implement such an algorithm
Thank You !
I do not know self-tune system that you are describing but it sounds like not so complicated task once you are using ready thread pool. Take thread pool from concurrency package, implement class TimeConsumptionCallable implements Callable that wraps any other callable and just measures the execution time.
Now you just have to change (increase or decrease) number of working threads when average execution time increases or decreases.
Just do not forget that you need enough statistics before you decide to change number of working threads. Otherwise various random effects that do not depend on your application can cause your thread pool to grow and go down all the time that can itself kill overall performance.
newCachedThreadPool() V/s newFixedThreadPool suggests that perhaps you should be looking at ExecutorService.newCachedThreadPool().
Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available. These pools will typically improve the performance of programs that execute many short-lived asynchronous tasks. Calls to execute will reuse previously constructed threads if available. If no existing thread is available, a new thread will be created and added to the pool. Threads that have not been used for sixty seconds are terminated and removed from the cache. Thus, a pool that remains idle for long enough will not consume any resources. Note that pools with similar properties but different details (for example, timeout parameters) may be created using ThreadPoolExecutor constructors.
If your threads do not block at any time, then the maximum execution speed is reached when you have as many threads as cores, as simply more than 100% CPU usage is not possible.
In other situations it is very difficult to measure how much a new thread will increase/decrease the execution speed, as you just watch a moment in time and make assumptions based on something that could be entirely different the next second.
One idea would be to use an Executor class in combination with a Queue that you specified. So you can measure the size of the queue and make assumptions based on that. If the queue is empty, threads are idle and you can remove one. If the queue fills up, threads cannot handle the load, you need to add more. If the queue is stable, you are about right.
You can come up with your own algorithm by using existing API of java :
public void setCorePoolSize(int corePoolSize) in ThreadPoolExecutor
Sets the core number of threads. This overrides any value set in the constructor.
If the new value is smaller than the current value, excess existing threads will be terminated when they next become idle.
If larger, new threads will, if needed, be started to execute any queued tasks.
Initialization:
ExecutorService service = Executors.newFixedThreadPool(5); // initializaiton
On your need basis, resize the pool by using below API
((ThreadPoolExecutor)service).setCorePoolSize(newLimit);//newLimit is new size of the pool
And one important point: If the queue is full, and new value of number of threads is greater than or equal to maxPoolSize defined earlier, Task will be rejected.
Be careful when setting maxPoolSize so that setCorePoolSize works properly.
I have a bit of an issue with an application running multiple Java threads.
The application runs a number of working threads that peek continuously at an input queue and if there are messages in the queue they pull them out and process them.
Among those working threads there is another verification thread scheduled to perform at a fixed period a check to see if the host (on which the application runs) is still in "good shape" to run the application. This thread updates an AtomicBoolean value which in turn is verified by the working thread before they start peeking to see if the host is OK.
My problem is that in cases with high CPU load the thread responsible with the verification will take longer because it has to compete with all the other threads. If the AtomicBoolean does not get updated after a certain period it is automatically set to false, causing me a nasty bottleneck.
My initial approach was to increase the priority of the verification thread, but digging into it deeper I found that this is not a guaranteed behavior and an algorithm shouldn't rely on thread priority to function correctly.
Anyone got any alternative ideas? Thanks!
Instead of peeking into a regular queue data structure, use the java.util.concurrent package's LinkedBlockingQueue.
What you can do is, run an pool of threads (you could use executer service's fixed thread pool, i.e., a number of workers of your choice) and do LinkedBlockingQueue.take().
If a message arrives at the queue, it is fed to one of the waiting threads (yeah, take does block the thread until there is something to be fed with).
Java API Reference for Linked Blocking Queue's take method
HTH.
One old school approach to throttling rate of work, that does not use a health check thread at all (and so by-passes these problems) is to block or reject requests to add to the queue if the queue is longer than say 100. This applies dynamic back pressure on to the clients generating the load, slowing them down when the worker threads are over loaded.
This approach was added to the Java 1.5 library, see java.util.concurrent.ArrayBlockingQueue. Its put(o) method blocks if the queue is full.
Are u using Executor framework (from Java's concurrency package)? If not give it a shot. You could try using ScheduledExecutorService for the verification thread.
More threads does not mean better performance. Usually if you have dual core, 2 threads gives best performance, 3 or more starts getting worse. Quad core should handle 4 threads best, etc. So be careful how much threads you use.
You can put the other threads to sleep after they perform their work, and allow other threads to do their part. I believe Thread.yield() will pause the current thread to give time to other threads.
If you want your thread to run continuously, I would suggest creating two main threads, thread A and B. Use A for the verification thread, and from B, create the other threads. Therefore thread A gets more execution time.
Seems you need to utilize Condition variables. Peeking will take cpu cycles.
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/Condition.html
From a Thread perspective, what is a block, wait and lock? Rather,is it necessary to have all these three in any operation? For example, in a producer-consumer pattern how this things are implemented.
Thanks in advance
A blocking operation is one that blocks the thread until the operation completes. Blocking a thread is the process of telling the thread scheduler (usually the operating system, although there are user-level thread libraries) not to run a thread until that thread is woken up. There are many kinds of blocking operations, and one example is file I/O. As with any other blocking operation, the method doesn't return until the relevant operation (in this case, file I/O) has completed.
A wait is a particular kind of blocking operation used for thread synchronization. Specifically, it says "please block the thread that called wait until some other thread wakes it up." In Java, wait is a method. The corresponding wake-up method is notify.
A lock is a higher-level abstraction that says "only allow a limited number of threads into this region of code." Most commonly, that limited number is 1, in which case a mutex (which I explain in plenty of detail in this SO answer) is the preferred locking primitive in a lower-level language like C. In Java, the most common locking primitive is called a monitor. There is a notion of owning an object's monitor (every object has a monitor), and waiting on a monitor, and waking up a thread that is waiting on a monitor. How do we accomplish this? You guessed it - we use the wait method to wait on a monitor, and notify to wake up one of the threads that is waiting on the monitor.
Now an answer that will probably sound a bit like Greek, given that you are just starting with concurrency: To implement the producer-consumer pattern, the most common strategy is to use two semaphores (plus a mutex to synchronize access to the buffer). A semaphore is usually implemented with a mutex, but is a higher-order construct because it allows counting some resource. So you keep one semaphore to count the number of items in the buffer, and one to count the number of empty spaces in the buffer. The producer waits on the empty space semaphore and adds items to the buffer whenever space becomes available, and the consumer waits on the items semaphore and consumes an item whenever an item becomes available.
Now I've defined what these things are, but I haven't really talked about how to use them. That, however, is worth several lectures in a college course, and is certainly too much for a StackOverflow answer. I'd recommend the concurrency lessons in the Java tutorials as a way to get started with threading. Also, look up college courses on the web. Many schools post notes publicly online, so with a little searching you can often find high-quality material.
EDIT: A description of the difference between wait and blocking I/O
Before you begin reading this, make sure you're familiar with what a thread is, and what a process is. I give an explanation in the first four paragraphs of this SO answer, and Wikipedia has a more detailed explanation (albeit with less historical context).
Each thread has one very important piece of information: an instruction pointer (there are other important pieces of information associated with each thread, but they aren't important now). The instruction pointer is a JVM-maintained pointer to the currently executing bytecode instruction. Every time you execute an instruction (each instruction is an abstract representation of a very simple operation, such as "call method foo on object x), the instruction pointer is moved forward to some "next instruction." To run your program, the JVM sets the instruction pointer to the beginning of main and keeps executing instructions and moving the instruction pointer forward until the program exits somehow.
A blocking operation stops the instruction pointer from moving forward until some event occurs to cause the instruction pointer to move forward again. Certainly the thread that initiated the blocking operation can't make this event happen, because that thread's instruction pointer isn't moving forward i.e. that thread is doing nothing.
Now, there are a lot of different kinds of blocking operations. One is blocking I/O. If you call System.out.println, for example, the println method doesn't return until the text is written out to the console. In this case, the instruction pointer stops somewhere inside System.out.println, and the operating system signals the thread to wake up whenever the console printing finishes. So the thread doesn't have to start its own instruction pointer moving again, but the method still returns just after the text is written to the console. So, at a very high level:
Thread 0 calls System.out.println("foo")
Thread 0's instruction pointer stops moving while the operating system writes "foo" to the console
When the operating system is done writing to the console, it notifies the JVM, and the JVM automatically starts moving thread 0's instruction pointer moving again. All of this happens without the programmer who writes System.out.println having to think about it.
Another completely separate kind of blocking operation is encapsulated in the Object.wait method. Whenever a thread calls Object.wait, that thread's instruction pointer stops moving, but instead of the operating system starting the movement of the instruction pointer again, another thread does the job. In this case, there is no external event that will cause the thread's instruction pointer to be restarted (as in the blocking I/O case), but there is an event internal to the program. As I said, another thread will start the instruction pointer moving again by calling Object.notify. So, at a very high level:
Thread 0 calls x.wait() on some object
Thread 0's instruction pointer stops moving
Thread 1 calls x.notify() on the same object x
Thread 0's instruction pointer starts moving again
Thread 0 and thread 1 are now executing concurrently
Notice that a lot more work has to go into writing wait/notify code correctly - the JVM and the operating system don't do all the work for you this time. They still actually do most of the work for you, but you actually have to think about calling wait and notify, and how they allow you to communicate between threads, implement locks, and more.
So there are two morals to this story. The first is that blocking I/O and wait are completely different beasts. In both cases, a thread is blocked, but in the blocking I/O case the thread is woken up automatically by the operating system, while in the wait case the thread has to rely on another thread calling notify in order to wake it up. The second is that concurrent programming is harder to reason about than serial programming. The toy examples I've put in this answer don't really do the second point justice.
No, you don't necessarily need a lock or a wait just because you're using threads. However, if you want the threads to exchange data, they are often useful.
Here's a good explanation with an example of the consumer producer model:
http://www.ase.md/~aursu/JavaThreadsSynchronization.html
Cheers!
Block : Prevent the Executing.
Wait : Suspends the current thread.
Lock : When you lock it others can't Use it.
Consider online purchase when a customer buys a Movie Ticket
As soon as he chooses the seat. Others won't be able to get those seat at the same time(Locking those seats).
We are developing a Java application with several worker threads. These threads will have to deliver a lot of computation results to our UI thread. The order in which the results are delivered does not matter.
Right now, all threads simply push their results onto a synchronized Stack - but this means that every thread must wait for the other threads before results can be delivered.
Is there a data structure that supports simultaneous insertions with each insertion completing in constant time?
Thanks,
Martin
ConcurrentLinkedQueue is designed for high contention. Producers enqueue stuff on one end and consumers collect elements at the other end, so everything will be processed in the order it's added.
ArrayBlockingQueue is a better for lower contention, with lower space overhead.
Edit: Although that's not what you asked for. Simultaneuos inserts? You may want to give every thread one output queue (say, an ArrayBlockingQueue) and then have the UI thread poll the separate queues. However, I'd think you'll find one of the two above Queue implementations sufficient.
Right now, all threads simply push
their results onto a synchronized
Stack - but this means that every
thread must wait for the other threads
before results can be delivered.
Do you have any evidence indicating that this is actually a problem? If the computation performed by those threads is even the least little bit complex (and you don't have literally millions of threads), then lock contention on the result stack is simply a non-issue because when any given thread delivers its results, all others are most likely busy doing their computations.
Take a step back and evaluate whether performance is the key design consideration here. Don't think, know: does profiling back it up?
If not, I'd say a bigger concern is clarity and readability of design, and not introducing new code to maintain. It just so happens that, if you're using Swing, there is a library for doing exactly what you're trying to do, called SwingWorker.
Take a look at java.util.concurrent.ConcurrentLinkedQueue, java.util.concurrent.ConcurrentHashMap or java.util.concurrent.ConcurrentSkipListSet. They might do what you need. ConcurrentSkipListSet, for instance, claims to have "expected average log(n) time cost for the contains, add and remove operations and their variants. Insertion, removal, and access operations safely execute concurrently by multiple threads."
Two other patterns you might want to look at are
each thread has its own collection, when polled it returns the collection and creates a new one, so the collection only holds the pending items between polls. The thread needs to protect operations on its collection, but there is no contention between threads. This is blocking (each thread cannot add to its collection while the UI thread pulls updates from it), but can reduce contention (no contention between threads).
each thread has its own collection, and appends the results to a common queue which is protected using a Lock.tryLock(). The thread continues processing if it fails to acquire the lock. This makes it less likely that a thread will block waiting for the shared queue.