While reading about Java synchronized, I just wondered, if the processing should be in synchronization, why not just creating a single thread (not main thread) and process one by one instead of creating multiple threads.
Because, by 'synchronized', all other threads will be just waiting except single running thread. It seems like the only single thread is working in the time.
Please advise me what I'm missing it.
I would very appreciate it if you could give some use cases.
I read an example, that example about accessing bank account from 2 ATM devices. but it makes me more confused, the blocking(Lock) should be done by the Database side, I think. and I think the 'synchronized' would not work in between multiple EC2 instances.
If my thinking is wrong, please fix me.
If all the code you run with several threads is within a synchronized block, then indeed it makes no difference vs. using a single thread.
However in general your code contains parts which can be run on several threads in parallel and parts which can't. The latter need synchronization but not the former. By using several threads you can speed up the "parallelisable" bits.
Let's consider the following use-case :
Your application is a internet browser game. Every player has a score and can click a button. Every time a player clicks the button, their score is increased and their opponent's is decreased. The first player to reach 10 wins.
As per the nature of the game, and to single a unique winner, you have to consider the two counters increase (and the check for the winner) atomically.
You'll have each player send clickEvents on their own thread and every event will be translated into the increase of the owner's counter, the check on whether the counter reached 10 and the decrease of the opponent's counter.
This is very easily done by synchronizing the method which handles modifying the counters : every concurrent thread will try to obtain the lock, and when they do, they'll execute the code (and finally release the lock).
The locking mechanism is pretty lightweight and only requires a single keyword of code.
If we follow your suggestion to implement another thread that will handle the execution, we'd have to implement the whole thread management logic (more code), to initialize that Thread (more resource) and even so, to guarantee fairness in the handling of events, you still need a way for your client threads to pass the event to your executor thread. The only way I see to do so, is to implement a BlockingQueue, which is also synchronized to prevent the race condition that naturally occurs when trying to add elements from two other thread.
I honnestly don't see a way to resolve this very simple use-case without synchronization (or implementing your own locking algorithm that basically does the same).
You can have a single thread and process one-by-one (and this is done), but there are considerable overheads in doing so and it does not remove the need for synchronization.
You are in a situation where you are starting with multiple threads (for example, you have lots of simultaneous web sessions). You want to do a part of the processing in a single thread - let's say updating some common structure with some new data. You need to pass the new data to the single thread - how do you get it there? You would have to use some kind of message queue (or an equivalent thing) and have the single thread pick requests off the message queue and that would have have to be synchronized anyway, plus there is the overhead of managing the queue, plus the issue that you need to get a reply back from the single thread asynchronously. So you are back to square one.
This technique is used where the processing you need to do is considerable and you don't want to block your main threads for a long time.
In summary: having a single thread does not remove the need for synchronization.
Related
I have two threads: a primary thread that does the main processing of the application, and a secondary thread that receives data batches from the primary thread and processes and outputs them, either to the user, to a file, or over the network. In general, data should be processed at a much faster rate than it is produced. I would like to ensure that the main thread never waits for the secondary thread. The secondary thread can accept any amount of overhead, expanding buffers, redoing work, and so on, with the sole objective of maximizing performance of the main thread. Ideally the main thread will never synchronize at all. Is there any way to push synchronization costs onto one thread in Java?
This is an outline of a solution:
The main thread works in isolation for some time, piling up data into a collection;
when it has generated a nice batch, it:
i. creates a new collection for itself;
ii. sets the filled-up collection aside, available to be picked up by the reading thread;
iii. CASes this collection into an AtomicReference.
The reading thread polls this AtomicReference for updates;
when it notices it has been set, it picks up the batch, CASing null into the shared reference, so that the main thread knows it can put another collection in.
This has negligible coordination costs for the main thread (just one CAS operation per batch) assuming that the reference is always already null when it's time to share a new batch.
The reading thread may run a busy loop polling the shared reference, sleeping a small amount of time each time it reads null. The best technique to make the thread sleep for a really short time is
LockSupport.parkNanos(1);
which will typically sleep for some 30 µs and the whole loop will consume about 2-3% CPU time. You could use a longer pause, of course, if you want to bring down the CPU time even more.
Note that coordination techniques which make the thread wait in a wait set impose a very large latency on both sides, so you should stay away from them if, say, 1 ms latency is a big concern for you.
The simplest approach is
a BlockingQueue without size limit (LinkedBlockingQueu) as the was of communication would prevent your main thread from 'synchronization' costs if you meant by them waiting for other thread when sending the data.
I have a bit of an issue with an application running multiple Java threads.
The application runs a number of working threads that peek continuously at an input queue and if there are messages in the queue they pull them out and process them.
Among those working threads there is another verification thread scheduled to perform at a fixed period a check to see if the host (on which the application runs) is still in "good shape" to run the application. This thread updates an AtomicBoolean value which in turn is verified by the working thread before they start peeking to see if the host is OK.
My problem is that in cases with high CPU load the thread responsible with the verification will take longer because it has to compete with all the other threads. If the AtomicBoolean does not get updated after a certain period it is automatically set to false, causing me a nasty bottleneck.
My initial approach was to increase the priority of the verification thread, but digging into it deeper I found that this is not a guaranteed behavior and an algorithm shouldn't rely on thread priority to function correctly.
Anyone got any alternative ideas? Thanks!
Instead of peeking into a regular queue data structure, use the java.util.concurrent package's LinkedBlockingQueue.
What you can do is, run an pool of threads (you could use executer service's fixed thread pool, i.e., a number of workers of your choice) and do LinkedBlockingQueue.take().
If a message arrives at the queue, it is fed to one of the waiting threads (yeah, take does block the thread until there is something to be fed with).
Java API Reference for Linked Blocking Queue's take method
HTH.
One old school approach to throttling rate of work, that does not use a health check thread at all (and so by-passes these problems) is to block or reject requests to add to the queue if the queue is longer than say 100. This applies dynamic back pressure on to the clients generating the load, slowing them down when the worker threads are over loaded.
This approach was added to the Java 1.5 library, see java.util.concurrent.ArrayBlockingQueue. Its put(o) method blocks if the queue is full.
Are u using Executor framework (from Java's concurrency package)? If not give it a shot. You could try using ScheduledExecutorService for the verification thread.
More threads does not mean better performance. Usually if you have dual core, 2 threads gives best performance, 3 or more starts getting worse. Quad core should handle 4 threads best, etc. So be careful how much threads you use.
You can put the other threads to sleep after they perform their work, and allow other threads to do their part. I believe Thread.yield() will pause the current thread to give time to other threads.
If you want your thread to run continuously, I would suggest creating two main threads, thread A and B. Use A for the verification thread, and from B, create the other threads. Therefore thread A gets more execution time.
Seems you need to utilize Condition variables. Peeking will take cpu cycles.
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/Condition.html
This is homework.
I do not want the solution, just a small number of links or ideas.
Simply speaking what I want to do is,
Simple example :
public class Example
{
public void method()
{
int x = doThat();
//Call other methods which do not depend on x
return;
}
}
doThat() is a method that is known to be time consuming, which results in my program blocking until results are back. And I want to use different methods of this Object, but program is frozen until doThat() is finished. Those different methods do not necesserely have to be invoked from the method() used in this example, but maybe from outside the object.
I thought about using threads but if I have a huge number of objects (1000+) this probably wont be very efficient (correct me if I am wrong please). I guess if I use threads I have to use one thread per object ?
Is there any other way besides threads that can make the invoking object not block when calling doThat(); ? If threading is the only way, could you provide a link ?
Knowing questions like that get downvoted I will accept any downvotes. But please just a link would be more than great.
Thanks in advance. I hope question is inline with the rules.
I'd also use threads for this, but I simply wanted to add that it would probably be interesting to look at java.util.concurrent.Executors (to create thread pools as you have a number of objects) and the java.util.concurrent.Future and java.util.concurrent.Callable classes which will allow you to launch threads that can return a value.
Take a look at the concurrency tutorial for more info.
I recommend you to create a class that implements Runnable, whose run method does what doThat() does in your sample. Then you can invoke it in a separate Thread in a simple way. Java's Thread class does have a constructor that takes a runnable. Use the run and join methods.
Cheers
Matthias
Of course threads are the only solution to handle some jobs in backgrounds, but
you are not forced to use a thread just for a single operation to be performed.
You can use only one thread that maintains a queue of operations to be performed, in a way that every call to the method doThat adds a new entry into the queue.
Maybe some design patterns like "Strategy" can help you to generalize the concept of operation to be performed, in order to store "operation objects" into the thread's queue.
You want to perform several things concurrently, so using threads is indeed the way to go. The Java tutorial concurrency lesson will probably help you.
1000 concurrent threads will impose a heavy memory load, because a certain amount of stack memory is allocated for each thread (2 MB?). If, however, you can somehow make sure there will be only one Thread running at a time, you still can take the thread per object approach. This would require you to manage that doThat() is only called, if the thread produced by a former invocation on another object has already finished.
If you cannot ensure that easily, the other approach would be to construct one worker thread which reads from a double ended queue which object to work on. The doThat() method would then just add this to the end of the queue, from which the worker thread will later extract it. You have to properly synchronize when accessing any data structure from concurrent threads. And the main thread should somehow notify the worker thread of the condition, that it will not add any more objects to the queue, so the worker thread can cleanly terminate.
We are developing a Java application with several worker threads. These threads will have to deliver a lot of computation results to our UI thread. The order in which the results are delivered does not matter.
Right now, all threads simply push their results onto a synchronized Stack - but this means that every thread must wait for the other threads before results can be delivered.
Is there a data structure that supports simultaneous insertions with each insertion completing in constant time?
Thanks,
Martin
ConcurrentLinkedQueue is designed for high contention. Producers enqueue stuff on one end and consumers collect elements at the other end, so everything will be processed in the order it's added.
ArrayBlockingQueue is a better for lower contention, with lower space overhead.
Edit: Although that's not what you asked for. Simultaneuos inserts? You may want to give every thread one output queue (say, an ArrayBlockingQueue) and then have the UI thread poll the separate queues. However, I'd think you'll find one of the two above Queue implementations sufficient.
Right now, all threads simply push
their results onto a synchronized
Stack - but this means that every
thread must wait for the other threads
before results can be delivered.
Do you have any evidence indicating that this is actually a problem? If the computation performed by those threads is even the least little bit complex (and you don't have literally millions of threads), then lock contention on the result stack is simply a non-issue because when any given thread delivers its results, all others are most likely busy doing their computations.
Take a step back and evaluate whether performance is the key design consideration here. Don't think, know: does profiling back it up?
If not, I'd say a bigger concern is clarity and readability of design, and not introducing new code to maintain. It just so happens that, if you're using Swing, there is a library for doing exactly what you're trying to do, called SwingWorker.
Take a look at java.util.concurrent.ConcurrentLinkedQueue, java.util.concurrent.ConcurrentHashMap or java.util.concurrent.ConcurrentSkipListSet. They might do what you need. ConcurrentSkipListSet, for instance, claims to have "expected average log(n) time cost for the contains, add and remove operations and their variants. Insertion, removal, and access operations safely execute concurrently by multiple threads."
Two other patterns you might want to look at are
each thread has its own collection, when polled it returns the collection and creates a new one, so the collection only holds the pending items between polls. The thread needs to protect operations on its collection, but there is no contention between threads. This is blocking (each thread cannot add to its collection while the UI thread pulls updates from it), but can reduce contention (no contention between threads).
each thread has its own collection, and appends the results to a common queue which is protected using a Lock.tryLock(). The thread continues processing if it fails to acquire the lock. This makes it less likely that a thread will block waiting for the shared queue.
I have a producer app that generates an index (stores it in some in-memory tree data structure). And a consumer app will use the index to search for partial matches.
I don't want the consumer UI to have to block (e.g. via some progress bar) while the producer is indexing the data. Basically if the user wishes to use the partial index, it will just do so. In this case, the producer will potentially have to stop indexing for a while until the user goes away to another screen.
Roughly, I know I will need the wait/notify protocol to achieve this. My question: is it possible to interrupt the producer thread using wait/notify while it is doing its business ? What java.util.concurrent primitives do I need to achieve this ?
The way you've described this, there's no reason that you need wait/notify. Simply synchronize access to your data structure, to ensure that it is in a consistent state when accessed.
Edit: by "synchronize access", I do not mean synchronize the entire data structure (which would end up blocking either producer or consumer). Instead, synchronize only those bits that are being updated, and only at the time that you update them. You'll find that most of the producer's work can take place in an unsynchronized manner: for example, if you're building a tree, you can identify the node where the insert needs to happen, synchronize on that node, do the insert, then continue on.
In your producer thread, you are likely to have some kind of main loop. This is probably the best place to interrupt your producer. Instead of using wait() and notify() I suggest you use the java synchronization objects introduced in java 5.
You could potentially do something like that
class Indexer {
Lock lock = new ReentrantLock();
public void index(){
while(somecondition){
this.lock.lock();
try{
// perform one indexing step
}finally{
lock.unlock();
}
}
}
public Item lookup(){
this.lock.lock();
try{
// perform your lookup
}finally{
lock.unlock();
}
}
}
You need to make sure that each time the indexer releases the lock, your index is in a consistent, legal state. In this scenario, when the indexer releases the lock, it leaves a chance for a new or waiting lookup() operation to take the lock, complete and release the lock, at which point your indexer can proceed to its next step. If no lookup() is currently waiting, then your indexer just reaquires the lock itself and goes on with its next operation.
If you think you might have more that one thread trying to do the lookup at the same time, you might want to have a look at the ReadWriteLock interface and ReentrantReadWriteLock implementation.
Of course this solution is the simple way to do it. It will block either one of the threads that doesn't have the lock. You may want to check if you can just synchronize on your data structure directly, but that might prove tricky since building indexes tends to use some sort of balanced tree or B-Tree or whatnot where node insertion is far from being trivial.
I suggest you first try that simple approach, then see if the way it behaves suits you. If it doesn't, you may either try breaking up the the indexing steps into smaller steps, or try synchronizing on only parts of your data structure.
Don't worry too much about the performance of locking, in java uncontended locking (when only one thread is trying to take the lock) is cheap. As long as most of your locking is uncontented, locking performance is nothing to be concerned about.
The producer application can have two indices: published and in-work. The producer will work only with in-work, the consumer will work only with published. Once the producer done with indexing it can replace in-work one with published (usually swapping one pointer). The producer may also publish copy of the partial index if will bring value. This way you will avoid long term locks -- it will be useful when index accessed by lost of consumers.
No, that's not possible.
The only way of notifying a thread without any explicit code in the thread itself is to use Thread.interrupt(), which will cause an exception in the thread. interrrupt() is usually not very reliable though, because throwing a exception at some random point in the code is a nightmare to get right in all code paths. Beside that, a single try{}catch(Throwable){} somewhere in the thread (including any libraries that you use) could be enough to swallow the signal.
In most cases, the only correct solution is use a shared flag or a queue that the consumer can use to pass messages to the producer. If you worry about the producer being unresponsive or freezing, run it in a separate thread and require it to send heartbeat messages every n seconds. If it does not send a heartbeat, kill it. (Note that determining whether a producer is actually freezing, and not just waiting for an external event, is often very hard to get right as well).