I have a problem with Java threads. I must run two threads separately for a defined number of steps and then I have to make them communicate.
Thread 1 must read an ArrayList that thread 2 owns and modifies and same for thread 2.
Which is the better method to synchronize them? Can I use the arrayList of thread1 for thread 2 or must I define a different shared area?
Thank everybody.
It is much cleaner to use a dedicated synchronizer than to lock on one of the ArrayLists.
I would suggest using a CyclicBarrier. To quote the JavaDoc:
A synchronization aid that allows a set of threads to all wait for each other to reach a common barrier point. CyclicBarriers are useful in programs involving a fixed sized party of threads that must occasionally wait for each other. The barrier is called cyclic because it can be re-used after the waiting threads are released.
Since the situation is symmetric, you should not chose one ArrayList over the other. Use additional object. Exchanger looks like the best choice.
Related
While reading about Java synchronized, I just wondered, if the processing should be in synchronization, why not just creating a single thread (not main thread) and process one by one instead of creating multiple threads.
Because, by 'synchronized', all other threads will be just waiting except single running thread. It seems like the only single thread is working in the time.
Please advise me what I'm missing it.
I would very appreciate it if you could give some use cases.
I read an example, that example about accessing bank account from 2 ATM devices. but it makes me more confused, the blocking(Lock) should be done by the Database side, I think. and I think the 'synchronized' would not work in between multiple EC2 instances.
If my thinking is wrong, please fix me.
If all the code you run with several threads is within a synchronized block, then indeed it makes no difference vs. using a single thread.
However in general your code contains parts which can be run on several threads in parallel and parts which can't. The latter need synchronization but not the former. By using several threads you can speed up the "parallelisable" bits.
Let's consider the following use-case :
Your application is a internet browser game. Every player has a score and can click a button. Every time a player clicks the button, their score is increased and their opponent's is decreased. The first player to reach 10 wins.
As per the nature of the game, and to single a unique winner, you have to consider the two counters increase (and the check for the winner) atomically.
You'll have each player send clickEvents on their own thread and every event will be translated into the increase of the owner's counter, the check on whether the counter reached 10 and the decrease of the opponent's counter.
This is very easily done by synchronizing the method which handles modifying the counters : every concurrent thread will try to obtain the lock, and when they do, they'll execute the code (and finally release the lock).
The locking mechanism is pretty lightweight and only requires a single keyword of code.
If we follow your suggestion to implement another thread that will handle the execution, we'd have to implement the whole thread management logic (more code), to initialize that Thread (more resource) and even so, to guarantee fairness in the handling of events, you still need a way for your client threads to pass the event to your executor thread. The only way I see to do so, is to implement a BlockingQueue, which is also synchronized to prevent the race condition that naturally occurs when trying to add elements from two other thread.
I honnestly don't see a way to resolve this very simple use-case without synchronization (or implementing your own locking algorithm that basically does the same).
You can have a single thread and process one-by-one (and this is done), but there are considerable overheads in doing so and it does not remove the need for synchronization.
You are in a situation where you are starting with multiple threads (for example, you have lots of simultaneous web sessions). You want to do a part of the processing in a single thread - let's say updating some common structure with some new data. You need to pass the new data to the single thread - how do you get it there? You would have to use some kind of message queue (or an equivalent thing) and have the single thread pick requests off the message queue and that would have have to be synchronized anyway, plus there is the overhead of managing the queue, plus the issue that you need to get a reply back from the single thread asynchronously. So you are back to square one.
This technique is used where the processing you need to do is considerable and you don't want to block your main threads for a long time.
In summary: having a single thread does not remove the need for synchronization.
Let's say i've got a list which holds 100 threads inside it.
i want to allow only 5 threads run simultaneously until all my threads in list have done their job.
i also can't use the java threadpool for this task.
could you give me a clue how to bound the number of my running threads?
I think what you need is semaphore. Check the doc.
Briefly speaking, semaphore is a locking method. You ask the resource from the semaphore singleton class, which is process scope unique, control the running number of function process.
I got one main thread that will start up other threads. Those other threads will ask for jobs to be done, and the main thread will make jobs available for the other threads to see and do.
The job that must be done is to set indexes in the a huge boolean array to true. They are by default false, and the other threads can only set them to true, never false. The various jobs may involve setting the same indexes to true.
The main thread finds new jobs depending on two things.
The values in the huge boolean array.
Which jobs has already been done.
How do I make sure the main thread reads fresh values from the huge boolean array?
I can't have the update of the array be through a synchronized method, because that's pretty much all the other threads do, and as such I would only get a pretty much sequential performance.
Let's say the other threads update the huge boolean array by setting many of it's indexes to true through a non-synchronized function. How can I make sure the main thread reads the updates and make sure it's not just locally cached at the thread? Is there any ways to make it "push" the update? I'm guessing the main thread should just use a synchronized method to "get" the updates?
For the really complete answer to your question, you should open up a copy of the Java Language Spec, and search for "happens before".
When the JLS says that A "happens before" B, it means that in a valid implementation of the Java language, A is required to actually happen before B. The spec says things like:
If some thread updates a field, and then releases a lock (e.g.,
leaves a synchronized block), the update "happens before" the lock is
released,
If some thread releases a lock, and some other thread subsequently
acquires the same lock, the release "happens before" the acquisition.
If some thread acquires a lock, and then reads a field, the
acquisition "happens before" the read.
Since "happens before" is a transitive relationship, you can infer that if thread A updates some variables inside a synchronized block and then thread B examines the variables in a block that is synchronized on the same object, then thread B will see what thread A wrote.
Besides entering and leaving synchronized blocks, there are lots of other events (constructing objects, wait()ing/notify()ing objects, start()ing and join()ing threads, reading and writing volatile variables) that allow you to establish "happens before" relationships between threads.
It's not a quick solution to your problem, but the chapter is worth reading.
...the main thread will make jobs available for the other threads to see and do...
I can't have the update of the array be through a synchronized method, because that's pretty much all the other threads do, and ...
Sounds like you're saying that each worker thread can only do a trivial amount of work before it must wait for further instructions from the main() thread. If that's true, then most of the workers are going to be waiting most of the time. You'd probably get better performance if you just do all of the work in a single thread.
Assuming that your goal is to make the best use of available cycles a multi-processor machine, you will need to partition the work in some way that lets each worker thread go off and do a significant chunk of it before needing to synchronize with any other thread.
I'd use another design pattern. For instance, you could add to a Set the indexes of the boolean values as they're turned on, for instance, and then synchronize access to that. Then you can use wait/notify to wake up.
First of all, don't use boolean arrays in general, use BitSets. See this: boolean[] vs. BitSet: Which is more efficient?
In this case you need an atomic BitSet, so you can't use the java.util.BitSet, but here is one: AtomicBitSet implementation for java
You could instead model this as message passing rather than mutating shared state. In your description the workers never read the boolean array and only write the completion status. Have you considered using a pending job queue that workers consume from and a completion queue that the master reads? The job status fields can be efficiently maintained by the master thread without any shared state concerns. Depending on your needs, you can use either blocking or non-blocking queues.
I'm looking for the best approach to my problem:
I have a bayesian model class that you can call classify on. I want as many threads to access this as possible. However, I have another method that would alter the internal structure of the object removeCategory.
Is it possible for me to prevent threads from accessing classify only while removeCategory is being accessed?
I think I might achieve this with a semaphore, where classify acquires 1 permit per thread, while removeCategory acquires max permits. That way it'll block til all threads are done with classify, and another thread cannot start a classify call until the permits have been released.
Is there a better solution?
what you're describing is in essence a readwritelock. treat your classify() method as a reader and your removeCategory() method as a writer and you'll get your desired behavior - the class will allow an unlimited amount of readers in but a writer will block all others (including other writers)
there's an example for using this class here (and in plenty of other places)
Many methods like stop(), resume(), suspend() etc are deprecated.
So is it useful to create threads using ThreadGroup?
Using ThreadGroup can be a useful diagnostic technique in big application servers with thousands of threads. If your threads are logically grouped together, then when you get a stack trace you can see which group the offending thread was part of (e.g. "Tomcat threads", "MDB threads", "thread pool X", etc), which can be a big help in tracking down and fixing the problem.
Don't use ThreadGroup for new code. Use the Executor stuff in java.util.concurrent instead.
Somewhat complimentary to the answer provided (6 years ago or so). But, while the Concurrency API provides a lot of constructs, the ThreadGroup might still be useful to use. It provides the following functionality:
Logical organisation of your threads (for diagnostic purposes).
You can interrupt() all the threads in the group. (Interrupting is perfectly fine, unlike suspend(), resume() and stop()).
You can set the maximum priority of the threads in the group. (not sure how widely useful is that, but there you have it).
Sets the ThreadGroup as a daemon. (So all new threads added to it will be daemon threads).
It allows you to override its uncaughtExceptionHandler so that if one of the threads in the group throws an Exception, you have a callback to handle it.
It provides you some extra tools such as getting the list of threads, how many active ones you have etc. Useful when having a group of worker threads, or some thread pool of some kind.
The short answer is - no, not really. There's little if any benefit to using one.
To expand on that slightly, if you want to group worker threads together you're much better off using an ExecutorService. If you want to quickly count how many threads in a conceptual group are alive, you still need to check each Thread individually (as ThreadGroup.activeCount() is an estimation, meaning it's not useful if the correctness of your code depends on its output).
I'd go so far as to say that the only thing you'd get from one these days, aside from the semantic compartmentalisation, is that Threads constructed as part of a group will pick up the daemon flag and a sensible name based on their group. And using this as a shortcut for filling in a few primitives in a constructor call (which typically you'd only have to write once anyway, sicne you're probably starting the threads in a loop and/or method call).
So - I really don't see any compelling reason to use one at all. I specifically tried to, a few months back, and failed.
EDIT - I suppose one potential use would be if you're running with a SecurityManager, and want to assert that only threads in the same group can interrupt each other. Even that's pretty borderline, as the default implementation always returns true for a Thread in any non-system thread group. And if you're implementing your own SecurityManager, you've got the possibility to have it make its decision on any other criteria (including the typical technique of storing Threads in collections as they get created).
Great answer for #skaffman. I want to add one more advantage:
Thread groups helps manipulating all the threads which are defined in this at once.