Is this java code thread-safe?

Is this java code thread-safe? - java

I am planning to use this schema in my application, but I was not sure whether this is safe.
To give a little background, a bunch of servers will compute results of sub-tasks that belong to a single task and report them back to the central server. This piece of code is used to register the results, and also check whether all the subtasks for the task has completed and if so, report that fact only once.
The important point is that, all task must be reported once and only once as soon as it is completed (all subTaskResults are set).
Can anybody help? Thank you! (Also, if you have a better idea to solve this problem, please let me know!)
*Note that I simplified the code for brevity.
Solution I
class Task {
//Populate with bunch of (Long, new AtomicReference()) pairs
//Actual app uses read only HashMap
Map<Id, AtomicReference<SubTaskResult>> subtasks = populatedMap();
Semaphore permission = new Semaphore(1);
public Task set(id, subTaskResult){
//null check omitted
subtasks.get(id).set(result);
return check() ? this : null;
}
private boolean check(){
for(AtomicReference ref : subtasks){
if(ref.get()==null){
return false;
}
}//for
return permission.tryAquire();
}
}//class
Stephen C kindly suggested to use a counter. Actually, I have considered that once, but I reasoned that the JVM could reorder the operations and thus, a thread can observe a decremented counter (by another thread) before the result is set in AtomicReference (by that other thread).
*EDIT: I now see this is thread safe. I'll go with this solution. Thanks, Stephen!
Solution II
class Task {
//Populate with bunch of (Long, new AtomicReference()) pairs
//Actual app uses read only HashMap
Map<Id, AtomicReference<SubTaskResult>> subtasks = populatedMap();
AtomicInteger counter = new AtomicInteger(subtasks.size());
public Task set(id, subTaskResult){
//null check omitted
subtasks.get(id).set(result);
//In the actual app, if !compareAndSet(null, result) return null;
return check() ? this : null;
}
private boolean check(){
return counter.decrementAndGet() == 0;
}
}//class

I assume that your use-case is that there are multiple multiple threads calling set, but for any given value of id, the set method will be called once only. I'm also assuming that populateMap creates the entries for all used id values, and that subtasks and permission are really private.
If so, I think that the code is thread-safe.
Each thread should see the initialized state of the subtasks Map, complete with all keys and all AtomicReference references. This state never changes, so subtasks.get(id) will always give the right reference. The set(result) call operates on an AtomicReference, so the subsequent get() method calls in check() will give the most up-to-date values ... in all threads. Any potential races with multiple threads calling check seem to sort themselves out.
However, this is a rather complicated solution. A simpler solution would be to use an concurrent counter; e.g. replace the Semaphore with an AtomicInteger and use decrementAndGet instead of repeatedly scanning the subtasks map in check.
In response to this comment in the updated solution:
Actually, I have considered that once,
but I reasoned that the JVM could
reorder the operations and thus, a
thread can observe a decremented
counter (by another thread) before the
result is set in AtomicReference (by
that other thread).
The AtomicInteger and AtomicReference by definition are atomic. Any thread that tries to access one is guaranteed to see the "current" value at the time of the access.
In this particular case, each thread calls set on the relevant AtomicReference before it calls decrementAndGet on the AtomicInteger. This cannot be reordered. Actions performed by a thread are performed in order. And since these are atomic actions, the efects will be visible to other threads in order as well.
In other words, it should be thread-safe ... AFAIK.

The atomicity guaranteed (per class documentation) explicitly for AtomicReference.compareAndSet extends to set and get methods (per package documentation), so in that regard your code appears to be thread-safe.
I am not sure, however, why you have Semaphore.tryAquire as a side-effect there, but without complimentary code to release the semaphore, that part of your code looks wrong.

The second solution does provide a thread-safe latch, but it's vulnerable to calls to set() that provide an ID that's not in the map -- which would trigger a NullPointerException -- or more than one call to set() with the same ID. The latter would mistakenly decrement the counter too many times and falsely report completion when there are presumably other subtasks IDs for which no result has been submitted. My criticism isn't with regard to the thread safety, but rather to the invariant maintenance; the same flaw would be present even without the thread-related concern.
Another way to solve this problem is with AbstractQueuedSynchronizer, but it's somewhat gratuitous: you can implement a stripped-down counting semaphore, where each call set() would call releaseShared(), decrementing the counter via a spin on compareAndSetState(), and tryAcquireShared() would only succeed when the count is zero. That's more or less what you implemented above with the AtomicInteger, but you'd be reusing a facility that offers more capabilities you can use for other portions of your design.
To flesh out the AbstractQueuedSynchronizer-based solution requires adding one more operation to justify the complexity: being able to wait on the results from all the subtasks to come back, such that the entire task is complete. That's Task#awaitCompletion() and Task#awaitCompletion(long, TimeUnit) in the code below.
Again, it's possibly overkill, but I'll share it for the purpose of discussion.
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.AbstractQueuedSynchronizer;
final class Task
{
private static final class Sync extends AbstractQueuedSynchronizer
{
public Sync(int count)
{
setState(count);
}
#Override
protected int tryAcquireShared(int ignored)
{
return 0 == getState() ? 1 : -1;
}
#Override
protected boolean tryReleaseShared(int ignored)
{
int current;
do
{
current = getState();
if (0 == current)
return true;
}
while (!compareAndSetState(current, current - 1));
return 1 == current;
}
}
public Task(int count)
{
if (count < 0)
throw new IllegalArgumentException();
sync_ = new Sync(count);
}
public boolean set(int id, Object result)
{
// Ensure that "id" refers to an incomplete task. Doing so requires
// additional synchronization over the structure mapping subtask
// identifiers to results.
// Store result somehow.
return sync_.releaseShared(1);
}
public void awaitCompletion()
throws InterruptedException
{
sync_.acquireSharedInterruptibly(0);
}
public void awaitCompletion(long time, TimeUnit unit)
throws InterruptedException
{
sync_.tryAcquireSharedNanos(0, unit.toNanos(time));
}
private final Sync sync_;
}

I have a weird feeling reading your example program, but it depends on the larger structure of your program what to do about that. A set function that also checks for completion is almost a code smell. :-) Just a few ideas.
If you have synchronous communication with your servers you might use an ExecutorService with the same number of threads like the number of servers that do the communication. From this you get a bunch of Futures, and you can naturally proceed with your calculation - the get calls will block at the moment the result is needed but not yet there.
If you have asynchronous communication with the servers you might also use a CountDownLatch after submitting the task to the servers. The await call blocks the main thread until the completion of all subtasks, and other threads can receive the results and call countdown on each received result.
With all these methods you don't need special threadsafety measures other than that the concurrent storing of the results in your structure is threadsafe. And I bet there are even better patterns for this.

Related

Which synchronize statements are unnecessary here?

First the code fragments...
final class AddedOrders {
private final Set<Order> orders = Sets.newConcurrentHashSet();
private final Set<String> ignoredItems = Sets.newConcurrentHashSet();
private boolean added = false;
public synchronized void clear() {
added = false;
}
public synchronized void add(Order order) {
added = orders.add(order);
}
public synchronized void remove(Order order) {
if (added) orders.remove(order);
}
public synchronized void ban(String item) {
ignoredItems.add(item);
}
public synchronized boolean has(Order order) {
return orders.contains(order);
}
public synchronized Set<Order> getOrders() {
return orders;
}
public synchronized boolean ignored(String item) {
return ignoredItems.contains(item);
}
}
private final AddedOrders added = new AddedOrders();
...
boolean subscribed;
int i = 10;
synchronized (added) {
while (!(subscribed = client.getSubscribedOrders().containsAll(added.getOrders())) && (i>0)) {
Helper.out("...order not subscribed yet (try: %d)", i);
Thread.sleep(200);
i--;
}
}
What I'd like to know...
Could someone point out which synchronized are not necessary?
Of course this is not the full code but assume that in the full project that all methods are called, and that some combinations of methods are called in the check value first, then modify style
added(the class) is accessed by multipleThreads
client is part of an external Server API, that I'm not entirely sure if it is Thread-Safe yet but I think it must be
ConcurrentHashSet is a google guava Class but it is based on ConcurrentHashMap apparently and the docs say it carries all the same concurrency guarantees.
But I don't really understand completely what those guarantees all are, even though I did some reading. Namely I know it's not ok to just check and set a value in a synchronized HashMap (without synchronizing on the synchronized Map using a synchronized block), however I do not know if you can do that in ConcurrentHashMap or not (without synchronizing on the ConcurrentHashMap using a synchronized block).

The only cases in your code where you really need synchronized are the ones where you test or update the added flag. You need the synchronized block to make sure that changes to the flag are visible across threads, and you also need to make sure that the added flag change is made in step with the change to the orders data structure. The synchronized keyword keeps another thread from barging in and doing something in between checking the flag and changing the data structure (the remove method could be broken like this if you remove the synchronization).
The code toward the end seems problematic because you're locking on the added object and then not letting go of the lock, there's not an opportunity for any other thread to make the changes that the thread is looking for. Although it looks like you're waiting for another object to change, so this criticism may be invalid. Sleeping with a lock held seems dangerous, though. This kind of thing is why Object#wait releases the lock it acquired.
Also note that since you're passing references out to the Orders set, code outside this class can add orders. You should do something to protect this internal data, like returning it wrapped in an immutableSet so callers can't make changes.
In general synchronization is used when you want to impose some granularity on changes, where you have 2 or more changes you want made together, without possibility of interleaving. An example is a check-then-act sequence where you execute some code that makes a change based on the value of something else, and you don't want some other thread to execute in between the check and the action (so the decision to act could be made, then the condition that allowed that action changes, so that the action could be invalid). If individual values are changed but they are unrelated, then you can make them volatile or use Atomic variables, and reduce the amount of locking you have to do.
It's a valid point that the synchronized keyword could be removed in cases like the clear method, where the only thing that changes is the added flag, which could be made volatile. The purpose of the added flag continues to elude me. Anything that enters a value that's already present can turn the flag back to false, it's not apparent that reasoning about any action based on what the current value of the flag makes any sense if this structure is getting modified concurrently.
Without knowing the exact context it's hard to say, but in general, classes created without considering their being used with multiple threads probably need to be reworked extensively before being used in a concurrent environment.

Is this a proper customized synchronizer?

I had a strong need for a synchronizer similar to a CountDownLatch, but the starting number for the countdown is unknown. To add context, if I'm going through a buffered recordset (say from a text file or a query) and kicking off a runnable for each record, but I don't know how many records there will be... I need a synchronizer that signals when the iteration is complete and all runnables are complete.
This is the synchronizer I came up with... a BufferedLatch. A method is called in the iteration loop for each record incrementing the recordSetSize. At the end of each runnable kicked off for each record, the processedRecordSetSize is incremented. When the iteration through all records is complete (but runnables may still be in queue), the setDownloadComplete() method is called letting the BufferedLatch know the recordSetSize is now fixed. The await() method waits for the iterationComplete variable to be true (recordsetSize is now fixed) and recordsetSize == processedRecordSetSize;
Is this an optimal implementation of this synchronizer? Is there more concurrent opportunity that synchronization is holding back? Although testing seems to work fine, are there any gotcha's I'm overlooking?
import java.util.concurrent.atomic.AtomicInteger;
public final class BufferedLatch {
/** A customized synchronizer built for concurrent iteration processes where the number of objects to be iterated is unknown
* and a runnable will be kicked off for each object, and the await() method will wait for all runnables to be complete
*/
private final AtomicInteger recordsetSize = new AtomicInteger(0);
private final AtomicInteger processedRecordsetSize = new AtomicInteger(0);
private volatile boolean iterationComplete = false;
public int incrementRecordsetSize() throws Exception {
if (iterationComplete) {
throw new Exception("Cannot increase recordsize after download is flagged complete!");
}
else {
return recordsetSize.incrementAndGet();
}
}
public void incrementProcessedRecordSize() {
synchronized(this) {
processedRecordsetSize.incrementAndGet();
if (iterationComplete) {
if (processedRecordsetSize.get() == recordsetSize.get()) {
this.notifyAll();
}
}
}
}
public void setDownloadComplete() {
synchronized(this) {
iterationComplete = true;
}
}
public void await() throws InterruptedException {
while (! (iterationComplete && (processedRecordsetSize.get() == recordsetSize.get()))) {
synchronized(this) {
while (! (iterationComplete && (processedRecordsetSize.get() == recordsetSize.get()))) {
this.wait();
}
}
}
}
}
UPDATE-- NEW CODE
public final class BufferedLatch {
/** A customized synchronizer built for concurrent iteration processes where the number of objects to be iterated is unknown
* and a runnable will be kicked off for each object, and the await() method will wait for all runnables to be complete
*/
private int recordCount = 0;
private int processedRecordCount = 0;
private boolean iterationComplete = false;
public synchronized void incrementRecordCount() throws Exception {
if (iterationComplete) {
throw new Exception("Cannot increase recordCount after download is flagged complete!");
}
else {
recordCount++;
}
}
public synchronized void incrementProcessedRecordCount() {
processedRecordCount++;
if (iterationComplete && recordCount == processedRecordCount) {
this.notifyAll();
}
}
public synchronized void setIterationComplete() {
iterationComplete = true;
if (iterationComplete && recordCount == processedRecordCount) {
this.notifyAll();
}
}
public synchronized void await() throws InterruptedException {
while (! (iterationComplete && (recordCount == processedRecordCount))) {
this.wait();
}
}
}

Probably not. I think conceptually you're onto something here, as it looks like your application needs something more than just a CountDownLatch. However, the implementation seems to have several problems.
First, I note that it looks odd to mix atomics/volatiles AND ordinary object monitor locks (synchronized). While there may be proper uses that mix these different constructs, mixing in this case I believe will lead to errors.
Consider incrementRecordsetSize() which first checks iterationComplete and only if it's false does it increment recordsetSize. The iterationComplete variable is volatile so updates from other threads will be visible. However, the fact that no locking is done here allows TOCTOU race conditions (time-of-check vs time-of-use). The rule seems to be, recordsetSize must not be incremented if iterationComplete is true. Suppose thread T1 comes along and finds iterationComplete to be false, so it decides to increment recordsetSize. Before it does so, another thread T2 comes along and sets iterationComplete to be true. This would allow T1 to do the increment improperly. Worse, before it does so, suppose another thread T3 came along and called incrementProcessedRecordSize(). It would increment processedRecordsetSize and then find iterationComplete true. It further might find that processedRecordsetSize equals recordsetSize and then notify all waiters, who then proceed as if the processing is complete. But it's not, as T1 then proceeds to increment recordsetSize and presumably continues with its processing.
The problem here is that this object's state consists of the fusion of three independent pieces of state -- two int counters and a boolean -- and all three must be read and written atomically. If certain bits of logic attempt to take advantage of individual volatile or atomic properties, it introduces the possibility of race conditions such as the one I described.
I'd suggest rewriting this as a plain object with two plain ints and a boolean (not atomic, not volatile) and just lock around everything. This should certainly clear up the logic and make things easier to understand.
In incrementProcessedRecordSize I note that the condition essentially duplicates the condition in the await method. A simplifying convention is for all updates to notify and have the condition evaluated only by the waiters. This may result in some unnecessary wakeups. If this is a problem, you might consider minimizing the number of notifies, but you need to think about maintainability. If you're not careful, the wait/notify conditions will become spread across the code and will be very hard to reason about. Alternatively, you could refactor the condition into a method and call it from the different places that do waiting and notification.
It looks like await() does a complicated form of double-checked locking. Instead of testing a volatile boolean outside the lock, it tests several separate pieces of information both outside and inside the lock. This seems susceptible to TOCTOU problems (as above) but it might be safe if you can prove the state really latches, that is, that once it becomes true it never returns to false. I'd have to stare at the code for a long time before I'd be able to convince myself it's correct.
On the other hand, what does this buy you? It seems to optimize away just the taking of the lock. If you have a zillion threads that are going to come by after processing is complete, it might be worth it, but it doesn't seem like it. I'd just remove the outer while loop and check the variables within a synchronized block.
Finally, having an object that represents counters and a boolean may very well be sensible for what you're doing, but other things you've said (in the question and in comments) are that some threads are generating a workload (e.g. reading lines from a file) and other threads are retiring that workload. This implies that there is some other data structure like a queue that contains this workload, and you have a producer-consumer problem here. That other structure has to be thread-safe, of course, since multiple threads are interacting over it. But the counters and boolean in this structure need to be updated in lockstep with the updates to the workload structure, otherwise there could be race conditions between checking and updating these separate objects.
It seems to me you could replace the counters in this object with the queue and just put simple locks around everything. The producers would append to the queue until they're done, at which time they set iterationComplete to true which prevents more work from being added. The consumers pull from the queue until iterationComplete is true and the queue is empty, at which point they're done. If they find the queue empty but iterationComplete is false, they know to block while awaiting further work.
I'd say to stick with simple locking and avoid volatiles/atomics until you get the basics correct. If there are bottlenecks in that code, then apply optimizations selectively while preserving the same invariants.

Multi-threading -- a faster way?

I have a class with a getter getInt() and a setter setInt() on a certain field, say field
Integer Int;
of an object of a class, say SomeClass.
The setInt() here is synchronized-- getInt() isn't.
I am updating the value of Int from within multiple threads.
Each thread is getting the value Int, and setting it appropriately.
The threads aren't sharing any other resources in any way.
The code executed in each thread is as follows.
public void update(SomeClass c) {
while (<condition-1>) // the conditions here and the calculation of
// k below dont have anything to do
// with the members of c
if (<condition-2>) {
// calculate k here
synchronized (c) {
c.setInt(c.getInt()+k);
// System.out.println("in "+this.toString());
}
}
}
The run() method is just invoking the above method on the members updated from within the constructor by the params passed to it:
public void run() { update(c); }
When I run this on large sequences, the threads aren't interleaving much-- i see one thread executing for long without any other thread running in between.
There must be a better way of doing this.
I can't change the internals of SomeClass, or of the class invoking the threads.
How can this be done better?
TIA.
//=====================================
EDIT:
I'm not after manipulating the execution sequence of the threads. They all have the same priority. It`s just that what i see in the outcome is suggesting that the threads aren't sharing the execution time evenly-- one of them, once takes over, executing on. However, I can't see why this code should be doing this.

It`s just that what i see in the outcome is suggesting that the threads aren't sharing the execution time evenly
Well, this is exactly what you don't want if you are after efficiency. Yanking a thread from being executed and scheduling another thread is generally very costly. Therefore it's actually advantageous to do one of them, once takes over, executing on. Of course, when this is overdone you could see higher throughput but longer response time. In theory. In practice, JVMs thread scheduling is well tuned for almost all purposes, and you don't want to try changing it in almost all situations. As a rule of thumb, if you are interested in response times in millisecond order, you probably want to stay away messing with it.
tl;dr: It's not being inefficient, you probably want to leave it as it is.
EDIT:
Having said that, using an AtomicInteger may help in performance, and is in my opinion less error prone than using a lock (synchronized keyword). You need to be hitting that variable really very hard in order to get a measurable benefit though.

The JDK provides a nice solution for multi threaded int access, AtomicInteger:
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/AtomicInteger.html

As Enno Shioji has pointed out, letting one thread proceed might be the most efficient way to execute your code in some scenarios.
It depends on how much cost the thread synchronization imposes in relation to the other work of your code (which we don’t know). If you have a loop like:
while (<condition-1>)
if (<condition-2>) {
// calculate k here
synchronized (c) {
c.setInt(c.getInt()+k);
}
}
and the test for condition-1 and condition-2 and the calculation of k is rather cheap compared to the synchronization cost, the Hotspot optimizer might decide to reduce the overhead by transforming the code to something like this:
synchronized (c) {
while (<condition-1>)
if (<condition-2>) {
// calculate k here
c.setInt(c.getInt()+k);
}
}
(or a rather more complicated structure by performing loop unrolling and span the synchronized block over multiple iterations). The bottom line is that the optimized code might block other threads longer but let the one owning the lock finish faster resulting in an overall faster execution.
This does not mean that a single-threaded execution was the fastest way to handle your problem. It also doesn’t mean that using an AtomicInteger here would be the best option to solve the problem. It would create a higher CPU load and possibly a small acceleration but it doesn’t solve your real mistake:
It is completely unnecessary to update c within the loop at a high frequency. After all, your threads do not depend on seeing updates to c timely. It even looks like they are not using it at all. So the correct fix would be to move the update out of the loop:
int kTotal=0;
while (<condition-1>)
if (<condition-2>) {
// calculate k here
kTotal += k;
}
synchronized (c) {
c.setInt(c.getInt()+kTotal);
}
Now, all threads can run in parallel (assuming the code you haven’t posted here doesn’t contain inter-thread dependencies) and the synchronization cost is reduced to a minimum. You could still change it to an AtomicInteger as well but that’s not that important anymore.

Answering to this
i see one thread executing for long without any other thread running in between.
There must be a better way of doing this.
You can not control how threads will be executed. JVM does this for you, and does not like you to interfere in its work.
Still you can look at yield as your option, but that also does not ensure same thread will not be picked again.
The java.lang.Thread.yield() method causes the currently executing thread object to temporarily pause and allow other threads to execute.

I've found it better to use wait() and notify() than yield. Check out this example (seen from a book)-
class Q {
int n;
boolean valueSet = false;
synchronized int get() {
if(!valueSet)
wait(); //handle InterruptedException
//
valueSet = false;
notify();//if thread waiting in put, now notified
}
synchronized void put(int n) {
if(valueSet)
wait(); //handle InterruptedException
//
valueSet = true;
//if thread in get waiting then that is resumed now
notify();
}
}
or you could try using sleep() and join the threads in the end in main() but that isn't a foolproof way

You are having public void update(SomeClass c) method in your code and this method is an instance method in which you are passing the object as parameter.
synchronized(c) in your code is doing nothing. Let me show you with some example,
So if you will make different objects of this class and then try to make them different threads like,
class A extends Thread{
public void update(SomeClass c){}
public void run(){
update(c)
}
public static void main(String args[]){
A t1 = new A();
A t2 = new A();
t1.start();
t2.start();
}
}
Then both of these t1 & t2 will have their own copies of update method and the reference variable c which you are making synchronized will also be different for both the threads. t1 calls its own update() method and t2 calls its own update() method. So synchronization won't work.
Synchronization will work when you have something common for both the threads.
Something like,
class A extends Thread{
static SomeClass c;
public void update(){
synchronized(c){
}
}
public void run(){
update(c)
}
public static void main(String args[]){
A t1 = new A();
A t2 = new A();
t1.start();
t2.start();
}
}
This way the actual concept of synchronization will be applied.

Java - concurrent clear of the list

I am trying to find a good way to achieve the following API:
void add(Object o);
void processAndClear();
The class would store the objects and upon calling processAndClear would iterate through the currently stored ones, process them somehow, and then clear the store. This class should be thread safe.
the obvious approach is to use locking, but I wanted to be more "concurrent". This is the approach which I would use:
class Store{
private AtomicReference<CopyOnWriteArrayList<Object>> store = new AtomicReference<>(new CopyOnWriteArrayList <>());
void add(Object o){
store.get().add(o);
}
void processAndClear(){
CopyOnWriteArrayList<Object> objects = store.get();
store.compareAndSet(objects, new CopyOnWriteArrayList<>());
for (Object object : objects) {
//do sth
}
}
}
This would allow threads that try to add objects to proceed almost immediately without any locking/waiting for the xlearing to complete. Is this the more or less correct approach?

Your above code is not thread-safe. Imagine the following:
Thread A is put on hold at add() right after store.get()
Thread B is in processAndClear(), replaces the list, processes all elements of the old one, then returns.
Thread A resumes and adds a new item to the now obsolete list that will never be processed.
The probably easiest solution here would be to use a LinkedBlockingQueue, which would as well simplify the task a lot:
class Store{
final LinkedBlockingQueue<Object> queue = new LinkedBlockingQueue<>();
void add(final Object o){
queue.put(o); // blocks until there is free space in the optionally bounded queue
}
void processAndClear(){
Object element;
while ((element = queue.poll()) != null) { // does not block on empty list but returns null instead
doSomething(element);
}
}
}
Edit: How to do this with synchronized:
class Store{
final LinkedList<Object> queue = new LinkedList<>(); // has to be final for synchronized to work
void add(final Object o){
synchronized(queue) { // on the queue as this is the shared object in question
queue.add(o);
}
}
void processAndClear() {
final LinkedList<Object> elements = new LinkedList<>(); // temporary local list
synchronized(queue) { // here as well, as every access needs to be properly synchronized
elements.addAll(queue);
queue.clear();
}
for (Object e : elements) {
doSomething(e); // this is thread-safe as only this thread can access these now local elements
}
}
}
Why this is not a good idea
Although this is thread-safe, it is much slower if compared to the concurrent version. Assume that you have a system with 100 threads that frequently call add, while one thread calls processAndClear. Then the following performance bottle-necks will occur:
If one thread calls add the other 99 are put on hold in the meantime.
During the first part of processAndClear all 100 threads are put on hold.
If you assume that those 100 adding threads have nothing else to do, you can easily show, that the application runs at the same speed as a single-threaded application minus the cost for synchronization. That means: adding will effectively be slower with 100 threads than with 1. This is not the case if you use a concurrent list as in the first example.
There will however be a minor performance gain with the processing thread, as doSomething can be run on the old elements while new ones are added. But again the concurrent example could be faster, as you could have multiple threads do the processing simultaneously.
Effectively synchronized can be used as well, but you will automatically introduce performance bottle-necks, potentially causing the application to run slower as single-threaded, forcing you to do complicated performance tests. In addition extending the functionality always contains a risk of introducing threading issues, as locking needs to be done manually.A concurrent list in contrast solves all these problems without additional code and the code can easily changed or extended later on.

The class would store the objects and upon calling processAndClear would iterate through the currently stored ones, process them somehow, and then clear the store.
This seems like you should use a BlockingQueue for this task. Your add(...) method would add to the queue and your consumer would call take() which blocks waiting for the next item. The BlockingQueue (ArrayBlockingQueue is a typical implementation) takes care of all of the synchronization and signaling for you.
This means that you don't have to have a CopyOnWriteArrayList nor an AtomicReference. What you would lose is a collection and you can iterate through for other reasons than your post articulates currently.

Java class as a Monitor

i need to write a java program but i need some advice before starting on my own.
The program i will be writing is to do the following:
Simulate a shop takes advanced order for donuts
The shop would not take further orders, once 5000 donuts have been ordered
Ok i am kind of stuck thinking if i should be writing the java-class to act as a Monitor or should i use Java-Semaphore class instead?
Please advice me. Thanks for the help.

Any java object can work as a monitor via the wait/notify methods inherited from Object:
Object monitor = new Object();
// thread 1
synchronized(monitor) {
monitor.wait();
}
// thread 2
synchronized(monitor) {
monitor.notify();
}
Just make sure to hold the lock on the monitor object when calling these methods (don't worry about the wait, the lock is released automatically to allow other threads to acquire it). This way, you have a convenient mechanism for signalling among threads.
It seems to me like you are implementing a bounded producer-consumer queue. In this case:
The producer will keep putting items in a shared queue.
If the queue size reaches 5000, it will call wait on a shared monitor and go to sleep.
When it puts an item, it will call notify on the monitor to wake up the consumer if it's waiting.
The consumer will keep taking items from the queue.
When it takes an item, it will call notify on the monitor to wake up the producer.
If the queue size reaches 0 the consumer calls wait and goes to sleep.
For an even more simplified approach, have a loop at the various implementation of BlockingQueue, which provides the above features out of the box!

It seems to me that the core of this exercise is updating a counter (number of orders taken), in a thread-safe and atomic fashion. If implemented incorrectly, your shop could end up taking more than 5000 pre-orders due to missed updates and possibly different threads seeing stale values of the counter.
The simplest way to update a counter atomically is to use synchronized methods to get and increment it:
class DonutShop {
private int ordersTaken = 0;
public synchronized int getOrdersTaken() {
return ordersTaken;
}
public synchronized void increaseOrdersBy(int n) {
ordersTaken += n;
}
// Other methods here
}
The synchronized methods mean that only one thread can be calling either method at any time (and they also provide a memory barrier to ensure that different threads see the same value rather than locally cached ones which may be outdated). This ensures a consistent view of the counter across all threads in your application.
(Note that I didn't have a "set" method but an "increment" method. The problem with "set" is that if client has to call shop.set(shop.get() + 1);, another thread could have incremented the value between the calls to get and set, so this update would be lost. By making the whole increment operation atomic - because it's in the synchronized block - this situation cannot occur.
In practice I would probably use an AtomicInteger instead, which is basically a wrapper around an int to allow for atomic queries and updates, just like the DonutShop class above. It also has the advantage that it's more efficient in terms of minimising exclusive blocking, and it's part of the standard library so will be more immediately familiar to other developers than a class you've written yourself.
In terms of correctness, either will suffice.

Like Tudor wrote, you can use any object as monitor for general purpose locking and synchronization.
However, if you got the requirement that only x orders (x=5000 for your case) can be processed at any one time, you could use the java.util.concurrent.Semaphore class. It is made specifically for use cases where you can only have fixed number of jobs running - it is called permits in the terminology of Semaphore
If you do the processing immediately, you can go with
private Semaphore semaphore = new Semaphore(5000);
public void process(Order order)
{
if (semaphore.tryAcquire())
{
try
{
//do your processing here
}
finally
{
semaphore.release();
}
}
else
{
throw new IllegalStateException("can't take more orders");
}
}
If if takes more than that (human input required, starting another thread/process, etc.), you need to add callback for when the processing is over, like:
private Semaphore semaphore = new Semaphore(5000);
public void process(Order order)
{
if (semaphore.tryAcquire())
{
//start a new job to process order
}
else
{
throw new IllegalStateException("can't take more orders");
}
}
//call this from the job you started, once it is finished
public void processingFinished(Order order)
{
semaphore.release();
//any other post-processing for that order
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.