Asynchronous atomic array

Asynchronous atomic array - java

I have a critical section of my (Java) code which basically goes like the snippet below. They're coming in from a nio server.
void messageReceived(User user, Message message) {
synchronized(entryLock) {
userRegistry.updateLastMessageReceived(user,time());
server.receive(user,message);
}
}
However, a high percentage of my messages are not going to change the server state, really. They're merely the client saying "hello, I'm still here". I really don't want to have to make that inside the synchronization block.
I could use a synchronous map or something like that, but it's still going to incur a synchronization penalty.
What I would really like to do is to have something like a drop box, like this
void messageReceived(User user, Message message) {
dropbox.add(new UserReceived(user,time());
if(message.getType() != message.TYPE_KEPT_ALIVE) {
synchronized(entryLock) {
server.receive(user,message);
}
}
}
I have a cleanup routine to automatically put clients that aren't active to sleep. So instead of synchronizing on every kept alive message to update the registry, the cleanup routine can simply compile the kept alive messages in a single synchronization block.
So naturally, reconigizing a need for this, the first thing I did was start making a solution. Then I decided this was a non-trivial class, and a problem that was more than likely fairly common. so here I am.
tl;dr Is there a Java library or other solution I can use to facilitate atomically adding to a list of objects in an asynchronous manner? Collecting from the list in an asychronous manner is not required. I just don't want to synchronize on every add to the list.

ConcurrentLinkedQueue claims to be:
This implementation employs an efficient "wait-free" algorithm based on one described in Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms by Maged M. Michael and Michael L. Scott.
I'm not sure what the quotes on "wait-free" entail but the Concurrent* classes are good places to look for structures like you're looking for.
You might also be interested in the following: Effective Concurrency: Lock-Free Code — A False Sense of Security. It talks about how hard these things are to get right, even for experts.

Well, there are few things you must bear in mind.
First, there is very little "synchronization cost" if there is little contention (more than one thread trying to enter the synchronized block at the same time).
Second, if there is contention, you're going to incur some cost no matter what technique you're using. Paul is right about ConcurrentLinkedQueue and the "wait-free" means that thread concurrency control is not done using locks, but still, you will always pay some price for contention. You may also want to look at ConcurrentHashMap because I'm not sure a list is what you're looking for. Using both classes is quite simple and common.
If you want to be more adventurous, you might find some non-locking synchronization primitives in java.util.concurrent.atomic.

One thing we could do is to use a simple ArrayList for keep-alive messages:
Keep adding to this list whenever each keep-alive message comes.
The other thread would synch on a lock X and read and process
keep-alives. Note that this thread is not removing from list only
reading/copying.
Finally in messageReceived itself you check if the list has grown
say beyond 1000, in which case you synch on the lock X and clear the
list.
List keepAliveList = new ArrayList();
void messageReceived(User user, Message message) {
if(message.getType() == message.TYPE_KEPT_ALIVE) {
if(keepAliveList.size() > THRESHOLD) {
synchronized(X) {
processList.addAll(list);
list.clear();
}
}
keepAliveList.add(message);
}
}
//on another thread
void checkKeepAlives() {
synchronized(X) {
processList.addAll(list)
}
processKeepAlives(processList);
}

Related

Get/Set the value in the cache using the AtomicReference in java

I've already posted this question on codereview site https://codereview.stackexchange.com/questions/158999/get-set-the-value-in-the-cache-using-the-atomicreference-in-java , but thought of posting here, so that it reaches the wider audience and i can get the quicker solution posting it here as well.
I am having below code which get and set the data in the cache using the synchronized block and i want to know if i can optimize the below code :-
public int getValue() {
AtomicReferenceTest<Integer> cachedIntRef = new AtomicReference<Integer>();
boolean wasCached = true;
Integer cachedInt = cachedIntRef.get();
if (cachedInt == null) {
synchronized (cachedIntRef) {
cachedInt = cachedIntRef.get();
if (cachedInt == null) {
wasCached = false;
// Make DB call to get the data and update the cache.
cachedInt = baseDao.getCloudMaximumWeight();
cachedIntRef.set(cachedInt);
}
}
}
}
I want to know if is there is any way by which i can remove the synchronized block and optimize further or this code is already optimized?
EDIT :- i'll remove the question from one of the site, if i get the answer on any of the site. Also when i profile my application sometime even with less no of threads, i see threads blocking on synchronized piece of code. which made me think as i code is using the AtomicRef , somehow i can get rid of syncronized or is there is some other better way of optimize the code.

I want to know if is there is any way by which i can remove the synchronized block and optimize further or this code is already optimized?
I assume that optimizing the code means removing the synchronized block. The problem with that thinking is that most likely your dao call is significantly more expensive than synchronized. Any IO (especially to a remote database) is going to be at least 4+ orders of magnitude more expensive than the locking.
That said, you can remove the synchronized block if you don't mind multiple DAO calls when initializing the cache. If the DAO calls are inexpensive then having 2 threads making them maybe isn't a problem. There is a race condition on which one's answer will be put into the cache but chances are their results will be the same anyway. I often do this and assume that as the application starts up, the first couple of calls are going to be more expensive as the cache warms. But are 2 threads making the same DAO request ever going to be faster than 1 thread doing it and 1 waiting for the other thread to finish?
If there is a number of different DAO calls then you can try some sort of lock segregation so not all cache requests go through the same lock. This would allow some parallelization which might help. I can't tell if your code is specific or an example of the problem. This is how the ConcurrentHashMap works for example.
But really I would be sure that this section of code has performance problems before I worry too much about it. And even if a profiler is saying that it is a primary time sink, it may just be that the DAO calls are the most expensive part of the equation so saving a couple with synchronization would be the best way to speed it up anyway. You can take out the dao calls and replace with a straight assignment if you need to see if it the synchronized or dao.* calls that is the problem.

Try using volatile integer instead. Maybe I am missing something here but I don't see the use case for the AtomicReference here.

Why should we use HashMap in multi-threaded environments?

Today I was reading about how HashMap works in Java. I came across a blog and I am quoting directly from the article of the blog. I have gone through this article on Stack Overflow. Still
I want to know the detail.
So the answer is Yes there is potential race condition exists while
resizing HashMap in Java, if two thread at the same time found that
now HashMap needs resizing and they both try to resizing. on the
process of resizing of HashMap in Java , the element in bucket which
is stored in linked list get reversed in order during there migration
to new bucket because java HashMap doesn't append the new element at
tail instead it append new element at head to avoid tail traversing.
If race condition happens then you will end up with an infinite loop.
It states that as HashMap is not thread-safe during resizing of the HashMap a potential race condition can occur. I have seen in our office projects even, people are extensively using HashMaps knowing they are not thread safe. If it is not thread safe, why should we use HashMap then? Is it just lack of knowledge among developers as they might not be aware about structures like ConcurrentHashMap or some other reason. Can anyone put a light on this puzzle.

I can confidently say ConcurrentHashMap is a pretty ignored class. Not many people know about it and not many people care to use it. The class offers a very robust and fast method of synchronizing a Map collection. I have read a few comparisons of HashMap and ConcurrentHashMap on the web. Let me just say that they’re totally wrong. There is no way you can compare the two, one offers synchronized methods to access a map while the other offers no synchronization whatsoever.
What most of us fail to notice is that while our applications, web applications especially, work fine during the development & testing phase, they usually go tilts up under heavy (or even moderately heavy) load. This is due to the fact that we expect our HashMap’s to behave a certain way but under load they usually misbehave. Hashtable’s offer concurrent access to their entries, with a small caveat, the entire map is locked to perform any sort of operation.
While this overhead is ignorable in a web application under normal load, under heavy load it can lead to delayed response times and overtaxing of your server for no good reason. This is where ConcurrentHashMap’s step in. They offer all the features of Hashtable with a performance almost as good as a HashMap. ConcurrentHashMap’s accomplish this by a very simple mechanism.
Instead of a map wide lock, the collection maintains a list of 16 locks by default, each of which is used to guard (or lock on) a single bucket of the map. This effectively means that 16 threads can modify the collection at a single time (as long as they’re all working on different buckets). Infact there is no operation performed by this collection that locks the entire map.

There are several aspects to this: First of all, most of the collections are not thread safe. If you want a thread safe collection you can call synchronizedCollection or synchronizedMap
But the main point is this: You want your threads to run in parallel, no synchronization at all - if possible of course. This is something you should strive for but of course cannot be achieved every time you deal with multithreading.
But there is no point in making the default collection/map thread safe, because it should be an edge case that a map is shared. Synchronization means more work for the jvm.

In a multithreaded environment, you have to ensure that it is not modified concurrently or you can reach a critical memory problem, because it is not synchronized in any way.
Dear just check Api previously I also thinking in same manner.
I thought that the solution was to use the static Collections.synchronizedMap method. I was expecting it to return a better implementation. But if you look at the source code you will realize that all they do in there is just a wrapper with a synchronized call on a mutex, which happens to be the same map, not allowing reads to occur concurrently.
In the Jakarta commons project, there is an implementation that is called FastHashMap. This implementation has a property called fast. If fast is true, then the reads are non-synchronized, and the writes will perform the following steps:
Clone the current structure
Perform the modification on the clone
Replace the existing structure with the modified clone
public class FastSynchronizedMap implements Map,
Serializable {
private final Map m;
private ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
.
.
.
public V get(Object key) {
lock.readLock().lock();
V value = null;
try {
value = m.get(key);
} finally {
lock.readLock().unlock();
}
return value;
}
public V put(K key, V value) {
lock.writeLock().lock();
V v = null;
try {
v = m.put(key, value);
} finally {
lock.writeLock().lock();
}
return v;
}
.
.
.
}
Note that we do a try finally block, we want to guarantee that the lock is released no matter what problem is encountered in the block.
This implementation works well when you have almost no write operations, and mostly read operations.

Hashmap can be used when a single thread has an access to it. However when multiple threads start accessing the Hashmap there will be 2 main problems:
1. resizing of hashmap is not gauranteed to work as expected.
2. Concurrent Modification exception would be thrown. This can also be thrown when its accessed by single thread to read and write onto the hashmap at the same time.

A workaround for using HashMap in multi-threaded environment is to initialize it with the expected number of objects' count, hence avoiding the need for a re-sizing.

Thread.sleep() in a while loop

I notice that NetBeans is warning me about using Thread.sleep() in a while loop in my Java code, so I've done some research on the subject. It seems primarily the issue is one of performance, where your while condition may become true while the counter is still sleeping, thus wasting wall-clock time as you wait for the next iteration. This all makes perfect sense.
My application has a need to contact a remote system and periodically poll for the state of an operation, waiting until the operation is complete before sending the next request. At the moment the code logically does this:
String state = get state via RPC call
while (!state.equals("complete")) {
Thread.sleep(10000); // Wait 10 seconds
state = {update state via RPC call}
}
Given that the circumstance is checking a remote operation (which is a somewhat expensive process, in that it runs for several seconds), is this a valid use of Thread.sleep() in a while loop? Is there a better way to structure this logic? I've seen some examples where I could use a Timer class, but I fail to see the benefit, as it still seems to boil down to the same straightforward logic above, but with a lot more complexity thrown in.
Bear in mind that the remote system in this case is neither under my direct control, nor is it written in Java, so changing that end to be more "cooperative" in this scenario is not an option. My only option for updating my application's value for state is to create and send an XML message, receive a response, parse it, and then extract the piece of information I need.
Any suggestions or comments would be most welcome.

Unless your remote system can issue an event or otherwise notify you asynchronously, I don't think the above is at all unreasonable. You need to balance your sleep() time vs. the time/load that the RPC call makes, but I think that's the only issue and the above doesn't seem of concern at all.

Without being able to change the remote end to provide a "push" notification that it is done with its long-running process, that's about as well as you're going to be able to do. As long as the Thread.sleep time is long compared to the cost of polling, you should be OK.

You should (almost) never use sleep since its very inefficient and its not a good practice. Always use locks and condition variables where threads signal each other. See Mike Dahlin's Coding Standards for Programming with threads
A template is:
public class Foo{
private Lock lock;
private Condition c1;
private Condition c2;
public Foo()
{
lock = new SimpleLock();
c1 = lock.newCondition();
c2 = lock.newCondition();
...
}
public void doIt()
{
try{
lock.lock();
...
while(...){
c1.awaitUninterruptibly();
}
...
c2.signal();
}
finally{
lock.unlock();
}
}
}

Java concurrency - use which technique to achieve safety?

I have a list of personId. There are two API calls to update it (add and remove):
public void add(String newPersonName) {
if (personNameIdMap.get(newPersonName) != null) {
myPersonId.add(personNameIdMap.get(newPersonName)
} else {
// get the id from Twitter and add to the list
}
// make an API call to Twitter
}
public void delete(String personNAme) {
if (personNameIdMap.get(newPersonName) != null) {
myPersonId.remove(personNameIdMap.get(newPersonName)
} else {
// wrong person name
}
// make an API call to Twitter
}
I know there can be concurrency problem. I read about 3 solutions:
synchronized the method
use Collections.synchronizedlist()
CopyOnWriteArrayList
I am not sure which one to prefer to prevent the inconsistency.

1) synchronized the method
2) use Collections.synchronizedlist
3) CopyOnWriteArrayList ..
All will work, it's a matter of what kind of performance / features you need.
Method #1 and #2 are blocking methods. If you synchronize the methods, you handle concurrency yourself. If you wrap a list in Collections.synchronizedList, it handles it for you. (IMHO #2 is safer -- just be sure to use it as the docs say, and don't let anything access the raw list that is wrapped inside the synchronizedList.)
CopyOnWriteArrayList is one of those weird things that has use in certain applications. It's a non-blocking quasi-immutable list, namely, if Thread A iterates through the list while Thread B is changing it, Thread A will iterate through a snapshot of the old list. If you need non-blocking performance, and you are rarely writing to the list, but frequently reading from it, then perhaps this is the best one to use.
edit: There are at least two other options:
4) use Vector instead of ArrayList; Vector implements List and is already synchronized. However, it's generally frowned, upon as it's considered an old-school class (was there since Java 1.0!), and should be equivalent to #2.
5) access the List serially from only one thread. If you do this, you're guaranteed not to have any concurrency problems with the List itself. One way to do this is to use Executors.newSingleThreadExecutor and queue up tasks one-by-one to access the list. This moves the resource contention from your list to the ExecutorService; if the tasks are short, it may be fine, but if some are lengthy they may cause others to block longer than desired.
In the end you need to think about concurrency at the application level: thread-safety should be a requirement, and find out how to get the performance you need with the simplest design possible.
On a side note, you're calling personNameIdMap.get(newPersonName) twice in add() and delete(). This suffers from concurrency problems if another thread modifies personNameIdMap between the two calls in each method. You're better off doing
PersonId id = personNameIdMap.get(newPersonName);
if (id != null){
myPersonId.add(id);
}
else
{
// something else
}

Collections.synchronizedList is the easiest to use and probably the best option. It simply wraps the underlying list with synchronized. Note that multi-step operations (eg for loop) still need to be synchronized by you.
Some quick things
Don't synchronize the method unless you really need to - It just locks the entire object until the method completes; hardly a desirable effect
CopyOnWriteArrayList is a very specialized list that most likely you wouldn't want since you have an add method. Its essentially a normal ArrayList but each time something is added the whole array is rebuilt, a very expensive task. Its thread safe, but not really the desired result

Synchronized is the old way of working with threads. Avoid it in favor of new idioms mostly expressed in the java.util.concurrent package.
See 1.
A CopyOnWriteArrayList has fast read and slow writes. If you're making a lot of changes to it, it might start to drag on your performance.
Concurrency isn't about an isolated choice of what mechanism or type to use in a single method. You'll need to think about it from a higher level to understand all of its impacts.

Are you making changes to personNameIdMap within those methods, or any other data structures access to which should also be synchronized? If so, it may be easiest to mark the methods as synchronized; otherwise, you might consider using Collections.synchronizedList to get a synchronized view of myPersonId and then doing all list operations through that synchronized view. Note that you should not manipulate myPersonId directly in this case, but do all accesses solely through the list returned from the Collections.synchronizedList call.
Either way, you have to make sure that there can never be a situation where a read and a write or two writes could occur simultaneously to the same unsynchronized data structure. Data structures documented as thread-safe or returned from Collections.synchronizedList, Collections.synchronizedMap, etc. are exceptions to this rule, so calls to those can be put anywhere. Non-synchronized data structures can still be used safely inside methods declared to be synchronized, however, because such methods are guaranteed by the JVM to never run at the same time, and therefore there could be no concurrent reading / writing.

In your case from the code that you posted, all 3 ways are acceptable. However, there are some specific characteristics:
#3: This should have the same effect as #2 but may run faster or slower depending on the system and workload.
#1: This way is the most flexible. Only with #1 can you make the the add() and delete() methods more complex. For example, if you need to read or write multiple items in the list, then you cannot use #2 or #3, because some other thread can still see the list being half updated.

Java concurrency (multi-threading) :
Concurrency is the ability to run several programs or several parts of a program in parallel. If a time consuming task can be performed asynchronously or in parallel, this improve the throughput and the interactivity of the program.
We can do concurrent programming with Java. By java concurrency we can do parallel programming, immutability, threads, the executor framework (thread pools), futures, callables and the fork-join framework programmings.

Why does unsynchronization make ArrayList faster and less secure?

I read the following statement:
ArrayLists are unsynchronized and therefore faster than Vector, but less secure in a multithreaded environment.
I would like to know why unsynchronization can improve the speed, and why it will be less secure?

I will try to address both of your questions:
Improve speed
If the ArrayList were synchronized and multiple threads were trying to read data out of the list at the same time, the threads would have to wait to get an exclusive lock on the list. By leaving the list unsynchronized, the threads don't have to wait and the program will run faster.
Unsafe
If multiple threads are reading and writing to a list at the same time, the threads can have unstable view of the list, and this can cause instability in multi-threaded programs.

The whole point of synchronization is that it means only one thread has access to an object at any given time. Take a box of chocolates as an example. If the box is synchronized (Vector), and you get there first, no one else can take any and you get your pick. If the box is NOT synchronized (ArrayList), anyone walking by can snag a chocolate - It will disappear faster, but you may not get the ones you want.

ArrayLists are unsynchronized and
therefore faster than Vector, but less
secure in a multithreaded environment.
I would like to know why
unsynchronization can improve the
speed,and why it will be less secure?
When multiple threads are reading/writing to a shared memory location, the program might compute incorrect results due to lack of mutual exclusion and proper visibility. Hence lack of synchronization is considered "unsafe". This blog post by Jeremy Manson might provide a good introduction to the topic.
When the JVM executes a synchronized method, it makes sure that the current thread has an exclusive lock on the object on which the method is invoked. Similarly when the method finishes execution, the JVM releases the lock held by the executing thread. Synchronized methods provide mutual exclusion and visibility guarantees - and is important for "safety" (i.e. guaranteeing correctness) of the executing code. But, if only one thread is ever accessing the methods of the object, there is no safety issues to worry about. Although the JVM performance has improved over the years, uncontended synchronization (i.e. locking/unlocking of objects accessed by only one thread) still takes non-zero amount of time. For unsynchronized methods, the JVM does not pay this extra penalty - hence they are faster than their synchronized counterparts.
Vectors force their choice on you. All methods are synchronized and it is difficult to use them incorrectly. But when Vectors are used in a single-threaded context, you pay the price for the extra synchronization unnecessarily. ArrayLists leave the choice to you. When used in the multi-threaded context, it is up to you (the programmer) to correctly synchronizing the code; but when used in a single-threaded context you are guaranteed not to pay any extra synchronization overhead.
Also, when an collection is populated initially, and read subsequently ArrayLists perform better even in a multi-threaded context. For example, consider this method:
public synchronized List<String> getList() {
List<String> list = new Vector<String>();
list.add("Foo");
list.add("Bar");
return Collections.unmodifiableList(list);
}
A list is created, populated, and an immutable view of it is safely published. Looking at the code above it is clear that all subsequent uses of this list are reads and won't need any synchronization even when used by multiple threads - the object is effectively immutable. Using a Vector here incurs the synchronization overhead even for reads where it is not needed; using an ArrayList instead would perform better.

Data structures that synchronize use locks (or other synchronization constructs) to ensure that their data is always in a consistent state. Oftentimes, this requires that one or more threads wait on another thread to finish updating the structure's state, which will then reduce performance, since a wait has been introduced where before there was none.

2 threads can modify the list at the same time and add a new item or delete/modify the same item in the list at the same time because no synchronization (or lock mechanism if you prefer) exists. So imagine you delete one item of the list while somebody else is trying to work with it or you modify an item while someone uses it, it's not very secure.
http://download.oracle.com/javase/1.4.2/docs/api/java/util/ArrayList.html
Read the "Note that this implementation is not synchronized." paragraph, it explains a bit better.
And I forgot, considering speed, it seems quite trivial to imagine that when you try to control the access to a data, you add some mechanisms that prevent other people from accessing your data. Thus, you add some more computations so it is slower...

Non-blocking data structures will be faster than ones that bock, because of that fact. With blocking data structures, if a resources is acquired by some entity it will take time for another entity to acquire that same resource, once it becomes available.
However, this can be less secure in some instances depending on the situation. The main points of contention are during writes. If it can be guaranteed that the data contained in a data structure will not change it has been added and will only be accessed to read the value than there will not be a problem. The issues arise when there is a conflict between a write and a read, or a write and a write.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.