Get/Set the value in the cache using the AtomicReference in java

Get/Set the value in the cache using the AtomicReference in java - java

I've already posted this question on codereview site https://codereview.stackexchange.com/questions/158999/get-set-the-value-in-the-cache-using-the-atomicreference-in-java , but thought of posting here, so that it reaches the wider audience and i can get the quicker solution posting it here as well.
I am having below code which get and set the data in the cache using the synchronized block and i want to know if i can optimize the below code :-
public int getValue() {
AtomicReferenceTest<Integer> cachedIntRef = new AtomicReference<Integer>();
boolean wasCached = true;
Integer cachedInt = cachedIntRef.get();
if (cachedInt == null) {
synchronized (cachedIntRef) {
cachedInt = cachedIntRef.get();
if (cachedInt == null) {
wasCached = false;
// Make DB call to get the data and update the cache.
cachedInt = baseDao.getCloudMaximumWeight();
cachedIntRef.set(cachedInt);
}
}
}
}
I want to know if is there is any way by which i can remove the synchronized block and optimize further or this code is already optimized?
EDIT :- i'll remove the question from one of the site, if i get the answer on any of the site. Also when i profile my application sometime even with less no of threads, i see threads blocking on synchronized piece of code. which made me think as i code is using the AtomicRef , somehow i can get rid of syncronized or is there is some other better way of optimize the code.

I want to know if is there is any way by which i can remove the synchronized block and optimize further or this code is already optimized?
I assume that optimizing the code means removing the synchronized block. The problem with that thinking is that most likely your dao call is significantly more expensive than synchronized. Any IO (especially to a remote database) is going to be at least 4+ orders of magnitude more expensive than the locking.
That said, you can remove the synchronized block if you don't mind multiple DAO calls when initializing the cache. If the DAO calls are inexpensive then having 2 threads making them maybe isn't a problem. There is a race condition on which one's answer will be put into the cache but chances are their results will be the same anyway. I often do this and assume that as the application starts up, the first couple of calls are going to be more expensive as the cache warms. But are 2 threads making the same DAO request ever going to be faster than 1 thread doing it and 1 waiting for the other thread to finish?
If there is a number of different DAO calls then you can try some sort of lock segregation so not all cache requests go through the same lock. This would allow some parallelization which might help. I can't tell if your code is specific or an example of the problem. This is how the ConcurrentHashMap works for example.
But really I would be sure that this section of code has performance problems before I worry too much about it. And even if a profiler is saying that it is a primary time sink, it may just be that the DAO calls are the most expensive part of the equation so saving a couple with synchronization would be the best way to speed it up anyway. You can take out the dao calls and replace with a straight assignment if you need to see if it the synchronized or dao.* calls that is the problem.

Try using volatile integer instead. Maybe I am missing something here but I don't see the use case for the AtomicReference here.

Related

Design AppServer Interview Discussion

I encountered the following question in a recent System Design Interview:
Design an AppServer that interfaces with a Cache and a DB.
I came up with this:
public class AppServer{
public Database DB;
public Cache cache;
public Value get(Key k){
Value res = cache.get(k);
if(res == null){
res = DB.get(k);
cache.set(k, res);
}
}
public void set(Key k, Value v){
cache.set(k, v);
DB.set(k, v);
}
}
This code is fine and works correctly, but follow ups to the question are:
What if there are multiple threads?
What if there are multiple instances of the AppServer?
Suddenly AppServer performance degrades a ton, we find out this is because our cache is consistently missing. Cache size is fixed (already largest that it can be). How can we prevent this?
Response:
I answered that we can use Locks or Conditional Variables. In Java, we can add Synchronized to each method to allow for mutual exclusion, but the interviewer mentioned that this isn't too efficient and wanted only critical parts synchronized.
I thought that we only need to synchronize the 2 set lines in void set(Key k, Value v) and 1 set method in Value get(Key k), however the interviewer pushed for also synchronizing res = DB.get(k);. I agreed with him at the end, but don't fully understand. Don't threads have independent stacks and shared heaps? So when a thread executes get, it stores res in local variable on stack frame, even if another thread executes get sequentially, the former thread retains its get value. Then each thread sets their respective fetched values.
How can we handle multiple instances of the AppServer?
I came up with a Distributed Queue Solution like Kafka, every time we perform a set / get command we queue that command, but he also mentioned that set is ok because the action sets a value in the cache / db, but how would you return the correct value for get? Can someone explain this?
Also there are possible solutions with a versioning system and event system?
Possible solutions:
L1, L2, L3 caches - layers and more caches
Regional / Segmentation caches - use different cache for user groups.
Any other ideas?
Will upvote all insightful responses :)

1
Although JDBC is "supposed" to be thread safe, some drivers aren't and I'm going to assume that Cache isn't thread safe either (although most caches should be thread safe) so in that case, you would need to make the following changes to your code:
Make both fields final
Synchronize the ENTIRE get(...)method
Synchronize the ENTIRE set(...)method
Assuming there is no other way to access the said fields, the behavior of your get(...) method depends on 2 things: first, that updates from the set(...) method can be seen, and secondly, that a cache miss is then stored only by a single thread. You need to synchronize because the idea is to only have one thread perform an expensive DB query in the case that there is a cache miss. If you do not synchronize the entire get(...) method, or you split the synchronized statement, it is possible for another thread to also see a cache miss between the lookup and insertion.
The way I would answer this question is honestly just to toss the entire thing. I would look at how JCIP wrote the cache and base my answer on that.
2
I think your queue solution is fine.
I believe your interviewer means that if another instance of AppServer did not have cached what was already set(...) by another instance of AppServer, then it would lookup and find the correct value in the DB. This solution would be incorrect if you are using multiple threads because it is possible for 2 threads to be set(...)ing conflicting values, then the caches would have 2 different values while depending on the thread safety of your DB, it might not even have the value at all.
Ideally, you'd never create more than a single instance of your AppServer.
3
I don't have enough experience to evaluate this question specifically, but perhaps an LRU cache would improve performance somewhat, or using a hash ring buffer. It might be a stretch but if you wanted to throw out there, perhaps even using ML to determine the best values to either preload to retain at certain times of the day, for example, could also work.
If you are always missing values from your cache, there is no way to improve your code. Performance would be dependent on your database.

Conccurent Modification and Synchronization of arraylists [duplicate]

This question already has answers here:
How do I make my ArrayList Thread-Safe? Another approach to problem in Java?
(8 answers)
Closed 7 years ago.
This question is more about asking if my way of doing something is the "correct" way or not. I have some program that involves constantly updating graphical components. To that effect, I have the method below.
public void update(){
for (BaseGameEntity movingEntity : movingEntityList) {
((MovingEntity)movingEntity).update();
}
}
Essentially, the class containing this method has a list of all graphical objects that need updating and it loops through, calling their respective update methods.
The issue comes when I have to add new entities or remove current entities from this list. The addition and removal of entities is handled by a different thread, and as you can guess, this results in a Concurrent Modification Exception if I try to add/remove entities while also looping through and updating their graphical components.
My ad hoc solution was to simply throw a try-catch block around this and just ignore any concurrent modification exceptions that crop up - in effect, not updating at that specific time. This does exactly what I want and no problems occur.
public void update(){
try{
for (BaseGameEntity movingEntity : movingEntityList) {
((MovingEntity)movingEntity).update();
}
}catch(ConcurrentModificationException e){
//Do Nothing
}
}
However, my question is, is this a "proper" way of handling this issue? Should I perhaps be doing something akin to what is outlined in this answer? What is the "correct" way to handle this issue, if mine is wrong? I'm not looking specifically for ways to make my arraylist thread safe such as through synchronized lists, I'm specifically asking if my method is a valid method or if there is some reason I should avoid it and actually use a synchronized list.

The proper way would be to synchronize the list with Collections.synchronizedList():
List list = Collections.synchronizedList(new ArrayList());
...
synchronized (list) {
Iterator i = list.iterator(); // Must be in synchronized block
while (i.hasNext())
foo(i.next());
}
If you are traversing way more than the number of times you update your list, you can also use CopyOnWriteArrayList.
If you don't mind occasional missing updates (or if they happen way too infrequently for the price of synchronization), your way is fine.

Is this a "proper" way of handling this issue?
If you do not mind getting an increase of concurrency at the expense of dropping the updates on error, then the answer is "yes". You do run the risk of not completing an update multiple times in a row, when significant additions and removals to the list happen often.
On the other hand, when the frequency of updates is significantly higher than the frequency of adding/removing an object, this solution sounds reasonable.
Should I perhaps be [using synchronized]?
This is also a viable solution. The difference is that an update would no longer be able to proceed when an update is in progress. This may not be desirable when the timing of calls to update is critical (yet it is not critical to update everything on every single call).

Some people consider it as a duplicate of all the generic synchronization questions. I think this is not the case. You are asking for a very specific constellation, and whether your solution is "OK", in this sense.
Based on what you described, the actual goal seems to be clear: You want to quickly and concurrently iterate over the entities to call the update method, and avoid any synchronization overhead that may be implied by using Collections#synchronizedList or similar approaches.
Additionally, I assume that the main idea behind the solution that you proposed was that the update calls have to be done very often and as fast as possible, whereas adding or removing entities happens "rarely".
So, adding and removing elements is an exception, compared to the regular operations ;-)
And (as dasblinkenlight already pointed out in his answer) for such a setup, the solution of catching and ignoring the ConcurrentModificationException is reasonable, but you should be aware of the consequences.
It might happen that the update method of some entities is called, and then the loop bails out due to the ConcurrentModificationException. You should be absolutely sure that this does not have undesirable side-effects. Depending on what update actually does, this might, for example, cause some entities to move faster over the screen, and others to not move at all, because their update calls had been missed due to several ConcurrentModificationExceptions. This may be particularly problematic if adding and removing entities is not an operation that happens rarely: If one thread constantly adds or removes elements, then the last elements of the list may never receive an update call at all.
If you want some "justification by example": I first encountered this pattern in the JUNG Graph Library, for example, in the SpringLayout class and others. When I first saw it, I cringed a little, because at the first glance it looks horribly hacky and dangerous. But here, the justification is the same: The process has to be as fast as possible, and modifications to the graph structure (which would cause the exception) are rare. Note that the JUNG guys actually do recursive calls to the respective method when the ConcurrentModificationException happens - simply because they can't always assume the method to be called constantly by another thread. This, in turn, can have nasty side-effects: If another thread does constant modifications, and the ConcurrentModificationException is thrown each time when the method is called, then this will end with a StackOverflowError... But this is not the case for you, fortunately.

Synchronized Not Entering

Note: I'm not looking for workarounds; I'm sure I can find other methods if necessary. I simply feel like I'm missing something fundamental or quirky and I want to know what I'm missing. Or if there is a way to use the debugger to get more info that would be nice too. Thanks!
I'm having an issue with use of synchronized. I'm receiving deadlock but it seems utterly impossible. I've placed print statements before each and every synchronized call, just inside each call, and just before exiting so I can see who all holds which synchronized objects. I'm finding that it will not go inside one of my synchronized calls even though no one currently holds the lock on the object. Are there some kind of quirks that I'm missing or illegal nesting operations? Here's the jist of what I am doing.
Oh yeah, and the oddest thing is that removing the two "busyFlagObject" synchronizations makes it work fine...
Thread 1:
public void DrawFunction()
{
synchronized(drawObject)
{
...
// Hangs here though nobody has a lock on this object
synchronized(animationObject)
{
}
}
}
Thread 2:
public void AnotherFunction()
{
synchronized(busyFlagObject)
{
// Calls a function that also uses this same Synchronized call
synchronized(busyFlagObject)
{
// Calls another function that uses another Synchronized call
// Hangs here waiting for the draw function to complete which it SHOULD
// be able to do no problem.
synchronized(drawObject)
{
}
// Never gets to this one assuming the Log statements don't
// buffer and aren't flushed but still shouldn't be a problem anyway.
synchronized(animationObject)
{
}
}
}
}

Run your app under the debugger or use "jstack" from the JDK tools. That will show you directly which threads wait for locks and which hold locks, so we don't have to guess where your problem is :-)
That said, you mention you synchronize on Boolean. Keep in mind that the class is intended to only have two instances, and many things (particularly boxing) will implicitly change your Boolean instance to the "shared" value. Are you sure your lock objects are not the same instance? You might consider using new Object() as your monitor object.
It's worth noting that this isn't the only place that this can happen and there's a good entry on this problem in Java Concurrency in Practice, specifically with string interning, that I'm failing to find a link to at the moment. Don't use a type that isn't under your control as something it wasn't intended to do :-)

Asynchronous atomic array

I have a critical section of my (Java) code which basically goes like the snippet below. They're coming in from a nio server.
void messageReceived(User user, Message message) {
synchronized(entryLock) {
userRegistry.updateLastMessageReceived(user,time());
server.receive(user,message);
}
}
However, a high percentage of my messages are not going to change the server state, really. They're merely the client saying "hello, I'm still here". I really don't want to have to make that inside the synchronization block.
I could use a synchronous map or something like that, but it's still going to incur a synchronization penalty.
What I would really like to do is to have something like a drop box, like this
void messageReceived(User user, Message message) {
dropbox.add(new UserReceived(user,time());
if(message.getType() != message.TYPE_KEPT_ALIVE) {
synchronized(entryLock) {
server.receive(user,message);
}
}
}
I have a cleanup routine to automatically put clients that aren't active to sleep. So instead of synchronizing on every kept alive message to update the registry, the cleanup routine can simply compile the kept alive messages in a single synchronization block.
So naturally, reconigizing a need for this, the first thing I did was start making a solution. Then I decided this was a non-trivial class, and a problem that was more than likely fairly common. so here I am.
tl;dr Is there a Java library or other solution I can use to facilitate atomically adding to a list of objects in an asynchronous manner? Collecting from the list in an asychronous manner is not required. I just don't want to synchronize on every add to the list.

ConcurrentLinkedQueue claims to be:
This implementation employs an efficient "wait-free" algorithm based on one described in Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms by Maged M. Michael and Michael L. Scott.
I'm not sure what the quotes on "wait-free" entail but the Concurrent* classes are good places to look for structures like you're looking for.
You might also be interested in the following: Effective Concurrency: Lock-Free Code — A False Sense of Security. It talks about how hard these things are to get right, even for experts.

Well, there are few things you must bear in mind.
First, there is very little "synchronization cost" if there is little contention (more than one thread trying to enter the synchronized block at the same time).
Second, if there is contention, you're going to incur some cost no matter what technique you're using. Paul is right about ConcurrentLinkedQueue and the "wait-free" means that thread concurrency control is not done using locks, but still, you will always pay some price for contention. You may also want to look at ConcurrentHashMap because I'm not sure a list is what you're looking for. Using both classes is quite simple and common.
If you want to be more adventurous, you might find some non-locking synchronization primitives in java.util.concurrent.atomic.

One thing we could do is to use a simple ArrayList for keep-alive messages:
Keep adding to this list whenever each keep-alive message comes.
The other thread would synch on a lock X and read and process
keep-alives. Note that this thread is not removing from list only
reading/copying.
Finally in messageReceived itself you check if the list has grown
say beyond 1000, in which case you synch on the lock X and clear the
list.
List keepAliveList = new ArrayList();
void messageReceived(User user, Message message) {
if(message.getType() == message.TYPE_KEPT_ALIVE) {
if(keepAliveList.size() > THRESHOLD) {
synchronized(X) {
processList.addAll(list);
list.clear();
}
}
keepAliveList.add(message);
}
}
//on another thread
void checkKeepAlives() {
synchronized(X) {
processList.addAll(list)
}
processKeepAlives(processList);
}

Java concurrency - use which technique to achieve safety?

I have a list of personId. There are two API calls to update it (add and remove):
public void add(String newPersonName) {
if (personNameIdMap.get(newPersonName) != null) {
myPersonId.add(personNameIdMap.get(newPersonName)
} else {
// get the id from Twitter and add to the list
}
// make an API call to Twitter
}
public void delete(String personNAme) {
if (personNameIdMap.get(newPersonName) != null) {
myPersonId.remove(personNameIdMap.get(newPersonName)
} else {
// wrong person name
}
// make an API call to Twitter
}
I know there can be concurrency problem. I read about 3 solutions:
synchronized the method
use Collections.synchronizedlist()
CopyOnWriteArrayList
I am not sure which one to prefer to prevent the inconsistency.

1) synchronized the method
2) use Collections.synchronizedlist
3) CopyOnWriteArrayList ..
All will work, it's a matter of what kind of performance / features you need.
Method #1 and #2 are blocking methods. If you synchronize the methods, you handle concurrency yourself. If you wrap a list in Collections.synchronizedList, it handles it for you. (IMHO #2 is safer -- just be sure to use it as the docs say, and don't let anything access the raw list that is wrapped inside the synchronizedList.)
CopyOnWriteArrayList is one of those weird things that has use in certain applications. It's a non-blocking quasi-immutable list, namely, if Thread A iterates through the list while Thread B is changing it, Thread A will iterate through a snapshot of the old list. If you need non-blocking performance, and you are rarely writing to the list, but frequently reading from it, then perhaps this is the best one to use.
edit: There are at least two other options:
4) use Vector instead of ArrayList; Vector implements List and is already synchronized. However, it's generally frowned, upon as it's considered an old-school class (was there since Java 1.0!), and should be equivalent to #2.
5) access the List serially from only one thread. If you do this, you're guaranteed not to have any concurrency problems with the List itself. One way to do this is to use Executors.newSingleThreadExecutor and queue up tasks one-by-one to access the list. This moves the resource contention from your list to the ExecutorService; if the tasks are short, it may be fine, but if some are lengthy they may cause others to block longer than desired.
In the end you need to think about concurrency at the application level: thread-safety should be a requirement, and find out how to get the performance you need with the simplest design possible.
On a side note, you're calling personNameIdMap.get(newPersonName) twice in add() and delete(). This suffers from concurrency problems if another thread modifies personNameIdMap between the two calls in each method. You're better off doing
PersonId id = personNameIdMap.get(newPersonName);
if (id != null){
myPersonId.add(id);
}
else
{
// something else
}

Collections.synchronizedList is the easiest to use and probably the best option. It simply wraps the underlying list with synchronized. Note that multi-step operations (eg for loop) still need to be synchronized by you.
Some quick things
Don't synchronize the method unless you really need to - It just locks the entire object until the method completes; hardly a desirable effect
CopyOnWriteArrayList is a very specialized list that most likely you wouldn't want since you have an add method. Its essentially a normal ArrayList but each time something is added the whole array is rebuilt, a very expensive task. Its thread safe, but not really the desired result

Synchronized is the old way of working with threads. Avoid it in favor of new idioms mostly expressed in the java.util.concurrent package.
See 1.
A CopyOnWriteArrayList has fast read and slow writes. If you're making a lot of changes to it, it might start to drag on your performance.
Concurrency isn't about an isolated choice of what mechanism or type to use in a single method. You'll need to think about it from a higher level to understand all of its impacts.

Are you making changes to personNameIdMap within those methods, or any other data structures access to which should also be synchronized? If so, it may be easiest to mark the methods as synchronized; otherwise, you might consider using Collections.synchronizedList to get a synchronized view of myPersonId and then doing all list operations through that synchronized view. Note that you should not manipulate myPersonId directly in this case, but do all accesses solely through the list returned from the Collections.synchronizedList call.
Either way, you have to make sure that there can never be a situation where a read and a write or two writes could occur simultaneously to the same unsynchronized data structure. Data structures documented as thread-safe or returned from Collections.synchronizedList, Collections.synchronizedMap, etc. are exceptions to this rule, so calls to those can be put anywhere. Non-synchronized data structures can still be used safely inside methods declared to be synchronized, however, because such methods are guaranteed by the JVM to never run at the same time, and therefore there could be no concurrent reading / writing.

In your case from the code that you posted, all 3 ways are acceptable. However, there are some specific characteristics:
#3: This should have the same effect as #2 but may run faster or slower depending on the system and workload.
#1: This way is the most flexible. Only with #1 can you make the the add() and delete() methods more complex. For example, if you need to read or write multiple items in the list, then you cannot use #2 or #3, because some other thread can still see the list being half updated.

Java concurrency (multi-threading) :
Concurrency is the ability to run several programs or several parts of a program in parallel. If a time consuming task can be performed asynchronously or in parallel, this improve the throughput and the interactivity of the program.
We can do concurrent programming with Java. By java concurrency we can do parallel programming, immutability, threads, the executor framework (thread pools), futures, callables and the fork-join framework programmings.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.