i have a function which inserts inside an arrayList strings passed as parameter.This function can be accessed by different threads,
public void adding(String newStringForEachInvocation){
arrayList.add(newStringForEachInvocation);
}
i want to keep the add method concurrently and my doubt is, if two threads have got two differents strings is it possible for them to compete for the same bucket?
Another alternative is using the blockingQueue , but anyway it could represent a mutual esclusion for threads competing for the same bucket or not?
Yes, ArrayList is not thread-safe, and all the accesses to the list must thus be synchronized if it's accessed by multiple threads (explicitely, and/or by wrapping it using a Collections.synchronizedList()). Anything could happen if you're not doing it (data corruption, exceptions, etc.).
There are alternative, non-blocking List implementations, like CopyOnWriteArrayList. But depending on the use case it could be faster or slower than using a synchronized list.
Use Collections.synchronizedList, all unitary operation on that list will be synchronized
http://docs.oracle.com/javase/7/docs/api/java/util/Collections.html#synchronizedList(java.util.List)
Be careful though, if you are going to accomplish more than one operation on that list, like an iteration, use a synchronized block to ensure the integrity of the list, as specified on the javadoc :
It is imperative that the user manually synchronize on the returned list when iterating over it
Related
I'm writing a simple message queue program and I have multiple producers and multiple serializer (consumer is not considered right now). The producer specifies which queue it want to send message to by using a String queueName. And the serializer could only be initialized during sending procedure because the exact number/name of queues are not known until running. Since I have to use a Map, I think I can use either
HashMap together with lock/synchronized
ConcurrentHashMap
I want to avoid using explicit lock, so I choose ConcurrentHashMap. However, using ConcurrentHashMap doesn't mean my program ConcurrentHashMap is thread-safe, the idle between containsKey() and put() might cause some chaos. So I consider using its putIfAbsent() method.
However, when I call putIfAbsent(queuename, new MySerializer()), I find it creates a new instance of MySerializer everytime I call putIfAbsent. But if I don't use putIfAbsent, I'll have to use something like a lock.
My question is how to concurrently add elements into ConcurrentHashMap while avoiding using lock at the same time?
Java 8 added new methods to the Map interface which allow the potentially-new value to be evaluated lazily. For example:
map.computeIfAbsent(queuename, MySerializer::new);
I have a class which has:
2 fields holding time-ordered list (list1, list2).
3 read-only methods which iterate above lists to
generate summary statistics.
1 mutating method, which looks for a match of given 'new-item' in list1. If match is not found, it adds 'new-item' to list1. If match is found, it removes the match from list1 and adds both match and 'new-item' to list2.
Lets assume that multiple concurrent invocation of all methods are possible. I need to achieve thread-safety while maximising performance.
Approach1 (extremely slow) - Declare field-types as ArrayList and use synchronise keyword on all methods.
Approach2 - Declare field-type as CopyOnWriteArrayList and synchronise the mutating method.
Questions
Does Approach2 ensure thread-safety?
Are there better alternatives?
Do you need the random access offered by an ArrayList? Can you instead use a thread-safe ordered collection like ConcurrentSkipListSet (non-blocking) or PriorityBlockingQueue (blocking)? Both have log(n) insertions.
Mutation methods in both cases are thread-safe.
Edit: Just note, you would still run into atomicity concerns. If you need the add's to be done attomically then you would need more coarse locking.
Approach number 2 does not guarantee thread-safety.
The two operations on collections are not atomic: first you remove an item, then you add it to the other collection. Some thread might in the meantime execute a read-only method to find out that the item is missing in list 1, and is not yet added to the list 2. It depends on your application whether this is acceptable.
On the other hand, it is also possible that: a read-only method first iterates through list 1, and finds that it contains item x; in the meantime the updating method executes and transfers item x; the read-only method continues and iterates through list 2 in which it finds item x one more item. Again, it depends on your application whether this is acceptable.
Other solutions are possible, but that would require more details about what are you trying to achieve exactly.
One obvious way would be to modify approach number 1, and instead of using synchronized on every method, use a readers-writer lock. You would read-lock in every read-only method and write-lock in the mutating one.
You could also use two separate readers-writer locks. One for the first collection and one for the other. If your read-only methods iterate through both of the lists, they would have to read-acquire both of the locks up front, before doing anything. On the other hand the mutating method would have to first write-acquire the first lock, and if it wishes to transfer an item, then it should write-acquire the second lock.
You'd need to do some testing to see if it works nicely for you. Still there are definitely even better ways to handle it, but you'd need to provide more details.
The time it takes to lock a method is less than a micro-second. If a fraction of a micro-second matters, you might consider something more complex, both otherwise something simple is usually better.
Just using thread safe collection is not enough when you perform multiple operations, e.g. remove from one list and add to another is two operations, and any number of thread can get in between those operations.
Note: if you do lots of updates this can be slower.
I have a data store that is written to by multiple message listeners. Each of these message listeners can also be in the hundreds of individual threads.
The data store is a PriorityBlockingQueue as it needs to order the inserted objects by a timestamp. To make checking of the queue of items efficient rather than looping over the queue a concurrent hashmap is used as a form of index.
private Map<String, SLAData> SLADataIndex = new ConcurrentHashMap<String, SLAData>();;
private BlockingQueue<SLAData> SLADataQueue;
Question 1 is this a acceptable design or should I just use the single PriorityBlockingQueue.
Each message listener performs an operation, these listeners are scaled up to multiple threads.
Insert Method so it inserts into both.
this.SLADataIndex.put(dataToWrite.getMessageId(), dataToWrite);
this.SLADataQueue.add(dataToWrite);
Update Method
this.SLADataIndex.get(messageId).setNodeId(
updatedNodeId);
Delete Method
SLATupleData data = this.SLADataIndex.get(messageId);
//remove is O(log n)
this.SLADataQueue.remove(data);
// remove from index
this.SLADataIndex.remove(messageId);
Question Two Using these methods is this the most efficient way? They have wrappers around them via another object for error handling.
Question Three Using a concurrent HashMap and BlockingQueue does this mean these operations are thread safe? I dont need to use a lock object?
Question Four When these methods are called by multiple threads and listeners without any sort of synchronized block, can they be called at the same time by different threads or listeners?
Question 1 is this a acceptable design or should I just use the single PriorityBlockingQueue.
Certainly you should try to use a single Queue. Keeping the two collections in sync is going to require a lot more synchronization complexity and worry in your code.
Why do you need the Map? If it is just to call setNodeId(...) then I would have the processing thread do that itself when it pulls from the Queue.
// processing thread
while (!Thread.currentThread().isInterrupted()) {
dataToWrite = queue.take();
dataToWrite.setNodeId(myNodeId);
// process data
...
}
Question Two Using these methods is this the most efficient way? They have wrappers around them via another object for error handling.
Sure, that seems fine but, again, you will need to do some synchronization locking otherwise you will suffer from race conditions keeping the 2 collections in sync.
Question Three Using a concurrent HashMap and BlockingQueue does this mean these operations are thread safe? I dont need to use a lock object?
Both of those classes (ConcurrentHashMap and the BlockingQueue implementations) are thread-safe, yes. BUT since there are two of them, you can have race conditions where one collection has been updated but the other one has not. Most likely, you will have to use a lock object to ensure that both collections are properly kept in sync.
Question Four When these methods are called by multiple threads and listeners without any sort of synchronized block, can they be called at the same time by different threads or listeners?
That's a tough question to answer without seeing the code in question. For example. someone might be calling Insert(...) and has added it to the Map but not the queue yet, when another thread else calls Delete(...) and the item would get found in the Map and removed but the queue.remove() would not find it in the queue since the Insert(...) has not finished in the other thread.
I have a list of personId. There are two API calls to update it (add and remove):
public void add(String newPersonName) {
if (personNameIdMap.get(newPersonName) != null) {
myPersonId.add(personNameIdMap.get(newPersonName)
} else {
// get the id from Twitter and add to the list
}
// make an API call to Twitter
}
public void delete(String personNAme) {
if (personNameIdMap.get(newPersonName) != null) {
myPersonId.remove(personNameIdMap.get(newPersonName)
} else {
// wrong person name
}
// make an API call to Twitter
}
I know there can be concurrency problem. I read about 3 solutions:
synchronized the method
use Collections.synchronizedlist()
CopyOnWriteArrayList
I am not sure which one to prefer to prevent the inconsistency.
1) synchronized the method
2) use Collections.synchronizedlist
3) CopyOnWriteArrayList ..
All will work, it's a matter of what kind of performance / features you need.
Method #1 and #2 are blocking methods. If you synchronize the methods, you handle concurrency yourself. If you wrap a list in Collections.synchronizedList, it handles it for you. (IMHO #2 is safer -- just be sure to use it as the docs say, and don't let anything access the raw list that is wrapped inside the synchronizedList.)
CopyOnWriteArrayList is one of those weird things that has use in certain applications. It's a non-blocking quasi-immutable list, namely, if Thread A iterates through the list while Thread B is changing it, Thread A will iterate through a snapshot of the old list. If you need non-blocking performance, and you are rarely writing to the list, but frequently reading from it, then perhaps this is the best one to use.
edit: There are at least two other options:
4) use Vector instead of ArrayList; Vector implements List and is already synchronized. However, it's generally frowned, upon as it's considered an old-school class (was there since Java 1.0!), and should be equivalent to #2.
5) access the List serially from only one thread. If you do this, you're guaranteed not to have any concurrency problems with the List itself. One way to do this is to use Executors.newSingleThreadExecutor and queue up tasks one-by-one to access the list. This moves the resource contention from your list to the ExecutorService; if the tasks are short, it may be fine, but if some are lengthy they may cause others to block longer than desired.
In the end you need to think about concurrency at the application level: thread-safety should be a requirement, and find out how to get the performance you need with the simplest design possible.
On a side note, you're calling personNameIdMap.get(newPersonName) twice in add() and delete(). This suffers from concurrency problems if another thread modifies personNameIdMap between the two calls in each method. You're better off doing
PersonId id = personNameIdMap.get(newPersonName);
if (id != null){
myPersonId.add(id);
}
else
{
// something else
}
Collections.synchronizedList is the easiest to use and probably the best option. It simply wraps the underlying list with synchronized. Note that multi-step operations (eg for loop) still need to be synchronized by you.
Some quick things
Don't synchronize the method unless you really need to - It just locks the entire object until the method completes; hardly a desirable effect
CopyOnWriteArrayList is a very specialized list that most likely you wouldn't want since you have an add method. Its essentially a normal ArrayList but each time something is added the whole array is rebuilt, a very expensive task. Its thread safe, but not really the desired result
Synchronized is the old way of working with threads. Avoid it in favor of new idioms mostly expressed in the java.util.concurrent package.
See 1.
A CopyOnWriteArrayList has fast read and slow writes. If you're making a lot of changes to it, it might start to drag on your performance.
Concurrency isn't about an isolated choice of what mechanism or type to use in a single method. You'll need to think about it from a higher level to understand all of its impacts.
Are you making changes to personNameIdMap within those methods, or any other data structures access to which should also be synchronized? If so, it may be easiest to mark the methods as synchronized; otherwise, you might consider using Collections.synchronizedList to get a synchronized view of myPersonId and then doing all list operations through that synchronized view. Note that you should not manipulate myPersonId directly in this case, but do all accesses solely through the list returned from the Collections.synchronizedList call.
Either way, you have to make sure that there can never be a situation where a read and a write or two writes could occur simultaneously to the same unsynchronized data structure. Data structures documented as thread-safe or returned from Collections.synchronizedList, Collections.synchronizedMap, etc. are exceptions to this rule, so calls to those can be put anywhere. Non-synchronized data structures can still be used safely inside methods declared to be synchronized, however, because such methods are guaranteed by the JVM to never run at the same time, and therefore there could be no concurrent reading / writing.
In your case from the code that you posted, all 3 ways are acceptable. However, there are some specific characteristics:
#3: This should have the same effect as #2 but may run faster or slower depending on the system and workload.
#1: This way is the most flexible. Only with #1 can you make the the add() and delete() methods more complex. For example, if you need to read or write multiple items in the list, then you cannot use #2 or #3, because some other thread can still see the list being half updated.
Java concurrency (multi-threading) :
Concurrency is the ability to run several programs or several parts of a program in parallel. If a time consuming task can be performed asynchronously or in parallel, this improve the throughput and the interactivity of the program.
We can do concurrent programming with Java. By java concurrency we can do parallel programming, immutability, threads, the executor framework (thread pools), futures, callables and the fork-join framework programmings.
I have a kind of async task managing class, which has an array like this:
public static int[][][] tasks;
Mostly I access the cells like this:
synchronized(tasks[A][B]) {
// Doing something with tasks[A][B][0]
}
My question is, if I do this:
synchronized(tasks[A]) {
// ...
}
will it also block threads trying to enter synchronized(tasks[A][B])?
In other words, does a synchronized access to an array also synchronizes the accsess to it's cells?
If not, how to I block the WHOLE array of tasks[A] for my thread?
Edit: the answer is NO. When someone is in a synchronized block on tasks[A] someone else can simultaniously be in a synchronized block on tasks[A][B] for example - because it's a different object. So when talking about accessing objects from one place at a time, arrays are no different: to touch object X from one place at a time you need to surround it by synchronized(X) EVERYWHERE you touch it.
No. Each array is an Object (with a monitor) unto itself; and the array tasks[A] is a distinct object from the array tasks[A][B]. The solution, if you want to synchronize all accesses to "sub" arrays of tasks[A] is simply that you must do synchronized (tasks[A]). If all accesses into the descendant objects (say, tasks[A][B]) do this, then any further synchronization is redundant.'
It appears your underlying question is actually something like "how can I safely modify the structure as well as the contents of a data structure while retaining the best concurrency possible?" If you augment your question a bit more about the problem space, we might be able to delve deeper. A three-dimensional array may not be the best solution.
int[][][] is an array of arrays of integer arrays, so your synchronized(tasks[A][B]) is synchronizing on the lowest level object, an array of integers, blocking other synchronized access to that same array.
synchronized(tasks[A]) on the other hand is synchronizing the object at the next level up - an array of integer arrays. This prevents synchronized access to that array, which means, in practice that any other code which uses synchronized(tasks[A]) will be blocked - which seems to be what you want, so long as all your acccesses to tasks synchronizes at the same level.
Note that synchronize does not lock anything! However, if two threads attempt to synchronize on the same object, one will have to wait.
It doesn't matter that you then work on another object (your array of integers).
I'm afraid I'm saying that andersoj's answer is misleading. You're doing the right thing.
Whenever I see code that grabs lots of different mutexes or monitors, my first reaction is to worry about deadlock; in your current code, do you ever lock multiple monitors in the same thread? If so, do you ensure that they are locked in a canonical ordering each time?
It would probably help if you could explain what you are trying to accomplish and how your are using / modifying this tasks array. There are surprising (or perhaps unsurprising) number of cases where the utilities in java.util.concurrent are sufficient, and using individual monitors isn't necessary. Of course, it all depends on what exactlly you are trying to do. Also, if the reason you are trying to grab so many different locks is because you are reading them very frequently, it is possible that using a single read-write lock for the entire 3d jagged-array object might be sufficient for your needs.