Accessing list using multiple threads - java

Is the compute() function thread safe? Will multiple threads loop correctly over the list?
class Foo {
private List<Integer> list;
public Foo(List<Integer> list) {
this.list = list;
}
public void compute() {
for (Integer i: list) {
// do some thing with it
// NO LIST modifications
}
}
}

Considering that data does not mutate (as you mentioned in the comment) there will not be any dirty / phantom reads.

If the list is created specifically for the purposes of that method, then you're good to go. That is, if the list isn't modified in any other method or class, then that code is thread safe, since you're only reading.
A general recommendation is to make a read-only copy of the collection, if you're not sure the argument comes from a trustworthy origin (and even if you are sure).
this.list = Collections.unmodifiableList(new ArrayList<Integer>(list));
Note, however, that the elements of the list must also be thread-safe. If, in your real scenario, the list contains some mutable structure, instead of Integer (which are immutable), you should make sure that any modifications to the elements are also thread-safe.

If you can guarantee that the list is not modified elsewhere while you're iterating over it that code is thread safe.
I would create a read-only copy of the list though to be absolutely sure that it won't be modified elsewhere:
class Foo {
private List<Integer> list;
public Foo(List<Integer> list) {
this.list = Collections.unmodifiableList(new ArrayList<>(list));
}
public void compute() {
for (Integer i: list) {
// do some thing with it
// NO LIST modifications
}
}
}
If you don't mind adding a dependency to your project I suggest using Guava's ImmutableList:
this.list = ImmutableList.copyOf(list);
It is also a good idea to use Guavas immutable collections wherever you're using collections that aren't changing since they are inherently thread safe due to being immutable.

You can easily inspect the behavior when having for example 2 threads:
public class Test {
public static void main(String[] args) {
Runnable task1 = () -> { new Foo().compute(); };
Runnable task2 = () -> { new Foo().compute(); };
new Thread(task1).start();
new Thread(task2).start();
}
}
If the list is guaranteed not to be changed anywhere else, iterating on it is thread safe, if you implement compute to simply print the list content, debugging your code should help you understanding it is thread safe.

There is thread safe list in cocncurent library. If you want thread-safe collections always use it. Thread-safe list is CopyOnWriteArrayList

This version
class Foo {
private final List<Integer> list;
public Foo(List<Integer> list) {
this.list = new ArrayList<>(list);
}
public void compute() {
for(Integer i: list) {
// ...
}
}
}
is thread-safe, if following holds:
list arg to ctor can't be modified during ctor run time (e.g., it is local variable in caller) or thread-safe itself (e.g., CopyOnWriteArrayList);
compute won't modify list contents (just as OP stated). I guess compute should be not void but return some numeric value, to be of any utility...

Related

Thread-safe HashMap of Objects with Nested ArrayList

I have a HashMap of objects with nested ArrayLists that is accessed by multiple threads.
I am wondering if declaring it as a synchronized HashMap is enough to make it thread-safe.
public class ExampleRepository {
private static Map<String, Example> examples = Collections.synchronizedMap(new HashMap<>());
public static void addExample(Example example) {
examples.put(example.getKey(), example);
}
public static Example getExample(String key) {
return examples.get(key);
}
}
public class Example {
private String key;
// More attributes
private List<AnotherObject> anotherObjectList = new ArrayList<>();
// Constructor
public List<AnotherObject> getAnotherObjectList() {
return anotherObjectList;
}
// More getters & Setters
}
public class Doer {
// This function runs in an ExecutorService with many threads
public static one(String key) {
Example example = ExampleRepository.getExample(key);
if (example != null) {
// Do stuff
example = new Example(values);
AnotherObject anotherObject = new AnotherObject(values);
example.getAnotherObjectList().add(anotherObject);
ExampleRepository.addExample(example);
}
two(example);
}
private static two(Example example) {
// Do stuff
AnotherObject anotherObject = new AnotherObject(values);
trim(example.getAnotherObjectList(), time);
example.getAnotherObjectList().add(anotherObject);
}
private static void trim(List<AnotherObject> anotherObjectList, int time) {
short counter = 0;
for (AnotherObject anotherObject : anotherObjectList) {
if (anotherObject.getTime() < time - ONE_HOUR) {
counter++;
} else {
break;
}
}
if (counter > 0) {
anotherObjectList.subList(0, counter).clear();
}
}
}
I guess the question is adding Example objects to the HashMap thread safe? Also, is removing and adding AnotherObject objects to the nested list thread-safe or should I declared it as synchronized ArrayList?
I would greatly appreciate any insights. Thank you very much!
Thank you very much for the answers. I just realized that I actually loop a little over the nested AnotherObject. If i make the ArrayList a synchronized ArrayList, should I still put it in a synchronized block?
Thank you again!
The thing you have to be clear about is what you mean by "thread safe".
I guess the question is adding Example objects to the HashMap thread safe?
Making the map synchronized guarantees that the structural modifications you make to the map are visible to all threads.
Also, is removing and adding AnotherObjet objects to the nested list thread-safe or should I declared it as synchronized ArrayList?
No: you would need to externally synchronize accesses to the lists if you want structural modifications to the lists to be visible in all threads.
That could mean using synchronizedList, but you could "manually" synchronize on the list, or even on the map (or one of a number of other ways that create happens-before guarantees).
I guess the question is adding Example objects to the HashMap thread
safe?
-> Yes putting Example object to map is thread-safe.
Also, is removing and adding AnotherObjet objects to the nested list
thread-safe or should I declared it as synchronized ArrayList?
-> Removing objects from the list does not guarantee that it will be thread-safe.
Any operation on the map will be thread-safe as you have used the Synchronized map. ArrayList in Example object will be still unsafe to thread.
Thread-safe in a collection/map does not mean it will make API of any/every object it contains thread-safe.

Static Collection update inside CompletableFuture#runAsync

Preconditions (generic description):
1. static class field
static List<String> ids = new ArrayList<>();
2. CompletableFuture#runAsync(Runnable runnable,Executor executor)
called within
static void main(String args[]) method
3. elements added to someCollection inside of runAsync call from step2
Code snippet (specific description):
private static List<String> ids = new ArrayList<>();
public static void main(String[] args) throws ExecutionException, InterruptedException {
//...
final List<String> lines = Files.lines(path).collect(Collectors.toList());
for (List<String> lines : CollectionUtils.split(1024, lines)) {
CompletableFuture<Void> future = CompletableFuture.runAsync(() -> {
List<User> users = buildUsers();
populate(users);
}, executorService);
futures.add(future);
}
private static void populate(List<User> users){
//...
ids.add(User.getId);
//...
}
}
Problem description:
As I understand from concurrency point of view,
static variable could NOT be shared between threads, so data can be lost some way.
Should it be changed to volatile or it would be reasonable to use
ConcurrentSkipListSet<String> ?
Based on the code snippet:
volatile is not required here because it works on reference level, while the tasks don't update the reference of the collection object, they mutate its state. Would the reference be updated, either volatile or AtomicReference might have been used.
Static object can be shared between threads, but the object must be thread-safe. A concurrent collection will do the job for light to medium load.
But the modern way to do this would involve streams instead of using a shared collection:
List<CompletableFuture<List<String>>> futures = lines.stream()
.map(line -> CompletableFuture.supplyAsync(() -> buildUsers().stream()
.map(User::getId)
.collect(Collectors.toList()),
executorService))
.collect(Collectors.toList());
ids.addAll(futures.stream()
.map(CompletableFuture::join)
.flatMap(List::stream)
.collect(Collectors.toList()));
In your particular case there are ways to guarantee thread safety for ids:
Use thread-safe collection (for example, ConcurrentSkipListSet, CopyOnWriteArrayList, Collections.synchronizedList(new ArrayList(), Collections.newSetFromMap(new ConcurrentHashMap());
Use synchronization as shown below.
Examples of synchronized:
private static synchronized void populate(List<User> users){
//...
ids.add(User.getId);
//...
}
private static void populate(List<User> users){
//...
synchronized (ids) {
ids.add(User.getId);
}
//...
}
I assume that it would be the fastest to use Collections.newSetFromMap(new ConcurrentHashMap(), if you expect a lot of user ids. Otherwise, you would be familiar with ConcurrentSkipListSet.
volatile is a bad option here. Volatile guarantees visibility, but not atomicity. The typical examples of volatile usage are
volatile a = 1
void threadOne() {
if (a == 1) {
// do something
}
}
void threadTwo() {
// do something
a = 2
}
In that case, you do only write/read operations once. As "a" is volatile, then it is guaranteed that each thread "see" (read) full exactly 1 or 2.
Another (bad example):
void threadOne() {
if (a == 1) {
// do something
a++;
}
}
void threadTwo() {
if (a == 1) {
// do something
a = 2
} else if (a == 2) {
a++
}
}
Here we do increment operation (read and write) and there are could be different results of a, because we don't have atomicity. That's why there are AtomicInteger, AtomicLong, etc. In your case, all threads would see the write value ids, but they would write different values and if you see inside "add" method of ArrayList, you will see something like:
elementData[size++] = e;
So nobody guarantees atomicity of size value, you could write different id in one array cell.
In terms of thread safety it doesn't matter whether the variable static or not.
What matters are
Visibility of shared state between threads.
Safety (preserving class invariants) when class object is used by multiple threads through class methods.
Your code sample is fine from visibility perspective because ids is static and will be initialized during class creation time. However it's better to mark it as final or volatile depending on whether ids reference can be changed. But safety is violated because ArrayList doesn't preserve its invariants in multithreaded environment by design. So you should use a collection which is designed for accessing by multiple threads. This topic should help with the choice.

How do I synchronize access to a member of a different class?

I'm trying to figure out how to synchronize read/write access to a synchronized list from a different class.
A small example: I have a synchronized list in one class ListProvider and I access it in a different class Filter. As the name suggests, this class performs some filtering based on a (in)validation check isInvalid.
The filter method first gets the list reference, then collects the entries to remove in a temporary list to not run into concurrent modification issues, and finally removes the entries from the list:
public class Filter {
ListProvider listProvider;
...
public void filter() {
List<String> listProviderList = listProvider.getList();
List<String> entriesToRemove = new ArrayList<>();
// collect
for (String entry : listProviderList)
if (isInvalid(entry)) {
entriesToRemove.add(entry);
}
}
// remove
for (String entry : entriesToRemove) {
listProviderList.remove(entry);
}
}
}
My question: How can I make sure that no other thread modifies the list while filter does its reading and writing?
If it were Filter's own list, I'd just do:
synchronized(myList) {
// collect
// remove
}
but in this case I'm not sure what to use as a monitor.
but in this case I'm not sure what to use as a monitor.
To create a monitor for a specific task, it is a good pattern to use a private final Object:
private final Object listUpdateLock = new Object();
...
synchronized(listUpdateLock) {
...
}
It's important to make sure that ListProvider is private and that all accesses to the list are done within a synchronized block -- even if only reading from it.
In this case, you are updating the list, you could create a temporary list and then replace it when you are done. I'm not sure you can do that with ListProvider however. Then you could just make the list volatile.
Here it seems like you should use a lock. A lock is like synchronized but it's a bit more flexible. It doesn't require a surrounding block and it has some extended features. There are also some different kinds of locks. ReentrantLock is much like synchronized.
public class ListProvider<E> {
private final List<E> theList = new ArrayList<E>();
private final ReentrantLock listLock = new ReentrantLock();
public final List<E> lockList() {
listLock.lock();
return theList;
}
public final void unlockList() {
listLock.unlock();
}
}
/* somewhere else */ {
List<E> theList = listProvider.lockList();
/*
* perform
* multiple
* operations
*
*/
listProvider.unlockList();
}
The main differences between this and synchronized are:
The actual locking mechanism is hidden. This is good for abstraction; however,
Clients must remember to unlock explicitly whereas a synchronized monitor exit is at a block delimiter.
There is a lock called ReentrantReadWriteLock which you might find useful because multiple threads can read simultaneously. ReadWriteLock explains how it works.
Do not iterate over original list, but create a copy of it to find invalid elements. When you are done with filtering you can remove invalid elements from original list safely:
public class Filter {
ListProvider listProvider;
...
public void filter() {
List<String> listProviderCopy = new ArrayList<>(listProvider.getList());
List<String> entriesToRemove = new ArrayList<>();
// collect
for (String entry : listProviderCopy)
if (isInvalid(entry)) {
entriesToRemove.add(entry);
}
}
listProvider.getList().removeAll(entriesToRemove);
}
}
You may want to use SynchronizedList
List<String> list = new ArrayList<>();
List<String> synch = Collections.synchronizedList(list);
more

List ConcurrentModificationException in servlet

It's plenty of questions regarding ConcurrentModificationException for ArrayList objects, but I could not find yet an answer to my problem.
In my servlet I have an ArrayList as a member object:
List myList<Object> = new ArrayList<Object> (...);
The list must be shared among users and sessions.
In one of the servlet's methods, method1, I need to iterate over the ArrayList items, and eventually add clear the list after the iteration. Here a snippet:
for (Object o : myList) {
// read item o
}
myList.clear();
In another method, method2, I simply add a new Item to the list.
Most of the times the method ends its job without errors. Sometimes, probably due to the concurrent invocation of this method by different users, I get the famous java util.ConcurrentModificationException exception.
Should I define my List as:
List myList = Collections.synchronizedList(new ArrayList(...));
Would this be enough or am I missing something? What's behind the scenes? When there is a possible concurrency, is the second thread held in standby by the container?
EDIT: I have added the answers to some comments.
Using a synchronized list will not solve your problem. The core of the problem is that you are iterating over a list and modifying it at the same time. You need to use mutual exclusion mechanisms (synchronized blocks, locks etc) to ensure that they do not happen at the same time. To elaborate, if you start with:
methodA() {
iterate over list {
}
edit list;
}
methodB() {
edit list;
}
If you use a synchronized list, what you essentially get is:
methodA() {
iterate over list {
}
synchronized {
edit list;
}
}
methodB() {
synchronized {
edit list;
}
}
but what you actually want is:
methodA() {
synchronized {
iterate over list {
}
edit list;
}
}
methodB() {
synchronized {
edit list;
}
}
Just using synchronizedList makes all methods thread safe EXCEPT Iterators.
I would use CopyOnWriteArrayList. It is thread safe and doesn't produce ConcurrentModificationException.
ConcurrentModificaitonException occurs when you attempt to modify a collection while you're iterating through it. I imagine that the error only gets thrown when you perform some conditional operation.
I'd suggest pushing the values you want to add/remove into a separate list and performing the add /remove after you're done iterating.
You need to lock not just over the method accesses but over your use of the list.
So if you allocate a paired Object like:
Object myList_LOCK = new Object();
then you can lock that object whenever you are accessing the List, like this:
synchronized(myList_LOCK) {
//Iterate through list AND modify all within the same lock
}
at the moment the only locking you're doing is within the individual methods of the List, which isn't enough in your case because you need atomicity over the entire sequence of iteration and modification.
You could use the actual object (myList) to lock rather than a paired object but in my experience you are better off using another dedicated object as it avoids unexpected deadlock conditions that can arise as a result of the code internal to the object locking on the object itself.
This is kind of an add onto Peter Lawery's answer. But since copying wouldn't effect you too negatively you can do a mixture of copying with synchronization.
private final List<Object> myList = new ArrayList<Object>();
public void iterateAndClear(){
List<Object> local = null;
synchronized(myList){
local = new ArrayList<Object>(myList);
myList.clear();
}
for(Object o : local){
//read o
}
}
public void add(Object o){
synchronized(myList){
myList.add(o);
}
}
Here you can iterate over o elements without fear of comodifications (and outside of any type of synchronization), all while myList is safely cleared and added to.

Java concurrency question - synchronizing on a collection

Will the following code snippet of a synchronized ArrayList work in a multi-threaded environment?
class MyList {
private final ArrayList<String> internalList = new ArrayList<String>();
void add(String newValue) {
synchronized (internalList) {
internalList.add(newValue);
}
}
boolean find(String match) {
synchronized (internalList) {
for (String value : internalList) {
if (value.equals(match)) {
return true;
}
}
}
return false;
}
}
I'm concerned that one thread wont be able to see changes by another thread.
Your code will work and is thread-safe but not concurrent. You may want to consider using ConcurrentLinkedQueue or other concurrent thread-safe data structures like ConcurrentHashMap or CopyOnWriteArraySet suggested by notnoop and employ contains method.
class MyList {
private final ConcurrentLinkedQueue<String> internalList =
new ConcurrentLinkedQueue<String>();
void add(String newValue) {
internalList.add(newValue);
}
boolean find(String match) {
return internalList.contains(match);
}
}
This should work, because synchronizing on the same object establishes a happens-before relationship, and writes that happen-before reads are guaranteed to be visible.
See the Java Language Specification, section 17.4.5 for details on happens-before.
It will work fine, because all access to the list is synchronized. Hovewer, you can use CopyOnWriteArrayList to improve concurrency by avoiding locks (especially if you have many threads executing find).
It will work, but better solution is to create a List by calling Collections.synchronizedList().
You may want to consider using a Set(Tree or Hash) for your data as you are doing lookups by a key. They have methods that will be much faster than your current find method.
HashSet<String> set = new HashSet<String>();
Boolean result = set.contains(match); // O(1) time

Categories

Resources