How to individually synchronize data members - java

I have been searching the web for this, but have been unable to find any article that comes close to this, and I'm quite surprised by that. Maybe the wisdom is hidden somewhere that I have yet to find.
Suppose I have a class with 10 members of various types (for the sake of simplicity, let's say they are mixed of ints and Strings), and each one of them has its own accessor methods. Now, I want to make this class thread-safe. But, some of these data members don't necessarily interact with each other. For example the class Person below, has age and name and other properties.
public class Person {
private volatile int age;
private String name;
private volatile long blabla;
// ... and so on
public synchronized int getAge() {
return age;
}
public synchronized void setAge(int age) {
this.age = age;
}
// .. and so on for each data member
}
One thread may only need to read/write age, and other threads only need to modify name. Obviously, adding synchronized to each and every one of the accessor methods is a bad idea as it locks the entire instance of the object. A thread that's calling getAge() has to wait for another thread that's calling getName() even though age and name are two separate fields.
So, one obvious solution is to create a lock for each field (or add volatile to primitive types). However, this seems to be an overkill. If I have 10 data members, do I also need 10 locks? I'm wondering if there's another way of achieving this without excessive locking.

If you are concerned about synchronizing primitive types, this is an excellent use case for AtomicInteger etc... They are very fast and ensure thread-safety. For more info:
http://docs.oracle.com/javase/tutorial/essential/concurrency/atomicvars.html

First off, if you are talking about primitives (or immutable objects like String) then all you should need is to mark each of the fields volatile. Locks won't be necessary if all you are doing is getting and setting field values.
However, if your get/set methods do multiple operations and synchronized blocks are need, having a synchronized blocks per field seems like premature optimization to me. I think that synchronized methods on a small object like your Person is a perfectly appropriate way to accomplish this. Unless you have real reasons (i.e. profiler output), I would not try to make it more complicated. Certainly a lock per field is overkill in just about any situation.
It would make a difference if the method takes a long time. Then you would not want to lock the entire object and block the other accessors. Then it is a good time to have multiple locks -- each for separate calculation. But if your object truly is just trying to protect get/set then a synchronized method is fine.
Couple of other comments:
If you can get away with just volatile fields then you don't need any synchronized blocks.
If you have synchronized methods then you do not need to make your fields volatile.
If the name field should probably be marked as final if it is not being written to.

Related

How to make java class thread safe?

I have a java class as below
class User {
String name;
String phone;
public String getName() {
return name;
}
public String getPhone() {
return phone;
}
}
The way this class is used is, for every thread 1 object of this User class is created. Now since there is one copy of object for every thread, can i call this class as thread safe?
Do I need to synchronize these methods?
The way you presented it, if each thread has its one copy, then it can be called thread-safe, as maximum of accessing threads is one.
Another thing - if you declare your fields as private and create the instance of that class as final, then it's immutable (final User user = new User(...)). There are no setters, so the object cannot be modified as well as it cannot change its reference. If you wanted to keep the immutability, you would have to make setters return a new instance of this object with changed fields.
#markspace noticed, that better approach would be to declare fields as final, because if you use the previous one and make User a member of some class, it won't work (unless final).
For a class to be thread safe, no matter how many threads are accessing it, its invariants and post-conditions should hold true.
For this class, although there are no write methods, you still need to synchronize the reads. This is because the compiler can cache the state variables (in this case name and phone) for each thread (remember each thread has its own set of registers). Thus, if one thread updates the value of any of the state variables, the other thread may not see it and it may read a stale value.
A very easy way do avoid this would be to make the state variables volatile. It's a weak synchronization primitive though, and does not provide atomic behavior like synchronized does.
Here's the proper way to make this class thread safe:
class User {
GuardedBy("this")String name;
GuardedBy("this")String phone;
public synchronized String getName() {
return name;
}
public synchronized String getPhone() {
return phone;
}
}
Note: Each state variable can use a different lock, depending upon your requirements. I have assumed that somehow both of these variables participate in an invariant together.
Weak synchronization:
class User {
volatile String name;
volatile String phone;
public String getName() {
return name;
}
public String getPhone() {
return phone;
}
}
For synchronized methods, every time a thread invokes them, it flushes its cache and reads the latest value from memory, and every time it exists a synchronized method, it puts the latest values of the variables in memory.
Stale reads can be even more dangerous with 64b double and long, as writes and reads to these data type in Java is not atomic, and can be done in 2 32b operations. This can lead to some very bad consequences.
Edit: Just saw that each thread will have its own copy of the object. In that case, no synchronization is needed.
Thread Safe Class means that every changes (getting/setting values) into your POJO class are made Thread Safely.
It can be achieved by synchronization mechanism.
The general solution is to use keyword synchronized on the methods or even on your any private logically used object for this purpose.
This keyword just locks the object and you are guaranteed that only one thread will be available to access this method at any given time.
But the best practice (optimized solution) is to reduce code critical section and don't always use synchronized for an easy/"fast" solution.

Do I have to extend class to ConcurrentHashMap or can I have variable ConcurrentHashMap for threadSafety

I am creating Socket based Server-Client reservation service, and have problem about class which will be accessed by multiple threads, does it need to Extend ConcurrentHashMap or is it enough to create variable ConcurrentHashMap to be thread safe?
I have two ideas but I am not sure if first one will work, so the first one would be creating class which only implements Serializable has variable date and then variable ConcurrentHashMap on which threads want to operate, second idea is to have class which extends Concurrent Hash Map and just is CHP but with addiontal variable to make sure it is distinguishable from others
public class Day implements Serializable {
private LocalDate date;
private ConcurrentHashMap<String, Boolean> schedule;
public Day(LocalDate date){
this.date = date;
this.schedule = new ConcurrentHashMap<>();
IntStream.range(10, 18).forEachOrdered(
n -> this.schedule.put(LocalTime.of(n, 0).toString(), TRUE));
}
public void changeaval(String key,Boolean status) {
this.schedule.replace(key,status);
}
public boolean aval(String key){
return this.schedule.get(key);
}
public LocalDate getDate(){return this.date;}
public ConcurrentHashMap getSchedule(){return this.schedule;}
}
I just want to have Class/Object which can be accessed by multiple threads and can be distinguishable from others/comparable and has ConcurrentHashMap which maps Int -> Boolean
This is the first time I am using Stack and It is my first project in Java so I don't know much sorry if something is not right.
There are basically two things to look out for when dealing with objects accessed by multiple threads:
Race condition - Due to thread scheduling by the operating system and instruction reordering optimizations by the compiler, the instructions are executed in a order not intended by the programmer causing bugs
Memory visibility - In a multi processor system, changes made by one processor is not always immediately visible to other processors. Processors keep things in their local registers and caches for performance reasons and therefore not visible to threads being executed by other processors.
Luckily we can handle both these situation using proper synchronizations.
Let's talk about this particular program.
Localdate by itself is an immutable and thread safe class. If we look at the source code of this class, we'd see that all the fields of this class are final. This means that as soon as the constructor of Localdate finishes initializing the object, the object itself will be visible across threads. But when it is assigned to a reference variable in a different object, whether the assignment (in other words, the content of the reference variable) would be visible to other threads or not is what we need to look out for.
Given the constructor in your case, we can ensure the visibility of the field date across threads provided date is either final or volatile. Since you are not modifying the date field in your class, you can very well make it final and that ensures safe initialization. If you later decide to have a setter method for this field (depending on your business logic and your design), you should make the field volatile instead of final. volatile creates a happens-before relationship which means that any instruction that is executed in the particular thread before writing to the volatile variable would be immediately visible to the other threads as soon as they read the same volatile variable.
Same goes for ConcurrentHashMap. You should make the field schedule final. Since ConcurrentHashMap by itself has all the necessary synchronizations in it, any value you set against a key would be visible to the other threads when they try to read it.
Note, however, that if you had some mutable objects as ConcurrentHashMap values instead of Boolean, you would have to design it in the same way as mentioned above.
Also, it may be good to know that there is a concept called piggy-backing which means that if one thread writes to all its fields and then writes to a volatile variable, everything written by the thread before writing to the volatile variable would be visible to the other threads, provided the other threads first read value of the volatile variable after it is written by the first thread. But when you do this you have to ensure very carefully the sequence of reading and writing and it is error prone. So, this is done when you want to squeeze out the last drop of performance from the piece of code which is rare. Favor safety, maintainability, readability before performance.
Finally, there is no race condition in the code. The only write that is happening is on the ConcurrentHashMap which is thread safe by itself.
Basically, both approaches are equivalent. From architectural point of view, making a variable inside dedicated class is preferred because of better control of which methods are accessible to the user. When extending, a user can access many methods of underlying ConcurrentHashMap and misuse them.

Is this HashMap usage thread safe?

I have a static HashMap which will cache objects identifed by unique integers; it will be accessed from multiple threads. I will have multiple instances of the type HashmapUser running in different threads, each of which will want to utilize the same HashMap (which is why it's static).
Generally, the HashmapUsers will be retrieving from the HashMap. Though if it is empty, it needs to be populated from a Database. Also, in some cases the HashMap will be cleared because it needs the data has change and it needs to be repopulated.
So, I just make all interactions with the Map syncrhonized. But I'm not positive that this is safe, smart, or that it works for a static variable.
Is the below implementation of this thread safe? Any suggestions to simplify or otherwise improve it?
public class HashmapUser {
private static HashMap<Integer, AType> theMap = new HashSet<>();
public HashmapUser() {
//....
}
public void performTask(boolean needsRefresh, Integer id) {
//....
AType x = getAtype(needsRefresh, id);
//....
}
private synchronized AType getAtype(boolean needsRefresh, Integer id) {
if (needsRefresh) {
theMap.clear();
}
if (theMap.size() == 0) {
// populate the set
}
return theMap.get(id);
}
}
As it is, it is definitely not thread-safe. Each instance of HashmapUsers will use a different lock (this), which does nothing useful. You have to synchronise on the same object, such as the HashMap itself.
Change getAtype to:
private AType getAtype(boolean needsRefresh, Integer id) {
synchronized(theMap) {
if (needsRefresh) {
theMap.clear();
}
if (theMap.size() == 0) {
// populate the set
}
return theMap.get(id);
}
}
Edit:
Note that you can synchronize on any object, provided that all instances use the same object for synchronization. You could synchronize on HashmapUsers.class, which also allows for other objects to lock access to the map (though it is typically best practice to use a private lock).
Because of this, simply making your getAtype method static would work, since the implied lock would now be HashMapUsers.class instead of this. However, this exposes your lock, which may or may not be what you want.
No, this won't work at all.
If you don't specify lock object, e.g. declare method synchronized, the implicit lock will be instance. Unless the method is static then the lock will be class. Since there are multiple instances, there are also multiple locks, which i doubt is desired.
What you should do is create another class which is the only class with the access to HashMap.
Clients of HashMap, such as the HashMapUser must not even be aware that there is synchronization in place. Instead, thread safety should be assured by the proper class wrapping the HashMap hiding the synchronization from the clients.
This lets you easily add additional clients to the HashMap since synchronization is hidden from them, otherwise you would have to add some kind of synchronization between the different client types too.
I would suggest you go with either ConcurrentHashMap or SynchronizedMap.
More info here: http://crunchify.com/hashmap-vs-concurrenthashmap-vs-synchronizedmap-how-a-hashmap-can-be-synchronized-in-java/
ConcurrentHashMap is more suitable for high - concurrency scenarios. This implementation doesn't synchronize on the whole object, but rather does that in an optimised way, so different threads, accessing different keys can do that simultaneously.
SynchronizerMap is simpler and does synchronization on the object level - the access to the instance is serial.
I think you need performance, so I think you should probably go with ConcurrentHashMap.

Which is more efficient and why?

Out of the below two synchronization strategy, which one is optimized (as in processing and generated byte code) and also the scenario in which one should use one of them.
public synchronized void addName(String name)
{
lastName = name;
nameCount++;
nameList.add(name);
}
or
public void addName(String name) {
synchronized(this) {
lastName = name;
nameCount++;
nameList.add(name);
}
}
Also what is advisiable way to handle concurrency:
using java.util.concurrent package
using the above low level methods
using Job or UIJob API (if working in eclipse PDE environment)
Thanks
which one is optimized (as in processing and generated byte code)
According to this IBM DeveloperWorks Article Section 1, a synchronized method generates less bytecode when compared to a synchronized block. The article explains why.
Snippet from the article:
When the JVM executes a synchronized method, the executing thread identifies that the method's method_info structure has the ACC_SYNCHRONIZED flag set, then it automatically acquires the object's lock, calls the method, and releases the lock. If an exception occurs, the thread automatically releases the lock.
Synchronizing a method block, on the
other hand, bypasses the JVM's
built-in support for acquiring an
object's lock and exception handling
and requires that the functionality be
explicitly written in byte code. If
you read the byte code for a method
with a synchronized block, you will
see more than a dozen additional
operations to manage this
functionality. Listing 1 shows calls
to generate both a synchronized method
and a synchronized block:
Edited to address first comment
To give other SOers credit, here is a good discussion about why one would use a sync. block. I am sure you can find more interesting discussions if you search around :)
Is there an advantage to use a Synchronized Method instead of a Synchronized Block?
I personally have not had to use a sync. block to lock on another object other than this, but that is one use SOers point out about sync. blocks.
Your updated two pieces of code are semantically identical. However, using a synchronized block as in the second piece allows you more control, as you could synchronize on a different object or, indeed, not synchronize parts of the method that don't need to be.
Using java.util.concurrent is very much preferrable to using synchronization primitives wherever possible, since it allows you to work at a higher level of abstraction, and use code that was written by very skilled people and tested intensively.
If you're working in eclipse PDE, using its APIs is most likely preferrable, as it ties in with the rest of the platform.
This totally does not matter from any efficiency point of view.
The point of having blocks is you can specify your own lock. You can choose a lock that is encapsulated within the object, as opposed to using this, with the consequence that you have more control over who can acquire the lock (since you can make that lock inaccessible from outside your object).
If you use this as the lock (whether you put synchronized on the method or use the block), anything in your program can acquire the lock on your object, and it's much harder to reason about what your program is doing.
Restricting access to the lock buys you a massive gain in decidability, it's much more beneficial to have that kind of certainty than to shave off a bytecode somewhere.
I know this might be an example, but if you plan on writing such code - think again.
To me it looks like you are duplicating information, and you should not do that unless you see that you need to do performance changes to your code. (Which you almost never should do).
If you really need this to be code that run in several threads, I'd make the nameList into a synchronized list using Collections.synchronizedList.
The last name should be a getter and it could pick the last element in the list.
The nameCount should be the size of the list.
If you do stuff like you have done now, you must also synchronize the access to all of the places where the variables are referenced, and that would make the code a lot less readable and harder to maintain.
You could remove all locking:
class Names {
AtomicReference<Node> names = new AtomicReference<Node>();
public void addName(final String name) {
Node old = names.get();
while (!names.compareAndSet(old, new Node(old, name))) {
old = names.get();
}
}
public String getName() {
final Node node = names.get();
return (node == null) ? null : node.name;
}
static class Node {
final Node parent;
final String name;
Node(final Node parent, final String name) {
this.parent = parent;
this.name = name;
}
int count() {
int count = 0;
Node p = parent;
while (p != null) {
count++;
p = p.parent;
}
return count;
}
}
}
This is basically a Treiber stack implementation. You can get the size, the current name, and you can easily implement an Iterator (albeit reverse to the one in your example) over the contents. Alternative copy-on-write containers could be used as well, depending on your needs.
Impossible to say, since the two code snippets arent equivalent.
The difference (the lack of synchronization of the call to add) may be significant, it might not be. From what you've given us its impossible to say.

Java. How to properly synchronize getters and setters?

If I have several mutable properties in an object that will be acted upon by several threads, I understand they should be synchronized.
class Doggie {
private String name;
private int age;
public void setName(String name) { this.name = name; }
public String getName() { return this.name; }
public void setAge(int age) { this.age = age; }
public int getAge() { return this.age; }
}
Questions:
Are not return and assignment atomic operations in Java?
Since properties might not necessarily be interrelated, it does not always make sense to synchronize with the same lock. How to organize the locking structure?
Is it better to go with the intrinsic lock or a private Object lock pattern?
Are not return and assignment atomic operations in Java?
Yes they are atomic (in some cases at least), but atomicity is not the only issue. Another important issue is whether the action of a write to an attribute by one thread is guaranteed to be visible to a following read for the same attribute made by a different thread.
When the reads and writes are in the same thread, the read is guaranteed to see the earlier write.
When the reads and writes are in different threads, the read is only guaranteed to see the earlier write if the two threads synchronize properly ... or if the attribute is declared as volatile.
Note that primitive locks/mutexes are not the only way to synchronize.
Since properties might not necessarily be interrelated, it does not always make sense to synchronize with the same lock. How to organize the locking structure?
It makes sense to use multiple locks if (and only if) lock contention is likely. In your example, lock contention is only likely to be an issue if some Doggie instance receives a very high rate of get and/or set operations.
Is it better to go with the intrinsic lock or a private Object lock pattern?
It depends. If your application is going use the Doggie object's primitive lock, then you might get lock contention or even unintended locking out of get and set operations. In that case a private lock might be advisable. Otherwise, a private lock is an unnecessary overhead.
Your example begs for an immutable object. http://java.sun.com/docs/books/tutorial/essential/concurrency/imstrat.html
Operations with references are atomic, but not volatile - you will always see the old value or the new value, but there's no guarantee you'll see the new value without some sort of memory barrier. I can't remember the details of which primitives are guaranteed to be atomic - probably all but long and double.
Personally I'd use a single private lock until I saw any evidence that it was a bottleneck. I would advise against locking on "this" as other code might lock on it too. If you're the only code that knows about the lock, it's harder to get interference. Having said that, if callers want to atomically change more than one property, you may want to expose the lock via a property.
Do you definitely need a threadsafe mutable type? If you could avoid that requirement it would make life simpler.
They are atomic operations, but picture a scenario where two clients are trying to get and set a piece of data at the same time. There is no guarantee as to which order things are going to be called which could greatly affect the results of your application. (The classic example is money transactions.)
It may or may not make sense to synchronize with the same lock - that really depends on your application. However, it typically is not a good idea to lock the entire object out if it is not necessary.
As with what Jon said, start with a single, private lock and go from there depending on results.
You're right to take note that non-interrelated properties can have different locks. Considering that locking objects require trivial memory, I would personally go with a lock per property instead of one for the entire object.
The lightweight way to do this is just to have a boolean that's set while the property is being written to and clear otherwise. The heavyweight way to do this, to support timeouts etc., is with a mutex.

Categories

Resources