What will occur if I would use non final ConcurrentHashMap - java

I read somewhere that even if ConcurrentHashMap is guaranteed to be safe for using in multiple threads it should be declared as final, even private final. My questions are the following:
1) Will CocurrentHashMap still keep thread safety without declaring it as final?
2) The same question about private keyword. Probably it's better to ask more general question - do public/private keywords affect on runtime behavior? I understand their meaning in terms of visibility/usage in internal/external classes but what about meaning in the context of multithreading runtime? I believe code like public ConcurrentHashMap may be incorrect only in coding style terms not in runtime, am I right?

It might be helpful to give a more concrete example of what I was talking about in the comments. Let's say I do something like this:
public class CHMHolder {
private /*non-final*/ CHMHolder instance;
public static CHMHolder getInstance() {
if (instance == null) {
instance = new CHMHolder();
}
return instance;
}
private ConcurrentHashMap<String, String> map = new ConcurrentHashMap<>();
public ConcurrentHashMap<String, String> getMap() {
return map;
}
}
Now, this is not thread-safe for a whole bunch of reasons! But let's say that threadA sees a null value for instance and thus instantiates the CHMHolder, and then threadB, by a happy coincidence, sees that same CHMHolder instance (which is not guaranteed, since there's no synchronization). You would think that threadB sees a non-null CHMHolder.map, right? It might not, since there's no formal happens-before edge between threadA's map = new ... and threadB's return map.
What this means in practice is that something like CHMHolder.getInstance().getMap().isEmpty() could throw a NullPointerException, which would be confusing — after all, getInstance looks like it should always return a non-null CHMHolder, and CHMHolder looks like it should always have a non-null map. Ah, the joys of multithreading!
If map were marked final, then the JLS bit that user2864740 referenced applies. That means that if threadB sees the same instance that threadA sees (which, again, it might not), then it'll also see the map = new... action that threadA did -- that is, it will see the non-null CHM instance. Once it sees that, CHM's internal thread safety will be enough to ensure safe access.

final and private say nothing about the thread-safety (or lack thereof) of the object named by said variable. (They modify the variable, not the object.) Anyway ..
The variable will be consistent across threads if it is a final field:
An object is considered to be completely initialized when its constructor finishes. A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields.
The actual ConcurrentHashMap object is "thread safe" insofar as the guarantees it makes. In particular, only single method calls/operations are guaranteed and as such using larger synchronization code may be required .. which is easily controlled if the CHM is only accessible from the object that created it.
Using private is normally considered good because it prevents other code from "accidently" accessing a variable (and thus the object it names) when they should not. However, the private modifier does not establish the same happens-before guarantee as the final modifier and is thus orthogonal to thread-safety.

Related

Are objects that have no state always visible when published?

Say I have a class
public class Foo {
public void printHi() {
System.out.print("Hi");
}
}
and in some client code I do something like
public static void main() {
Foo foo = new Foo();
(new Thread(() -> {foo.printHi();})).start();
}
and take away the happens-before guarantee for calling Thread Start.
Then is it possible that the Foo reference might not be visible to that thread using it or worse, the method that belongs to that class is not visible, but the Foo reference is visible. I am not sure how method is stored in an object like fields, but this assumes that it is just something in memory belonging to that object, so it might have visibility issues, but that I am not sure of. Can someone also explain that part to me?
I ask this because Foo is immutable, and in the JCIP book Goetz says that
"Immutable objects, on the other hand, can be safely accessed even when synchronization is not used to publish the object reference. For this guarantee of initialization safety to hold, all of the requirements for immutability must be met: unmodi-fiable state, all fields are final, and proper construction" (Goetz, 3.5.2)
However, it doesn't have any final fields, so does it count as if all fields are final? Since no fields = all fields?
There are different ways to get to the same answer.
Your object is immutable, as it bears no state that can be modified.
All of its fields are final, as there is no field that is not final.
There is no possible race condition, as there are no data that could be modified while being accessed.
Even if there were some non-final fields declared in Foo, the invocation of printHi(), which does not read the object’s state, bears no potential data race. Note that this only applies to exact instances of Foo, produced by new Foo(…) expressions, as subclasses could override printHi() to access shared mutable state.
It’s important to emphasize that race condition are about the shared mutable data, not necessarily objects. So if printHi() accesses a shared static variable of a different class, it could produce a data race, even if the Foo instance is immutable and/or properly published. The code invoking foo.printHi() in another thread is safe, if printHi() does not access shared mutable state (or only using proper guarding).
As Elliott Frisch mentioned, a lambda expression behaves like an immutable object anyway, so the code would be safe even without the happens-before relationship of Thread.start() or the immutability of Foo (assuming the Foo instance is not modified afterwards).
foo must be (effectively) final to use here.
Foo foo = null; // <-- for example,
foo = new Foo();
(new Thread(() -> {
foo.printHi(); // <-- compiler error
})).start();

Instance methods and thread-safety of instance variables

I would like to known if each instance of a class has its own copy of the methods in that class?
Lets say, I have following class MyClass:
public MyClass {
private String s1;
private String s2;
private String method1(String s1){
...
}
private String method2(String s2){
...
}
}
So if two differents users make an instance of MyClass like:
MyClass instanceOfUser1 = new MyClass();
MyClass instanceOfUser2 = new MyClass();
Does know each user have in his thread a copy of the methods of MyClass? If yes, the instance variables are then thread-safe, as long as only the instance methods manipulate them, right?
I am asking this question because I often read that instance variables are not thread-safe. And I can not see why it should be like that, when each user gets an instance by calling the new operator?
Each object gets its own copy of the class's instance variables - it's static variables that are shared between all instances of a class. The reason that instance variables are not necessarily thread-safe is that they might be simultaneously modified by multiple threads calling unsynchronized instance methods.
class Example {
private int instanceVariable = 0;
public void increment() {
instanceVariable++;
}
}
Now if two different threads call increment at the same then you've got a data race - instanceVariable might increment by 1 or 2 at the end of the two methods returning. You could eliminate this data race by adding the synchronized keyword to increment, or using an AtomicInteger instead of an int, etc, but the point is that just because each object gets its own copy of the class's instance variables does not necessarily mean that the variables are accessed in a thread-safe manner - this depends on the class's methods. (The exception is final immutable variables, which can't be accessed in a thread-unsafe manner, short of something goofy like a serialization hack.)
Issues with multi-threading arise primarily with static variables and instances of a class being accessed at the same time.
You shouldn't worry about methods in the class but more about the fields (meaning scoped at the class level). If multiple references to an instance of a class exist, different execution paths may attempt to access the instance at the same time, causing unintended consequences such as race conditions.
A class is basically a blueprint for making an instance of an object. When the object is instantiated it receives a spot in memory that is accessed by a reference. If more than one thread has a handle to this reference it can cause occurrences where the instance is accessed simultaneously, this will cause fields to be manipulated by both threads.
'Instance Variables are not thread safe' - this statement depends on the context.
It is true, if for example you are talking about Servlets. It is because, Servlets create only one instance and multiple threads access it. So in that case Instance Variables are not thread safe.
In the above simplified case, if you are creating new instance for each thread, then your instance variables are thread safe.
Hope this answers your question
A method is nothing but a set of instructions. Whichever thread calls the method, get a copy of those instructions. After that the execution begins. The method may use local variables which are method and thread-scoped, or it may use shared resources, like static resources, shared objects or other resources, which are visible across threads.
Each instance has its own set of instance variables. How would you detect whether every instance had a distinct "copy" of the methods? Wouldn't the difference be visible only by examining the state of the instance variables?
In fact, no, there is only one copy of the method, meaning the set of instructions executed when the method is invoked. But, when executing, an instance method can refer to the instance on which it's being invoked with the reserved identifier this. The this identifier refers to the current instance. If you don't qualify an instance variable (or method) with something else, this is implied.
For example,
final class Example {
private boolean flag;
public void setFlag(boolean value) {
this.flag = value;
}
public void setAnotherFlag(Example friend) {
friend.flag = this.flag;
}
}
There's only one copy of the bytes that make up the VM instructions for the setFlag() and setAnotherFlag() methods. But when they are invoked, this is set to the instance upon which the invocation occurred. Because this is implied for an unqualified variable, you could delete all the references to this in the example, and it would still function exactly the same.
However, if a variable is qualified, like friend.flag above, the variables of another instance can be referenced. This is how you can get into trouble in a multi-threaded program. But, as long as an object doesn't "escape" from one thread to be visible to others, there's nothing to worry about.
There are many situations in which an instance may be accessible from multiple classes. For example, if your instance is a static variable in another class, then all threads would share that instance, and you can get into big trouble that way. That's just the first way that pops into my mind...

Constructor synchronization in Java

Someone somewhere told me that Java constructors are synchronized so that it can't be accessed concurrently during construction, and I was wondering: if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
Let me demonstrate with some code:
public class Test {
private static final Map<Integer, Test> testsById =
Collections.synchronizedMap(new HashMap<>());
private static final AtomicInteger atomicIdGenerator = new AtomicInteger();
private final int id;
public Test() {
this.id = atomicIdGenerator.getAndIncrement();
testsById.put(this.id, this);
// Some lengthy operation to fully initialize this object
}
public static Test getTestById(int id) {
return testsById.get(id);
}
}
Assume that put/get are the only operations on the map, so I won't get CME's via something like iteration, and try to ignore other obvious flaws here.
What I want to know is if another thread (that's not the one constructing the object, obviously) tries to access the object using getTestById and calling something on it, will it block? In other words:
Test test = getTestById(someId);
test.doSomething(); // Does this line block until the constructor is done?
I'm just trying to clarify how far the constructor synchronization goes in Java and if code like this would be problematic. I've seen code like this recently that did this instead of using a static factory method, and I was wondering just how dangerous (or safe) this is in a multi-threaded system.
Someone somewhere told me that Java constructors are synchronized so that it can't be accessed concurrently during construction
This is certainly not the case. There is no implied synchronization with constructors. Not only can multiple constructors happen at the same time but you can get concurrency issues by, for example, forking a thread inside of a constructor with a reference to the this being constructed.
if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
No it won't.
The big problem with constructors in threaded applications is that the compiler has the permission, under the Java memory model, to reorder the operations inside of the constructor so they take place after (of all things) the object reference is created and the constructor finishes. final fields will be guaranteed to be fully initialized by the time the constructor finishes but not other "normal" fields.
In your case, since you are putting your Test into the synchronized-map and then continuing to do initialization, as #Tim mentioned, this will allow other threads to get ahold of the object in a possibly semi-initialized state. One solution would be to use a static method to create your object:
private Test() {
this.id = atomicIdGenerator.getAndIncrement();
// Some lengthy operation to fully initialize this object
}
public static Test createTest() {
Test test = new Test();
// this put to a synchronized map forces a happens-before of Test constructor
testsById.put(test.id, test);
return test;
}
My example code works since you are dealing with a synchronized-map, which makes a call to synchronized which ensures that the Test constructor has completed and has been memory synchronized.
The big problems in your example is both the "happens before" guarantee (the constructor may not finish before Test is put into the map) and memory synchronization (the constructing thread and the get-ing thread may see different memory for the Test instance). If you move the put outside of the constructor then both are handled by the synchronized-map. It doesn't matter what object it is synchronized on to guarantee that the constructor has finished before it was put into the map and the memory has been synchronized.
I believe that if you called testsById.put(this.id, this); at the very end of your constructor, you may in practice be okay however this is not good form and at the least would need careful commenting/documentation. This would not solve the problem if the class was subclassed and initialization was done in the subclass after the super(). The static solution I showed is a better pattern.
Someone somewhere told me that Java constructors are synchronized
'Somebody somewhere' is seriously misinformed. Constructors are not synchronized. Proof:
public class A
{
public A() throws InterruptedException
{
wait();
}
public static void main(String[] args) throws Exception
{
A a = new A();
}
}
This code throws java.lang.IllegalMonitorStateException at the wait() call. If there was synchronization in effect, it wouldn't.
It doesn't even make sense. There is no need for them to be synchronized. A constructor can only be invoked after a new(), and by definition each invocation of new() returns a different value. So there is zero possibility of a constructor being invoked by two threads simultaneously with the same value of this. So there is no need for synchronization of constructors.
if I have a constructor that stores the object in a map, and another thread retrieves it from that map before its construction is finished, will that thread block until the constructor completes?
No. Why would it do that? Who's going to block it? Letting 'this' escape from a constructor like that is poor practice: it allows other threads to access an object that is still under construction.
You've been misinformed. What you describe is actually referred to as improper publication and discussed at length in the Java Concurrency In Practice book.
So yes, it will be possible for another thread to obtain a reference to your object and begin trying to use it before it is finished initializing. But wait, it gets worse consider this answer: https://stackoverflow.com/a/2624784/122207 ... basically there can be a reordering of reference assignment and constructor completion. In the example referenced, one thread can assign h = new Holder(i) and another thread call h.assertSanity() on the new instance with timing just right to get two different values for the n member that is assigned in Holder's constructor.
constructors are just like other methods, there's no additional synchronization (except for handling final fields).
the code would work if this is published later
public Test()
{
// Some lengthy operation to fully initialize this object
this.id = atomicIdGenerator.getAndIncrement();
testsById.put(this.id, this);
}
Although this question is answered but the code pasted in question doesn't follow safe construction techniques as it allows this reference to escape from constructor , I would like to share a beautiful explanation presented by Brian Goetz in the article: "Java theory and practice: Safe construction techniques" at the IBM developerWorks website.
It's unsafe. There are no additional synchronization in JVM. You can do something like this:
public class Test {
private final Object lock = new Object();
public Test() {
synchronized (lock) {
// your improper object reference publication
// long initialization
}
}
public void doSomething() {
synchronized (lock) {
// do something
}
}
}

Why is this static final variable in a singleton thread-safe?

Reading this site, I've found this:
[The] line private static final Foo INSTANCE = new Foo(); is only executed when the class is actually used, this takes care of the lazy instantiation, and is it guaranteed to be thread safe.
Why this guaranteed to be thread safe? Because this field is final? Or for some other reason?
Because it's final, yes. Final variables have special thread-safety semantics, in that other threads are guaranteed to see the final field in at least the state it was in when its constructor finished.
This is in JLS 17.5, though the language there is a bit dense. These semantics were introduced in Java 1.5, in particular by JSR-133. See this page for a non-spec discussion of JSR-133 and its various implications.
Note that if you modify the instance after its constructor, that is not necessarily thread safe. In that case, you have to take the usual thread safety precautions to ensure happens-before edges.
I'm fairly sure (though not quite 100%) that the fact that only one thread does the class initialization is not a factor here. It's true that the class is only initialized by one thread, but I don't believe there are any specific happens-before edges established between that thread any any other thread that uses the class (other than that other thread not having to re-initialize the class). So, without the final keyword, another thread would be able to see a partially-constructed instance of the object. The specific happens-before edges the JMM defines are in JLS 17.4.5, and class initialization is not listed there.
Class constructors and static/instance initializers are guaranteed to be atomically executed and since private static final FOO INSTANCE = new FOO; is equivalent to
private static final FOO INSTANCE;
static{
INSTANCE = new FOO();
}
this case falls in the above category.
It is guaranteed to be thread safe because the JVM guarantees that static initializers are executed on a single thread.
It doesn't mean that the instance of Foo is internally thread safe- it just means that you are guaranteed that the constructor of Foo will be called exactly once, on one thread, via this particular code path.
The static initialisation block of any class is guaranteed to be single threaded. A simpler singleton is to use an enum
enum Singleton {
INSTANCE;
}
This is also threads safe and the class lazy-initialised.

Final variable and synchronized block in java

What is a final variable in Java? For example: if I write final int temp; in function what is the meaning of the final keyword?
Also, when would I want to use final variable (both as a class variable and as a function variable)?
Why must variables in a synchronized block be declared final?
Final variables and synchronized code blocks do have something in common... If you declare non-final variable a and then write synchronized (a) { System.out.println('xxx'); } you will get warning "Synchronization on non-final field" - at least in NetBeans.
Why you should not be synchronizing on non-final field? Because if field value may change, then different threads may be synchronizing on different objects (different values of the field) - so there could be no synchronization at all (every thread may enter synchronized block at the same time).
Look here for example of real-life trouble caused by synchronizing on non-final field: http://forums.sun.com/thread.jspa?threadID=5379204
Basically it just means you can't change the value. For instance variables, you have to assign any final variables once (and only once) in the constructor (or with a variable initializer). Synchronization is a pretty orthogonal concept.
The primary reason for making a local variable final is so you can use it in an anonymous inner class... this has nothing to do with being in a synchronized block.
Final variables are useful for immutable classes, admittedly - and immutability makes life easier in a multi-threaded environment - but that's the only relationship between the two that I can think of...
EDIT: Wildwezyr's comment makes sense in terms of not changing the variable on which you are synchronizing. That would be dangerous, for the reasons he's given. Is that what you meant by "variable in synchronized block"?
In addition to what Jon Skeet said, the value can't be changed but the contents may be changed.
final Integer one = new Integer(1);
...
one = new Integer(2); // compile error
final List list = new ArrayList();
...
list = new ArrayList(); // compile error
list.add(1); // Changes list, but that's fine!
Also be aware that final and static final are not the same. final is within the scope of the instance, whereas static final is the same for all instances of a class (in other languages this could be called a constant).
Personally I think the advantage of final, even when not absolutely required to get your software working, is in the semantical meaning. It offers you the possibility to say to the compiler and the next person working on that code that this variable is not meant to be changed, and that trying to change it could result in a bug.

Categories

Resources