Is initialization of objects Thread Safe in Java - java

I've written similar code to this in one of my applications but I'm not sure whether it is thread safe.
public class MyClass {
private MyObject myObject = new MyObject();
public void setObject(MyObject o) {
myObject = o;
}
public MyObject getObject() {
return myObject;
}
}
The setObject() and getObject() methods will be called by different threads. The getObject() method is going to be called by a thread that keeps drawing on a Canvas. For optimum FPS and smooth motion, I don't want that thread to be kept waiting for a synchronization lock. Hence, I want to avoid using synchronization unless it is really necessary. So is it really necessary here? Or is there any other better way to solve this problem?
And by the way, it doesn't matter if the thread receives an older copy of the object.

As for the status of your current version, it is definitely not thread-safe because concurrent access of myObject will establish a data race.
You didn't specify this, but if MyObject is not thread-safe itsef, then your program will not be thread-safe, regardless of what you do to the code you have shown.
it doesn't matter if the thread receives an older copy of the object.
The Java Memory Model allows much worse things than that to happen to objects accessed via a data race:
a thread may receive always the same object (the first one it happened to read);
a thread may observe the object with only some of the reachable values initialized (a torn object).
For optimum FPS and smooth motion, I don't want that thread to be kept waiting for a synchronization lock.
Have you spent any effort to actually measure how much time your threads are waiting for the lock? My guess: you didn't because that time is so short as to be undetectable.
However, your case doesn't even call for locks: just making your instance variable volatile will be enough to guarantee safe sharing of the object between threads.

No it's not thread safe - this could happen:
a thread may not see the latest version of myObject when calling getObject (which you can live with apparently)
a thread may never see any updates made to myObject by other threads when calling getObject
a thread may see an updated reference to a MyObject that is in a inconsistent state (partially constructed for example)
The easiest way to solve these issues is to mark myObject volatile.

You actually have a number of complications here:
myObject needs to be volatile. (Otherwise other threads may NEVER see changes).
The initial value of myObject will be fully constructed before MyClass is accessed so that is safe in this case, however in general you need to be careful about combining construction of objects and multi threading.

Yes, it is thread-safe under the said conditions, so long as you mark myObject as volatile. You will always get a correct MyObject instance from getObject().

You should make the shared variable volatile, to let a thread know that other threads/processes/etc may change its value.
Beside that, there's no concurrency issue in your code.
The second line, which creates an instance of MyObject when an instance of MyClass is created, is perfectly fine. No one will have access to the shared variable until the instance of MyObject is fully constructed (unless you leak the shared variable from within the constructor).
The setObject method is also fine - all it does is assign an object to the shared variable myObject. And since assignments are atomic, there's nothing to worry about.

Related

What is the use of ThreadLocal?

What is the use of ThreadLocal when a Thread normally works on variable keeping it in its local cache ?
Which means thread1 do not know the value of same var in thread2 even if no ThreadLocal is used .
With multiple threads, although you have to do work to make sure you read the "most recent" value of a variable, you expect there to be effectively one variable per instance (assuming we're talking about instance fields here). You might read an out of date value unless you're careful, but basically you've got one variable.
With ThreadLocal, you're explicitly wanting to have one value per thread that reads the variable. That's typically for the sake of context. For example, a web server with some authentication layer might set a thread-local variable early in request handling so that any code within the execution of that request can access the authentication details, without needing any explicit reference to a context object. So long as all the handling is done on the one thread, and that's the only thing that thread does, you're fine.
A thread doesn't have to keep variables in its local cache -- it's just that it's allowed to, unless you tell it otherwise.
So:
If you want to force a thread to share its state with other threads, you have to use synchronization of some sort (including synchronized blocks, volatile variables, etc).
If you want to prevent a thread from sharing its state with other threads, you have to use ThreadLocal (assuming the object that holds the variable is known to multiple threads -- if it's not, then everything is thread-local anyway!).
It's kind of a global variable for the thread itself, so that any code running in the thread can access it directly. (A "really" global variable can be accessed by any code running in the "process"; we could call it ProcessLocal:)
Is global variable bad? Maybe; it should be avoided if we can. But sometimes we have no choice, we cannot pass the object through method parameters, and ThreadLocal proves to be useful in many designs without causing too much trouble.
Use of ThreadLocal is when an object is not thread-safe, but you want to avoid synchronizing access. So each thread stores data on its own Thread local storage memory. By default, data is shared between threads.

Synchronization, When to or not to use?

I have started learning concurrency and threads in Java. I know the basics of synchronized (i.e. what it does). Conceptually I understand that it provides mutually exclusive access to a shared resource with multiple threads in Java. But when faced with an example like the one below I am confused about whether it is a good idea to have it synchronized. I know that critical sections of the code should be synchronized and this keyword should not be overused or it effects the performance.
public static synchronized List<AClass> sortA(AClass[] aArray)
{
List<AClass> aObj = getList(aArray);
Collections.sort(aObj, new AComparator());
return aObj;
}
public static synchronized List<AClass> getList(AClass[] anArray)
{
//It converts an array to a list and returns
}
Assuming each thread passes a different array then no synchronization is needed, because the rest of the variables are local.
If instead you fire off a few threads all calling sortA and passing a reference to the same array, you'd be in trouble without synchronized, because they would interfere with eachother.
Beware, that it would seem from the example that the getList method returns a new List from an array, such that even if the threads pass the same array, you get different List objects. This is misleading. For example, using Arrays.asList creates a List backed by the given array, but the javadoc clearly states that Changes to the returned list "write through" to the array. so be careful about this.
Synchronization is usually needed when you are sharing data between multiple invocations and there is a possibility that the data would be modified resulting in inconsistency. If the data is read-only then you dont need to synchronize.
In the code snippet above, there is no data that is being shared. The methods work on the input provided and return the output. If multiple threads invoke one of your method, each invocation will have its own input and output. Hence, there is no chance of in-consistency anywhere. So, your methods in the above snippet need not be synchornized.
Synchronisation, if unnecessarily used, would sure degrade the performance due to the overheads involved and hence should be cautiously used only when required.
Your static methods don't depend on any shared state, so need not be synchronized.
There is no rule defined like when to use synchronized and when not, when you are sure that your code will not be accessed by concurrent threads then you can avoid using synchronised.
Synchronization as you have correctly figured has an impact on the throughput of your application, and can also lead to starving thread.
All get basically should be non blocking as Collections under concurrency package have implemented.
As in your example all calling thread will pass there own copy of array, getList doesn't need to be synchronized so is sortA method as all other variables are local.
Local variables live on stack and every thread has its own stack so other threads cannot interfere with it.
You need synchronization when you change the state of the Object that other threads should see in an consistent state, if your calls don't change the state of the object you don't need synchronization.
I wouldn't use synchronized on single threaded code. i.e. where there is no chance an object will be accessed by multiple threads.
This may appear obvious but ~99% of StringBuffer used in the JDK can only be used by one thread can be replaced with a StringBuilder (which is not synchronized)

How many threads can simultaneously invoke an unsynchronized method of an object?

So let's say I have a class X with a method m. Method m is NOT synchronized and it doesn't need to be since it doesn't really change the state of the object x of type X.
In some threads I call the method like this: x.m(). All these threads use the same object x.
By how many threads can be this method (method m) called on object x simultaneously?
Can be the fact that the method is called by, let's say, 100 threads a bottleneck for my application?
thanks.
Other's have answered your direct question.
I'd like to clear up something that is could be a misconception on your part ... and if it is, it is a dangerous one.
Method m is NOT synchronized and it doesn't need to be since it doesn't really change the state of the object x of type X.
That is not a sufficient condition. Methods that don't change state typically need to be synchronized too.
Suppose that you have a class Test with a simple getter and setter:
public class Test {
private int foo;
public int getFoo() {
return foo;
}
public synchronized void setFoo(int foo) {
this.foo = foo;
}
}
Is the getter thread-safe?
According to your rule, yes.
In reality, no.
Why? Because unless the threads that call getFoo and setFoo synchronize properly, a call to getFoo() after a call to setFoo(...) may see a stale value for foo.
This is one of those nasty cases where you will get away with it nearly all of the time. But very occasionally, the timing of the two calls will be such that the bug bites you. This kind of bug is likely to slip through the cracks of your testing, and be very difficult to reproduce when it occurs in production.
The only case where it absolutely safe to access an object's state from multiple threads without synchronizing is when the state is declared as final, AND the constructor doesn't publish the object.
If you have more threads in the runnable state than you have physical cores, you'll end up wasting time by context switching... but that's about it. The fact that those threads are executing the same method is irrelevant if there's no coordination between them.
Remember the difference between threads and instances. One is executing the other is data. If the data is not under some locking mechanism, or some resource constraints then the access is only limited by the number of threads that can run by the underlying infrastructure. This is a system (jvm implementation + OS + machine) limitation.
Yep, an unsynchronized method doesn't "care" how many threads are invoking it. It's a purely passive entity and nothing special occurs when a new thread enters it.
Perhaps one thing that confuses some people is the "auto" storage used by a method. This storage is allocated on the thread's stack, and does not require the active participation of the method. The method's code is simply given a pointer to the storage.
(Many, many moons ago, it wasn't thus. Either the "auto" storage was allocated from heap when the method was called, or the method maintained a list of "auto" storage areas. But that paradigm disappeared maybe 40 years ago, and I doubt that there is any system in existence that still uses it. And I'm certain that no JVM uses the scheme.)
You'd have a bottleneck if one thread acquired a resource that others needed and held onto it for a long-running operation. If that isn't the situation for your method, I don't see how you'll experience a bottle.
Is this a theoretical question, or are you observing behavior in a real application that's running more slowly than you think it should?
The best answer of all is to get some data and see. Run a test and monitor it. Be a scientist.

Why is java.lang.ThreadLocal a map on Thread instead on the ThreadLocal?

Naively, I expected a ThreadLocal to be some kind of WeakHashMap of Thread to the value type. So I was a little puzzled when I learned that the values of a ThreadLocal is actually saved in a map in the Thread. Why was it done that way? I would expect that the resource leaks associated with ThreadLocal would not be there if the values are saved in the ThreadLocal itself.
Clarification: I was thinking of something like
public class AlternativeThreadLocal<T> {
private final Map<Thread, T> values =
Collections.synchronizedMap(new WeakHashMap<Thread, T>());
public void set(T value) { values.put(Thread.currentThread(), value); }
public T get() { return values.get(Thread.currentThread());}
}
As far as I can see this would prevent the weird problem that neither the ThreadLocal nor it's left over values could ever be garbage-collected until the Thread dies if the value somehow strongly references the ThreadLocal itself.
(Probably the most devious form of this occurs when the ThreadLocal is a static variable on a class the value references. Now you have a big resource leak on redeployments in application servers since neither the objects nor their classes can be collected.)
Sometimes you get enlightened by just asking a question. :-) Now I just saw one possible answer: thread-safety. If the map with the values is in the Thread object, the insertion of a new value is trivially thread-safe. If the map is on the ThreadLocal you have the usual concurrency issues, which could slow things down. (Of course you would use a ReadWriteLock instead of synchronize, but the problem remains.)
You seem to be misunderstanding the problem of ThreadLocal leaks. ThreadLocal leaks occur when the same thread is used repeatedly, such as in a thread pool, and the ThreadLocal state is not cleared between usages. They're not a consequence of the ThreadLocal remaining when the Thread is destroyed, because nothing references the ThreadLocal Map aside from the thread itself.
Having a weakly reference map of Thread to thread-local objects would not prevent the ThreadLocal leak problem because the thread still exists in the thread pool, so the thread-local objects are not eligible for collection when the thread is reused from the pool. You'd still need to manually clear the ThreadLocal to avoid the leak.
As you said in your answer, concurrency control is simplified with the ThreadLocal Map being a single instance per thread. It also makes it impossible for one thread to access another's thread local objects, which might not be the case if the ThreadLocal object exposed an API on the Map you suggest.
I remember some years ago Sun changed the implementation of thread locals to its current form. I don't remember what version it was and what the old impl was like.
Anyway, for a variable that each thread should have a slot for, Thread is the natural container of choice. If we could, we would also add our thread local variable directly as a member of Thread class.
Why would the Map be on ThreadLocal? That doesn't make a lot of sense. So it'd be a Map of ThreadLocals to objects inside a ThreadLocal?
The simple reason it's a Map of Threads to Objects is because:
It's an implementation detail ie that Map isn't exposed in any way;
It's always easy to figure out the current thread (with Thread.currentThread()).
Also the idea is that a ThreadLocal can store a different value for each Thread that uses it so it makes sense that it is based on Thread, doesn't it?

Threadsafe publishing of java object structure?

Assuming that I have the following code:
final Catalog catalog = createCatalog();
for (int i = 0; i< 100; i++{
new Thread(new CatalogWorker(catalog)).start();
}
"Catalog" is an object structure, and the method createCatalog() and the "Catalog" object structure has not been written with concurrency in mind. There are several non-final, non-volatile references within the product catalog, there may even be mutable state (but that's going to have to be handled)
The way I understand the memory model, this code is not thread-safe. Is there any simple way to make it safe ? (The generalized version of this problem is really about single-threaded construction of shared structures that are created before the threads explode into action)
No, there's no simple way to make it safe. Concurrent use of mutable data types is always tricky. In some situations, making each operation on Catalog synchronized (preferably on a privately-held lock) may work, but usually you'll find that a thread actually wants to perform multiple operations without risking any other threads messing around with things.
Just synchronizing every access to variables should be enough to make the Java memory model problem less relevant - you would always see the most recent values, for example - but the bigger problem itself is still significant.
Any immutable state in Catalog should be fine already: there's a "happens-before" between the construction of the Catalog and the new thread being started. From section 17.4.5 of the spec:
A call to start() on a thread
happens-before any actions in the
started thread.
(And the construction finishing happens before the call to start(), so the construction happens before any actions in the started thread.)
You need to synchronize every method that changes the state of Catalog to make it thread-safe.
public synchronized <return type> method(<parameter list>){
...
}
Assuming you handle the "non-final, non-volatile references [and] mutable state" (presumably by not actually mutating anything while these threads are running) then I believe this is thread-safe. From the JSR-133 FAQ:
When one action happens before
another, the first is guaranteed to be
ordered before and visible to the
second. The rules of this ordering are
as follows:
Each action in a thread happens before every action in that thread
that comes later in the program's
order.
An unlock on a monitor happens before every subsequent lock on that
same monitor.
A write to a volatile field happens before every subsequent read
of that same volatile.
A call to start() on a thread happens before any actions in the
started thread.
All actions in a thread happen before any other thread successfully
returns from a join() on that thread.
Since the threads are started after the call to createCatalog, the result of createCatalog should be visible to those threads without any problems. It's only changes to the Catalog objects that occur after start() is called on the thread that would cause trouble.

Categories

Resources