Perl shared variables atomicity and visibility

Perl shared variables atomicity and visibility - java

This I read from the threads::shared description:
By default, variables are private to each thread, and each newly created thread gets a private copy of each existing variable. This module allows you to share variables across different threads ... (more)
Let's say I have a shared variable like this:
my $var :shared;
$var = 10;
This means the variable exists only once for all the threads I create.
Now about atomicity and visibility:
If thread_A assigns a new value let's say 11:
$var = 11;
Is it guaranteed that thread_B (and all the other threads I might have created) will see the value 11 ?
And is the assignment performed atomically ?
Or do we have like in Java first to acquire a lock and then do the assignment and to release the lock. And only threads using the same lock are guaranteed to see the updated value?
Or this behaves like volatile primitive variables in Java ?

It's always good practice to enforce atomicity in updates. Perl provides lock to allow us to do this. You can lock the variable itself - if the variable is shared with the thread, then so is the lock state.
If you update $var then the other threads will see the new value.
But you do have a potential race condition, depending on when they access it. If that's a problem - lock and if it's not... carry on.
Bear in mind that operations such as $var++ are not guaranteed to be atomic. (http://perldoc.perl.org/perlthrtut.html#Thread-Pitfalls%3a-Races)

Related

Java - Multithreading - volatile [duplicate]

We use volatile in one of our projects to maintain the same copy of variable accessed by different threads. My question is whether it is alright to use volatile with static. The compiler does not give any errors but I don't understand the reason of using both.

Short of reading the memory model specification, I recommend you read http://jeremymanson.blogspot.com/2008/11/what-volatile-means-in-java.html. It's written by one of the JMM authors and should answer your question. Thinking of memory reads and writes in terms of the happens-before clause is also helpful; the JMM for Java 5 onwards adds happens-before semantics to volatile.
Specifically, when you read a volatile variable from one thread, all writes up to and including the write to that volatile variable from other threads are now visible to that one thread.
And, yes, you can use static with volatile. They do different things.

In Java, volatile has a similar general meaning as it does in C. The Java Memory Model (see the excellent link in ide's answer) allows threads to "see" a different value at the same time for variables marked as non-volatile. For example:
Thread a:
n = 1;
// wait...
n = 2;
Threads B and C:
while (true) {
System.out.println(name + ": " + n);
}
This output is allowed to happen (note that you're not guaranteed to strictly alternate between B and C, I'm just trying to show the "changeover" of B and C here):
C: 1
B: 1
C: 2
B: 1
C: 2
B: 2
This is entirely separate from the lock taken by println; thread B is allowed to see n as 1 even after C finds out that it's 2. There are a variety of very good reasons for this that I can't pretend to fully understand, many pertaining to speed, and some pertaining to security.
If it's volatile, you're guaranteed (apart from the println's locking, which I'll ignore for the moment) that B and C will both "simultaneously" see the new value of B as soon as it is sent.
You can use volatile with static because they affect different things. volatile causes changes a variable to be "replicated" to all threads that use that variable before they use it, while static shares a single variable across all classes that use that variable. (This can be rather confusing to people new to threading in Java, because every Thread happens to be implemented as a class.)

volatile means that the variable changes at runtime and that the compiler should not cache its value for any reason.
This is only really a problem when sharing the variable amongst threads, you don't want a thread working with stale data, so the compiler should never cache the value of a volatile variable reference.

Consider a scenario when two thread (Thread1 and Thread2) are accessing same variable 'mObject' with value 1.
when a Thread1 runs, it doesn't expect other threads to modify the variable 'mObject'. In this scenario the Thread1 caches the variable 'mObject' with value 1.
And if the Thread2 modify the value of 'mObject' to 2, still the Thread1 would be refering the mObject value as 1 since it did caching.
To avoid this caching we should to declare the variable as
private volatile int mObject;
in this scenarion the Thread1 will be getting updated value of mObject

Small elaboration, but the volatile keyword isn't just for for memory visibility. Before Java ver 1.5 was released the volatile keyword declared that the field will get the most recent value of the object by hitting main memory each time for reads and flushing for writes.
In the latest Java versions, the volatile keyword says two very important things:
Don't worry about how but know that when reading a volatile field
you will always have the most up to date value.
A compiler cannot reorder a volatile read/write as to maintain program order.
Check it out for more Java volatile examples.

The Java volatile keyword is used to mark a Java variable as "being stored in main memory". More precisely that means, that every read of a volatile variable will be read from the computer's main memory, and not from the CPU cache, and that every write to a volatile variable will be written to main memory, and not just to the CPU cache. The value of an attribute is not cached thread-locally, and is always read from the "main memory".
Overcoming the data inconsistency problem is the advantage but reading from and writing to main memory is more expensive than accessing the CPU cache. Hence, if there are no specific requirements it is never recommended to use volatile keywords.
class Test
{
static int var=5;
}
In the above example, assume that two threads are working on the same class. Both threads run on different processors where each thread has its local copy of var. If any thread modifies its value, the change will not reflect in the original one in the main memory. It leads to data inconsistency because the other thread is not aware of the modified value.
class Test
{
static volatile int var =5;
}
In the above example, the value of a volatile variable will never be stored in the cache. All read and write will be done from and to the main memory.

volatile barrier in Hibernate source code would "syncs state with other threads". How?

I was digging inside the source code of hibernate-jpa today and stumbled upon the following code snippet (that you can also find here):
private static class PersistenceProviderResolverPerClassLoader implements PersistenceProviderResolver {
//FIXME use a ConcurrentHashMap with weak entry
private final WeakHashMap<ClassLoader, PersistenceProviderResolver> resolvers =
new WeakHashMap<ClassLoader, PersistenceProviderResolver>();
private volatile short barrier = 1;
/**
* {#inheritDoc}
*/
public List<PersistenceProvider> getPersistenceProviders() {
ClassLoader cl = getContextualClassLoader();
if ( barrier == 1 ) {} //read barrier syncs state with other threads
PersistenceProviderResolver currentResolver = resolvers.get( cl );
if ( currentResolver == null ) {
currentResolver = new CachingPersistenceProviderResolver( cl );
resolvers.put( cl, currentResolver );
barrier = 1;
}
return currentResolver.getPersistenceProviders();
}
That weird statement if ( barrier == 1 ) {} //read barrier syncs state with other threads disturbed me. I took the time to dig into the volatile keyword specification.
To put it simply, in my understanding, it ensures that any READ or WRITE operation on the corresponding variable will allways be performed directly in the memory at the place the value is usually stored. It specifically prevents accesses through caches or registrars that hold a copy of the value and are not necessarily aware if the value has changed or is being modified by a concurrent thread on another core.
As a consequence it causes a drop in performances because every access implies to go all the way into the memory instead of using the usual (pipelined?) shortcuts. But it also ensures that whenever a thread reads the variable it will always be up to date.
I provided those details to let you know what my understanding of the keyword is. But now when I re-read the code I am telling myself "Ok wo we are slowing the execution by ensuring that a value which is always 1 is always 1 (and setting it to 1). How does that help?"
Anybody can explain this?

You understand volatile wrong.
it ensures that any READ or WRITE operation on the corresponding
variable will allways be performed directly in the memory at the place
the value is usually stored. It specifically prevents accesses through
caches or registrars that hold a copy of the value and are not
necessarily aware if the value has changed or is being modified by a
concurrent thread on another core.
You are talking about the implemention, while the implemention may differs from jvm to jvm.
volatile is much like some kind of specification or rule, it can gurantee that
Write to a volatile variable establishes a happens-before relationship
with subsequent reads of that same variable. This means that changes
to a volatile variable are always visible to other threads. What's
more, it also means that when a thread reads a volatile variable, it
sees not just the latest change to the volatile, but also the side
effects of the code that led up the change.
and
Using simple atomic variable access is more efficient than accessing
these variables through synchronized code, but requires more care by
the programmer to avoid memory consistency errors. Whether the extra
effort is worthwhile depends on the size and complexity of the
application.
In this case, volatile is not used to gurantte barrier == 1:
if ( barrier == 1 ) {} //read
PersistenceProviderResolver currentResolver = resolvers.get( cl );
if ( currentResolver == null ) {
currentResolver = new CachingPersistenceProviderResolver( cl );
resolvers.put( cl, currentResolver );
barrier = 1; //write
}
it is used to gurantee that the side effects between the read and write is visible to other threads.
Without it, if you put something in the resolvers in Thread1, Thread2 might not notice it.
With it, if Thread2 read barrier after Thread1 write it, Thread2 is gurantted to see this put action.
And, there are many other synchronization mechanism, such as:
synchronized keyword
ReentrantLock
AtomicInteger
....
Usually, they can also build this happens-before relation ship between different threads.

This is done to make updates done to resolvers map to other threads by establishing happens before relationship (https://www.logicbig.com/tutorials/core-java-tutorial/java-multi-threading/happens-before.html).
In a single thread the following instructions have happens before relation
resolvers.put( cl, currentResolver );
barrier = 1;
But to make change in resolvers visible to other threads we need to read value from volatile variable barrier because write and subsequent read of the same volatile variable establish happens before relation (which is also transitive). So basically this is the overall result:
Update resolvers
Write to volatile barrier
Read from volatile barrier to make update made in step 1 visible to a thread which reads value from barrier

Volatile variables - is lightweight form of synchronization in Java.
Declaring a field volatile will give the following effects:
Compiler will not reorder the operations
Variable will be not cashed in registers
Operations on 64-bit data structures will be executed as atomic one
It will affect visibility synchronization of other variables
Quote from Brian Goetz's Concurrency in practice:
The visibility effects of volatile variables extend beyond the value
of the volatile variable itself. When thread A writes to a volatile
variable and subsequently thread B reads that same variable, the
values of all variables that were visible to A prior to writing to the
volatile variable become visible to B after reading the volatile
variable.
Okay, what is the point of keeping 1 and not declare resolvers as volatile WeakHashMap?
This safe publication guarantee applies only to primitive fields and object references. For the purposes of this visibility guarantee, the actual member is the object reference; the objects referred to by volatile object references are beyond the scope of the safe publication guarantee. Consequently, declaring an object reference to be volatile is insufficient to guarantee that changes to the members of the referent are published to other threads. A thread may fail to observe a recent write from another thread to a member field of such an object referent.
Furthermore, when the referent is mutable and lacks thread safety, other threads might see a partially constructed object or an object in a inconsistent state.
The instance of the Map object is mutable because of its put() method.
Interleaved calls to get() and put() may result in the retrieval of internally inconsistent values from the Map object because put() modifies its state. Declaring the object reference volatile is insufficient to eliminate this data race.
Since volatile variable establishes a happens-before relationship, when one thread has an update, it's just can inform others accessing barrier.
From a memory visibility perspective, writing a volatile
variable is like exiting a synchronized block and reading a volatile
variable is like entering a synchronized block.

What happens if a volatile variable is written from 2 threads?

Consider the snippet from Java Concurrency in Practice-
#ThreadSafe
public class SynchronizedInteger{
#GuardedBy("this") private int value;
public synchronized int getValue() {
return value;
}
public synchronized void setValue(int value) {
this.value = value;
}
}
An extract from the same book-
A good way to think about volatile variables is to imagine that they
behave roughly like the SynchronizedInteger class in Listing above,
replacing reads and writes of the volatile variable with calls to get
and set. Yet accessing a volatile variable performs no locking and so
cannot cause the executing thread to block, making volatile variables
a lighter-weight synchronization mechanism than synchronized.
A special case of thread confinement applies to volatile variables. It is safe to perform read-modify-write operations on shared volatile variables as long as you ensure that the volatile variable is only written from a single thread.
So, if you make the instance variable in the above class as volatile and then remove the synchronized keyword, after that suppose there are 3 threads
Thread A & Thread B are writing to the same volatile variable.
Thread C reads the volatile variable.
Since the volatile variable is now written from 2 threads, why is it unsafe to perform read-modify-write operations on this shared volatile variable?

The keyword volatile is used to ensure that changes to your Object will be seen by other Threads.
This does not enforce, that non-atomic operations on the Object will be performed without an other Thread interfering before the operation is finished.
For enforcing this you will need the keyword synchronized.

It's because read-modify-write operations on volatile variables are not atomic. v++ is actually something like:
r1 = v;
r2 = r1 + 1;
v = r2;
So if you have two threads performing this operation once each, it could possibly result in the variable being incremented only once, as they both read the old value. That's an example of why it's not safe.
In your example it would be not safe if you removed synchronized, made the field volatile and had two threads calling setValue after some conditional logic based on the return of getValue - the value could have been modified by the other thread.
If you want atomic operations look at the java.util.concurrent.atomic package.

If you write volatile variable from multiple threads without using any synchronized constructs, you are bound to get data inconsistency errors.
Use volatile variables without synchronization in case of single write thread and multiple read threads for atomic operations.
Volatile make sure that variable value is fetched from main memory instead of Thread cache. It's safe to use in case of single write and multiple read operations.
Use Atomic variables or synchronization or Lock API to update and read variables from multiple threads.
Refer to related SE question:
What is meant by "thread-safe" code?

If two threads are writing without reading the variable first, there is no problem.. it is safe. Problem arises if a thread first reads, then modifies and then writes. What if second thread is also reading at the same time, reads the same old value as the first thread, and modifies it.. and when it writes, it will simply overwrite the first threads update. BOOM.
val i = 1
-> Thread reads 1 -> Threads 2 reads 1 -> Thread 1 does 1 * .2 = 1.2 -> Thread 2 does 1 * .3 = 1.3 -> Thread 1 writes 1.2 back -> Thread 2 cooly overwrites it to 1.3 instead of doing 1.2 * .3

Volatile and more threads

I'm trying to understand the volatile keyword and its proper using. Looking at the Brian Goetz's article Java theory and practice: Fixing the Java Memory Model, I'm stuck on this example:
Map configOptions;
char[] configText;
volatile boolean initialized = false;
// In Thread A
configOptions = new HashMap();
configText = readConfigFile(fileName);
processConfigOptions(configText, configOptions);
initialized = true;
// In Thread B
while (!initialized)
sleep();
// use configOptions
The volatile variable above is used as a "guard" to indicate that a set of shared variables had been initialized.
I understand that since java 1.5, the volatile is strong enough to ensure that when thread B reads the volatile variable, it sees all variables that was visible to the thread A at the time the thread A writes to the volatile variable.
But what if there would be a thread C doing something like this:
// In Thread C
configOptions = new HashMap();
// put something to configOptions
My question: Is the volatile strong enough to ensure that when thread B reads the volatile variable, it sees all variables from all threads. Maybe some kind of flushing all caches? If not, then such a code with 3 threads is broken, right?

per the lang spec (http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.4):
A write to a volatile variable v (§8.3.1.4) synchronizes-with all subsequent reads of v by any thread (where "subsequent" is defined according to the synchronization order).
and
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
so the volatile variable itself is safe from stale cache problems. Your questions is; "what about all other variables?" Well no, the volatile keyword only affects caching on the variable it is on: all other variables on those threads are unsynchronized.

In this answer I will try to explain what volatile variables in Java is.
So, where to start?
Read and write operations with volatile variables are guaranteed to be atomic, even for 64-bit length variables. Note: i++; is not atomic because technically it is three variables.
Writing some value to volatile variable happens-before this value can be read from it. You can find lots of questions on SO about what happens-before is. Important: in JVM it is implemented with memory fences, store fence on writing and load fence on reading. From practical side that means when you read some value from it, you're guaranteed to see all values written to non-volatile variables before volatile write;
Values written to volatile variables are available to all CPUs and all threads at once, without any CPU caches.
Now, regarding your question.
Is the volatile strong enough to ensure that when thread B reads the volatile variable, it sees all variables from all threads?
No. It is strong enough to ensure that when thread B read some value from volatile variable, it sees (will read) values from variables written before volatile write.
Maybe some kind of flushing all caches?
Actually yes, on x86 architecture volatile write empties store order buffer, volatile read empties load order buffer. If you want more details on that, you may want to read answer for this question: Java 8 Unsafe: xxxFence() instructions
If not, then such a code with 3 threads is broken, right?
This code works as intended (I guess), because thread B does volatile read prior to reading configOptions which guarantees its visibility.

Is the volatile strong enough to ensure that when thread B reads the
volatile variable, it sees all variables from all threads. Maybe some
kind of flushing all caches?
All variables from the volatile-writing-thread that are written prior to the volatile store will be visible.
So there is no 'flush all caches' magic.
If not, then such a code with 3 threads is broken, right?
It could very well be broken with two threads if you do not synchronize correctly. There is a reason the initialized flag is written to. That effectively flushes all the writes that occurred on that thread.

What does "volatile" mean in Java?

We use volatile in one of our projects to maintain the same copy of variable accessed by different threads. My question is whether it is alright to use volatile with static. The compiler does not give any errors but I don't understand the reason of using both.

Short of reading the memory model specification, I recommend you read http://jeremymanson.blogspot.com/2008/11/what-volatile-means-in-java.html. It's written by one of the JMM authors and should answer your question. Thinking of memory reads and writes in terms of the happens-before clause is also helpful; the JMM for Java 5 onwards adds happens-before semantics to volatile.
Specifically, when you read a volatile variable from one thread, all writes up to and including the write to that volatile variable from other threads are now visible to that one thread.
And, yes, you can use static with volatile. They do different things.

In Java, volatile has a similar general meaning as it does in C. The Java Memory Model (see the excellent link in ide's answer) allows threads to "see" a different value at the same time for variables marked as non-volatile. For example:
Thread a:
n = 1;
// wait...
n = 2;
Threads B and C:
while (true) {
System.out.println(name + ": " + n);
}
This output is allowed to happen (note that you're not guaranteed to strictly alternate between B and C, I'm just trying to show the "changeover" of B and C here):
C: 1
B: 1
C: 2
B: 1
C: 2
B: 2
This is entirely separate from the lock taken by println; thread B is allowed to see n as 1 even after C finds out that it's 2. There are a variety of very good reasons for this that I can't pretend to fully understand, many pertaining to speed, and some pertaining to security.
If it's volatile, you're guaranteed (apart from the println's locking, which I'll ignore for the moment) that B and C will both "simultaneously" see the new value of B as soon as it is sent.
You can use volatile with static because they affect different things. volatile causes changes a variable to be "replicated" to all threads that use that variable before they use it, while static shares a single variable across all classes that use that variable. (This can be rather confusing to people new to threading in Java, because every Thread happens to be implemented as a class.)

volatile means that the variable changes at runtime and that the compiler should not cache its value for any reason.
This is only really a problem when sharing the variable amongst threads, you don't want a thread working with stale data, so the compiler should never cache the value of a volatile variable reference.

Consider a scenario when two thread (Thread1 and Thread2) are accessing same variable 'mObject' with value 1.
when a Thread1 runs, it doesn't expect other threads to modify the variable 'mObject'. In this scenario the Thread1 caches the variable 'mObject' with value 1.
And if the Thread2 modify the value of 'mObject' to 2, still the Thread1 would be refering the mObject value as 1 since it did caching.
To avoid this caching we should to declare the variable as
private volatile int mObject;
in this scenarion the Thread1 will be getting updated value of mObject

Small elaboration, but the volatile keyword isn't just for for memory visibility. Before Java ver 1.5 was released the volatile keyword declared that the field will get the most recent value of the object by hitting main memory each time for reads and flushing for writes.
In the latest Java versions, the volatile keyword says two very important things:
Don't worry about how but know that when reading a volatile field
you will always have the most up to date value.
A compiler cannot reorder a volatile read/write as to maintain program order.
Check it out for more Java volatile examples.

The Java volatile keyword is used to mark a Java variable as "being stored in main memory". More precisely that means, that every read of a volatile variable will be read from the computer's main memory, and not from the CPU cache, and that every write to a volatile variable will be written to main memory, and not just to the CPU cache. The value of an attribute is not cached thread-locally, and is always read from the "main memory".
Overcoming the data inconsistency problem is the advantage but reading from and writing to main memory is more expensive than accessing the CPU cache. Hence, if there are no specific requirements it is never recommended to use volatile keywords.
class Test
{
static int var=5;
}
In the above example, assume that two threads are working on the same class. Both threads run on different processors where each thread has its local copy of var. If any thread modifies its value, the change will not reflect in the original one in the main memory. It leads to data inconsistency because the other thread is not aware of the modified value.
class Test
{
static volatile int var =5;
}
In the above example, the value of a volatile variable will never be stored in the cache. All read and write will be done from and to the main memory.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.