In a previous question -> my question here i received a good solution (which works) to resolve my issue.
But i haven't understand how exactly works.
So if i have many threads that can enter concurrently on this synchronized block, and according to the java docs this code is:
synchronized(...){
//atomic for the operation inside
}
So, i' m asking:
why this operation is atomic:
for (int j = 0; j < column; j++) {
matrix[row][j] += 1;
}
and not this one:
System.out.println("begin print");
for (int i = 0; i < this.row; i++) {
System.out.println();
for (int j = 0; j < column; j++)
System.out.print(matrix[i][j]);
}
System.out.println();
System.out.println("end print");
my full function is this:
public void increaseRow(Integer row) {
synchronized (rows.get(row)) {
for (int j = 0; j < column; j++) {
matrix[row][j] += 1;
}
System.out.println("begin print");
for (int i = 0; i < this.row; i++) {
System.out.println();
for (int j = 0; j < column; j++)
System.out.print(matrix[i][j]);
}
System.out.println();
System.out.println("end print");
}
}
Could someone provide me a useful explanation, i'll appreciate a lot.
As it's stated in comment, System.out.println is not thread safe operation.
The problem is the way you lock your critical section.
synchronized (rows.get(row)) { }
This code means, that you are locking on specific row, not the whole table, so if you have N rows, that means N locks exist at the same time, and there fore N threads can run simultaneously populating System.out in parallel.
Locking on a row gives you a better parallelism: Thread working on row 2, can work at the same time, as Thread working on row 3.
Another option is to have a single lock for the whole table section.
Object lock = new Object();
...
public void someMethod(){
synchronized(lock){...}
}
In this case there is only one lock, and only one Thread executing it at the same time, so you are effectively calling your System.out synchronously from your code.
Locking on a table, decreases parallelism, since you decrease number of locks, available: Thread working on row 2, would need to wait for Thread working on row 3, to release the lock.
Thread safety, that synchronous guaranties affects only functions, written in the block, not externally called functions, it does not make System.out atomic operation.
why dont you use your class object: synchronized(this)
or, even more secure: synchronized(YourClassName.class)
or some other lock?
Object lock = new Object();
...
public void someMethod(){
synchronized(lock){...}
}
Every Java object created, including every Class loaded, has an associated lock or monitor. Putting code inside a synchronized block makes the compiler append instructions to acquire the lock on the specified object before executing the code, and release it afterwards (either because the code finishes normally or abnormally). Between acquiring the lock and releasing it, a thread is said to "own" the lock. At the point of Thread A wanting to acquire the lock, if Thread B already owns the it, then Thread A must wait for Thread B to release it.
(http://www.javamex.com/tutorials/synchronization_concurrency_synchronized1.shtml)
but if your lock changes, while a thread uses this lock in a synchronized block, it can occure that another block can enter the synchronized block with this changed lock.
Example:
Object lock = new Object();
int value = 0;
public void increment(){
synchronized(lock){value++;}
}
public void printValue(){
synchronized(lock){System.out.println(value);}
}
timeline:
thread1:
calling printValue() //taking the lock
thread2:
lock = new Object(); //the lock changes, its another object now
calling increment() //taking this new lock. the old lock is still reserved by thread1
value is getting incrementing.
threat1:
printing the wrong value.
EDIT: Didn't see that he needs a lock for each row.
Related
I am referencing from Baeldung.com. Unfortunately, the article does not explain why this is not a thread safe code. Article
My goal is to understand how to create a thread safe method with the synchronized keyword.
My actual result is: The count value is 1.
package NotSoThreadSafe;
public class CounterNotSoThreadSafe {
private int count = 0;
public int getCount() { return count; }
// synchronized specifies that the method can only be accessed by 1 thread at a time.
public synchronized void increment() throws InterruptedException { int temp = count; wait(100); count = temp + 1; }
}
My expected result is: The count value should be 10 because of:
I created 10 threads in a pool.
I executed Counter.increment() 10 times.
I make sure I only test after the CountDownLatch reached 0.
Therefore, it should be 10. However, if you release the lock of synchronized using Object.wait(100), the method become not thread safe.
package NotSoThreadSafe;
import org.junit.jupiter.api.Test;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import static org.junit.jupiter.api.Assertions.assertEquals;
class CounterNotSoThreadSafeTest {
#Test
void incrementConcurrency() throws InterruptedException {
int numberOfThreads = 10;
ExecutorService service = Executors.newFixedThreadPool(numberOfThreads);
CountDownLatch latch = new CountDownLatch(numberOfThreads);
CounterNotSoThreadSafe counter = new CounterNotSoThreadSafe();
for (int i = 0; i < numberOfThreads; i++) {
service.execute(() -> {
try { counter.increment(); } catch (InterruptedException e) { e.printStackTrace(); }
latch.countDown();
});
}
latch.await();
assertEquals(numberOfThreads, counter.getCount());
}
}
This code has both of the classical concurrency problems: a race condition (a semantic problem) and a data race (a memory model related problem).
Object.wait() releases the object's monitor and another thread can enter into the synchronized block/method while the current one is waiting. Obviously, author's intention was to make the method atomic, but Object.wait() breaks the atomicity. As result, if we call .increment() from, let's say, 10 threads simultaneously and each thread calls the method 100_000 times, we get count < 10 * 100_000 almost always, and this isn't what we'd like to. This is a race condition, a logical/semantic problem. We can rephrase the code... Since we release the monitor (this equals to the exit from the synchronized block), the code works as follows (like two separated synchronized parts):
public void increment() {
int temp = incrementPart1();
incrementPart2(temp);
}
private synchronized int incrementPart1() {
int temp = count;
return temp;
}
private synchronized void incrementPart2(int temp) {
count = temp + 1;
}
and, therefore, our increment increments the counter not atomically. Now, let's assume that 1st thread calls incrementPart1, then 2nd one calls incrementPart1, then 2nd one calls incrementPart2, and finally 1st one calls incrementPart2. We did 2 calls of the increment(), but the result is 1, not 2.
Another problem is a data race. There is the Java Memory Model (JMM) described in the Java Language Specification (JLS). JMM introduces a Happens-before (HB) order between actions like volatile memory write/read, Object monitor's operations etc. https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.4.5 HB gives us guaranties that a value written by one thread will be visible by another one. Rules how to get these guaranties are also known as Safe Publication rules. The most common/useful ones are:
Publish the value/reference via a volatile field (https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.4.5), or as the consequence of this rule, via the AtomicX classes
Publish the value/reference through a properly locked field (https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.4.5)
Use the static initializer to do the initializing stores
(http://docs.oracle.com/javase/specs/jls/se11/html/jls-12.html#jls-12.4)
Initialize the value/reference into a final field, which leads to the freeze action (https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.5).
So, to have the counter correctly (as JMM has defined) visible, we must make it volatile
private volatile int count = 0;
or do the read over the same object monitor's synchronization
public synchronized int getCount() { return count; }
I'd say that in practice, on Intel processors, you read the correct value without any of these additional efforts, with just simple plain read, because of TSO (Total Store Ordering) implemented. But on a more relaxed architecture, like ARM, you get the problem. Follow JMM formally to be sure your code is really thread-safe and doesn't contain any data races.
Why int temp = count; wait(100); count = temp + 1; is not thread-safe? One possible flow:
First thread reads count (0), save it in temp for later, and waits, allowing second thread to run (lock released);
second thread reads count (also 0), saved in temp, and waits, eventually allowing first thread to continue;
first thread increments value from temp and saves in count (1);
but second thread still holds the old value of count (0) in temp - eventually it will run and store temp+1 (1) into count, not incrementing its new value.
very simplified, just considering 2 threads
In short: wait() releases the lock allowing other (synchronized) method to run.
I am interested in how synchronized works in sense how/when it flushes writes from local caches. Lets imagine I have following code:
class Scratch1 {
int counter = 0;
Scratch1() throws ExecutionException, InterruptedException {
counter += 5;
counter += 5;
// Does this cause to flush possibly cached value written by main thread even if it locks
// on totally unrelated object and the write doesnt happen inside the sync block?
synchronized (String.class) {}
Executors.newCachedThreadPool().submit(() -> {
for (int i = 0; i < 1000; i++) {
counter += 5;
}
synchronized (Integer.class) {}
}).get();
System.out.println(counter);
}
}
class Scratch2 {
int counter = 0;
Scratch2() throws ExecutionException, InterruptedException {
// Or is this only possible working way how flush written data.
synchronized (String.class) {
counter += 5;
counter += 5;
}
Executors.newCachedThreadPool().submit(() -> {
synchronized (Integer.class) {
for (int i = 0; i < 1000; i++) {
counter += 5;
}
}
}).get();
System.out.println(counter);
}
}
class Scratch3 {
volatile int counter = 0;
Scratch3() throws ExecutionException, InterruptedException {
counter += 5;
counter += 5;
Executors.newCachedThreadPool().submit(() -> {
for (int i = 0; i < 1000; i++) {
counter += 5;
}
}).get();
System.out.println(counter);
}
}
I have several questions:
Does all three examples share same "thread-safety" level (taking into account specifics like first write is done by one thread and second write is done after first one (is it?) and by another thread) i.e. "is it guaranteed that 5010 is printed"?
Is there performance difference (at least theoretical) in "operating" outside a synchronized block or working with non-volatile properties (I would expect volatile access to be slower as this post confirms) but in case of synchronized block is the "flushing" price paid only when crossing synchronized start/end or is there also difference while inside the block?
I am interested in how synchronized works in sense how/when it flushes writes from local caches.
Actually, synchronized doesn't flush writes from local caches. It just acts as if it did so.
Does all three examples share same "thread-safety" level (taking into account specifics like first write is done by one thread and second write is done after first one (is it?) and by another thread) i.e. "is it guaranteed that 10 is printed"?
They all provide slightly different forms of thread safety. None of them are really safe if other threads are accessing the object at the same time. For example, another thread accessing counter would have to hold both the String.class and the Integer.class locks to ensure ensure it didn't see counter during an operation. The third one uses increment operations that aren't atomic, though it's safe if no other thread tries to modify counter.
Is there performance difference (at least theoretical) in "operating" outside a synchronized block or working with non-volatile properties (I would expect volatile access to be slower as this post confirms) but in case of synchronized block is the "flushing" price paid only when crossing synchronized start/end or is there also difference while inside the block?
No difference. Entering a synchronized block has a cost because the lock has to be acquired and some optimizations have to disabled across the entry point. Exiting the block has similar costs.
Inside the block, there are no costs because the safety is provided by the programmer ensuring they don't allow any thread to modify the object unless it holds the same lock and no two threads can hold the same lock at the same time. Generally speaking, code may not even know whether or not it's inside one or more synchronized blocks because it can be deep down the call tree.
I have such scenario (this is Java pseudo code):
There is a main thread which:
1) creates an instance of an array of type C:
C[] arr = new C[LARGE];
2) creates and submits tasks which populate (by doing CPU bound operations) the arr to a pool P1:
for (int i = 0; i < populateThreadCount; i++) {
p1.submit(new PopulateTask(arr, start, end))
}
Each task populates different range of indexes in arr so at this point synchronization is not needed between threads in pool P1.
3) the main thread waits until all populate tasks are finished.
4) once the arr is populated, main thread creates and submits tasks which upload (IO bound operations) the content of arr, to a pool P2:
for (int i = 0; i < uploadThreadCount; i++) {
p2.submit(new UploadTask(arr, start, end);
}
As previously, the ranges are not overlapping, each thread has it's own range so no internal synchronization between threads in P2 pool is necessary.
In the populate and upload tasks the ranges are different as there is a different number of threads to handle each type.
Now I am thinking what is the most efficient way to synchronize it.
Using CopyOnWriteArrayList is not an option as it can be very large (millions of elements).
My initial idea was to synchronize briefly in a populate task after creating an instance of a C class and then to the same in an upload task:
C[] arr = new C[LARGE];
for (int i = 0; i < populateThreadCount; i++) {
p1.submit(new PopulateTask(arr, start, end) {
void run() {
for (int j = start; j <= end; j++) {
... do some heavy computation ...
arr[j] = new C(some_computed_data);
synchronized(arr[j]) {}
}
}
});
}
for (int i = 0; i < uploadThreadCount; i++) {
p2.submit(new UploadTask(arr, start, end) {
void run() {
for (int j = start; j <= end; j++) {
synchronized(arr[j]) {
upload(arr[j]);
}
}
}
});
}
but not sure if this is correct, especially if this empty synchronized block won't be optimized out by javac or JIT.
I cannot create the instances of C class before starting the populate tasks as for that I need the computed data.
Any ideas, if that's correct and if not a way to do it better?
You don't need to synchronize anything. The executor offers the memory visibility guarantees you need. In particular, see the concurrent package documentation:
Actions in a thread prior to the submission of a Runnable to an Executor happen-before its execution begins. Similarly for Callables submitted to an ExecutorService.
Actions taken by the asynchronous computation represented by a Future happen-before actions subsequent to the retrieval of the result via Future.get() in another thread.
So, the changes done by the tasks submitted to the first executor happen before what the main thread does after the executor has finished executing them (second rule), and what the main thread does with the array happens before the actions executed by the tasks submitted to the second executor (first rule).
Since happen-before is transitive, the tasks submitted to the second executor will see the changes made by the tasks submitted to the first one.
I would an explanation on these different implmentations:
First:
public void foo(Object key){
synchronized (map.get(key)) { //-> thread can enter with different key
int variable = 0;
for (int j = 0; j <new Random().nextInt(10); j++)
variable+=j;
return variable;
}
}
Second:
public void foo(Object key){
int variable = 0;
synchronized (map.get(key)) {
for (int j = 0; j < new Random().nextInt(10); j++)
variable+=j;
return variable;
}
}
Third:
public void foo(Object key){
int variable = 0;
synchronized (map.get(key)) {
for (int j = 0; j <new Random().nextInt(10); j++)
variable+=j;
lock.lock(); // class instance lock
try{
setTheVariable(variable) //-> Example.....
}finally{
lock.unlock();
}
return variable;
}
}
In my opinion the first two implementations are the same, if each thread enter the syncrhonized block they share the for loop but they have got their own variable copy, is right?
I have a doubt in the third implementation, if each thread enter the syncrhonzized block , after finishing only one can enter inside the lock block and the other have to wait. In this case when one thread can return each variable resulting of the for loop remains attached on his own thread?
thanks in advance.
Variables declared inside the foo() method remains attached to the individual threads,because they are local variables. Here you are declaring "j" and "variable" inside the method and those variables will remain attached to the thread executing the method.
Your first two implementations are the same.
In your third implementation, only one thread can enter the synchronized block, irrespective of whether it is an class instance variable, so the lock is somehow redundant unless your //Do Something Here section has a compelling reason to do so.
Because all the variables involved are local variables, each thread has its own copy of these variables. The returned value of one thread will not be affected by another thread.
However, always watch out for deadlock if two locks are used in this fashion.
public class MyStack2 {
private int[] values = new int[10];
private int index = 0;
public synchronized void push(int x) {
if (index <= 9) {
values[index] = x;
Thread.yield();
index++;
}
}
public synchronized int pop() {
if (index > 0) {
index--;
return values[index];
} else {
return -1;
}
}
public synchronized String toString() {
String reply = "";
for (int i = 0; i < values.length; i++) {
reply += values[i] + " ";
}
return reply;
}
}
public class Pusher extends Thread {
private MyStack2 stack;
public Pusher(MyStack2 stack) {
this.stack = stack;
}
public void run() {
for (int i = 1; i <= 5; i++) {
stack.push(i);
}
}
}
public class Test {
public static void main(String args[]) {
MyStack2 stack = new MyStack2();
Pusher one = new Pusher(stack);
Pusher two = new Pusher(stack);
one.start();
two.start();
try {
one.join();
two.join();
} catch (InterruptedException e) {
}
System.out.println(stack.toString());
}
}
Since the methods of MyStack2 class are synchronised, I was expecting the output as
1 2 3 4 5 1 2 3 4 5. But the output is indeterminate. Often it gives : 1 1 2 2 3 3 4 4 5 5
As per my understanding, when thread one is started it acquires a lock on the push method. Inside push() thread one yields for sometime. But does it release the lock when yield() is called? Now when thread two is started, would thread two acquire a lock before thread one completes execution? Can someone explain when does thread one release the lock on stack object?
A synchronized method will only stop other threads from executing it while it is being executed. As soon as it returns other threads can (and often will immediately) get access.
The scenario to get your 1 1 2 2 ... could be:
Thread 1 calls push(1) and is allowed in.
Thread 2 calls push(1) and is blocked while Thread 1 is using it.
Thread 1 exits push(1).
Thread 2 gains access to push and pushes 1 but at the same time Thread 1 calls push(2).
Result 1 1 2 - you can clearly see how it continues.
When you say:
As per my understanding, when thread one is started it acquires a lock on the push method.
that is not quite right, in that the lock isn't just on the push method. The lock that the push method uses is on the instance of MyStack2 that push is called on. The methods pop and toString use the same lock as push. When a thread calls any of these methods on an object, it has to wait until it can acquire the lock. A thread in the middle of calling push will block another thread from calling pop. The threads are calling different methods to access the same data structure, using the same lock for all the methods that access the structure prevents the threads from accessing the data structure concurrently.
Once a thread gives up the lock on exiting a synchronized method the scheduler decides which thread gets the lock next. Your threads are acquiring locks and letting them go multiple times, every time a lock is released there is a decision for the scheduler to make. You can't make any assumptions about which will get picked, it can be any of them. Output from multiple threads is typically jumbled up.
It seems like you may have some confusion on exactly what the synchronized and yield keywords mean.
Synchronized means that only one thread can enter that code block at a time. Imagine it as a gate and you need a key to get through. Each thread as it enters takes the only key, and returns it when they are done. This allows the next thread to get the key and execute the code inside. It doesn't matter how long they are in the synchronized method, only one thread can enter at a time.
Yield suggests (and yes its only a suggestion) to the compiler that the current thread can give up its allotted time and another thread can begin execution. It doesn't always happen that way, however.
In your code, even though the current thread suggest to the compiler that it can give up its execution time, it still holds the key to the synchronized methods, and therefore the new thread cannot enter.
The unpredictable behavior comes from the yield not giving up the execution time as you predicted.
Hope that helped!