Implementing a cyclicbarrier in java using semaphores

Implementing a cyclicbarrier in java using semaphores - java

The question is as follows, since the barrier is only called using down() so that it would wait for the n threads to arrive and then execute all n threads together in the critical region now how do I inform the threads calling on barrier.down that it can move on now. I tried adding notifyAll() after phase2() and that doesn't work. Help? :)
public class cyclicBarrier {
private int n;
private int count;
private semaphore mutex;
private semaphore turnstile;
private semaphore turnstile2;
public cyclicBarrier(int n){
this.n = n;
this.count = 0;
this.mutex = new semaphore(1);
this.turnstile = new semaphore(0);
this.turnstile2 = new semaphore(0);
}
public synchronized void down() throws InterruptedException{
this.phase1(); //waits for n threads to arrive
this.phase2(); //waits for n threads to execute
}
private synchronized void phase1() throws InterruptedException {
this.mutex.down();
this.count++;
if(this.count == this.n){
for(int i = 0; i < this.n; i++){
this.turnstile.signal(); //when n threads received then move on to phase 2
}
}
this.mutex.signal();
this.turnstile.down(); //keeps waiting till I get n threads
}
private synchronized void phase2() throws InterruptedException {
this.mutex.down();
this.count--;
if(this.count == 0){
for(int i = 0; i < this.n; i++){
this.turnstile2.signal(); //reset the barrier for reuse
}
}
this.mutex.signal();
this.turnstile2.down(); //keeps waiting till n threads get executed
}
}
public class semaphore {
private int counter;
public semaphore(int number){
if (number > 0) {
this.counter = number;
}
}
public synchronized void signal(){
this.counter++;
notifyAll();
}
public synchronized void down() throws InterruptedException{
while (this.counter <= 0){
wait();
}
this.counter--;
}
}

I see you're using the solution from The Little Book of Semaphores. One main point of the book is that you can solve many coordination problems using semaphores as the only coordination primitive. It is perfectly fine to use synchronized to implement a semaphore, since that is necessary to do it correctly. It misses the point, however, to use synchronized in the methods which solve a puzzle that is supposed to be solved with semaphores.
Also, I think it doesn't work in your case: don't you get a deadlock at this.turnstile.down()? You block on a semaphore which holding an exclusive lock (through synchronized) on the object and method which would allow that semaphore to get released.
Addressing the question as stated: you signal to threads that they can proceed by returning from barrier.down(). You ensure that you don't return too soon by doing turnstile.down().
Aside: Semaphore implementation
Your semaphore implementation looks correct, except that you only allow non-negative initial values, which is at least non-standard. Is there some motivation for doing this that I can't see? If you think negative initial values are wrong, why not throw an error instead of silently doing something else?
Aside: Other synchronization primitives
Note that the java constructs synchronized, .wait() and .notify() correspond to the Monitor coordination primitive. It may be instructive to solve the puzzles with monitors (or other coordination primitives) instead of semaphores, but I would recommend keeping those efforts separate. I've had a bit of fun trying to solve a puzzle using Haskell's Software Transactional Memory.
Aside: On runnability
You say you have tried things, which indicates that you have some code that allows you to run the code in the question. It would have been helpful if you had included that code, so we could easily run it too. I probably would have checked that my hypothesized deadlock actually occurs.

Related

Why does wait(100) cause synchronized method to fail in multi threaded?

I am referencing from Baeldung.com. Unfortunately, the article does not explain why this is not a thread safe code. Article
My goal is to understand how to create a thread safe method with the synchronized keyword.
My actual result is: The count value is 1.
package NotSoThreadSafe;
public class CounterNotSoThreadSafe {
private int count = 0;
public int getCount() { return count; }
// synchronized specifies that the method can only be accessed by 1 thread at a time.
public synchronized void increment() throws InterruptedException { int temp = count; wait(100); count = temp + 1; }
}
My expected result is: The count value should be 10 because of:
I created 10 threads in a pool.
I executed Counter.increment() 10 times.
I make sure I only test after the CountDownLatch reached 0.
Therefore, it should be 10. However, if you release the lock of synchronized using Object.wait(100), the method become not thread safe.
package NotSoThreadSafe;
import org.junit.jupiter.api.Test;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import static org.junit.jupiter.api.Assertions.assertEquals;
class CounterNotSoThreadSafeTest {
#Test
void incrementConcurrency() throws InterruptedException {
int numberOfThreads = 10;
ExecutorService service = Executors.newFixedThreadPool(numberOfThreads);
CountDownLatch latch = new CountDownLatch(numberOfThreads);
CounterNotSoThreadSafe counter = new CounterNotSoThreadSafe();
for (int i = 0; i < numberOfThreads; i++) {
service.execute(() -> {
try { counter.increment(); } catch (InterruptedException e) { e.printStackTrace(); }
latch.countDown();
});
}
latch.await();
assertEquals(numberOfThreads, counter.getCount());
}
}

This code has both of the classical concurrency problems: a race condition (a semantic problem) and a data race (a memory model related problem).
Object.wait() releases the object's monitor and another thread can enter into the synchronized block/method while the current one is waiting. Obviously, author's intention was to make the method atomic, but Object.wait() breaks the atomicity. As result, if we call .increment() from, let's say, 10 threads simultaneously and each thread calls the method 100_000 times, we get count < 10 * 100_000 almost always, and this isn't what we'd like to. This is a race condition, a logical/semantic problem. We can rephrase the code... Since we release the monitor (this equals to the exit from the synchronized block), the code works as follows (like two separated synchronized parts):
public void increment() {
int temp = incrementPart1();
incrementPart2(temp);
}
private synchronized int incrementPart1() {
int temp = count;
return temp;
}
private synchronized void incrementPart2(int temp) {
count = temp + 1;
}
and, therefore, our increment increments the counter not atomically. Now, let's assume that 1st thread calls incrementPart1, then 2nd one calls incrementPart1, then 2nd one calls incrementPart2, and finally 1st one calls incrementPart2. We did 2 calls of the increment(), but the result is 1, not 2.
Another problem is a data race. There is the Java Memory Model (JMM) described in the Java Language Specification (JLS). JMM introduces a Happens-before (HB) order between actions like volatile memory write/read, Object monitor's operations etc. https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.4.5 HB gives us guaranties that a value written by one thread will be visible by another one. Rules how to get these guaranties are also known as Safe Publication rules. The most common/useful ones are:
Publish the value/reference via a volatile field (https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.4.5), or as the consequence of this rule, via the AtomicX classes
Publish the value/reference through a properly locked field (https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.4.5)
Use the static initializer to do the initializing stores
(http://docs.oracle.com/javase/specs/jls/se11/html/jls-12.html#jls-12.4)
Initialize the value/reference into a final field, which leads to the freeze action (https://docs.oracle.com/javase/specs/jls/se11/html/jls-17.html#jls-17.5).
So, to have the counter correctly (as JMM has defined) visible, we must make it volatile
private volatile int count = 0;
or do the read over the same object monitor's synchronization
public synchronized int getCount() { return count; }
I'd say that in practice, on Intel processors, you read the correct value without any of these additional efforts, with just simple plain read, because of TSO (Total Store Ordering) implemented. But on a more relaxed architecture, like ARM, you get the problem. Follow JMM formally to be sure your code is really thread-safe and doesn't contain any data races.

Why int temp = count; wait(100); count = temp + 1; is not thread-safe? One possible flow:
First thread reads count (0), save it in temp for later, and waits, allowing second thread to run (lock released);
second thread reads count (also 0), saved in temp, and waits, eventually allowing first thread to continue;
first thread increments value from temp and saves in count (1);
but second thread still holds the old value of count (0) in temp - eventually it will run and store temp+1 (1) into count, not incrementing its new value.
very simplified, just considering 2 threads
In short: wait() releases the lock allowing other (synchronized) method to run.

Terribly slow synchronization

I'm trying to write game of life on many threads, 1 cell = 1 thread, it requires synchronization between threads, so no thread will start calculating it new state before other thread does not finish reading previous state. here is my code
public class Cell extends Processor{
private static int count = 0;
private static Semaphore waitForAll = new Semaphore(0);
private static Semaphore waiter = new Semaphore(0);
private IntField isDead;
public Cell(int n)
{
super(n);
count ++;
}
public void initialize()
{
this.algorithmName = Cell.class.getSimpleName();
isDead = new IntField(0);
this.addField(isDead, "state");
}
public synchronized void step()
{
int size = neighbours.size();
IntField[] states = new IntField[size];
int readElementValue = 0;
IntField readElement;
sendAll(new IntField(isDead.getDist()));
Cell.waitForAll.release();
//here wait untill all other threads finish reading
while (Cell.waitForAll.availablePermits() != Cell.count) {
}
//here release semaphore neader lower
Cell.waiter.release();
for (int i = 0; i < neighbours.size(); i++) {
readElement = (IntField) reciveMessage(neighbours.get(i));
states[i] = (IntField) reciveMessage(neighbours.get(i));
}
int alive = 0;
int dead = 0;
for(IntField ii: states)
{
if(ii.getDist() == 1)
alive++;
else
dead++;
}
if(isDead.getDist() == 0)
{
if(alive == 3)
isDead.setValue(1);
else
;
}
else
{
if(alive == 3 || alive == 2)
;
else
isDead.setValue(0);
}
try {
while(Cell.waiter.availablePermits() != Cell.count)
{
;
//if every thread finished reading we can acquire this semaphore
}
Cell.waitForAll.acquire();
while(Cell.waitForAll.availablePermits() != 0)
;
//here we make sure every thread ends step in same moment
Cell.waiter.acquire();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
processor
class extends thread and in run method if i turn switch on it calls step() method. well it works nice for small amount of cells but when i run abou 36 cells it start to be very slow, how can repair my synchronization so it woudl be faster?

Using large numbers of threads tends not to be very efficient, but 36 is not so many that I would expect that in itself to produce a difference that you would characterize as "very slow". I think more likely the problem is inherent in your strategy. In particular, I suspect this busy-wait is problematic:
Cell.waitForAll.release();
//here wait untill all other threads finish reading
while (Cell.waitForAll.availablePermits() != Cell.count) {
}
Busy-waiting is always a performance problem because you are tying up the CPU with testing the condition over and over again. This busy-wait is worse than most, because it involves testing the state of a synchronization object, and this not only has extra overhead, but also introduces extra interference among threads.
Instead of busy-waiting, you want to use one of the various methods for making threads suspend execution until a condition is satisfied. It looks like what you've actually done is created a poor-man's version of a CyclicBarrier, so you might consider instead using CyclicBarrier itself. Alternatively, since this is a learning exercise you might benefit from learning how to use Object.wait(), Object.notify(), and Object.notifyAll() -- Java's built-in condition variable implementation.
If you insist on using semaphores, then I think you could do it without the busy-wait. The key to using semaphores is that it is being able to acquire the semaphore (at all) that indicates that the thread can proceed, not the number of available permits. If you maintain a separate variable with which to track how many threads are waiting on a given semaphore at a given point, then each thread reaching that point can determine whether to release all the other threads (and proceed itself) or whether to block by attempting to acquire the semaphore.

Is this synchronized block need?

Is the synchronized block on System.out.println(number); need the following code?
import java.util.concurrent.CountDownLatch;
public class Main {
private static final Object LOCK = new Object();
private static long number = 0L;
public static void main(String[] args) throws InterruptedException {
CountDownLatch doneSignal = new CountDownLatch(10);
for (int i = 0; i < 10; i++) {
Worker worker = new Worker(doneSignal);
worker.start();
}
doneSignal.await();
synchronized (LOCK) { // Is this synchronized block need?
System.out.println(number);
}
}
private static class Worker extends Thread {
private final CountDownLatch doneSignal;
private Worker(CountDownLatch doneSignal) {
this.doneSignal = doneSignal;
}
#Override
public void run() {
synchronized (LOCK) {
number += 1;
}
doneSignal.countDown();
}
}
}
I think it's need because there is a possibility to read the cached value.
But some person say that:
It's unnecessary.
Because when the main thread reads the variable number, all of worker thread has done the write operation in memory of variable number.

doneSignal.await() is a blocking call, so your main() will only proceed when all your Worker threads have called doneSignal.countDown(), making it reach 0, which is what makes the await() method return.
There is no point adding that synchronized block before the System.out.println(), all your threads are already done at that point.
Consider using an AtomicInteger for number instead of synchronizing against a lock to call += 1.

It is not necessary:
CountDownLatch doneSignal = new CountDownLatch(10);
for (int i = 0; i < 10; i++) {
Worker worker = new Worker(doneSignal);
worker.start();
}
doneSignal.await();
// here the only thread running is the main thread
Just before dying each thread countDown the countDownLatch
#Override
public void run() {
synchronized (LOCK) {
number += 1;
}
doneSignal.countDown();
}
Only when the 10 thread finish their job the doneSignal.await(); line will be surpass.

It is not necessary because you are waiting for "done" signal. That flush memory in a way that all values from the waited thread become visible to main thread.
However you can test that easily, make inside the run method a computation that takes several (millions) steps and don't get optimized by the compiler, if you see a value different than from the final value that you expect then your final value was not already visible to main thread. Of course here the critical part is to make sure the computation doesn't get optimized so a simple "increment" is likely to get optimized. This in general is usefull to test concurrency where you are not sure if you have correct memory barriers so it may turn usefull to you later.

synchronized is not needed around System.out.println(number);, but not because the PrintWriter.println() implementations are internally synchronized or because by the time doneSignal.await() unblocks all the worker threads have finished.
synchronized is not needed because there's a happens-before edge between everything before each call to doneSignal.countDown and the completion of doneSignal.await(). This guarantees that you'll successfully see the correct value of number.

Needed
No.
However, as there is no (documented) guarantee that there will not be any interleaving it is possible to find log entries interleaved.
System.out.println("ABC");
System.out.println("123");
could print:
AB1
23C
Worthwhile
Almost certainly not. Most JVMs will implement println with a lock open JDK does.
Edge case
As suggested by #DimitarDimitrov, there is one further possible use for that lock and it is to ensure a memory barrier is crossed befor accessing number. If that is the concern then you do not need to lock, all you need to do is make number volatile.
private static volatile long number = 0L;

Java Synchronization at multiple level

As shown in example below, once lock is taken on an object in call method, there is no need for further methods to have synchronized keyword.
public class Prac
{
public static void main(String[] args)
{
new Prac().call();
}
private synchronized void call()
{
further();
}
private synchronized void further()
{
oneMore();
}
private synchronized void oneMore()
{
// do something
}
}
But, if I still add synchronized keyword to further and onceMore, how will performance be impacted? Or not impacted at all?
EDIT : Does it add costs of checking(after encountering synchronized keyword) if it has lock or lock is required? Internally does this checking adds overhead?
EDIT : application will not have one thread only, this code here is just sample code. may be replace main with run method

The performance will not be impacted. Acquiring a lock, which is already acquired costs nothing. This technique is called biased locking. By default biased locking is switched on. That's why single thread applications are not impacted by calling synchronized methods.
Java SE 6 Performance White Paper:
An object is "biased" toward the thread which first acquires its monitor via a monitorenter bytecode or synchronized method invocation; subsequent monitor-related operations can be performed by that thread without using atomic operations resulting in much better performance, particularly on multiprocessor machines.

synchronization mechanism make methods a little bit slower so try to not synchronize method if you have only one thread

Since JDK 7 HotSpot JVM is capable of optimizing such code by eliminating nested locks.
The optimization is called -XX:+EliminateNestedLocks and is turned on by default.
The redundant locks are removed during JIT-compilation, so there is no run-time overhead even to check if the lock is already taken. However this optimization works only when monitor object is static final or when locking this object.

I modified the benchmark according to the comment below. In this benchmark, acquiring the lock multiple times, occasionally takes less time than acquire_once, but I think this is because of background threads like gc and jit
public class Benchmark {
final int count = 10000;
boolean the_bool = false; // prevent no-op optimization inside the loop
public static void main(String[] args) {
Benchmark benchmark = new Benchmark();
benchmark.start();
}
public void start() {
//run the test 12000 times
for (int i = 0; i < 12000; i++) {
long start = System.nanoTime();
aqcuire_lock_multiple_times();
long end = System.nanoTime();
long time1 = end - start; // time to acquire lock multiple times
start = System.nanoTime();
acquire_lock_once();
end = System.nanoTime();
long time2 = end - start; // the time to acquire lock once
if (time1 <= time2) {
String m = MessageFormat.format(
"time1:{0}ns < time2:{1}ns, iteration:{2}", time1, time2, i);
System.out.println(m);
}else{
// acquire the lock once is faster as expected
}
}
}
public synchronized void aqcuire_lock_multiple_times() {
for (int i = 0; i < count; i++) {
synchronized (this) {
the_bool = !the_bool;
}
}
}
public synchronized void acquire_lock_once() {
for (int i = 0; i < count; i++) {
the_bool = !the_bool;
}
}
}
Here I compile it with jdk1.7 (the results with eclipse compiler are the same)
So my conclusion is that there is overhead.

Multithreading programming in Java, using semaphores

I'm Learning Java multithreading and I have problem, I can't understand Semaphores. How can I execute threads in this order? for example : on image1 : the 5-th thread start running only then 1-st and 2-nd is finished to execute.
Image 2:
Image 1:
I upload now images for better understanding . :))

Usually in java you use mutexes (also called monitors), which prohibits that two or more threads access the code region proctected by that mutex
That code region is defined using the sychronized statement
sychronized(mutex) {
// mutual exclusive code begin
// ...
// ...
// mutual exclusive code end
}
where mutex is defined as e.g:
Object mutex = new Object();
To prevent a task from beeing started you need advanced technics, such as barriers, defined in java.util.concurrency package.
But first make yourself confortable with the synchronized statement.
If you think that you will often use multi threading in java, you might want to read
"Java Concurrency in Practise"

Synchronized is used so that each thread will enter that method or that portion of the code on at a time. If you want to
public class CountingSemaphore {
private int value = 0;
private int waitCount = 0;
private int notifyCount = 0;
public CountingSemaphore(int initial) {
if (initial > 0) {
value = initial;
}
}
public synchronized void waitForNotify() {
if (value <= waitCount) {
waitCount++;
try {
do {
wait();
} while (notifyCount == 0);
} catch (InterruptedException e) {
notify();
} finally {
waitCount--;
}
notifyCount--;
}
value--;
}
public synchronized void notifyToWakeup() {
value++;
if (waitCount > notifyCount) {
notifyCount++;
notify();
}
}
}
This is an implementation of a counting semaphore. It maintains counter variables ‘value’, ‘waitCount’ and ‘notifyCount’. This makes the thread to wait if value is lesser than waitCount and notifyCount is empty.
You can use Java Counting Semaphore. Conceptually, a semaphore maintains a set of permits. Each acquire() blocks if necessary until a permit is available, and then takes it. Each release() adds a permit, potentially releasing a blocking acquirer. However, no actual permit objects are used; the Semaphore just keeps a count of the number available and acts accordingly.
Semaphores are often used to restrict the number of threads than can access some (physical or logical) resource. For example, here is a class that uses a semaphore to control access to a pool of items:
class Pool {
private static final MAX_AVAILABLE = 100;
private final Semaphore available = new Semaphore(MAX_AVAILABLE, true);
public Object getItem() throws InterruptedException {
available.acquire();
return getNextAvailableItem();
}
public void putItem(Object x) {
if (markAsUnused(x))
available.release();
}
// Not a particularly efficient data structure; just for demo
protected Object[] items = ... whatever kinds of items being managed
protected boolean[] used = new boolean[MAX_AVAILABLE];
protected synchronized Object getNextAvailableItem() {
for (int i = 0; i < MAX_AVAILABLE; ++i) {
if (!used[i]) {
used[i] = true;
return items[i];
}
}
return null; // not reached
}
protected synchronized boolean markAsUnused(Object item) {
for (int i = 0; i < MAX_AVAILABLE; ++i) {
if (item == items[i]) {
if (used[i]) {
used[i] = false;
return true;
} else
return false;
}
}
return false;
}
}
Before obtaining an item each thread must acquire a permit from the semaphore, guaranteeing that an item is available for use. When the thread has finished with the item it is returned back to the pool and a permit is returned to the semaphore, allowing another thread to acquire that item. Note that no synchronization lock is held when acquire() is called as that would prevent an item from being returned to the pool. The semaphore encapsulates the synchronization needed to restrict access to the pool, separately from any synchronization needed to maintain the consistency of the pool itself.
A semaphore initialized to one, and which is used such that it only has at most one permit available, can serve as a mutual exclusion lock. This is more commonly known as a binary semaphore, because it only has two states: one permit available, or zero permits available. When used in this way, the binary semaphore has the property (unlike many Lock implementations), that the "lock" can be released by a thread other than the owner (as semaphores have no notion of ownership). This can be useful in some specialized contexts, such as deadlock recovery.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.