I've been reading up on Doug Lea's 'Concurrency Programming in Java' book. As you may know, Doug originally wrote the Java Concurrency API. However, something has caused me some confusion and I was hoping to gain a few my opinions on this little conundrum!
Take the following code from Doug Lea's queuing example...
class LinkedQueue {
protected Node head = new Node(null);
protected Node last = head;
protected final Object pollLock = new Object();
protected final Object putLock = new Object();
public void put(Object x) {
Node node = new Node(x);
synchronized (putLock) { // insert at end of list
synchronized (last) {
last.next = node; // extend list
last = node;
}
}
}
public Object poll() { // returns null if empty
synchronized (pollLock) {
synchronized (head) {
Object x = null;
Node first = head.next; // get to first real node
if (first != null) {
x = first.object;
first.object = null; // forget old object
head = first; // first becomes new head
}
return x;
}
}
}
static class Node { // local node class for queue
Object object;
Node next = null;
Node(Object x) { object = x; }
}
}
This a quite a nice Queue. It uses two monitors so a Producer and a Consumer can access the Queue at the same time. Nice! However, the synchronization on 'last' and 'head' is confusing me here. The book states this is needed for for the situation whereby Queue is currently or about to have 0 entries. Ok, fair enough and this kind of makes sense.
However, then I looked at the Java Concurrency LinkedBlockingQueue. The original version of the Queue don't synchronize on head or tail (I also wanted to post another link to the modern version which also suffers from the same problem but I couldn't do so because I'm a newbie). I wonder why not? Am I missing something here? Is there some part of the idiosyncratic nature of the Java Memory Model I'm missing? I would have thought for visibility purposes that this synchronization is needed? I'd appreciate some expert opinions!
In the version you put up a link for as well as the version in the latest JRE the item inside the Node class is volatile which enforces reads and writes to be visible to all other threads, here is a more in depth explaination http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile
The subtlety here is that synchronized(null) would throw a NullPointerException,so neither head nor last is allowed to become null. They are both initialized to the value of the same dummy node that is never returned or removed from either list.
put() and poll() are synchronized on two different locks. The methods would need to synchronize on the same lock to be thread-safe with respect to one another if they could modify the same value from different threads. The only situation in which this is a problem is when head == last (i.e. they are the same object, referenced through different member variables). This is why the code synchronizes on head and last - most of the time these will be fast, uncontented locks, but occasionally head and last will be the same instance and one of the threads will have to block the other.
The only time that visibility is an issue is when the queue is nearly empty, the rest of the time put() and poll() work on different ends of the queue and don't interfere with each other.
Related
I saw this self implemented bounded blocking queue.
A change was made to it, aiming to eleminate competition by replacing notifyAll with notify.
But I don't quite get what's the point of the 2 extra variables added: waitOfferCount and waitPollCount.
Their initial values are both 0.
Diff after and before they're added is below:
Offer:
Poll:
My understanding is that the 2 variables purpose is that you won't do useless notify calls when there's nothing wait on the object. But what harm would it do if not done this way?
Another thought is that they may have something to do with the switch from notifyAll to notify, but again I think we can safely use notify even without them?
Full code below:
class FairnessBoundedBlockingQueue implements Queue {
protected final int capacity;
protected Node head;
protected Node tail;
// guard: canPollCount, head
protected final Object pollLock = new Object();
protected int canPollCount;
protected int waitPollCount;
// guard: canOfferCount, tail
protected final Object offerLock = new Object();
protected int canOfferCount;
protected int waitOfferCount;
public FairnessBoundedBlockingQueue(int capacity) {
this.capacity = capacity;
this.canPollCount = 0;
this.canOfferCount = capacity;
this.waitPollCount = 0;
this.waitOfferCount = 0;
this.head = new Node(null);
this.tail = head;
}
public boolean offer(Object obj) throws InterruptedException {
synchronized (offerLock) {
while (canOfferCount <= 0) {
waitOfferCount++;
offerLock.wait();
waitOfferCount--;
}
Node node = new Node(obj);
tail.next = node;
tail = node;
canOfferCount--;
}
synchronized (pollLock) {
++canPollCount;
if (waitPollCount > 0) {
pollLock.notify();
}
}
return true;
}
public Object poll() throws InterruptedException {
Object result;
synchronized (pollLock) {
while (canPollCount <= 0) {
waitPollCount++;
pollLock.wait();
waitPollCount--;
}
result = head.next.value;
head.next.value = null;
head = head.next;
canPollCount--;
}
synchronized (offerLock) {
canOfferCount++;
if (waitOfferCount > 0) {
offerLock.notify();
}
}
return result;
}
}
You would need to ask the authors of that change what they thought they were achieving with that change.
My take is as follows:
Changing from notifyAll() to notify() is a good thing. If there are N threads waiting on a queue's offerLock or pollLock, then this avoids N - 1 unnecessary wakeups.
It seems that the counters are being used avoid calling notify() when there isn't a thread waiting. This looks to me like a doubtful optimization. AFAIK a notify on a mutex when nothing is waiting is very cheap. So this may make a small difference ... but it is unlikely to be significant.
If you really want to know, write some benchmarks. Write 4 versions of this class with no optimization, the notify optimization, the counter optimization and both of them. Then compare the results ... for different levels of queue contention.
I'm not sure what "fairness" is supposed to mean here, but I can't see anything in this class to guarantee that threads that are waiting in offer or poll get treated fairly.
Another thought is that they may have something to do with the switch from notifyAll to notify, but again I think we can safely use notify even without them?
Yes, since two locks (pollLock and offerLock) are used, it is no problem to change notyfiAll to notify without these two variables. But if you are using a lock, you must use notifyAll.
My understanding is that the 2 variables purpose is that you won't do useless notify calls when there's nothing wait on the object. But what harm would it do if not done this way?
Yes, these two variables are to avoid useless notify calls. These two variables also bring in additional operations. I think benchmarking may be needed to determine performance in different scenarios.
Besides,
1.As a blocking queue, it should implement the interface BlockingQueue, and both poll and offer methods shoule be non-blocking. It should use take and put.
2.This is not a Fairness queue.
I am learning java concurrent programming recently. I know that the final keyword can guarantee a safe publication. However, when I read the LinkedBlockingQueue source code, I found that the head and last field did not use the final keyword. I found that the enqueue method is called in the put method, and the enqueue method directly assigns the value to last.next. At this time, last may be a null because last is not declared with final. Is my understanding correct? Although lock can guarantee last read and write thread safety, but can lock guarantee that last is a correct initial value instead of null
public class LinkedBlockingQueue<E> extends AbstractQueue<E>
implements BlockingQueue<E>, java.io.Serializable {
transient Node<E> head;
private transient Node<E> last;
public LinkedBlockingQueue(int capacity) {
if (capacity <= 0) throw new IllegalArgumentException();
this.capacity = capacity;
last = head = new Node<E>(null);
}
private void enqueue(Node<E> node) {
// assert putLock.isHeldByCurrentThread();
// assert last.next == null;
last = last.next = node;
}
public void put(E e) throws InterruptedException {
if (e == null) throw new NullPointerException();
// Note: convention in all put/take/etc is to preset local var
// holding count negative to indicate failure unless set.
int c = -1;
Node<E> node = new Node<E>(e);
final ReentrantLock putLock = this.putLock;
final AtomicInteger count = this.count;
putLock.lockInterruptibly();
try {
/*
* Note that count is used in wait guard even though it is
* not protected by lock. This works because count can
* only decrease at this point (all other puts are shut
* out by lock), and we (or some other waiting put) are
* signalled if it ever changes from capacity. Similarly
* for all other uses of count in other wait guards.
*/
while (count.get() == capacity) {
notFull.await();
}
enqueue(node);
c = count.getAndIncrement();
if (c + 1 < capacity)
notFull.signal();
} finally {
putLock.unlock();
}
if (c == 0)
signalNotEmpty();
}
}
According to this blog post https://shipilev.net/blog/2014/safe-public-construction/ even writing to one final property in constructor is enough to achieve safe initialization (and thus your object will be always published safely). And capacity property is declared as final.
In short, we emit a trailing barrier in three cases:
A final field was written. Notice we do not care about what field was actually written, we unconditionally emit the barrier before exiting the (initializer) method. That means if you have at least one final field write, the final fields semantics extend to every other field written in constructor.
Maybe you miss understanding about the of Java's continuous assignment
//first last is inited in the constructor
last = head = new Node<E>(null); // only the filed's value in last is null(item & next)
// enqueue
last = last.next = node;
//equals:
last.next = node;
last = last.next;
Only if you call last.next otherwise there will no NPE.
You are correct that last is equal to a node with a null value. However this is intentional. The lock is only meant to ensure that each thread can perform modifications in this class correctly.
Sometimes using null values is intentional, to indicate a lack of value (empty queue in this case). Because the variable is private it can only be modified from within the class, so as long as the one writing the class is aware of the possibility of null, everything is alright.
I think you are confusing multiple different concepts which are not necessarily connected. Note that because last is private there is no publication. In addition head and last are meant to be modified, so they can't be final.
Edit
Perhaps I misunderstood your question...
null is never assigned to last directly. So the only place this could happen is in the constructor, before last is assigned new Node<E>(null). Although we can be sure that the constructor finishes before it is used by many threads, there is no visibility guarantee for the values.
However put uses a lock which does guarantees visibility in use. So if there was no lock used, then last could actually be null.
I'm learning about threads, locks etc. Therefore, I don't want to use synchronized key word or any class that is thread-safe other then semaphore and ReentrantLock (without Atomic variables).
I want to have kind of synchronized LinkedList<T> of Node<T>, order by the size of T (assume that T is implements an interface that have size and increment functions and lock, unlock functions). I want to be able to replace two Nodes by their T.getSize() function without locking all the list.
For example, if I only had one thread the function will be a "classic" replace function, something like that:
public void IncrementAndcheckReplace(Node<T> node)
{
node.getData().incrementSize();
Node nextNode = node.getNext();
if(nextNode == null)
return;
while(node.getData().getSize() > nextNode.getData().getSize())
{
node.getPrev().setNext(nextNode);
nextNode.setPrev(node.getPrev());
Node nextnext = nextNode.getNext();
nextNode.setNext(node);
node.setPrev(nextNode);
node.setNext(nextnext);
nextnext.setPrev(node);
nextNode = node.getNext();
if(nextNode == null)
break;
}
}
now lets get to the synchronized problem.
I thought about trying to do something like that to create a lock for my Nodes:
public void IncrementAndcheckReplace(Node<T> node)
{
node.lock(); //using fair ReentrantLock for specific node
node.getData().incrementSize();
Node nextNode = node.getNext();
if(nextNode == null)
{
node.unlock();
return;
}
nextNode.lock();
while(node.getData().getSize() > nextNode.getData().getSize())
{
Node prev = node.getPrev();
if(prev != null)
{
prev.lock();
prev.setNext(nextNode);
}
nextNode.setPrev(prev);
Node nextnext = nextNode.getNext();
if(nextnext != null)
{
nextnext.lock();
nextnext.setPrev(node);
}
nextNode.setNext(node);
node.setPrev(nextNode);
node.setNext(nextnext);
if(prev!=null)
prev.unlock();
if(nextnext!=null)
nextnext.unlock();
nextNode.unlock();
nextNode = node.getNext();
if(nextNode == null)
break;
nextNode.lock();
}
node.unlock();
}
the problem is that this is not thread safe as all, and dead-lock may happens. For example lets assume that we have Node a, Node b which a.next == b and b.prev==a that now if thread A trying to use replace function on a, and thread B trying to use replace function on b, they will both be locked and I will get nowhere ever.
how can I make the replace function to be thread safe without lock the entire list? I want to avoid dead-lock and starvation.
thanks!
The most general answer, permitting the most concurrency, is to lock all of the four nodes involved in the reordering. After they are all locked, check that the ordering hasn't changed - perhaps aborting and retrying if it has - then do the reordering, then release the locks in reverse order.
The tricky part is that, to avoid deadlock, the nodes have to be locked in order according to some fixed order. Unfortunately, ordering them by position in the list won't work, since that ordering can change. The nodes' hashCode and identityHashCode aren't guaranteed to work, since there can be collisions. You'll need to provide some ordering of your own, for example by giving each node a unique permanent ID on construction, which can then be used for the locking order.
I'm implementing a concurrent skip list map based on Java's ConcurrentSkipListMap, the differences being that I want the list to allow duplicates, and I also want the list to be indexable (so that finding the Nth element of the list takes O(lg(n)) time, instead of O(n) time as with a standard skip list). These modifications aren't presenting a problem.
In addition, the skip list's keys are mutable. For example, if the list elements are the integers {0, 4, 7}, then the middle element's key can be changed to any value in [0, 7] without prompting a change to the list structure; if the key changes to (-inf, -1] or [8, +inf) then the element is removed and re-added to maintain the list order. Rather than implementing this as a removal followed by a O(lg(n)) insert, I implement this as a removal followed by a linear traversal followed by an O(1) insert (with an expected runtime of O(1) - 99% of the time the node will be swapped with an adjacent node).
Inserting a completely new node is rare (after startup), and deleting a node (without immediately re-adding it) never occurs; almost all of the operations are elementAt(i) to retrieve the element at the ith index, or operations to swap nodes after a key is modified.
The problem I'm running into is in how to implement the key modification class(es). Conceptually, I'd like to do something like
public class Node implements Runnable {
private int key;
private Node prev, next;
private BlockingQueue<Integer> queue;
public void update(int i) {
queue.offer(i);
}
public void run() {
while(true) {
int temp = queue.take();
temp += key;
if(prev.getKey() > temp) {
// remove node, update key to temp, perform backward linear traversal, and insert
} else if(next.getKey() < temp) {
// remove node, update key to temp, perform forward linear traveral, and insert
} else {
key = temp; // node doesn't change position
}
}
}
}
(The insert sub-method being called from run uses CAS in order to handle the problem of two nodes attempting to simultaneously insert at the same location (similar to how the ConcurrentSkipListMap handles conflicting inserts) - conceptually this is the same as if the first node locked the nodes adjacent to the insertion point, except that the overhead is reduced for the case where there's no conflict.)
This way I can ensure that the list is always in order (it's okay if a key update is a bit delayed, because I can be certain that the update will eventually happen; however, if the list becomes unordered then things might go haywire). The problem being that implementing the list this way will generate an awful lot of threads, one per Node (with several thousand nodes in the list) - most of them will be blocking at any given point in time, but I'm concerned that several thousand blocking threads will still result in too high of an overhead.
Another option is to make the update method synchronized and remove the Runnable interface from Node, so that rather than having two threads enqueuing updates in the Node which then takes care of processing these updates on its separate thread, the two threads would instead take turns executing the Node#update method. The problem is that this could potentially create a bottleneck; if eight different threads all decided to update the same node at once then the queue implementation would scale just fine, but the synchronized implementation would block seven out of the eight threads (and would then block six threads, then five, etc).
So my question is, how would I implement something like the queue implementation except with a reduced number of threads, or else how would I implement something like the synchronized implementation except without the potential bottleneck problem.
I think I may be able to solve this with a ThreadPoolExecutor, something like
public class Node {
private int key;
private Node prev, next;
private ConcurrentLinkedQueue<Integer> queue;
private AtomicBoolean lock = new AtomicBoolean(false);
private ThreadPoolExecutor executor;
private UpdateNode updater = new UpdateNode();
public void update(int i) {
queue.offer(i);
if(lock.compareAndSet(false, true)) {
executor.execute(updater);
}
}
private class UpdateNode implements Runnable {
public void run() {
do {
try {
int temp = key;
while(!queue.isEmpty()) {
temp += queue.poll();
}
if(prev.getKey() > temp) {
// remove node, update key to temp, perform backward linear traversal, and insert
} else if(next.getKey() < temp) {
// remove node, update key to temp, perform forward linear traveral, and insert
} else {
key = temp; // node doesn't change position
}
} finally {
lock.set(false);
}
} while (!queue.isEmpty() && lock.compareAndSet(false, true));
}
}
}
This way I have the advantages of the queue approach without having a thousand threads sitting blocked - I instead execute a UpdateNode each time I need to update a node (unless there's already an UpdateNode being executed on that Node, hence the AtomicBoolean that's acting as a lock), and rely on the ThreadPoolExecutor to make it inexpensive to create several thousand runnables.
I defined an Element class:
class Element<T> {
T value;
Element<T> next;
Element(T value) {
this.value = value;
}
}
also defined a List class based on Element. It is a typical list, just like in any data structure books, has addHead, delete and etc operations
public class List<T> implements Iterable<T> {
private Element<T> head;
private Element<T> tail;
private long size;
public List() {
this.head = null;
this.tail = null;
this.size = 0;
}
public void insertHead (T node) {
Element<T> e = new Element<T>(node);
if (size == 0) {
head = e;
tail = e;
} else {
e.next = head;
head = e;
}
size++;
}
//Other method code omitted
}
How do I make this List class thread safe?
put synchronized on all methods? Seems not working. Two threads may work on differnt methods at the same time and cause collision.
If I have used an array to keep all the elements in the class, then I may use a volatile on the array to make sure only one thread is working with the internal elements. But currently all the elements are linked through object refernece on each's next pointer. I have no way to use volatile.
Using volatile on head, tail and size? This may cause deadlocks if two thread running different methods holding on the resource each other waiting for.
Any suggestions?
If you put synchronized on every method, the data structure WILL BE thread-safe. Because by definition, only one thread will be executing any method on the object at a time, and inter-thread ordering and visibility is also ensured. So it is as good as if one thread is doing all operations.
Putting a synchronized(this) block won't be any different if the area the block covers is the whole method. You might get better performance if the area is smaller than that.
Doing something like
private final Object LOCK = new Object();
public void method(){
synchronized(LOCK){
doStuff();
}
}
Is considered good practice, although not for better performance. Doing this will ensure that nobody else can use your lock, and unintentionally creating a deadlock-prone implementation etc.
In your case, I think you could use ReadWriteLock to get better read performance. As the name suggests, a ReadWriteLock lets multiple threads through if they are accessing "read method", methods that does not mutate the state of the object (Of course you have to correctly identify which of your methods are "read method" and "write method", and use ReadWriteLock accordingly!). Also, it ensures that no other thread is accessing the object while "write method" are executed. And it takes care of the scheduling of the read/write threads.
Other well known way of making a class thread-safe is "CopyOnWrite", where you copy the whole data structure upon mutation. This is only recommended when the object is mostly "read" and rarely "written".
Here is a sample implementation of that strategy.
http://www.codase.com/search/smart?join=class+java.util.concurrent.CopyOnWriteArrayList
private volatile transient E[] array;
/**
* Returns the element at the specified position in this list.
*
* #param index index of element to return.
* #return the element at the specified position in this list.
* #throws IndexOutOfBoundsException if index is out of range <tt>(index
* < 0 || index >= size())</tt>.
*/
public E get(int index) {
E[] elementData = array();
rangeCheck(index, elementData.length);
return elementData[index];
}
/**
* Appends the specified element to the end of this list.
*
* #param element element to be appended to this list.
* #return true (as per the general contract of Collection.add).
*/
public synchronized boolean add(E element) {
int len = array.length;
E[] newArray = (E[]) new Object[len+1];
System.arraycopy(array, 0, newArray, 0, len);
newArray[len] = element;
array = newArray;
return true;
}
Here, read method is accessing without going through any lock, while write method has to be synchronized. Inter-thread ordering and visibility for read methods are ensured by the use of volatile for the array.
The reason that write methods have to "copy" is because the assignment array = newArray has to be "one shot" (in java, assignment of object reference is atomic), and you may not touch the original array during the manipulation.
I'd look at the source code for the java.util.LinkedList class for a real implementation.
Synchronized by default will lock on the instance of the class - which may not be what you want. (esp. if Element is externally accessible). If you synchronize all the methods on the same lock, then you'll have terrible concurrent performance, but it'll prevent them from executing at the same time - effectively single-threading access to the class.
Also - I see a tail reference, but don't see Element with a corresponding previous field, for a double linked-list - reason?
I'd suggest you to use a ReentrantLock which you can pass to every element of the list, but you will have to use a factory to instantiate every element.
Any time you need to take something out of the list, you will block the very same lock, so you can assure that no two threads will be accessing at the same time.