It is mentioned that one of the advantages of Lock (java.util.concurrent.locks.Lock) over intrinsic lock is that Lock facilitates "chain locking". Chain locking being, hold a lock for A, then acquire B, after acquiring B release A and then acquire C ...
I am just curious, have you guys encountered any situation in which the use of chain locking was necessary?
Cheers,
Vic
Any situation where you have a series of critical sections which are mutually independent, but you wish to execute in order would be appropriate.
Think of this like a burrito bar, you have a queue of consumers, and four or so workers on the other side. You don't want any consumers to skip ahead of others, nor do you want any of the workers to serve more than one consumer at a time. You could create queues between each server, however you know that the pipeline is strictly sequential, and sometimes that abstraction isn't the best representation in code.
HOWEVER, you may have exceptional handling where you want to be able to acquire one of the stages of the pipeline. E.g., the cashier at the end. If someone comes in for a gift-card, they could skip the queue and go straight to the cashier. This model reduces average wait times/latency, while providing the necessary locks and sequencing guarantees for other workers.
As with anything in computing, there are many ways to achieve the same effect, however the cognitive distance between the domain model and the implementation model impacts code clarity. Therefore if you have an application where you want to ensure that you don't release one resource before you have acquired the next in the sequence, a lock chain is a convenient solution.
Finally, don't forget that the synchronized capability of java is strictly nested, you can only release a lock in the order you acquired it. Not ideal if you have long complicated pipelines.
Related
I currently have a Spring dispatcher ensuring various concurrency limitation policies based on bounded queues.
Basically, multiple request types are handled, some memory expensive, other less, and the request threads happening to hit the memory expensive tasks put a token in a bounded blocking queue (ArrayBlockingQueue), so that only N of them end up actually running, while the other end up waiting.
Now, the waiting list is internally managed by a ReentrantLock, which in turns leverages a Condition implementation fund in AbstractQueuedLongSynchronizer that uses a linked list, which notifies the longest waiting thread when a token is removed from the queue.
Now I need a different behavior, so that the list maintained by the Condition is sorted by a user defined priority too (straight one, no counter-starvation measures needed for lower priority requests).
Unfortunately the classes in question have a wall of "final" declarations making it hard to inject this seemingly small behavioral change.
Is there any concurrent data structure out there providing the behavior I'm looking for, or that would allow customization?
Alternatively, suggestions to implement it without rewriting ArrayBlockinQueue/ReentrantLock/Condition from scratch?
Note: really looking for a bounded blocking queue with priority in the waiting list, other approaches requiring a redesign of the whole application, secondary execution thread pools and the like are unfortunately not feasible (time and material limitations)
By clustered environment I mean same code running on multiple server machines.My scenario what I can think of is as follows
Multiple request come to update Card details based on expiry time from different threads at the same time. A snippet of code is following
synchronized(card) { //card object
if(card.isExpired())
updateCard()
}
My understanding is synchronized block works at jvm level so how in multiserver environment it is achieved.
Please suggest edit to rephrase question. I asked what I can recollect from a question asked to me.
As you said, synchronized block is only for "local JVM" threads.
When it comes to cluster, it is up to you how you drive your distributed transaction.
It really depends where your objects (e.g. card) are stored.
Database - You will probably need to use some locking strategy. Very likely optimistic locking that stores a version of entity and checks it when every change is made. Or more "safe" pessimistic locking where you lock the whole row when making changes.
Memory - You will probably need some memory grid solution (e.g. Hazelcast...) and make use of its transaction support or implement it by yourself
Any other? You will have specify...
See, in a clustered environment, you will usually have multiple JVMs running the same code. If traffic is high, then actually the number of JVMs could auto-scale and increase (new instances could be spawned). This is one of the reasons why you should be really careful when using static fields to keep data in a distributed environment.
Next, coming to your actual question, if you have a single jvm serving requests, then all other threads will have to wait to get that lock. If you have multiple JVMs running, then lock acquired by one thread on oneJVM will not prevent acquisition of the (in reality, not same, but conceptually same) lock by another thread in a different jvm.
I am assuming you want to handle that only one thread can edit the object or perform the action (based on the method name i.e updatecard) I suggest you implement optimistic locking (versioning), hibernate can do this quite easily, to prevent dirty read.
When there are two concurrent transactions t1 and t2 (I'll skip boilerplate, just assume that I'm doing everything by the book):
Thread A : t1:
it1 = db.findNodes(label);
it1.forEach(node -> println(node.hasLabel(label))
Thread B : t2:
it2 = db.findNodes(label);
it2.forEach(node -> node.removeLabel(label))
Now, from my point of view, we have a huge inconsistency here: if thread A executes slower than thread B, at the point we check if the node in A has a label label, the result will be false!
As a developer, I understand that since iterators are lazy, this is kind of predictable and I see the reason behind this behaviour, but as an API user, I was really annoyed with the fact that I can't be 100% sure that the nodes that I requested as those having the label, turn out not to have it!
Also, there might be a situation where it's not possible to obtain a write lock on any entity that will guard all those nodes from concurrent modification, hence I can't have consistency even with some fine tools.
I really don't think that this is a bug - rather a feature gone wild. However, I would be really glad to know if there is any solution that will help me with my issue.
Update: here's how this pseudo-race condition happens:
Before: create 100 nodes with :Label
A: get iterator for all nodes with :Label
B: get iterator for all nodes with :Label
A: consume e.g. 50 nodes
B: remove labels from all nodes, commit
A: see the rest of the nodes as the ones not having :Label
You've got a tough question there - since no one has attempted an answer, I'll attempt one.
I think the answer is wrapped up in the details of how neo4j deals with transactions. This particular link dealing with transaction isolation seems very relevant to me, which says:
Transactions in Neo4j use a read-committed isolation level, which
means they will see data as soon as it has been committed and will not
see data in other transactions that have not yet been committed. This
type of isolation is weaker than serialization but offers significant
performance advantages whilst being sufficient for the overwhelming
majority of cases.
I take for granted that you removing those labels is happening within a transaction. My read says that not a single label in Thread A can change, until all of Thread B has finished. This is because you might remove labels from many nodes, but none of that is real/visible to any other thread until the removal transaction commits. At least this is how it should be.
So your race condition here is when Thread A starts while Thread B is executing, but before Thread B commits.
I think your best answer probably comes from the second paragraph of that link:
In addition, the Neo4j Java API (see Advanced Usage) enables explicit
locking of nodes and relationships. Using locks gives the opportunity
to simulate the effects of higher levels of isolation by obtaining and
releasing locks explicitly. For example, if a write lock is taken on a
common node or relationship, then all transactions will serialize on
that lock — giving the effect of a serialization isolation level.
Inside of Thread B, you might want to acquire a read lock on the nodes that you're modifying. That lock would be released when the transaction commits.
I'm not 100% sure of this answer, but I think it makes sense. If someone more experienced can improve on this or contradict, please jump in.
I'm trying to model a situation in Java in which many producers (at least 2) access the same LinkedBlockingQueue at a fixed rate. They produce, put, and then start over again.
I was wondering whether this could eventually lead to race conditions between those producers which try to gain write access on the queue at the same time. Are java.util.concurrent.BlockingQueue's implementations already set up to handle such an issue, or should I manually create mutexes in order to avoid this kind of problems?
Thank you for your attention.
java's blocking queues are thread-safe for single operations such as take and put but are not for multiple operations of put or take operations such as addAll which is not being performed atomically.
so in your case the answer is no, you should not handle the thread-safety yourself unless you would like the producers to produce multiple products and put them all in one operation.
I read the following statement:
ArrayLists are unsynchronized and therefore faster than Vector, but less secure in a multithreaded environment.
I would like to know why unsynchronization can improve the speed, and why it will be less secure?
I will try to address both of your questions:
Improve speed
If the ArrayList were synchronized and multiple threads were trying to read data out of the list at the same time, the threads would have to wait to get an exclusive lock on the list. By leaving the list unsynchronized, the threads don't have to wait and the program will run faster.
Unsafe
If multiple threads are reading and writing to a list at the same time, the threads can have unstable view of the list, and this can cause instability in multi-threaded programs.
The whole point of synchronization is that it means only one thread has access to an object at any given time. Take a box of chocolates as an example. If the box is synchronized (Vector), and you get there first, no one else can take any and you get your pick. If the box is NOT synchronized (ArrayList), anyone walking by can snag a chocolate - It will disappear faster, but you may not get the ones you want.
ArrayLists are unsynchronized and
therefore faster than Vector, but less
secure in a multithreaded environment.
I would like to know why
unsynchronization can improve the
speed,and why it will be less secure?
When multiple threads are reading/writing to a shared memory location, the program might compute incorrect results due to lack of mutual exclusion and proper visibility. Hence lack of synchronization is considered "unsafe". This blog post by Jeremy Manson might provide a good introduction to the topic.
When the JVM executes a synchronized method, it makes sure that the current thread has an exclusive lock on the object on which the method is invoked. Similarly when the method finishes execution, the JVM releases the lock held by the executing thread. Synchronized methods provide mutual exclusion and visibility guarantees - and is important for "safety" (i.e. guaranteeing correctness) of the executing code. But, if only one thread is ever accessing the methods of the object, there is no safety issues to worry about. Although the JVM performance has improved over the years, uncontended synchronization (i.e. locking/unlocking of objects accessed by only one thread) still takes non-zero amount of time. For unsynchronized methods, the JVM does not pay this extra penalty - hence they are faster than their synchronized counterparts.
Vectors force their choice on you. All methods are synchronized and it is difficult to use them incorrectly. But when Vectors are used in a single-threaded context, you pay the price for the extra synchronization unnecessarily. ArrayLists leave the choice to you. When used in the multi-threaded context, it is up to you (the programmer) to correctly synchronizing the code; but when used in a single-threaded context you are guaranteed not to pay any extra synchronization overhead.
Also, when an collection is populated initially, and read subsequently ArrayLists perform better even in a multi-threaded context. For example, consider this method:
public synchronized List<String> getList() {
List<String> list = new Vector<String>();
list.add("Foo");
list.add("Bar");
return Collections.unmodifiableList(list);
}
A list is created, populated, and an immutable view of it is safely published. Looking at the code above it is clear that all subsequent uses of this list are reads and won't need any synchronization even when used by multiple threads - the object is effectively immutable. Using a Vector here incurs the synchronization overhead even for reads where it is not needed; using an ArrayList instead would perform better.
Data structures that synchronize use locks (or other synchronization constructs) to ensure that their data is always in a consistent state. Oftentimes, this requires that one or more threads wait on another thread to finish updating the structure's state, which will then reduce performance, since a wait has been introduced where before there was none.
2 threads can modify the list at the same time and add a new item or delete/modify the same item in the list at the same time because no synchronization (or lock mechanism if you prefer) exists. So imagine you delete one item of the list while somebody else is trying to work with it or you modify an item while someone uses it, it's not very secure.
http://download.oracle.com/javase/1.4.2/docs/api/java/util/ArrayList.html
Read the "Note that this implementation is not synchronized." paragraph, it explains a bit better.
And I forgot, considering speed, it seems quite trivial to imagine that when you try to control the access to a data, you add some mechanisms that prevent other people from accessing your data. Thus, you add some more computations so it is slower...
Non-blocking data structures will be faster than ones that bock, because of that fact. With blocking data structures, if a resources is acquired by some entity it will take time for another entity to acquire that same resource, once it becomes available.
However, this can be less secure in some instances depending on the situation. The main points of contention are during writes. If it can be guaranteed that the data contained in a data structure will not change it has been added and will only be accessed to read the value than there will not be a problem. The issues arise when there is a conflict between a write and a read, or a write and a write.