Which is more efficient and why? - java

Out of the below two synchronization strategy, which one is optimized (as in processing and generated byte code) and also the scenario in which one should use one of them.
public synchronized void addName(String name)
{
lastName = name;
nameCount++;
nameList.add(name);
}
or
public void addName(String name) {
synchronized(this) {
lastName = name;
nameCount++;
nameList.add(name);
}
}
Also what is advisiable way to handle concurrency:
using java.util.concurrent package
using the above low level methods
using Job or UIJob API (if working in eclipse PDE environment)
Thanks

which one is optimized (as in processing and generated byte code)
According to this IBM DeveloperWorks Article Section 1, a synchronized method generates less bytecode when compared to a synchronized block. The article explains why.
Snippet from the article:
When the JVM executes a synchronized method, the executing thread identifies that the method's method_info structure has the ACC_SYNCHRONIZED flag set, then it automatically acquires the object's lock, calls the method, and releases the lock. If an exception occurs, the thread automatically releases the lock.
Synchronizing a method block, on the
other hand, bypasses the JVM's
built-in support for acquiring an
object's lock and exception handling
and requires that the functionality be
explicitly written in byte code. If
you read the byte code for a method
with a synchronized block, you will
see more than a dozen additional
operations to manage this
functionality. Listing 1 shows calls
to generate both a synchronized method
and a synchronized block:
Edited to address first comment
To give other SOers credit, here is a good discussion about why one would use a sync. block. I am sure you can find more interesting discussions if you search around :)
Is there an advantage to use a Synchronized Method instead of a Synchronized Block?
I personally have not had to use a sync. block to lock on another object other than this, but that is one use SOers point out about sync. blocks.

Your updated two pieces of code are semantically identical. However, using a synchronized block as in the second piece allows you more control, as you could synchronize on a different object or, indeed, not synchronize parts of the method that don't need to be.
Using java.util.concurrent is very much preferrable to using synchronization primitives wherever possible, since it allows you to work at a higher level of abstraction, and use code that was written by very skilled people and tested intensively.
If you're working in eclipse PDE, using its APIs is most likely preferrable, as it ties in with the rest of the platform.

This totally does not matter from any efficiency point of view.
The point of having blocks is you can specify your own lock. You can choose a lock that is encapsulated within the object, as opposed to using this, with the consequence that you have more control over who can acquire the lock (since you can make that lock inaccessible from outside your object).
If you use this as the lock (whether you put synchronized on the method or use the block), anything in your program can acquire the lock on your object, and it's much harder to reason about what your program is doing.
Restricting access to the lock buys you a massive gain in decidability, it's much more beneficial to have that kind of certainty than to shave off a bytecode somewhere.

I know this might be an example, but if you plan on writing such code - think again.
To me it looks like you are duplicating information, and you should not do that unless you see that you need to do performance changes to your code. (Which you almost never should do).
If you really need this to be code that run in several threads, I'd make the nameList into a synchronized list using Collections.synchronizedList.
The last name should be a getter and it could pick the last element in the list.
The nameCount should be the size of the list.
If you do stuff like you have done now, you must also synchronize the access to all of the places where the variables are referenced, and that would make the code a lot less readable and harder to maintain.

You could remove all locking:
class Names {
AtomicReference<Node> names = new AtomicReference<Node>();
public void addName(final String name) {
Node old = names.get();
while (!names.compareAndSet(old, new Node(old, name))) {
old = names.get();
}
}
public String getName() {
final Node node = names.get();
return (node == null) ? null : node.name;
}
static class Node {
final Node parent;
final String name;
Node(final Node parent, final String name) {
this.parent = parent;
this.name = name;
}
int count() {
int count = 0;
Node p = parent;
while (p != null) {
count++;
p = p.parent;
}
return count;
}
}
}
This is basically a Treiber stack implementation. You can get the size, the current name, and you can easily implement an Iterator (albeit reverse to the one in your example) over the contents. Alternative copy-on-write containers could be used as well, depending on your needs.

Impossible to say, since the two code snippets arent equivalent.
The difference (the lack of synchronization of the call to add) may be significant, it might not be. From what you've given us its impossible to say.

Related

Two threads updating the same object, will it work? java

I have two methods as follows:
class A{
void method1(){
someObj.setSomeAttribute(true);
someOtherObj.callMethod(someObj);
}
void method2(){
someObj.setSomeAttribute(false);
someOtherObj.callMethod(someObj);
}
}
where in another place that attribute is evaluated:
class B{
void callMethod(Foo someObj){
if(someObj.getAttribute()){
//do one thing
} else{
//so another thing
}
}
}
Note that A.method1 and A.method2 are updating the attribute of the same object. If those 2 methods are run in 2 threads, will this work or will there be unexpected results?
Will there be unexpected results? Yes, guaranteed, in that if you modify things you wouldn't want to have an impact on your app (such as the phase of the moon, the current song playing in your winamp, whether your dog is cuddling near the CPU, if it's the 5th tuesday of the month, and other such things), that may have an effect on behaviour. Which you don't want.
What you've described is a so-called violation of the java memory model: The end result is that any java implementation is free to return any of multiple values and nevertheless, that VM is operating properly according to the java specification. Even if it does so seemingly arbitrarily.
As a general rule, each thread gets an unfair coin. Unfair, in that it will try to mess with you: It'll flip correctly every time when you test it out, and then in production, and only when you're giving a demo to that crucial customer, it'll get ya.
Every time it reads to or writes from any field, it will flip this mean coin. On heads, it will use the actual field. On tails, it will use a local copy it made.
That's oversimplifying the model quite a bit, but it's a good start to try to get your head around how this works.
The way out is to force so-called 'comes before' relationships: What java will do, is ensure that what you can observe matches these relationships: If event A is defined as having a comes-before relationship vs. event B, then anything A did will be observed, exactly as is, by B, guaranteed. No more coin flips.
Examples of establishing comes-before relationships involve using volatile, synchronized, and any methods that use these things internally.
NB: Of course. if your setSomeAttribute method, which you did not paste, includes some comes-before-establishing act, then there's no problem here, but as a rule a method called setX will not be doing that.
An example of one that doesn't:
class Example {
private String attr;
public void setAttr(String attr) {
this.attr = attr;
}
}
some examples of ones that do:
Let's say method B.callMethod is executed in the same thread as method1 - then you are guaranteed to at least observe the change method1 made, though it's still a coin flip (whether you actually see what method2 did or not). What would not be possible is seeing the value of that attribute before either method1 or method2 runs, because code running in a single thread has comes-before across the entire run (any line that is executed before another in the same thread has a comes-before relationship).
The set method looks like:
class Example {
private String attr;
private final Object lock = new Object();
public void setAttr(String attr) {
synchronized (lock) {
this.attr = attr;
}
}
public String getAttr() {
synchronized (lock) {
return this.attr;
}
}
}
Now the get and set ops lock on the same object, that's one of the ways to establish comes-before. Which thread got to a lock first is observable behaviour; if method1's set got there before B's get, then you are guaranteed to observe method1's set.
More generally, sharing state between threads is extremely tricky and you should endeavour not do so. The alternatives are:
Initialize all state before starting a thread, then do the job, and only when it is finished, relay all results back. Fork/join does this.
Use a messaging system that has great concurrency fundamentals, such as a database, which has transactions, or message queue libraries.
If you must share state, try to write things in terms of the nice classes in j.u.concurrent.
I assume what you expected is when you call A.method1, someObj.getAttribute() will return true in B.callMethod, when you call A.method2, someObj.getAttribute() will return false in B.callMethod.
Unfortunately,this will not work. Because between the line setSomeAttribute and callMethod,other thread may have change the value of the attribute.
If you are only use the attribute in callMethod,why not just pass the attribute instead of the Foo object. Code as follow:
class A{
void method1(){
someOtherObj.callMethod(true);
}
}
class B{
void callMethod(boolean flag){
if(flag){
//do one thing
} else{
//so another thing
}
}
}
If you must use Foo as the parameter, what you can do is to make setAttribute and callMethod atomic.
The easiest way to achieve it is to make it synchronized.Code as follow:
synchronized void method1(){
someObj.setSomeAttribute(true);
someOtherObj.callMethod(someObj);
}
synchronized void method2(){
someObj.setSomeAttribute(false);
someOtherObj.callMethod(someObj);
}
But this may have bad performance, you can achieve it with some more fine-grained lock.

Which synchronize statements are unnecessary here?

First the code fragments...
final class AddedOrders {
private final Set<Order> orders = Sets.newConcurrentHashSet();
private final Set<String> ignoredItems = Sets.newConcurrentHashSet();
private boolean added = false;
public synchronized void clear() {
added = false;
}
public synchronized void add(Order order) {
added = orders.add(order);
}
public synchronized void remove(Order order) {
if (added) orders.remove(order);
}
public synchronized void ban(String item) {
ignoredItems.add(item);
}
public synchronized boolean has(Order order) {
return orders.contains(order);
}
public synchronized Set<Order> getOrders() {
return orders;
}
public synchronized boolean ignored(String item) {
return ignoredItems.contains(item);
}
}
private final AddedOrders added = new AddedOrders();
...
boolean subscribed;
int i = 10;
synchronized (added) {
while (!(subscribed = client.getSubscribedOrders().containsAll(added.getOrders())) && (i>0)) {
Helper.out("...order not subscribed yet (try: %d)", i);
Thread.sleep(200);
i--;
}
}
What I'd like to know...
Could someone point out which synchronized are not necessary?
Of course this is not the full code but assume that in the full project that all methods are called, and that some combinations of methods are called in the check value first, then modify style
added(the class) is accessed by multipleThreads
client is part of an external Server API, that I'm not entirely sure if it is Thread-Safe yet but I think it must be
ConcurrentHashSet is a google guava Class but it is based on ConcurrentHashMap apparently and the docs say it carries all the same concurrency guarantees.
But I don't really understand completely what those guarantees all are, even though I did some reading. Namely I know it's not ok to just check and set a value in a synchronized HashMap (without synchronizing on the synchronized Map using a synchronized block), however I do not know if you can do that in ConcurrentHashMap or not (without synchronizing on the ConcurrentHashMap using a synchronized block).
The only cases in your code where you really need synchronized are the ones where you test or update the added flag. You need the synchronized block to make sure that changes to the flag are visible across threads, and you also need to make sure that the added flag change is made in step with the change to the orders data structure. The synchronized keyword keeps another thread from barging in and doing something in between checking the flag and changing the data structure (the remove method could be broken like this if you remove the synchronization).
The code toward the end seems problematic because you're locking on the added object and then not letting go of the lock, there's not an opportunity for any other thread to make the changes that the thread is looking for. Although it looks like you're waiting for another object to change, so this criticism may be invalid. Sleeping with a lock held seems dangerous, though. This kind of thing is why Object#wait releases the lock it acquired.
Also note that since you're passing references out to the Orders set, code outside this class can add orders. You should do something to protect this internal data, like returning it wrapped in an immutableSet so callers can't make changes.
In general synchronization is used when you want to impose some granularity on changes, where you have 2 or more changes you want made together, without possibility of interleaving. An example is a check-then-act sequence where you execute some code that makes a change based on the value of something else, and you don't want some other thread to execute in between the check and the action (so the decision to act could be made, then the condition that allowed that action changes, so that the action could be invalid). If individual values are changed but they are unrelated, then you can make them volatile or use Atomic variables, and reduce the amount of locking you have to do.
It's a valid point that the synchronized keyword could be removed in cases like the clear method, where the only thing that changes is the added flag, which could be made volatile. The purpose of the added flag continues to elude me. Anything that enters a value that's already present can turn the flag back to false, it's not apparent that reasoning about any action based on what the current value of the flag makes any sense if this structure is getting modified concurrently.
Without knowing the exact context it's hard to say, but in general, classes created without considering their being used with multiple threads probably need to be reworked extensively before being used in a concurrent environment.

Lazy initialization without synchronization or volatile keyword

The other day Howard Lewis Ship posted a blog entry called "Things I Learned at Hacker Bed and Breakfast", one of the bullet points is:
A Java instance field that is assigned exactly once via lazy
initialization does not have to be synchronized or volatile (as long
as you can accept race conditions across threads to assign to the
field); this is from Rich Hickey
On the face of it this seems at odds with the accepted wisdom about visibility of changes to memory across threads, and if this is covered in the Java Concurrency in Practice book or in the Java language spec then I have missed it. But this was something HLS got from Rich Hickey at an event where Brian Goetz was present, so it would seem there must be something to it. Could someone please explain the logic behind this statement?
This statement sounds a little bit cryptic. However, I guess HLS refers to the case when you lazily initialize an instance field and don't care if several threads performs this initialization more than once.
As an example, I can point to the hashCode() method of String class:
private int hashCode;
public int hashCode() {
int hash = hashCode;
if (hash == 0) {
if (count == 0) {
return 0;
}
final int end = count + offset;
final char[] chars = value;
for (int i = offset; i < end; ++i) {
hash = 31*hash + chars[i];
}
hashCode = hash;
}
return hash;
}
As you can see access to the hashCode field (which holds cached value of the computed String hash) is not synchronized and the field isn't declared as volatile. Any thread which calls hashCode() method will still receive the same value, though hashCode field may be written more than once by different threads.
This technique has limited usability. IMHO it's usable mostly for the cases like in the example: a cached primitive/immutable object which is computed from the others final/immutable fields, but its computation in the constructor is an overkill.
Hrm. As I read this it is technically incorrect but okay in practice with some caveats. Only final fields can safely be initialized once and accessed in multiple threads without synchronization.
Lazy initialized threads can suffer from synchronization issues in a number of ways. For example, you can have constructor race conditions where the reference of the class has been exported without the class itself being initialized fully.
I think it highly depends on whether or not you have a primitive field or an object. Primitive fields that can be initialized multiple times where you don't mind that multiple threads do the initialization would work fine. However HashMap style initialization in this manner may be problematic. Even long values on some architectures may store the different words in multiple operations so may export half of the value although I suspect that a long would never cross a memory page so therefore it would never happen.
I think it depends highly on whether or not an application has any memory barriers -- any synchronized blocks or access to volatile fields. The devil is certainly in the details here and the code that does the lazy initialization may work fine on one architecture with one set of code and not in a different thread model or with an application that synchronizes rarely.
Here's a good piece on final fields as a comparison:
http://www.javamex.com/tutorials/synchronization_final.shtml
As of Java 5, one particular use of the final keyword is a very important and often overlooked weapon in your concurrency armoury. Essentially, final can be used to make sure that when you construct an object, another thread accessing that object doesn't see that object in a partially-constructed state, as could otherwise happen. This is because when used as an attribute on the variables of an object, final has the following important characteristic as part of its definition:
Now, even if the field is marked final, if it is a class, you can modify the fields within the class. This is a different issue and you must still have synchronization for this.
This works fine under some conditions.
its okay to try and set the field more than once.
its okay if individual threads see different values.
Often when you create an object which is not changed e.g. loading a Properties from disk, having more than one copy for a short amount of time is not an issue.
private static Properties prop = null;
public static Properties getProperties() {
if (prop == null) {
prop = new Properties();
try {
prop.load(new FileReader("my.properties"));
} catch (IOException e) {
throw new AssertionError(e);
}
}
return prop;
}
In the short term this is less efficient than using locking, but in the long term it could be more efficient. (Although Properties has a lock of it own, but you get the idea ;)
IMHO, Its not a solution which works in all cases.
Perhaps the point is that you can use more relaxed memory consistency techniques in some cases.
I think the statement is untrue. Another thread can see a partially initialized object, so the reference can be visible to another thread even though the constructor hasn't finished running. This is covered in Java Concurrency in Practice, section 3.5.1:
public class Holder {
private int n;
public Holder (int n ) { this.n = n; }
public void assertSanity() {
if (n != n)
throw new AssertionError("This statement is false.");
}
}
This class isn't thread-safe.
If the visible object is immutable, then I you are OK, because of the semantics of final fields means you won't see them until its constructor has finished running (section 3.5.2).

Java: How exactly do synchronized operations relate to volatility?

Sorry this is such a long question.
Ive been doing lots of research lately into multi-threading as I slowly implement it into a personal project. However, probably due to an abundance of slightly incorrect examples, the use of synchronized blocks and volatility in certain situations is still a bit unclear to me.
My core question is this: Are changes to references and primitives automatically volatile (that is, performed on the main memory and not a cache) when a thread is inside a synchronized block, or does the read also have to be synchronized for it to work properly?
If so What is the purpose of synchronizing a simple getter method? (see example 1 ) Also, are ALL changes sent to main memory as long as the thread has synchronized on anything? eg if it is sent off to do loads of work all over the place inside a very high level sync will every single change then made be to main memory, and nothing ever to cache, until its unlocked again?
If not Does the change have to be explicitly inside a synchronized block, or can java actually pick up on, for example, uses of the Lock object? (see example 3)
If either Does the synchronized object need to be related to the reference/primitive being changed in any way (eg the immediate object that contains it)? Can I write by syncing on one object and read with another if its otherwise safe? (see example 2)
(please note for the following examples that I know that synchronized methods and synchronized(this) are frowned upon and why, but discussion about that is beyond the scope of my question)
Example 1:
class Counter{
int count = 0;
public synchronized void increment(){
count++;
}
public int getCount(){
return count;
}
}
In this example, increment() needs to be synchronized since ++ is not an atomic operation. As such, two threads incremending at the same time may result in a overall increase of 1 to the count. The count primitive needs to be atomic (eg not long/double/reference), and it is so thats fine.
Does getCount() need to be synchronized here and why exactly? The explanation I have heard the most is that I will have no guarantee whether the count returned will be the pre- or post-increment. However, this seems like the explanation for something slightly different, thats found itself in the wrong place. I mean if I were to synchronize getCount(), then I still see no guarantee - its now down to not knowing the locking order, insead of not knowing whether the actual read happens to be before/after the actual write.
Example 2:
Is the following example threadsafe, if you assume that through trickery not shown here that none of these methods will never be called at the same time? Will count increment in an expected way if its done so using a random method each time, and then be read properly, or does the lock have to be the same object? (btw I fully realise how rediculous this example is but Im more interested in theory than practice)
class Counter{
private final Object lock1 = new Object();
private final Object lock2 = new Object();
private final Object lock3 = new Object();
int count = 0;
public void increment1(){
synchronized(lock1){
count++;
}
}
public void increment2(){
synchronized(lock2){
count++;
}
}
public int getCount(){
synchronized(lock3){
return count;
}
}
}
Example 3:
Is the happens-before relationship simply a java concept, or is it an actual thing built into the JVM? Even though I can guarantee a conceptual happens-before relationship for this next example, is java smart enough to pick it up if its a built in thing? I am assuming it is not, but is this example actually threadsafe? If its threadsafe, what about if getCount() did no locking?
class Counter{
private final Lock lock = new Lock();
int count = 0;
public void increment(){
lock.lock();
count++;
lock.unlock();
}
public int getCount(){
lock.lock();
int count = this.count;
lock.unlock();
return count;
}
}
Yes, the read has to be synchronized as well. This page says:
The results of a write by one thread are guaranteed to be visible to a
read by another thread only if the write operation happens-before the
read operation.
[...]
An unlock (synchronized block or method exit) of a monitor
happens-before every subsequent lock (synchronized block or method
entry) of that same monitor
The same page says:
Actions prior to "releasing" synchronizer methods such as Lock.unlock,
Semaphore.release, and CountDownLatch.countDown happen-before actions
subsequent to a successful "acquiring" method such as Lock.lock
So locks offer the same visibility guarantees as synchronized blocks.
Whether you use synchronized blocks or locks, the visibility is only guaranteed if the reader thread uses the same monitor or lock as the writer thread.
Your Example 1 is incorrect: the getter must be synchronized as well if you want to see the latest value of the count.
Your example 2 is incorrect because it uses different locks to guard the same count.
Your example 3 is OK. If the getter did not lock, you could see an older value of the count. The happens-before is something that is guaranteed by the JVM. The JVM has to respect the rules specified, by flushing caches to the main memory for example.
Try to view it in terms of two distinct, simple operations:
Locking (mutual exclusion),
Memory barrier (cache sync, instruction reordering barrier).
Entering a synchronized block entails both locking and memory barrier; leaving the synchronized block entails unlocking + memory barrier; reading/writing a volatile field entails memory barrier only. Thinking in these terms I think you can clarify for yourself all the question above.
As for Example 1, the reading thread will not have any kind of memory barrier. It's not just between seeing the value before/after read, it's about never observing any change to the var after a thread is started.
Example 2. is the most interesting issue you raise. You are indeed given no guarantees by the JLS in this case. In practice you won't be given any ordering guarantees (it's as if the locking aspect wasn't there at all), but you'll still have the benefit of the memory barriers so you will observe changes, unlike the first example. Basically, this is exactly the same as removing synchronized and tagging the int as volatile (apart from the runtime costs of acquiring locks).
Regarding Example 3, by "just a Java thing" I feel you have generics with erasure in mind, something that only the static code checking is aware of. This is not like that -- both locks and memory barriers are pure runtime artifacts. In fact, the compiler can't reason about them at all.

Java. How to properly synchronize getters and setters?

If I have several mutable properties in an object that will be acted upon by several threads, I understand they should be synchronized.
class Doggie {
private String name;
private int age;
public void setName(String name) { this.name = name; }
public String getName() { return this.name; }
public void setAge(int age) { this.age = age; }
public int getAge() { return this.age; }
}
Questions:
Are not return and assignment atomic operations in Java?
Since properties might not necessarily be interrelated, it does not always make sense to synchronize with the same lock. How to organize the locking structure?
Is it better to go with the intrinsic lock or a private Object lock pattern?
Are not return and assignment atomic operations in Java?
Yes they are atomic (in some cases at least), but atomicity is not the only issue. Another important issue is whether the action of a write to an attribute by one thread is guaranteed to be visible to a following read for the same attribute made by a different thread.
When the reads and writes are in the same thread, the read is guaranteed to see the earlier write.
When the reads and writes are in different threads, the read is only guaranteed to see the earlier write if the two threads synchronize properly ... or if the attribute is declared as volatile.
Note that primitive locks/mutexes are not the only way to synchronize.
Since properties might not necessarily be interrelated, it does not always make sense to synchronize with the same lock. How to organize the locking structure?
It makes sense to use multiple locks if (and only if) lock contention is likely. In your example, lock contention is only likely to be an issue if some Doggie instance receives a very high rate of get and/or set operations.
Is it better to go with the intrinsic lock or a private Object lock pattern?
It depends. If your application is going use the Doggie object's primitive lock, then you might get lock contention or even unintended locking out of get and set operations. In that case a private lock might be advisable. Otherwise, a private lock is an unnecessary overhead.
Your example begs for an immutable object. http://java.sun.com/docs/books/tutorial/essential/concurrency/imstrat.html
Operations with references are atomic, but not volatile - you will always see the old value or the new value, but there's no guarantee you'll see the new value without some sort of memory barrier. I can't remember the details of which primitives are guaranteed to be atomic - probably all but long and double.
Personally I'd use a single private lock until I saw any evidence that it was a bottleneck. I would advise against locking on "this" as other code might lock on it too. If you're the only code that knows about the lock, it's harder to get interference. Having said that, if callers want to atomically change more than one property, you may want to expose the lock via a property.
Do you definitely need a threadsafe mutable type? If you could avoid that requirement it would make life simpler.
They are atomic operations, but picture a scenario where two clients are trying to get and set a piece of data at the same time. There is no guarantee as to which order things are going to be called which could greatly affect the results of your application. (The classic example is money transactions.)
It may or may not make sense to synchronize with the same lock - that really depends on your application. However, it typically is not a good idea to lock the entire object out if it is not necessary.
As with what Jon said, start with a single, private lock and go from there depending on results.
You're right to take note that non-interrelated properties can have different locks. Considering that locking objects require trivial memory, I would personally go with a lock per property instead of one for the entire object.
The lightweight way to do this is just to have a boolean that's set while the property is being written to and clear otherwise. The heavyweight way to do this, to support timeouts etc., is with a mutex.

Categories

Resources