I would like to know if this piece of code is correct or not. Will this not lead to issues as I am submitting the runnable object to the executor service while constructing the object itself?
public class A implements Runnable {
public A() {
Executors.newSingleThreadExecutor().execute(this);
// some other initializations
}
}
Will this lead to any issues as we are trying to submit the object to the executor even before creating it completely? If the run() method is called even before all the initializing is done (if at all it's possible), will the variables still be null which were not yet initialized?
Please do not ask me to come up with the complete code, as I have been asking this as a general question which requires clarification.
Yes, there may be issues. The Executor might read a field that you set in the constructor, even before the corresponding code in the constructor was executed. In general you should not expose this from inside a constructor. Java provides useful guarantees for objects after their constructor finished, but in order to benefit from those you have to wait for the result of new X(...) before using it.
Will this lead to any issues as we are trying to submit the object to
the executor even before creating it completely?
For one thing, you can get final variable that are still changing value - that is quite bad per the semantics of final. It can lead to very hard-to-trace concurrency bugs in multi-threaded code.
This code will usually print a few zeros and even the occasional 4, even though the final field a is only ever assigned the value 4 and should never been seen having any other value than 4.
public class A implements Runnable {
private static ExecutorService threads = Executors.newSingleThreadExecutor();
final int a;
public A() {
threads.execute(this);
Thread.yield();
a = 4;
}
#Override
public void run() {
if (a != 4) {
System.out.println(a);
}
}
public static void main(String[] args) {
for (int i = 0; i < 50_000; i++) {
new A();
}
threads.shutdown();
}
}
If the run() method is called even before all the initializing is done
(if at all it's possible), will the variables still be null which were
not yet initialized?
Yes, the ones not yet initialized will be null for reference variables, or the default value (0, false, '\0', 0d, 0f, etc.) for the primitive types. It is even possible according to the specifications with long and double fields to see only 32 of the 64 bits initialized (although on 64 bit architectures it is unlikely that you will ever observe this)
There will almost certainly be issues. What you have is called a "this escape" where you pass a this reference in a ctor to an external method. It's super bad, the object is incompletely constructed at that point and anything could happen.
What you should probably do instead is make the constructor private and use a factory method to get a new instance and execute it, if that's the goal.
public class A implements Runnable {
public A getNew() {
A a = new A();
Executors.newSingleThreadExecutor().execute(a);
return a;
}
private A() {
// some other initializations
}
}
Related
I have a question regarding the java garbage collection and enum types.
Lets say I have an enum like so:
enum ConnectionHelper {
INSTANCE;
private boolean initialized = false;
private static int someVar;
ConnectionHelper initialize() {
if (!initialized) {
// Do some initialisation...
someVar = 10;
// NOTE: 1
initialized = true;
}
return this;
}
void start() {
// NOTE: 2
if (!initialized) {
throw new IllegalStateException("ConnectionHelper has to be initialized.");
}
// do some work...
}
Now is there a scenario where initialized may revert back to FALSE due to the garbage collector? The reason I'm asking because if thats the case I need to take additional precautions for this scenario.
And also, if I represent en singleton with an enum, would it matter if I use static or non-static variables for state? For example, in this example, there are two variables; someVar and initialised, would it make a difference to question 1 if initialized was static as well? Or if both were non-static?
Thanks!
We can answer all kind of “will the garbage collector make this program behave strangely” questions with “no”, in general. The very purpose of a garbage collector is to clean up the memory of unused objects transparently, without your program even noticing. Producing an observable behavior like a variable flipping from trueto false is definitely outside the actions allowed for a garbage collector.
That said, your program is not thread safe. If multiple threads access your ConnectionHelper without additional synchronization, they may perceive inconsistent results, including seeing false for the initialized variable while another thread already saw true for it at an earlier time from an external clock’s perspective, or seeing true for initialized while still not seeing the value of 10 written for someVar.
The solution is simple. Don’t implement a superfluous lazy initialization. The enum constants are initialized during the class initialization, which is already lazy (as specified in JLS §12.4.1) and made thread safe (as specified in JLS §12.4.2) by the JVM.
enum ConnectionHelper {
INSTANCE;
private static final int someVar = 10;
void start() {
// do some work, use the already initialized someVar...
}
}
or
enum ConnectionHelper {
INSTANCE;
private final int someVar = 10;
void start() {
// do some work, use the already initialized someVar...
}
}
or
enum ConnectionHelper {
INSTANCE;
private final int someVar;
ConnectionHelper() {
someVar = 10; // as place-holder for more complex initialization
}
void start() {
// do some work, use the already initialized someVar...
}
}
it doesn’t matter whether you declare the variable static or not.
The first thread calling start() will perform the class initialization, including the initialization of someVar. If other threads call the method while the initialization is still ongoing, they will wait for its completion. After the completion of the initialization, all threads may execute the start() method using the initialized values without any slowdown.
Garbage collection is not expected on enum types. Your ConnectionHelper.INSTANCE can not be dereferenced, so the object will always be kept in memory once instantiated.
So, to your questions:
AD 1: No, it can not revert. The only way to set it back to false is to set it manually.
AD 2: No difference for a singleton. There would be a difference if you had more instances, as they would share static variables and not the usual ones.
From reading the JLS after a frustrating debugging session I find that lambdas will capture the value of effectively-final local variables, but if you refer to an instance variable it captures a reference to the variable, which has serious implications for multi-threaded code.
For example, the following is an MCVE distilled from a much larger program:
public class LambdaCapture
{
public static void main(String[] args) throws Exception
{
Launcher i1 = new Launcher();
i1.launchAsynchTask();
}
public static class Launcher
{
private int value = 10;
public void launchAsynchTask() throws Exception
{
System.out.printf("In launchAsynchTask value is %s\n",value);
Thread t = new Thread(()->doSomething(value));
t.start();
value = -1;
t.join();
}
public void doSomething(int value)
{
System.out.printf("In asynch task, value is %s\n",value);
}
}
}
I found the output surprising. It is
In launchAsynchTask value is 10
In asynch task, value is -1
since I initially (prior to JLS research) and intuitively expected the lambda to capture the value of the variable value instead of a reference to it.
If I have to guarantee that the current value is captured instead of a reference the obvious solution is to create a local final temporary:
final int capture = this.value;
Thread t = new Thread(()->doSomething(capture));
My question: Is this the accepted idiomatic way to force value capture, or is there some other more natural way to do it?
I ... intuitively expected the lambda to capture the value of the variable value instead of a reference to it.
That (capturing the value) is what happens with local variables.
With fields, what is actually happening is that you are capturing a reference to the instance of the object that the field belongs to. In your case, it is a reference to the Launcher.this object. (The same thing happens when you declare an inner class.)
My question: Is this the accepted idiomatic way to force value capture, or is there some other more natural way to do it?
I can't think of a better way.
Because you're using shorthand syntax, it's not as obvious what is going on.
When you write value to access the field, it implicitly means this.value.
The lambda expression is capturing the absolutely final "local variable" this that is implicit to all non-static methods.
The lambda expression
()->doSomething(value)
is logically equivalent to
new Lambda$1(this)
where Lambda$1 is declared like this (using arbitrary names):
private static final class Lambda$1 implements Runnable {
private final Launcher ref;
Lambda$1(Launcher ref) {
this.ref = ref;
}
#Override
public void run() {
this.ref.doSomething(this.ref.value);
}
}
As you can see, the lambda expression ()->doSomething(value) is not actually capturing value. The unqualified field access is obscuring what is actually happening.
FYI: Hiding field value behind parameter value in the doSomething() method is a bad idea. The name conflict makes the code very vulnerable to misinterpretation by programmers, and good IDEs will warn you about it (unless you disabled that warning).
Hopefully that just happened by mistake here when creating an MCVE, and you wouldn't do that in real code. :-)
What I normally like to do is to minimize code parts that access fields directly, so you could wrap the part starting the thread in a function like this:
public void launchAsynchTask() throws Exception
{
System.out.printf("In launchAsynchTask value is %s\n", this.value);
Thread t = launchAsynchTaskWithValue(this.value);
this.value = -1;
t.join();
}
public Thread launchAsynchTaskWithValue(int launchValue) throws Exception
{
Thread t = new Thread(()->doSomething(launchValue));
t.start();
return t;
}
I had some confusion about inner classes and lambda expression, and I tried to ask a question about that, but then another doubt arose, and It's probable better posting another question than commenting the previous one.
Straight to the point: I know (thank you Jon) that something like this won't compile
public class Main {
public static void main(String[] args) {
One one = new One();
F f = new F(){ //1
public void foo(){one.bar();} //compilation error
};
one = new One();
}
}
class One { void bar() {} }
interface F { void foo(); }
due to how Java manages closures, because one is not [effectively] final and so on.
But then, how come is this allowed?
public class Main {
public static void main(String[] args) {
One one = new One();
F f = one::bar; //2
one = new One();
}
}
class One { void bar() {} }
interface F { void foo(); }
Is not //2 equivalent to //1? Am I not, in the second case, facing the risks of "working with an out-of-date variable"?
I mean, in the latter case, after one = new One(); is executed f still have an out of date copy of one (i.e. references the old object). Isn't this the kind of ambiguity we're trying to avoid?
A method reference is not a lambda expression, although they can be used in the same way. I think that is what is causing the confusion. Below is a simplification of how Java works, it is not how it really works, but it is close enough.
Say we have a lambda expression:
Runnable f = () -> one.bar();
This is the equivalent of an anonymous class that implements Runnable:
Runnable f = new Runnable() {
public void run() {
one.bar();
}
}
Here the same rules apply as for an anonymous class (or method local class). This means that one needs to effectively final for it to work.
On the other hand the method handle:
Runnable f = one::bar;
Is more like:
Runnable f = new MethodHandle(one, one.getClass().getMethod("bar"));
With MethodHandle being:
public class MethodHandle implements Runnable {
private final Object object;
private final Method method;
public MethodHandle(Object object, java.lang.reflect.Method method) {
this.object = Object;
this.method = method;
}
#Override
public void run() {
method.invoke(object);
}
}
In this case, the object assigned to one is assigned as part of the method handle created, so one itself doesn't need to be effectively final for this to work.
Your second example is simply not a lambda expression. It's a method reference. In this particular case, it chooses a method from a particular object, which is currently referenced by the variable one. But the reference is to the object, not to the variable one.
This is the same as the classical Java case:
One one = new One();
One two = one;
one = new One();
two.bar();
So what if one changed? two references the object that one used to be, and can access its method.
Your first example, on the other hand, is an anonymous class, which is a classical Java structure that can refer to local variables around it. The code refers to the actual variable one, not the object to which it refers. This is restricted for the reasons that Jon mentioned in the answer you referred to. Note that the change in Java 8 is merely that the variable has to be effectively final. That is, it still can't be changed after initialization. The compiler simply became sophisticated enough to determine which cases will not be confusing even when the final modifier is not explicitly used.
The consensus appears to be that this is because when you do it using an anonymous class, one refers to a variable, whereas when you do it using a method reference, the value of one is captured when the method handle is created. In fact, I think that in both cases one is a value rather than a variable. Let's consider anonymous classes, lambda expressions and method references in a bit more detail.
Anonymous classes
Consider the following example:
static Supplier<String> getStringSupplier() {
final Object o = new Object();
return new Supplier<String>() {
#Override
public String get() {
return o.toString();
}
};
}
public static void main(String[] args) {
Supplier<String> supplier = getStringSupplier();
System.out.println(supplier.get()); // Use o after the getStringSupplier method returned.
}
In this example, we are calling toString on o after the method getStringSupplier has returned, so when it appears in the get method, o cannot refer to a local variable of the getStringSupplier method. In fact it is essentially equivalent to this:
static Supplier<String> getStringSupplier() {
final Object o = new Object();
return new StringSupplier(o);
}
private static class StringSupplier implements Supplier<String> {
private final Object o;
StringSupplier(Object o) {
this.o = o;
}
#Override
public String get() {
return o.toString();
}
}
Anonymous classes make it look as if you are using local variables, when in fact the values of these variables are captured.
In contrast to this, if a method of an anonymous class references the fields of the enclosing instance, the values of these fields are not captured, and the instance of the anonymous class does not hold references to them; instead the anonymous class holds a reference to the enclosing instance and can access its fields (either directly or via synthetic accessors, depending on the visibility). One advantage is that an extra reference to just one object, rather than several, is required.
Lambda expressions
Lambda expressions also close over values, not variables. The reason given by Brian Goetz here is that
idioms like this:
int sum = 0;
list.forEach(e -> { sum += e.size(); }); // ERROR
are fundamentally serial; it is quite difficult to write lambda bodies
like this that do not have race conditions. Unless we are willing to
enforce -- preferably at compile time -- that such a function cannot
escape its capturing thread, this feature may well cause more trouble
than it solves.
Method references
The fact that method references capture the value of the variable when the method handle is created is easy to check.
For example, the following code prints "a" twice:
String s = "a";
Supplier<String> supplier = s::toString;
System.out.println(supplier.get());
s = "b";
System.out.println(supplier.get());
Summary
So in summary, lambda expressions and method references close over values, not variables. Anonymous classes also close over values in the case of local variables. In the case of fields, the situation is more complicated, but the behaviour is essentially the same as capturing the values because the fields must be effectively final.
In view of this, the question is, why do the rules that apply to anonymous classes and lambda expressions not apply to method references, i.e. why are you allowed to write o::toString when o is not effectively final? I do not know the answer to that, but it does seem to me to be an inconsistency. I guess it's because you can't do as much harm with a method reference; examples like the one quoted above for lambda expressions do not apply.
No. In your first example you define the implementation of F inline and try to access the instance variable one.
In the second example you basically define your lambda expression to be the call of bar() on the object one.
Now this might be a bit confusing. The benefit of this notation is that you can define a method (most of the time it is a static method or in a static context) once and then reference the same method from various lambda expressions:
msg -> System.out::println(msg);
Yes, this is an academic question, I know people will complain that I'm not posting any code
but I'm genuinely struck with this question, really don't know where to begin. I would really appreciate an explanation and maybe some code example.
If an object constructor starts a new thread that executes the method
run of an anonymous inner class object, it is possible that this new
thread can access its surrounding outer object before it has been
fully constructed and its fields fully initialized. How would you
prevent this from happening?
This is called "leaking this". Here you have the code
public class Test {
// this is guaranteed to be initialized after the constructor
private final int val;
public Test(int v) {
new Thread(new Runnable() {
#Override public void run() {
System.out.println("Val is " + val);
}
}).start();
this.val = v;
}
}
Guess what it will (may, since it's a thread) print. I used a final field to stress that the object is accessed before it has been fully initialized (final fields must be definitely assigned after the last line of every constructor)
How do you recover
You don't want to pass this around when you are in a constructor. This also mean you don't want to call non-final virtual methods in the very same class (non-static, non-private), and not using inner classes (anonymous classes are inner classes), that are implicitely linked to the enclosing instance, thus it's as they could access this.
Think about the single-threaded situation first:
Whenever you create an object via new, its constructor is called which (hopefully) initializes the fields of the new object before a reference to this object is returned. That is, from the point of view of the caller, this new is almost like an atomic operation:
Before calling new, there is no object. After returning from new, the object exists fully initialized.
So all is good.
The situation changes slightly when multiple threads come into play. But we have to read your quote carefully:
...has been fully constructed and its fields fully initialized.
The crucial point is fully. The subject line of your question says "before created", but what is meant here is not before the object has been created, but between object creation and initialization. In a multi-threaded situation, new can no longer be considered (pseudo-)atomic because of this (time flows from left to right):
Thread1 --> create object --> initialize object --> return from `new`
^
|
| (messing with the object)
Thread2 ------------------/
So how can Thread2 mess with the object? It would need a reference to that object but since new will only return the object after is both been created and initialized, this should be impossible, right?
Well, no - there is one way where it's still possible -- namely if Thread 2 is created inside the object's constructor. Then the situation would be like this:
Thread1 --> create object --> create Thread2 --> initialize object --> return from `new`
| ^
| |
| | (messing with the object)
\-----/
Since Thread2 is created after the object has been created (but before it has been fully initialized), there is already a reference to the object that Thread2 could get a hold of. One way is simply if the constructor of Thread2 explicitly takes a reference to the object as a parameter. Another way is by using a non-static inner class of the object for Thread2's run method.
I would change the title of the question, as threads are not accessing themselves, but the second one to the first one. I mean:
You have one thread, creating an object.
Inside the constructor for this object, you declare an anonymous inner class that implements Runnable.
In the same constructor of the first thread, you start a new thread to run your anonymous inner class.
Thus, you're having two threads. If you want to assure that the new thread doesn't do anything before the constructor is "fully ended", I would use some locks in the constructor. This way, the 2nd thread can be started but will wait until the first thread ends.
public class A {
int final number;
A() {
new Thread(
new Runnable() {
public void run() {
System.out.pritnln("Number: " + number);
}
}).start();
number = 2;
}
}
I do not fully agree with Pablos answer because it heavily depends on your initialization method.
public class ThreadQuestion {
public volatile int number = 0;
public static void main(String[] args) {
ThreadQuestion q = new ThreadQuestion();
}
public ThreadQuestion() {
Thread t = new Thread(new Runnable() {
#Override
public void run() {
System.out.println(number);
}
});
try {
Thread.sleep(500);
} catch(Exception e) {
e.printStackTrace();
}
number = 1;
t.start();
}
}
When you
place t.start() at the end, the correct data is printed.
place t.start() before the sleep command, it will print 0
remove the sleep command and place t.start() before the assignment it can print 1 (not determinable)
Play a mind game on 3.) you can say a "tiny" assignment of 1 simple data type will work as expected but if you create a database connection it will not achieve a reliable result.
Do not hesitate to raise any question.
So a situation like this?
public class MyClass {
private Object something;
public MyClass() {
new Thread() {
public void run() {
something = new Object();
}
}.start();
}
}
Depending on the actual code used, the behaviour could vary. This is why constructors should be carefully made so that they don't for example call non-private methods (a subclass could override it, allowing the superclass this to be accessed from a subclass before the superclass is fully initialized). Although this particular example deals with a single class and a thread, it's related to the reference leaking problem.
I've heard about this happening in non thread-safe code due to improperly constructed objects but I really don't have the concept down, even after reading about in in Goetz's book. I'd like to solidify my understanding of this code smell as I maybe doing it and not even realize it. Please provide code in your explanation to make it stick, thanks.
Example : in a constructor, you create an event listener inner class (it has an implicit reference to the current object), and register it to a list of listener.
=> So your object can be used by another thread, even though it did not finish executing its constructor.
public class A {
private boolean isIt;
private String yesItIs;
public A() {
EventListener el = new EventListener() { ....};
StaticListeners.register(el);
isIt = true;
yesItIs = "yesItIs";
}
}
An additional problem that could happen later : the object A could be fully created, made available to all threads, use by another thread ... except that that thread could see the A instance as created, yesItIs with it "yesItIs" value, but not isIt! Believe it or not, this could happen ! What happen is:
=> synchronization is only half about blocking thread, the other half is about inter-thread visibility.
The reason for that Java choice is performance : inter-thread visibility would kill performance if all data would be shared with all threads, so only synchronized data is guaranteed to be shared...
Really simple example:
public class Test
{
private static Test lastCreatedInstance;
public Test()
{
lastCreatedInstance = this;
}
}
This is the reason why double-checked locking doesn't work. The naive code
if(obj == null)
{
synchronized(something)
{
if (obj == null) obj = BuildObject(...);
}
}
// do something with obj
is not safe because the assignment to the local variable can occur before the rest of the construction (constructor or factory method). Thus thread 1 can be in the BuildObject step, when thread 2 enters the same block, detects a non-null obj, and then proceeds to operate on an incomplete object (thread 1 having been scheduled out in mid-call).
public class MyClass{
String name;
public MyClass(String s)
{
if(s==null)
{
throw new IllegalArgumentException();
}
OtherClass.method(this);
name= s;
}
public getName(){ return name; }
}
In the above code, OtherClass.method() is passed an instance of MyClass which is at that point incompletely constructed, i.e. not yet fulfilling the contract that the name property is non-null.
Steve Gilham is correct in his assesment of why double checked locking is broken. If thread A enters that method and obj is null, that thread will begin to create an instance of the object and assign it obj. Thread B can possibly enter while thread A is still instantiating that object (but not completing) and will then view the object as not null but that object's field may not have been initialized. A partially constructed object.
However, the same type of problem can arrise if you allow the keyword this to escape the constructor. Say your constructor creates an instance of an object which forks a thread, and that object accepts your type of object. Now your object may have not be fully initialized, that is some of your fields may be null. A reference to your object by the one you have created in your constructor can now reference you as a non null object but get null field values.
A bit more explanation:
Your constructor can initialize every field in your class, but if you allow 'this' to escape before any of the other objects are created, they can be null (or default primative) when viewed by other threads if 1. They are not declared final or 2. They are not declared volatile
public class Test extends SomeUnknownClass{
public Test(){
this.addListner(new SomeEventListner(){
#Override
void act(){}
});
}
}
After this operation instanse of SomeEventListner will have a link to Test object, as a usual inner class.
More examples can be find here:
http://www.ibm.com/developerworks/java/library/j-jtp0618/index.html
Here's an example of how uninitialized this of OuterClass can be accessed from inside of inner class:
public class OuterClass {
public Integer num;
public OuterClass() {
Runnable runnable = new Runnable() { // might lead to this reference escape
#Override
public void run() {
// example of how uninitialized this of outer class
// can be accessed from inside of inner class
System.out.println(OuterClass.this.num); // will print null
}
};
new Thread(runnable).start();
new Thread().start(); // just some logic to keep JVM busy
new Thread().start(); // just some logic to keep JVM busy
this.num = 8;
System.out.println(this.num); // will print 8
}
public static void main(String[] args) {
new OuterClass();
}
}
Output:
null
8
Pay attention to OuterClass.this.num instruction in the code