Java lambda expression with local variable capture

Java lambda expression with local variable capture - java

The reason for a local variable to be final or effectively final this because of concurrency issues. In the jls 8 specification, it states the following.
The restriction to effectively final variables prohibits access to
dynamically-changing local variables, whose capture would likely
introduce concurrency problems.
All good and sweet, but I did a little experiment. What if I synchronize the method, that would eliminate the possibility of dynamically-changing local variable because I am guaranteed only a single thread can execute this code. But the compile threw an error saying it has to be final or effectively final.
Is logic right?
Consider the following code:
public synchronized void capture() {
int localVariable = 100;
Interf i = (text) -> System.out.println(text + localVariable);
i.m1("This local variable is: ");
localVariable = 1000;
}
}

The answer is simply that your variable goes out of scope at the end of the method. This is easily solved with effectively final variables as the compiler just copies the value into the lambda. Since the code in the lambda expression can also be run outside of the method (where the modifiable variable is garbage collected already) this won't work. You also can't expect the compiler to somehow make a copy of your variable and then dynamically change it when it's modified outside of your lambda expression. I hope that clears it up.

But the compile threw an error saying it has to be final or effectively final.
That's because it does as per the rules. No it's, no buts; it doesn't matter if you've actually guarded against all concurrency issues or not - if it's not effectively final, it won't compile.
In your simplistic example here it's probably ok. However, making the method synchronized is irrelevant since local variables will always be tied to their per-thread invocation anyway. It's threading issues within the context of the method itself that the compiler is worried about, and that can easily happen with lambdas being used (which may be executed some arbitrary time in the future, after the state of a non-final variable may have changed, and if it has, it's not at all clear what state should be used - the initial state, or the updated state.)

Imagine the lambda that you have creates a CompletableFuture to be executed by the ForkJoinPool or another executor?
That is why synchronized on this method would not suffice to overrule the rule of having the local variable effectively final. The lambda will execute synchronously and will be synchronized but the async task that it creates would not.

Related

Read Indirect state Vs. Read Direct state of a variable in java

In following code compilation fails for static variable j inside the static block, as mentioned in comment.
However, It is working fine inside the method m1()
class StaticBlock {
static {
m1();
//compilation fails because variable are in read indirect state
System.out.println(j);
}
static void m1() {
System.out.println(j);
}
static int j = 10;
I know root cause of compilation failure - variable j is in Read Indirect State.
My question- Why is this behavior, we can also print 0 inside static block as we are doing in m1().
What made API developers to have this discrepancy

Why is this behavior, we can also print 0 inside static block as we
are doing in m1().
What made API developers to have this discrepancy
There are competing priorities revolving around simple specifications for the order of events during class initialization, consistency of constant (i.e. final) class fields, programmer expectations, and ease of implementation.
The values of constant fields provides a good starting point. Java would like to avoid the default values of such fields being observable, and especially to avoid their default values being used in the initialization of other class variables. Therefore, these are initialized first, before static initializer blocks or the initializers of other class variables. They are initialized in the order they appear in the source code, which is a rule that is easy for both humans and compilers to understand. But that affords the possibility that the initializer of one class variable sees the default value of another, yielding surprising, unwanted results. Java therefore specifies that that case must be detected and rejected at compile time.
Static initializer blocks and the initializers of other class variables are executed afterward, in the order they appear in the source code. The case for the constraint you're asking about is not as strong here, but it's reasonable to choose consistency by applying the same rule here as is applied to class constants. Combined, the effect is to have easy to understand and predict initialization order that is also mostly consistent with a model of class variables' initializers being evaluated and assigned before static initializer blocks are evaluated.
But then come static methods. It is highly desirable for static methods to be available for use during initialization, but they are also usable after initialization is complete, when none of the initialization-order considerations are relevant. It is therefore unfeasible to restrict static methods' access to variables based on order of appearance in source code. Conceivably, the VM could instead be required to keep track of class variables' individual initialization state, either by control-flow analysis at compile time or by some form of runtime monitoring, but rather than require such complexities, Java opts for simplicity, allowing people who insist on making a mess (by observing default values of class variables) to do so.
I emphasize, finally, that so-called "Read Indirect Write Only state" is part of a third-party model of how this all works. Java itself has no such concept -- such a thing is exactly what it rejects in favor of simplicity when it comes to requirements on static methods' use of class variables.

Is a public static final int thread safe?

I want to have a file with constants being accessed from multiple threads. Is it a safe implementation to have a class with a lot of public static final ints for this?

Yes, it is thread-safe. Any static final variable is guaranteed to be initialized after class initialization. Thus, once a class containing such a static final variable is used anywhere in your code, it is always fully initialized (i.e. the value is set) by requirement of the JVMS.
With a primitive int, this condition is even tighter. A primitive static final variable (same goes for String) is a so called compile-time constant which are inlined by the Java compiler javac. The only requirement is that the value can be computed by the Java compiler, i.e. it must not be the result of a non-constant evaluation. As you are write that you want to define constants, I assume that this does not apply for your use case. Those constant values are therefore directly copied to their access location what cuts the corner-case of non-thread safety of a static final variable being altered via reflection wich is hypothetically an issue with non-primitive types.
Furthermore, using such variables is a good idea because it avoids the use of so-called magic numbers.

Yes, it is safe.
The value never changes so there is no risk of race conditions. Java guarantees that the value will be initialized before anything tries to use it.
Whether it is the best architecture for other reasons (clarity of design etc) is another question.

Yes, 100% safe. It's final, so nobody can alter it. Every thread has to access as reader only, and there is no contention for reading only.

For primitives, making them final makes them compile time constants (if initialized directly and not as result of method) and an int is a primitive. So, final int makes it immutable and hence thread-safe.

Is there a pthread_once equivalent in Java?

In C/C++ world, it is very easy make a routine executed just once by using pthread_once. In Java, I generally use static atomic variables to do the explicit check if the routine was run already. But that looks ugly and hence wondering if there is something like pthrea_once in Java.

Since you refer to “static atomic variables” you seem to talk about static resources which do not need special actions if you initialize them within the class initializer itself:
class Foo {
static ResourceType X = createResource();
}
Here, createResource() will be executed exactly once in a thread-safe manner on the first use of Foo, e.g. when Foo.X is accessed the first time. Threads accessing X while the class initialization is in progress are forced to wait, but subsequent access will be performed without any synchronization overhead. Typically, but not necessarily, the variable will be declared final as well.
If you have multiple resources whose creation should be deferred independently, the owner class might use inner classes, each of them holding one resource.
If your question is about an action which should be executed exactly once without returning a value, the static initialization can be used as well. You only have to add a member you can access to trigger the class initialization, e.g.:
class Foo {
static { performAction(); }
static void performActionOnce() {}
}
Here, calling Foo.performActionOnce() will cause performAction() to be executed the first time while all other subsequent invocations do nothing. You can also rely on that on returning from performActionOnce() the action within performAction() has been completed, even when there is contention on the first invocation.
This is different from any atomic variable approach as atomic variables do not provide a sufficient waiting capability for the case that the first invocation is contended. If you combine the atomic variable with a waiting queue, you end up what Lock (or any other AQS based concurrency tool) provides. For instance variables where the static initialization does not work, there is no simple workaround (besides thinking about whether initialization really has to be lazy).

In Java, do methods that don't use static or class variables need to be synchronized?

Do methods that only use local variables inside suffer any threading issues ?. Somewhere it was mentioned that the method with local variables are copied to each thread stack frame to work with and do not need to synchronized for multithreaded implementation unless it uses class level or static references/variables ?

If your method only operates on parameters and locally-defined (as opposed to class member) variables then there are zero synchronization problems to worry about.
But...
This means any mutable reference types you use must live and die only within the scope of your method. (Immutable reference types aren't a problem here.) For example this is no problem:
int doSomething(int myParameter)
{
MyObject working_set = new MyObject();
interim = working_set.doSomethingElse(myParameter);
return working_set.doSomethingElseAgain(interim);
}
A MyObject instance is created within your method, does all of its work in your method and is coughing up blood, waiting to be culled by the GC when you exit your method.
This, on the other hand, can be a problem:
int doSomething(int myParameter)
{
MyObject working_set = new MyObject();
interim = working_set.doSomethingElse(myParameter);
another_interim = doSomethingSneaky(working_set);
return working_set.doSomethingElseAgain(another_interim);
}
Unless you know for sure what's going on in doSomethingSneaky(), you may have a need for synchronization somewhere. Specifically you may have to do synchronization on the operations on working_set because doSomethingSneaky() could possibly store the reference to your local working_set object and pass that off to another thread while you're still doing stuff in your method or in the working_set's methods. Here you'll have to be more defensive.
If, of course, you're only working with primitive types, even calling out to other methods, passing those values along, won't be a problem.

Does methods that only use local variables inside, do not suffer any threading issues ?
True in a very simplistic sense, but lets be clear - I think this is only true if:
such a method uses only local variables that are primitives or references to mutable instances that cannot otherwise be accessed outside the method by any other means.
such a method invokes only methods that are thread-safe.
Some ways these rules could be violated:
A local variable could be initialized to point to an object that is also accessible outside the method. For example, a local variable could point to a singleton (Foo bar = Foo.getSingleton()).
A local instance held by a local variable could "leak" if the instance is passed as a argument to an external method that keeps a reference to the instance.
A class with no instance variables and with only a single method with no local variables could still call the static method of another class that is not thread-safe.

The question is very generic, so please do not expect any specificity from my answer.
1_ We need to more careful with static methods than say instance methods.
2_ #Justmycorrectopinion is about right, but some of the terms he described needs to be more elaborated to be perfect. ( Even if the static method, only works on local variable, there is still possibility of race condition.)
3_ For me there are simple rules that have helped me analyze thread safety.
Understand if each components encapsulated within it is shareable or not. So the simplest solution is to reduce the scope of all variable and only increase scope if absolutely necessary and if component perform mutation on a object, its usually not thread safe.
4_ Use tooling support to perform static code analysis on thread safety. (Idea has checkthread plugin).
5_ Never use static method to perform object mutation. If calling static variable causes object mutation, then the developer is just circumventing OOPS.
6_ Always document thread safety. Remember some method may not need to be synchronized when you develop, but can be made not thread safe very easily.
7_ Last but probably my most important point, make sure most of your objects are immutable. In my experience, most of the time, I never had to make many of my objects mutable. (In rare cases when object state needs to be changed, defensive copying / New Object Creation is almost always better. )

You do not need to worry about local variables. Instance variables however are something to care about.

What does using volatile when using a java.util.concurrent.Concurrent* containers provide?

The question came up when I saw this code:
private static volatile ConcurrentHashMap<String, String> cMap = null;
static {
cMap = new ConcurrentHashMap<String, String>();
}
To me it looks like the volatile there is redundant as the container is ConcurrentHashMap which according the JavaDoc already has synchronized puts, DUH, the class that uses the cMap only instantiates it once and doesn't have any methods of setting or getting it.
The only thing I see volatile providing here is that if I would be setting the cMap to reference a new object in near future, those reads and writes would be synchronized.
Am I missing something?

The volatile modifier doesn't have anything to do with the class involved - it's only to do with the variable cMap. It only affects how a thread fetches or changes the value of that variable. By the time you've got as far as invoking methods on the referenced object, you've gone beyond the bailiwick of volatile.
As you say, it basically makes sure that all threads would be guaranteed to see changes to the cMap value (i.e. making it refer to a different map).
That may be a good idea - or it may not, depending on what the rest of the code does. If you could make it final for example, you wouldn't need it to be volatile...

unless the variable is re-assigned later, the volatile is totally unnecessary.
any writes occurred in the static initializer are visible to any code using the class (i.e. when a static method/field is accessed, when a constructor is invoked)
without that guarantee, we are in deep trouble. millions lines of code would be wrong.
see JLS3 $12.4.2:
The procedure for initializing a class
or interface is then as follows:
Synchronize (§14.19) on the Class object that represents the class
or interface to be initialized.

Declaring the cMap reference as volatile ensures that its initialized value is visible to all threads. AFAIK without this, it is not guaranteed, i.e. some threads might see a null reference instead of the reference to a properly initialized map object.
[Update] As #irreputable pointed out (and as is explained in Java Concurrency in Practice, section 3.5.3), I was wrong with the above statement: static initializers are indeed executed by the JVM at class initialization time, guarded by internal synchronization of the JVM. So volatile is not necessary.[/Update]
OTOH declaring it final (and initializing it straight away, not in a separate static block) would guarantee visibility too.
Note that the reference being volatile or not has nothing to do with the class of the variable it refers to. Even if the referred class is threadsafe, the reference itself may not be.

You would declare cMap volatile only when its value changes. Declaring it volatile says nothing about the objects held in the map.
If cMap changes all threads will need to see the new up to date value of the CHM. That being said, I would highly recommend cMap being final. A non final static variable can be dangerous.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java lambda expression with local variable capture - java

Related

Read Indirect state Vs. Read Direct state of a variable in java

Is a public static final int thread safe?

Is there a pthread_once equivalent in Java?

In Java, do methods that don't use static or class variables need to be synchronized?

What does using volatile when using a java.util.concurrent.Concurrent* containers provide?

Categories

Resources