Related
In a lambda, local variables need to be final, but instance variables don't. Why so?
The fundamental difference between a field and a local variable is that the local variable is copied when JVM creates a lambda instance. On the other hand, fields can be changed freely, because the changes to them are propagated to the outside class instance as well (their scope is the whole outside class, as Boris pointed out below).
The easiest way of thinking about anonymous classes, closures and labmdas is from the variable scope perspective; imagine a copy constructor added for all local variables you pass to a closure.
In a document of project lambda, State of the Lambda v4, under Section 7. Variable capture, it is mentioned that:
It is our intent to prohibit capture of mutable local variables. The
reason is that idioms like this:
int sum = 0;
list.forEach(e -> { sum += e.size(); });
are fundamentally serial; it is quite difficult to write lambda bodies
like this that do not have race conditions. Unless we are willing to
enforce—preferably at compile time—that such a function cannot escape
its capturing thread, this feature may well cause more trouble than it
solves.
Another thing to note here is, local variables are passed in the constructor of an inner class when you access them inside your inner class, and this won't work with non-final variable because value of non-final variables can be changed after construction.
While in case of an instance variable, the compiler passes a reference of the object and object reference will be used to access instance variables. So, it is not required in case of instance variables.
PS : It is worth mentioning that anonymous classes can access only final local variables (in Java SE 7), while in Java SE 8 you can access effectively final variables also inside lambda as well as inner classes.
In Java 8 in Action book, this situation is explained as:
You may be asking yourself why local variables have these restrictions.
First, there’s a key
difference in how instance and local variables are implemented behind the scenes. Instance
variables are stored on the heap, whereas local variables live on the stack. If a lambda could
access the local variable directly and the lambda were used in a thread, then the thread using the
lambda could try to access the variable after the thread that allocated the variable had
deallocated it. Hence, Java implements access to a free local variable as access to a copy of it
rather than access to the original variable. This makes no difference if the local variable is
assigned to only once—hence the restriction.
Second, this restriction also discourages typical imperative programming patterns (which, as we
explain in later chapters, prevent easy parallelization) that mutate an outer variable.
Because instance variables are always accessed through a field access operation on a reference to some object, i.e. some_expression.instance_variable. Even when you don't explicitly access it through dot notation, like instance_variable, it is implicitly treated as this.instance_variable (or if you're in an inner class accessing an outer class's instance variable, OuterClass.this.instance_variable, which is under the hood this.<hidden reference to outer this>.instance_variable).
Thus an instance variable is never directly accessed, and the real "variable" you're directly accessing is this (which is "effectively final" since it is not assignable), or a variable at the beginning of some other expression.
Putting up some concepts for future visitors:
Basically it all boils down to the point that compiler should be able to deterministically tell that lambda expression body is not working on a stale copy of the variables.
In case of local variables, compiler has no way to be sure that lambda expression body is not working on a stale copy of the variable unless that variable is final or effectively final, so local variables should be either final or effectively final.
Now, in case of instance fields, when you access an instance field inside the lambda expression then compiler will append a this to that variable access (if you have not done it explicitly) and since this is effectively final so compiler is sure that lambda expression body will always have the latest copy of the variable (please note that multi-threading is out of scope right now for this discussion). So, in case instance fields, compiler can tell that lambda body has latest copy of instance variable so instance variables need not to be final or effectively final. Please refer below screen shot from an Oracle slide:
Also, please note that if you are accessing an instance field in lambda expression and that is getting executed in multi-threaded environment then you could potentially run in problem.
It seems like you are asking about variables that you can reference from a lambda body.
From the JLS §15.27.2
Any local variable, formal parameter, or exception parameter used but not declared in a lambda expression must either be declared final or be effectively final (§4.12.4), or a compile-time error occurs where the use is attempted.
So you don't need to declare variables as final you just need to make sure that they are "effectively final". This is the same rule as applies to anonymous classes.
Within Lambda expressions you can use effectively final variables from the surrounding scope.
Effectively means that it is not mandatory to declare variable final but make sure you do not change its state within the lambda expresssion.
You can also use this within closures and using "this" means the enclosing object but not the lambda itself as closures are anonymous functions and they do not have class associated with them.
So when you use any field (let say private Integer i;)from the enclosing class which is not declared final and not effectively final it will still work as the compiler makes the trick on your behalf and insert "this" (this.i).
private Integer i = 0;
public void process(){
Consumer<Integer> c = (i)-> System.out.println(++this.i);
c.accept(i);
}
Here is a code example, as I didn't expect this either, I expected to be unable to modify anything outside my lambda
public class LambdaNonFinalExample {
static boolean odd = false;
public static void main(String[] args) throws Exception {
//boolean odd = false; - If declared inside the method then I get the expected "Effectively Final" compile error
runLambda(() -> odd = true);
System.out.println("Odd=" + odd);
}
public static void runLambda(Callable c) throws Exception {
c.call();
}
}
Output:
Odd=true
YES, you can change the member variables of the instance but you CANNOT change the instance itself just like when you handle variables.
Something like this as mentioned:
class Car {
public String name;
}
public void testLocal() {
int theLocal = 6;
Car bmw = new Car();
bmw.name = "BMW";
Stream.iterate(0, i -> i + 2).limit(2)
.forEach(i -> {
// bmw = new Car(); // LINE - 1;
bmw.name = "BMW NEW"; // LINE - 2;
System.out.println("Testing local variables: " + (theLocal + i));
});
// have to comment this to ensure it's `effectively final`;
// theLocal = 2;
}
The basic principle to restrict the local variables is about data and computation validity
If the lambda, evaluated by the second thread, were given the ability to mutate local variables. Even the ability to read the value of mutable local variables from a different thread would introduce the necessity for synchronization or the use of volatile in order to avoid reading stale data.
But as we know the principal purpose of the lambdas
Amongst the different reasons for this, the most pressing one for the Java platform is that they make it easier to distribute processing of collections over multiple threads.
Quite unlike local variables, local instance can be mutated, because it's shared globally. We can understand this better via the heap and stack difference:
Whenever an object is created, it’s always stored in the Heap space and stack memory contains the reference to it. Stack memory only contains local primitive variables and reference variables to objects in heap space.
So to sum up, there are two points I think really matter:
It's really hard to make the instance effectively final, which might cause lots of senseless burden (just imagine the deep-nested class);
the instance itself is already globally shared and lambda is also shareable among threads, so they can work together properly since we know we're handling the mutation and want to pass this mutation around;
Balance point here is clear: if you know what you are doing, you can do it easily but if not then the default restriction will help to avoid insidious bugs.
P.S. If the synchronization required in instance mutation, you can use directly the stream reduction methods or if there is dependency issue in instance mutation, you still can use thenApply or thenCompose in Function while mapping or methods similar.
First, there is a key difference in how local and instance variables are implemented behind the scenes. Instance variables are stored in the heap, whereas local variables stored in the stack.
If the lambda could access the local variable directly and the lambda was used in a thread, then the thread using the lambda could try to access the variable after the thread that allocated the variable had deallocated it.
In short: to ensure another thread does not override the original value, it is better to provide access to the copy variable rather than the original one.
I read somewhere that even if ConcurrentHashMap is guaranteed to be safe for using in multiple threads it should be declared as final, even private final. My questions are the following:
1) Will CocurrentHashMap still keep thread safety without declaring it as final?
2) The same question about private keyword. Probably it's better to ask more general question - do public/private keywords affect on runtime behavior? I understand their meaning in terms of visibility/usage in internal/external classes but what about meaning in the context of multithreading runtime? I believe code like public ConcurrentHashMap may be incorrect only in coding style terms not in runtime, am I right?
It might be helpful to give a more concrete example of what I was talking about in the comments. Let's say I do something like this:
public class CHMHolder {
private /*non-final*/ CHMHolder instance;
public static CHMHolder getInstance() {
if (instance == null) {
instance = new CHMHolder();
}
return instance;
}
private ConcurrentHashMap<String, String> map = new ConcurrentHashMap<>();
public ConcurrentHashMap<String, String> getMap() {
return map;
}
}
Now, this is not thread-safe for a whole bunch of reasons! But let's say that threadA sees a null value for instance and thus instantiates the CHMHolder, and then threadB, by a happy coincidence, sees that same CHMHolder instance (which is not guaranteed, since there's no synchronization). You would think that threadB sees a non-null CHMHolder.map, right? It might not, since there's no formal happens-before edge between threadA's map = new ... and threadB's return map.
What this means in practice is that something like CHMHolder.getInstance().getMap().isEmpty() could throw a NullPointerException, which would be confusing — after all, getInstance looks like it should always return a non-null CHMHolder, and CHMHolder looks like it should always have a non-null map. Ah, the joys of multithreading!
If map were marked final, then the JLS bit that user2864740 referenced applies. That means that if threadB sees the same instance that threadA sees (which, again, it might not), then it'll also see the map = new... action that threadA did -- that is, it will see the non-null CHM instance. Once it sees that, CHM's internal thread safety will be enough to ensure safe access.
final and private say nothing about the thread-safety (or lack thereof) of the object named by said variable. (They modify the variable, not the object.) Anyway ..
The variable will be consistent across threads if it is a final field:
An object is considered to be completely initialized when its constructor finishes. A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields.
The actual ConcurrentHashMap object is "thread safe" insofar as the guarantees it makes. In particular, only single method calls/operations are guaranteed and as such using larger synchronization code may be required .. which is easily controlled if the CHM is only accessible from the object that created it.
Using private is normally considered good because it prevents other code from "accidently" accessing a variable (and thus the object it names) when they should not. However, the private modifier does not establish the same happens-before guarantee as the final modifier and is thus orthogonal to thread-safety.
I would like to known if each instance of a class has its own copy of the methods in that class?
Lets say, I have following class MyClass:
public MyClass {
private String s1;
private String s2;
private String method1(String s1){
...
}
private String method2(String s2){
...
}
}
So if two differents users make an instance of MyClass like:
MyClass instanceOfUser1 = new MyClass();
MyClass instanceOfUser2 = new MyClass();
Does know each user have in his thread a copy of the methods of MyClass? If yes, the instance variables are then thread-safe, as long as only the instance methods manipulate them, right?
I am asking this question because I often read that instance variables are not thread-safe. And I can not see why it should be like that, when each user gets an instance by calling the new operator?
Each object gets its own copy of the class's instance variables - it's static variables that are shared between all instances of a class. The reason that instance variables are not necessarily thread-safe is that they might be simultaneously modified by multiple threads calling unsynchronized instance methods.
class Example {
private int instanceVariable = 0;
public void increment() {
instanceVariable++;
}
}
Now if two different threads call increment at the same then you've got a data race - instanceVariable might increment by 1 or 2 at the end of the two methods returning. You could eliminate this data race by adding the synchronized keyword to increment, or using an AtomicInteger instead of an int, etc, but the point is that just because each object gets its own copy of the class's instance variables does not necessarily mean that the variables are accessed in a thread-safe manner - this depends on the class's methods. (The exception is final immutable variables, which can't be accessed in a thread-unsafe manner, short of something goofy like a serialization hack.)
Issues with multi-threading arise primarily with static variables and instances of a class being accessed at the same time.
You shouldn't worry about methods in the class but more about the fields (meaning scoped at the class level). If multiple references to an instance of a class exist, different execution paths may attempt to access the instance at the same time, causing unintended consequences such as race conditions.
A class is basically a blueprint for making an instance of an object. When the object is instantiated it receives a spot in memory that is accessed by a reference. If more than one thread has a handle to this reference it can cause occurrences where the instance is accessed simultaneously, this will cause fields to be manipulated by both threads.
'Instance Variables are not thread safe' - this statement depends on the context.
It is true, if for example you are talking about Servlets. It is because, Servlets create only one instance and multiple threads access it. So in that case Instance Variables are not thread safe.
In the above simplified case, if you are creating new instance for each thread, then your instance variables are thread safe.
Hope this answers your question
A method is nothing but a set of instructions. Whichever thread calls the method, get a copy of those instructions. After that the execution begins. The method may use local variables which are method and thread-scoped, or it may use shared resources, like static resources, shared objects or other resources, which are visible across threads.
Each instance has its own set of instance variables. How would you detect whether every instance had a distinct "copy" of the methods? Wouldn't the difference be visible only by examining the state of the instance variables?
In fact, no, there is only one copy of the method, meaning the set of instructions executed when the method is invoked. But, when executing, an instance method can refer to the instance on which it's being invoked with the reserved identifier this. The this identifier refers to the current instance. If you don't qualify an instance variable (or method) with something else, this is implied.
For example,
final class Example {
private boolean flag;
public void setFlag(boolean value) {
this.flag = value;
}
public void setAnotherFlag(Example friend) {
friend.flag = this.flag;
}
}
There's only one copy of the bytes that make up the VM instructions for the setFlag() and setAnotherFlag() methods. But when they are invoked, this is set to the instance upon which the invocation occurred. Because this is implied for an unqualified variable, you could delete all the references to this in the example, and it would still function exactly the same.
However, if a variable is qualified, like friend.flag above, the variables of another instance can be referenced. This is how you can get into trouble in a multi-threaded program. But, as long as an object doesn't "escape" from one thread to be visible to others, there's nothing to worry about.
There are many situations in which an instance may be accessible from multiple classes. For example, if your instance is a static variable in another class, then all threads would share that instance, and you can get into big trouble that way. That's just the first way that pops into my mind...
Generally, final static members especially, variables (or static final of course, they can be used in either order without overlapping the meaning) are extensively used with interfaces in Java to define a protocol behavior for the implementing class which implies that the class that implements (inherits) an interface must incorporate all of the members of that interface.
I'm unable to differentiate between a final and a final static member. The final static member is the one which is a static member declared as final or something else? In which particular situations should they be used specifically?
A static variable or a final static variable can never be declared inside a method neither inside a static method nor inside an instance method. Why?
The following segment of code accordingly, will not be compiled and an compile-time error will be issued by the compiler, if an attempt is made to compile it.
public static void main(String args[])
{
final int a=0; //ok
int b=1; //ok
static int c=2; //wrong
final static int x=0; //wrong
}
You are making a huge mix of many different concepts. Even the question in the title does not correspond to the question in the body.
Anyways, these are the concepts you are mixing up:
variables
final variables
fields
final fields
static fields
final static fields
The keyword static makes sense only for fields, but in the code you show you are trying to use it inside a function, where you cannot declare fields (fields are members of classes; variables are declared in methods).
Let's try to rapidly describe them.
variables are declared in methods, and used as some kind of mutable local storage (int x; x = 5; x++)
final variables are also declared in methods, and are used as an immutable local storage (final int y; y = 0; y++; // won't compile). They are useful to catch bugs where someone would try to modify something that should not be modified. I personally make most of my local variables and methods parameters final. Also, they are necessary when you reference them from inner, anonymous classes. In some programming languages, the only kind of variable is an immutable variable (in other languages, the "default" kind of variable is the immutable variable) -- as an exercise, try to figure out how to write a loop that would run an specified number of times when you are not allowed to change anything after initialization! (try, for example, to solve fizzbuzz with only final variables!).
fields define the mutable state of objects, and are declared in classes (class x { int myField; }).
final fields define the immutable state of objects, are declared in classes and must be initialized before the constructor finishes (class x { final int myField = 5; }). They cannot be modified. They are very useful when doing multithreading, since they have special properties related to sharing objects among threads (you are guaranteed that every thread will see the correctly initialized value of an object's final fields, if the object is shared after the constructor has finished, and even if it is shared with data races). If you want another exercise, try to solve fizzbuzz again using only final fields, and no other fields, not any variables nor method parameters (obviously, you are allowed to declare parameters in constructors, but thats all!).
static fields are shared among all instances of any class. You can think of them as some kind of global mutable storage (class x { static int globalField = 5; }). The most trivial (and usually useless) example would be to count instances of an object (ie, class x { static int count = 0; x() { count++; } }, here the constructor increments the count each time it is called, ie, each time you create an instance of x with new x()). Beware that, unlike final fields, they are not inherently thread-safe; in other words, you will most certainly get a wrong count of instances of x with the code above if you are instantiating from different threads; to make it correct, you'd have to add some synchronization mechanism or use some specialized class for this purpose, but that is another question (actually, it might be the subject of a whole book).
final static fields are global constants (class MyConstants { public static final double PI = 3.1415926535897932384626433; }).
There are many other subtle characteristics (like: compilers are free to replace references to a final static field to their values directly, which makes reflection useless on such fields; final fields might actually be modified with reflection, but this is very error prone; and so on), but I'd say you have a long way to go before digging in further.
Finally, there are also other keywords that might be used with fields, like transient, volatile and the access levels (public, protected, private). But that is another question (actually, in case you want to ask about them, many other questions, I'd say).
Static members are those which can be accessed without creating an object. This means that those are class members and nothing to do with any instances. and hence can not be defined in the method.
Final in other terms, is a constant (as in C). You can have final variable inside the method as well as at class level. If you put final as static it becomes "a class member which is constant".
I'm unable to differentiate between a final and a final static member.
The final static member is the one which is a static member declared
as final or something else? In which particular situations should they
be used specifically?
Use a final static when you want it to be static. Use a final (non-static) when you don't want it to be static.
A static variable or a final static variable can never be declared
inside a method neither inside a static method nor inside an instance
method. Why?
Design decision. There's just no way to answer that without asking James Gosling.
The following segment of code accordingly, will not be compiled and an
compile-time error will be issued by the compiler, if an attempt is
made to compile it.
Because it violates the rule you just described.
final keyword simply means "this cannot be changed".It can be used with both fields and variables in a method.When a variable is declared final an attempt to change the variable will result to a compile-time error.For example if i declare a variable as final int x = 12; trying to increment x that is (++x) will produce an error.In short with primitives final makes a value a constant.
On the other hand static can only be applied with fields but not in methods.A field that is final static has only one piece of storage.final shows that it is a constant(cannot be changed), static shows it is only one.
In Java, a static variable is one that belongs to class rather than the object of a class, different instances of the same class will contain the same static variable value.
A final variable is one that once after initialized ,after the instantiation of a class (creation of an object) cannot be altered in the program. However this differ from objects if a different value is passed post creation of another object of the same class.
final static means that the variable belongs to the class as well as cannot be change once initialized. So it will be accessible to the same value throughout different instances of the same class.
Just to add a minor information to #Bruno Reis 's answer, which I sought to complete the answer, as he spoke about important condition to initialize final fields before constructor ends, final static fields must also be initialized before before static blocks' execution finishes.
You cannot declare static fields in static block, static fields can only belong to a class, hence the compiler error.
When you declare a final variable (constant) in a class, for example:
private static final int MyVar = 255;
How much memory will this require if I have 100,000 instances of the class which declared this?
Will it link the variable to the class and thus have 1*MyVar memory usage (disregarding internal pointers), or will it link to the instance of this variable and create 100,000*MyVar copies of this variable?
Unbelievably fast response! The consensus seems to be that if a variable is both static and final then it will require 1*MyVar. Thanks all!
The final keyword is irrelevant to the amount of memory used, since it only means that you can't change the value of the variable.
However, since the variable is declared static, there will be only one such variable that belongs to the class and not to a specific instance.
Taken from here:
If a field is declared static, there exists exactly one incarnation of the field, no matter how many instances (possibly zero) of the class may eventually be created. A static field, sometimes called a class variable, is incarnated when the class is initialized . A field that is not declared static (sometimes called a non-static field) is called an instance variable. Whenever a new instance of a class is created, a new variable associated with that instance is created for every instance variable declared in that class or any of its superclasses.
There will be only 1*MyVar memory usage because it is declared as static.
The static declaration means it will only have one instance for that class and it's subclasses (unless they override MyVar).
An int is a 32-bit signed 2's complement integer primitive, so it takes 4 bytes to hold it, if your example wasn't using static you'd just multiply that by the number of instances you have (for your example of 100,000 instances that's 0.38 of a megabyte - for the field alone, extra overhead for actual classes).
The final modifier on a field means it cannot be repointed to another value (whereas final on a class of method means it cannot be overridden).
It's static and thus class scope -> 1.
Edit: actually, it depends on the class loaders. In the general case you have one copy of the class but if you have multiple class loaders/class repositories (might be the case in application servers etc.) you could end up with more.
In addition to the fact that static fields belong to their classes, and thus there is only one instance of static varaible per class (and per classloader), it's important to understand that static final variables initialized by compile-time constant expressions are inlined into classes that use them.
JLS §13.1 The Form of a Binary:
References to fields that are constant variables (§4.12.4) are resolved at compile time to the constant value that is denoted. No reference to such a constant field should be present in the code in a binary file (except in the class or interface containing the constant field, which will have code to initialize it), and such constant fields must always appear to have been initialized; the default initial value for the type of such a field must never be observed.
So, in practice, the instance of static final variable that belong to its class is not the only instance of value of that variable - there are other instances of that value inlined into constant pools (or code) of classes that use the variable in question.
class Foo {
public static final String S = "Hello, world!";
}
class Bar {
public static void main(String[] args) {
// No real access to class Foo here
// String "Hello, world!" is inlined into the constant pool of class Bar
String s = Foo.S;
System.out.println(s);
}
}
In practice it means that if you change the value of Foo.S in class Foo, but don't recompile class Bar, class Bar will print the old value of Foo.S.
static means you will have only one instatnce
final just means, that you can't reassign that value.
The crucial part here is that you declared the variable as static because static variables are shared among all instances of the class, thus requiring only as much space as one instance of the variable. Declaring a variable final makes it immutable outside its declaration or constructor.
final makes is 1*instances memory usage.
However, static makes it simply 1.
The keyword "final" helps you to declare a constant with a specific amount of memory, where as the keyword "static" as its prefix will gives a single instance of this constant, what ever be the amount of memory consumed...!!!
Static means, one instance per class, static variable created once and can be shared between different object.
Final variable, once value is initialized it can't be changed. Final static variable use to create constant (Immutable) and refer directly without using the object.
It's static, so there will be only one instance created, so however many bytes are required to hold one int primitive will be allocated
You will have one instance per class. If you have the class loaded more than once (in different class loaders) it will be loaded once per class loader which loads it.
BTW: Memory is surprising cheap these days. Even if there was a copy per instance, the time it takes you to ask the question is worth more than the memory you save. You should make it static final for clarity rather than performance. Clearer code is easier to maintain and is often more efficient as well.