Effect of Java 'final' declaration for class member variables

Effect of Java 'final' declaration for class member variables - java

This is more of a theory question than a solution question, so hear me out.
In C/C++ as well as PHP, you can declare constants. There are usually a couple of ways to do this (#DEFINE for example, or 'const type'...) and the ultimate effect of this is that during compilation a replace is done so that all of those named constants become literals. This helps because instead of having to access a memory location to find the data, the data is hardcoded in, but without the downside of hardcoding - recalling the value if it needs to be reused, and changing all of the instances of that value when it needs to be changed.
But Java's final declaration is slightly inscrutable; because I can create a class with unset final vars and initialize them on construction, it means that they are not precompiled as literals as far as I can tell. Other than guaranteeing that they cannot logically change afte construction, does the final declaration provide any benefit to efficiency?
References to articles are fine, as long as you make note of the part which explains what final really does and what are if any its benefits other than stopping value changes after construction.
As a corollary, is it possible to actually declare compilation-level constants in Java in any other way than simply using literals (a bad idea anyway?)

Java does have constant expressions. See here in the java language specification.
A compile-time constant expression is an expression denoting a value of primitive type or a String that does not complete abruptly and is composed using only the following:
Literals of primitive type and literals of type String (§3.10.5)
Casts to primitive types and casts to type String
The unary operators +, -, ~, and ! (but not ++ or --)
The multiplicative operators *, /, and %
The additive operators + and -
The shift operators <<, >>, and >>>
The relational operators <, <=, >, and >= (but not instanceof)
The equality operators == and !=
The bitwise and logical operators &, ^, and |
The conditional-and operator && and the conditional-or operator ||
The ternary conditional operator ? :
Parenthesized expressions whose contained expression is a constant expression.
Simple names that refer to constant variables (§4.12.4).
Qualified names of the form TypeName . Identifier that refer to constant variables (§4.12.4).
But in Java, unlike C/C++, you also have a JIT compiler, so additional inlining is possible. So the bottom line is, don't worry about until you see a real performance problem.

Java does have constants that the Java compiler will replace with their values at compile-time. For example, member variables that are final and that are of type String are effectively constants which are replaced in this way. (This is allowed because class String is immutable). One of the consequences of this is that if you change the string in your source code, but you don't recompile the classes where this string is used, those classes will not see the new value of the string.
The JLS explains this in the following paragraphs:
4.12.4 final variables
13.4.9 final Fields and Constants

Final fields are aimed to make immutable objects.
Static final fields are your kind of constants.
Compiler optimisation, data flow analysis, happens to some degree. Try javap to see the jvm byte codes - if your are interested that far.

does the final declaration provide any benefit to efficiency?
Not really. This is because the JIT can often determine that the value is not changed at runtime and can treat it as final. (Which is a problem if the value is not volatile and is changed in another thread)
In Java 8, you can use local variables in closures if they could be made final, rather than having to declare them as final.

Related

Why is instanceof a keyword?

If Java allowed "instanceof" as a name for variables (and fields, type names, package names), it appears, at a first glance, that the language would still remain unambiguous.
In most or all of the productions in Java where an Identifier can appear, there are contextual cues that would prevent confusion with a binary operator.
Regarding the basic production:
RelationalExpression:
...
RelationalExpression instanceof ReferenceType
There are no expressions of the form RelationalExpression Identifier ReferenceType, since appending a single Identifier to any Expression is never valid, and no ReferenceType can be extended by adding an Identifier on the front.
The only other reason I can think of why instanceof must be a keyword would be if there were some other production containing an Identifier which can be broken up into an instanceof expression. That is, there may be productions which are ambiguous if we allow instanceof as an Identifier. However, I can't seem to find any, since an Identifier is almost always separated from its surrounding tokens by a dot (or is identifiable as a MethodName by a following lparen).
Is instanceof a keyword simply out of tradition, rather than necessity? Could new relational operators be introduced in future Java versions, with tokens that collide with identifiers? (For example, could a hypothetical "relatedto" operator be introduced without making it a keyword, which would break existing code?)

That question is different, it's asking why "instanceof" isn't a method, I'm asking whether there are reasons syntactically why
You have a point in that it could have been a method on Object or we have
if (myClass.class.isInstance(obj))
This is more cumbersome, however I would say that chains of instanceof are not considered best practice and making it a little harder might not have been a bad idea.
It is worth noting that earlier version of Java didn't use intrinsics as much as they do now and using a method would have been far less efficient than a native keyword, though I don't believe that would have to be true today.
Is instanceof a keyword simply out of tradition, rather than necessity?
IMHO keywords were/are considered good practice to make words with special meaning stand out as having and only having a special purpose.
Could new relational operators be introduced in future Java versions, with tokens that collide with identifiers?
Yes, One of the proposals for adding val and var is that they be special types, rather than keywords to avoid conflicting with code which have used them for variable names.
Given a chose, a new language would make these keywords and it is only for backward compatibility that they might be other wise. Alternatively it has been considered to use final rather than val and transient rather than var.
Personally, I think they should add them how other languages do it for consistency otherwise you are going to have every new Java developer asking basic questions like How do I compare strings in Java? What they did made sense but it confuses just about every new developer.
By comparison, they banned making _ a lambda variable to avoid confusion with other languages where this has a special meaning and they have a warning about using _ as a variable that it might be removed in future versions.

Why did language designers use angle brackets instead of parenthesis?

Reading through the javase api docs, I noticed that pretty much all of the methods in the collections framework use angle brackets. For example:
Collection<String> c = new HashSet<String>();
or
Map<String, Integer> m = new HashMap<String, Integer>();
To the eye they seem to serve the same function as a set of parentheses. I still don't know enough of the Java language to be able to see an overarching connection where angle brackets are used and why that might be the case.
My question is specifically: Is there a significance to the way angle brackets are interpreted by the JVM as opposed to perens? Or is it just a common practice across multiple languages?

The angle brackets came with the introduction of generics in Java 1.5
Since this is a later addition to an existing language, I guess the angle brackets where chosen to make a clear distinction to the existing parentheses (method and constructor calls), square brackets (array member access) and curly brackets (block delimiters). I'd say angle brackets are the logical choice here.

I guess they are used in Java because they are used in C++, just like everything from int to void.
Found some interesting references, though partial:
From C++ templates: the complete guide By David Vandevoorde, Nicolai M. Josuttis, page 139:
Relatively early during the development of templates, Tom Pennello—a
widely recognized parsing expert working for Metaware—noted some of
the problems associated with angle brackets. Stroustrup also comments
on that topic in [DnE] and argues that humans prefer to read angle
brackets rather than parentheses. However, other possibilities exist,
and Pennello specifically proposed braces (for example. List{: :X}) at
a C++ standards meeting in 1991 (held in Dallas) At that time the
extent of the problem was more limited because templates nested inside
other templates—so-called nested templates —were not valid and thus
the discussion of Section 9.3.3 on page 132 was largely irrelevant. As
a result. the committee declined the proposal to replace the angle
brackets.
So I may have been mistaken that the angled brackets were used to help the parser, perhaps they were used to help the programmer, because Bjarne Stroustrup thought they were better.

Parentheses are already reserved for method calls and expression grouping. Angle brackets are used for generic type parameters.
If parentheses were used for both, things could become ambiguous, if not for the compiler, then at least for the reader.

Is there a significance to the way angle brackets are interpreted by
the JVM as opposed to perens?
None of them is interpreted by the JVM [neither the braces, nor angle brackets], both parentheses and angle brackets are parsed during compile time, and the JVM doesn't see them, since the JVM is active on run time.
As side notes:
The <> are used for generics, and their usage is also common in other languages such as C++.
You are referring to new HashSet<String>(); as a method - it is not, it is invoking a constructor. A constructor is not a method.

In Java, angle brackets indicate the use of a generic type, which is a type that has different semantics depending on the type passed in.
The simplest use case for generics is in specialized collections, to indicate that they should hold only objects of a particular type.
In Java, generics do not actually add a lot to the language except basic enforcement functionality at run-time. Objects inserted into or retrieved from a collection are automatically cast to the given type at run time (risking a ClassCastException in the worst case). There is no compile-time checking for generic types in the language specification.

Angle brackets are used to denote type parameter lists for polymorphic ("generic") classes and methods. These are a very different beast from value parameter lists (the stuff in parentheses). If they were the same, then imagine you have the expression new Foo(bar)... How would the parser interpret this? Is bar the name of a type or the name of a variable?

Imagine that C++ used () instead of <> for templates. Now consider this line of code:
foo(bar)*bang;
Is that:
Declaring a local variable bang whose type is a pointer to the template type foo with type argument bar?
Calling the function foo, passing in bar, then multiplying the result by bang?
It's grammatically ambiguous. We could tweak the grammar such that it would always prefer one over the other, but that makes the (already painfully complex) grammar even hairier. Worse, whichever way you pick, users will probably guess wrong sometimes.
So, for C++, it makes sense to use a different grouping character for templates.
Java then just followed in C++'s footsteps.
Most of the trouble here stem's from C's decision to not have explicit syntax for variable declaration and instead just use a type annotation to implicitly mean "make a new variable of that type". Languages like Scala which have explicit keywords (var and val) for variables have more freedom with type declaration syntax, which is why they can use [] for generics.

Compile-time constants and variables

The Java language documentation says:
If a primitive type or a string is defined as a constant and the value
is known at compile time, the compiler replaces the constant name
everywhere in the code with its value. This is called a compile-time
constant.
My understanding is if we have a piece of code:
private final int x = 10;
Then, the compiler will replace every occurrence of x in the code with literal 10.
But suppose the constant is initialized at run-time:
private final int x = getX(); // here getX() returns an integer value at run-time.
Will there be any performance drop (howsoever negligible it may be) compared to the compile-time constant?
Another question is whether the below line of code:
private int y = 10; // here y is not final
is treated in same way as compile-time constant by the compiler?
Finally, what I understand from the answers are:
final static means compile-time constant
just final means it's a constant but is initialized at run-time
just static means initialized at run-time
without final is a variable and wouldn't be treated as constant.
Is my understanding correct?

Compile time constant must be:
declared final
primitive or String
initialized within declaration
initialized with constant expression
So private final int x = getX(); is not constant.
To the second question private int y = 10; is not constant (non-final in this case), so optimizer cannot be sure that the value would not change in the future. So it cannot optimize it as good as constant value. The answer is: No, it is not treated the same way as compile time constant.

The JLS makes the following distinctions between final variables and constants:
final variables
A variable can be declared final. A final variable may only be
assigned to once. It is a compile-time error if a final variable is
assigned to unless it is definitely unassigned immediately prior to
the assignment (§16 (Definite Assignment)).
Once a final variable has been assigned, it always contains the same
value. If a final variable holds a reference to an object, then the
state of the object may be changed by operations on the object, but
the variable will always refer to the same object. This applies also
to arrays, because arrays are objects; if a final variable holds a
reference to an array, then the components of the array may be changed
by operations on the array, but the variable will always refer to the
same array.
A blank final is a final variable whose declaration lacks an
initializer.
constants
A constant variable is a final variable of primitive type or type
String that is initialized with a constant expression (§15.28).
From this definition, we can discern that a constant must be:
declared final
of primitive type or type String
initialized within its declaration (not a blank final)
initialized with a constant expression
What about compile-time constants?
The JLS does not contain the phrase compile-time constant. However, programmers often use the terms compile-time constant and constant interchangeably.
If a final variable does not meet the criteria outlined above to be considered a constant, it should technically be referred to as a final variable.

According to JLS, there is no requirement that "constant variable" should be static.
So "constant variable" maybe static or non-static (instance variable).
But JLS imposes some other requirements for a variable to be a "constant variable" (besides being just final):
being only String or primitive
initialized inline only, because it is final, and blank final is not allowed
initialized with "constant expression" = "compile-time constant expression" (see JLS quote below)
4.12.4. final Variables (JLS)
A constant variable is a final variable of primitive type or type String that is initialized with a constant expression (§15.28).
15.28. Constant Expressions
A compile-time constant expression is an expression denoting a value
of primitive type or a String that does not complete abruptly and is
composed using only the following:
Literals of primitive type and literals of type String (§3.10.1,
§3.10.2, §3.10.3, §3.10.4, §3.10.5)
Casts to primitive types and casts to type String (§15.16)
The unary operators +, -, ~, and ! (but not ++ or --) (§15.15.3,
§15.15.4, §15.15.5, §15.15.6)
The multiplicative operators *, /, and % (§15.17)
The additive operators + and - (§15.18)
The shift operators <<, >>, and >>> (§15.19)
The relational operators <, <=, >, and >= (but not instanceof)
(§15.20)
The equality operators == and != (§15.21)
The bitwise and logical operators &, ^, and | (§15.22)
The conditional-and operator && and the conditional-or operator ||
(§15.23, §15.24)
The ternary conditional operator ? : (§15.25)
Parenthesized expressions (§15.8.5) whose contained expression is a
constant expression.
Simple names (§6.5.6.1) that refer to constant variables (§4.12.4).
Qualified names (§6.5.6.2) of the form TypeName . Identifier that
refer to constant variables (§4.12.4).

There might be a really small performance drop on some machines for private final int x = getX(); since that would involve at least one method call (besides the fact that this isn't a compile-time constant) but as you said, it would be negligible so why bother?
As for the second question: y isn't final and thus is not a compile time constant, since it might change at runtime.

The final keyword means that a variable will be initialized once and only once. A real constant need to be declared static as well.
So, none of your examples are treated as constants by the compiler. Nevertheless, the final keyword tells you (and to the compiler) that your variables will be initialized once only (in the constructor or literally).
If you need their values assigned at compile time your fields must be static.
Performance is not really that affected, but have in mind that primitive types are immutable, once you have created one it will hold that value in memory until the garbage collector removes it.
So, if you have a variable y = 1; and then you change it to y = 2; in memory the JVM will have both values, but your variable will "point" to the latter.
private int y = 10; // here y is not final
is treated in same way as compile time constant by the compiler ?
No. This is an instance variable, created, initialized an used at runtime.

Just keep in mind that in the following code, x is not compile time constant:
public static void main(String[] args) {
final int x;
x= 5;
}

private final int x = getX();
Will be called the first time your object is declared. The performance "drop" will depend on getX() but that's not the kind of things to create some bottleneck.

Simply speaking while compilation the compiler replaces the reference with the actual value specified, instead of using the reference parameter.
public static void main(String[] args) {
final int x = 5;
}
ie. while compilation the complier take the initialised value of 5 directly for compliation than using the reference variable 'x';
Please check this explanation

difference between keyword and literal in java

I am new to java and was reading a book and came across these lines:
"The literals true,falseand null are lowercase,not uppercase as in C++ language.Strictly speaking,these are not keywords but literals."
why these are literals, and what requirements are needed for some keywords to be called literal..?

Keywords are words that are used as part of code structure, like for or while. They change the way a compiler handles a block of code, e.g. a for tells the compiler to execute the code within the specified scope repeatedly, until the given exit condition is reached. The class keyword tells the compiler to treat everything within the specified scope to be part of a particular class. Keyword names are restricted, so you can't use them as variable names.
Literals like true, false and null are values that can be assigned, but their names are restricted in the same way that keywords are, i.e. you can't have a variable called true or for. They form parts of expressions, but don't change the way a compiler handles code.

true, false and null are expressions. They denote special built-in values, so they are considered literals (along with more traditional literals, such as 123 and "xyz").
for, if, class, etc. are keywords. They communicate your declarations and statements to the compiler, but they do not represent values. That is why they are not literals.

The keywords are defined in the Java Language Specification #3.9. 'true' is not among them. The literals are defined in #3.10, and they include 'true'. The text of those sections answers your question completely.

Literals are the symbols whereas the identifiers are the keywords.
Well, no, but an example may help:
12, 1e3, 0x4a, 'a', "Hello\n" -- literals
_debug, n, stdio, main, argc, printf -- identifiers

Is Java guaranteed to inline string constants if they can be determined at compile time

Consider this case:
public Class1 {
public static final String ONE = "ABC";
public static final String TWO = "DEF";
}
public Class2 {
public void someMethod() {
System.out.println(Class1.ONE + Class1.TWO);
}
}
Typically you would expect the compiler to inline the ONE and TWO constants. However, is this behavior guaranteed? Can you deploy at runtime Class2 without Class1 in the classpath, and expect it to work regardless of compilers, or is this an optional compiler optimization?
EDIT: Why on earth do this? Well I have a constant that would be shared between two ends of an application (client and server over RMI) and it would be very convenient in this particular case to put the constant on a class that can only be on one side of that divide (as it is logically the one that owns that constant value) rather than have it in an arbitrary constants class just because it needs to be shared by both sides of the code. At compile time its all one set of source files, but at build time it is divided by package.

It's guaranteed to be treated as a constant expression, and guaranteed to be interned by section 15.28 of the JLS:
A compile-time constant expression is
an expression denoting a value of
primitive type or a String that does
not complete abruptly and is composed
using only the following:
Literals of primitive type and literals of type String (§3.10.5)
Casts to primitive types and casts to type String
The unary operators +, -, ~, and ! (but not ++ or --)
The multiplicative operators *, /, and %
The additive operators + and -
...
...
Compile-time constants of type String
are always "interned" so as to share
unique instances, using the method
String.intern.
Now, that doesn't quite say it's guaranteed to be inlined. However, section 13.1 of the spec says:
References to fields that are constant
variables (§4.12.4) are resolved at
compile time to the constant value
that is denoted. No reference to such
a constant field should be present in
the code in a binary file (except in
the class or interface containing the
constant field, which will have code
to initialize it), and such constant
fields must always appear to have been
initialized; the default initial value
for the type of such a field must
never be observed.
In other words, even if the expression itself weren't a constant, there should be no reference to Class1. So yes, you're okay. That doesn't necessarily guarantee that the concatenated value is used in the bytecode, but the bits referenced earlier guarantee that the concatenated value is interned, so I'd be hugely surprised if it didn't just inline the concatenated value. Even if it doesn't, you're guaranteed that it'll work without Class1.

Compiling that with javac 1.6.0_14 produces the following bytecode:
public void someMethod();
Code:
0: getstatic #2; //Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #3; //String ABCDEF
5: invokevirtual #4; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
8: return
So the strings are concatenated at compile time and the result is included in Class2's constant pool.

It won't be inlined by the compiler but by the interpreter at runtime and if possible converted to assembly code.
It cannot be guaranteed, because not all the interpreters ( JVM's ) work the same way. But the most important implementations will do.
Unfortunately I don't have a link to sustain this :(

I suspect, but don't know for sure, that this will work, but it doesn't sound like a good idea.
The "normal" ways to do this are:
Put the constants in a package that's shared between the client and the server. Presumably, there is such a package, because that's where the interfaces go.
If there's no such package, create 2 classes with the shared constants: one for the server and one for the client.

See JLS 13.4.9. While it does not explicitly require that constants are inlined by the compiler, it hints that conditional compilation and support for constants in switch statements cause the compiler to always inline constants.

It looks like you're coding your own version of the capability built into enum, which does public static final for you, proper naming via name() and toString() (as well as having some other advantages, but perhaps having the disadvantage of a larger memory footprint).
Are you using an older version of Java that doesn't include enum yet?

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.