difference between keyword and literal in java

difference between keyword and literal in java - java

I am new to java and was reading a book and came across these lines:
"The literals true,falseand null are lowercase,not uppercase as in C++ language.Strictly speaking,these are not keywords but literals."
why these are literals, and what requirements are needed for some keywords to be called literal..?

Keywords are words that are used as part of code structure, like for or while. They change the way a compiler handles a block of code, e.g. a for tells the compiler to execute the code within the specified scope repeatedly, until the given exit condition is reached. The class keyword tells the compiler to treat everything within the specified scope to be part of a particular class. Keyword names are restricted, so you can't use them as variable names.
Literals like true, false and null are values that can be assigned, but their names are restricted in the same way that keywords are, i.e. you can't have a variable called true or for. They form parts of expressions, but don't change the way a compiler handles code.

true, false and null are expressions. They denote special built-in values, so they are considered literals (along with more traditional literals, such as 123 and "xyz").
for, if, class, etc. are keywords. They communicate your declarations and statements to the compiler, but they do not represent values. That is why they are not literals.

The keywords are defined in the Java Language Specification #3.9. 'true' is not among them. The literals are defined in #3.10, and they include 'true'. The text of those sections answers your question completely.

Literals are the symbols whereas the identifiers are the keywords.
Well, no, but an example may help:
12, 1e3, 0x4a, 'a', "Hello\n" -- literals
_debug, n, stdio, main, argc, printf -- identifiers

Related

What is the convention for variable names in lambda expressions when the variable is not used?

I'm trying to find a name similar to what i,j are for loops, or x,y is for coordinates etc.
I have a code like this:
DbSetup.setupCommon(x -> HibernateHelper.addResource(SpecificEntityHelper.HIBERNATE_RESOURCE, schemaName));
In this case x is not a required variable name and I found it is used in few places in code base of our project, so probably is
Of course in this specific case I can not use a static reference to the function HibernateHelper::addResource and it sounds like there is no other way to not to have a name of the variable at all.

Currently, Java syntax doesn't provide a way to have no identifier at all if the argument is to be ignored.
A single _ may serve that purpose in the future, but as of Java 16 _ is just a keyword that is "reserved for possible future use in parameter declarations."; see JLS 3.9.)
It is inadvisable to use $ because all identifiers that contain $ are reserved for use by source code generators or for legacy purposes; see JLS 3.8
Also, there isn't an established conventional name for a dummy argument to a lambda expression.
My advice would be to just use a single letter identifier; e.g. x. A lambda expression will typically be small enough that you can easily see that (say) x is not used in the expression.
Alternatively, you could pick a name like unused or dummy to flag your intent to not use the value.

Why is instanceof a keyword?

If Java allowed "instanceof" as a name for variables (and fields, type names, package names), it appears, at a first glance, that the language would still remain unambiguous.
In most or all of the productions in Java where an Identifier can appear, there are contextual cues that would prevent confusion with a binary operator.
Regarding the basic production:
RelationalExpression:
...
RelationalExpression instanceof ReferenceType
There are no expressions of the form RelationalExpression Identifier ReferenceType, since appending a single Identifier to any Expression is never valid, and no ReferenceType can be extended by adding an Identifier on the front.
The only other reason I can think of why instanceof must be a keyword would be if there were some other production containing an Identifier which can be broken up into an instanceof expression. That is, there may be productions which are ambiguous if we allow instanceof as an Identifier. However, I can't seem to find any, since an Identifier is almost always separated from its surrounding tokens by a dot (or is identifiable as a MethodName by a following lparen).
Is instanceof a keyword simply out of tradition, rather than necessity? Could new relational operators be introduced in future Java versions, with tokens that collide with identifiers? (For example, could a hypothetical "relatedto" operator be introduced without making it a keyword, which would break existing code?)

That question is different, it's asking why "instanceof" isn't a method, I'm asking whether there are reasons syntactically why
You have a point in that it could have been a method on Object or we have
if (myClass.class.isInstance(obj))
This is more cumbersome, however I would say that chains of instanceof are not considered best practice and making it a little harder might not have been a bad idea.
It is worth noting that earlier version of Java didn't use intrinsics as much as they do now and using a method would have been far less efficient than a native keyword, though I don't believe that would have to be true today.
Is instanceof a keyword simply out of tradition, rather than necessity?
IMHO keywords were/are considered good practice to make words with special meaning stand out as having and only having a special purpose.
Could new relational operators be introduced in future Java versions, with tokens that collide with identifiers?
Yes, One of the proposals for adding val and var is that they be special types, rather than keywords to avoid conflicting with code which have used them for variable names.
Given a chose, a new language would make these keywords and it is only for backward compatibility that they might be other wise. Alternatively it has been considered to use final rather than val and transient rather than var.
Personally, I think they should add them how other languages do it for consistency otherwise you are going to have every new Java developer asking basic questions like How do I compare strings in Java? What they did made sense but it confuses just about every new developer.
By comparison, they banned making _ a lambda variable to avoid confusion with other languages where this has a special meaning and they have a warning about using _ as a variable that it might be removed in future versions.

Why variable names in Java cannot have same names as keywords?

In most programing languages that I know you cannot declare a variable with name that is also a key word.
For example in Java:
public class SomeClass
{
Class<?> clazz = Integer.class; // OK.
Class<?> class = Integer.class; // Compilation error.
}
But it's very easy to figure out what is what. Humans reading it will not confuse variable name with class declaration and compiler will most likely not confuse it too.
Same thing about variable names like 'for', 'extends', 'goto' or anything from Java key words if we are talking about Java programming language.
What is the reason that we have this limitation?

What is the reason that we have this limitation?
There are two reasons in general:
As you identified in your Question: it would be extremely confusing for the human reader. And a programming language that is confusing by design is not going to get significant traction as a practical programming language.
If identifiers can be the same as keywords, it makes it much more difficult to write a formal grammar for the language. (Certainly, a grammar like that with the rules for disambiguation cannot be expressed in BNF / EBNF or similar.) That means that writing a parser for such a language would be a lot more complicated.
Anyhow, while neither of these reasons is a total "show stopper", they would be sufficient to cause most people attempting a new programming language design / implementation to reject the idea.
And that of course is the real reason that you (almost) never see languages where keywords can be used as identifiers. Programming language designers nearly always reject the idea ...
(In the case of Java, there was a conscious effort to make the syntax accessible to people used to the C language. C doesn't support this. That would have been a 3rd reason ... if they were looking for one.)
There is one interesting (semi-) counter example in a mainstream programming language. In early versions of FORTRAN, spaces in identifiers were not significant. Thus
I J = 1
and
IJ = 1
meant the same thing. That is cool (depending on your "taste" ...). But compare these two:
DO 20 I = 10, 1, -2
versus
DO 20 I = 10
One is an assignment, but the other one is a "DO loop" statement. As a reader, would you notice this?

It allows the lexer to classify symbols without having to disambiguate context - this in turn allows the language to be parsed according to grammar rules without needing knowledge about other ("higher") parts of the compilation process, including analysis of types.
As an example of complications (and ambiguity) removing such a distinction adds to parsing, consider the following. Under standard Java rules it declares and assigns a variable - there is no ambiguity of how it will be parsed.
final Foo x = 2; // roughly: <keyword> <identifier> <identifier> = <value>
Now, in a hypothetical language without a strict keyword distinction, imagine the following, where final may be a declared type; there are now two possible readings. The first is when final is not a type and the standard reading exists:
final Foo = 2; // roughly: <keyword> <identifier> ?error? = <value>
But if final was a "final type", then the reading may be:
final Foo = 2; // hypothetical: <identifier> <identifier> = <value>
Which interpretation of the source is correct?
Java makes this question even harder to answer due to separate compilation. Should adding a new "final type" in (or accidentally importing) a namespace now change how the code is parsed? Reporting an unresolved symbol is one thing - changing how the grammar is parsed based on such resolution is another.
These sort of issues are simply bypassed with the clear distinction of reserved words.
Arguably, there could be special productions to change the recognition of keywords dynamically (some languages allow controllable operator precedence), but this is not done in mainstream languages and is most certainly not supported in Java. At the very least it requires additional syntax and adds complexity to the system for not-enough benefit.
The most "clean" approach I've seen to such a problem is in C#, which allows one prefix reserved words and remove special meaning such as class #class { float #int = 2; } - although such should be done rarely, and ick!
Now, some words in Java that are reserved could be "reserved only in context", such as extends. Such is seen in SQL all the time; there are reserved words (eg. OVER) and then words that only have special meaning in a given statement construct (eg. ROW_NUMBER). But it's easier to say reserved is reserved, go pick something else.
Except for a very simple-to-parse language like LISP dialects, which effectively treat every bareword as an identifier, keywords and the distinction from identifiers is very prevalent in language grammars.

You're not quite right there. A key word is a word that has meaning in the syntax of the language, and a reserved word is one that you're not allowed to use as an identifier. In Java mostly they are the same, but 'true' and 'goto' are reserved words and not key words ('true' is a literal and 'goto' is not used).
The main reason to make the key words in a language reserved words is to simplify parsing and avoid ambiguities. For example, what does this mean if return could be a method?
return(1);
In my opinion, Java has taken this too far. There are key words that are only meaningful in a particular context in which there could be no ambiguity. Perhaps there is benefit in avoiding confusion on the part of the reader, but I put it down to customary habit of compiler writers. There are other languages which have far fewer key words and/or reserved words and work just fine.

Can we use type names as variable

I've been wondering about this question for a long time :
Can we use a type name as a variable name ?
For example someone on a REST api called one of its variable "protected", is there a way to get it ? I'm developing an Android app, and the api return Json object. To accelerate the process i use the Gson library.

Its mentioned in the Docs already that
You cannot use any of the following as identifiers in your programs.
And wiki says
programmers cannot use keywords as names for variables, methods, classes, or as any other identifier.2
If still you want to use them add underscores or some extra letters to that name.

No. Please, read this list of reserved words for Java http://docs.oracle.com/javase/tutorial/java/nutsandbolts/_keywords.html
But, if I remember right, in Gson you can mark variable by annotation with name of element in Gson - it can helps you.

In Java Language Specification:
3.8. Identifiers
An identifier cannot have the same spelling (Unicode character sequence) as a keyword (§3.9), boolean literal (§3.10.3), or the null literal (§3.10.7), or a compile-time error occurs.
However in Java Virtual Machine Specification:
4.2.2. Unqualified Names
Names of methods, fields, and local variables are stored as unqualified names. An unqualified name must not contain any of the ASCII characters . ; [ / (that is, period or semicolon or left square bracket or forward slash).
So
You can't use a type name as a variable name in the Java Language
You can use a type name as a variable name in class file.

Effect of Java 'final' declaration for class member variables

This is more of a theory question than a solution question, so hear me out.
In C/C++ as well as PHP, you can declare constants. There are usually a couple of ways to do this (#DEFINE for example, or 'const type'...) and the ultimate effect of this is that during compilation a replace is done so that all of those named constants become literals. This helps because instead of having to access a memory location to find the data, the data is hardcoded in, but without the downside of hardcoding - recalling the value if it needs to be reused, and changing all of the instances of that value when it needs to be changed.
But Java's final declaration is slightly inscrutable; because I can create a class with unset final vars and initialize them on construction, it means that they are not precompiled as literals as far as I can tell. Other than guaranteeing that they cannot logically change afte construction, does the final declaration provide any benefit to efficiency?
References to articles are fine, as long as you make note of the part which explains what final really does and what are if any its benefits other than stopping value changes after construction.
As a corollary, is it possible to actually declare compilation-level constants in Java in any other way than simply using literals (a bad idea anyway?)

Java does have constant expressions. See here in the java language specification.
A compile-time constant expression is an expression denoting a value of primitive type or a String that does not complete abruptly and is composed using only the following:
Literals of primitive type and literals of type String (§3.10.5)
Casts to primitive types and casts to type String
The unary operators +, -, ~, and ! (but not ++ or --)
The multiplicative operators *, /, and %
The additive operators + and -
The shift operators <<, >>, and >>>
The relational operators <, <=, >, and >= (but not instanceof)
The equality operators == and !=
The bitwise and logical operators &, ^, and |
The conditional-and operator && and the conditional-or operator ||
The ternary conditional operator ? :
Parenthesized expressions whose contained expression is a constant expression.
Simple names that refer to constant variables (§4.12.4).
Qualified names of the form TypeName . Identifier that refer to constant variables (§4.12.4).
But in Java, unlike C/C++, you also have a JIT compiler, so additional inlining is possible. So the bottom line is, don't worry about until you see a real performance problem.

Java does have constants that the Java compiler will replace with their values at compile-time. For example, member variables that are final and that are of type String are effectively constants which are replaced in this way. (This is allowed because class String is immutable). One of the consequences of this is that if you change the string in your source code, but you don't recompile the classes where this string is used, those classes will not see the new value of the string.
The JLS explains this in the following paragraphs:
4.12.4 final variables
13.4.9 final Fields and Constants

Final fields are aimed to make immutable objects.
Static final fields are your kind of constants.
Compiler optimisation, data flow analysis, happens to some degree. Try javap to see the jvm byte codes - if your are interested that far.

does the final declaration provide any benefit to efficiency?
Not really. This is because the JIT can often determine that the value is not changed at runtime and can treat it as final. (Which is a problem if the value is not volatile and is changed in another thread)
In Java 8, you can use local variables in closures if they could be made final, rather than having to declare them as final.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.