In most programing languages that I know you cannot declare a variable with name that is also a key word.
For example in Java:
public class SomeClass
{
Class<?> clazz = Integer.class; // OK.
Class<?> class = Integer.class; // Compilation error.
}
But it's very easy to figure out what is what. Humans reading it will not confuse variable name with class declaration and compiler will most likely not confuse it too.
Same thing about variable names like 'for', 'extends', 'goto' or anything from Java key words if we are talking about Java programming language.
What is the reason that we have this limitation?
What is the reason that we have this limitation?
There are two reasons in general:
As you identified in your Question: it would be extremely confusing for the human reader. And a programming language that is confusing by design is not going to get significant traction as a practical programming language.
If identifiers can be the same as keywords, it makes it much more difficult to write a formal grammar for the language. (Certainly, a grammar like that with the rules for disambiguation cannot be expressed in BNF / EBNF or similar.) That means that writing a parser for such a language would be a lot more complicated.
Anyhow, while neither of these reasons is a total "show stopper", they would be sufficient to cause most people attempting a new programming language design / implementation to reject the idea.
And that of course is the real reason that you (almost) never see languages where keywords can be used as identifiers. Programming language designers nearly always reject the idea ...
(In the case of Java, there was a conscious effort to make the syntax accessible to people used to the C language. C doesn't support this. That would have been a 3rd reason ... if they were looking for one.)
There is one interesting (semi-) counter example in a mainstream programming language. In early versions of FORTRAN, spaces in identifiers were not significant. Thus
I J = 1
and
IJ = 1
meant the same thing. That is cool (depending on your "taste" ...). But compare these two:
DO 20 I = 10, 1, -2
versus
DO 20 I = 10
One is an assignment, but the other one is a "DO loop" statement. As a reader, would you notice this?
It allows the lexer to classify symbols without having to disambiguate context - this in turn allows the language to be parsed according to grammar rules without needing knowledge about other ("higher") parts of the compilation process, including analysis of types.
As an example of complications (and ambiguity) removing such a distinction adds to parsing, consider the following. Under standard Java rules it declares and assigns a variable - there is no ambiguity of how it will be parsed.
final Foo x = 2; // roughly: <keyword> <identifier> <identifier> = <value>
Now, in a hypothetical language without a strict keyword distinction, imagine the following, where final may be a declared type; there are now two possible readings. The first is when final is not a type and the standard reading exists:
final Foo = 2; // roughly: <keyword> <identifier> ?error? = <value>
But if final was a "final type", then the reading may be:
final Foo = 2; // hypothetical: <identifier> <identifier> = <value>
Which interpretation of the source is correct?
Java makes this question even harder to answer due to separate compilation. Should adding a new "final type" in (or accidentally importing) a namespace now change how the code is parsed? Reporting an unresolved symbol is one thing - changing how the grammar is parsed based on such resolution is another.
These sort of issues are simply bypassed with the clear distinction of reserved words.
Arguably, there could be special productions to change the recognition of keywords dynamically (some languages allow controllable operator precedence), but this is not done in mainstream languages and is most certainly not supported in Java. At the very least it requires additional syntax and adds complexity to the system for not-enough benefit.
The most "clean" approach I've seen to such a problem is in C#, which allows one prefix reserved words and remove special meaning such as class #class { float #int = 2; } - although such should be done rarely, and ick!
Now, some words in Java that are reserved could be "reserved only in context", such as extends. Such is seen in SQL all the time; there are reserved words (eg. OVER) and then words that only have special meaning in a given statement construct (eg. ROW_NUMBER). But it's easier to say reserved is reserved, go pick something else.
Except for a very simple-to-parse language like LISP dialects, which effectively treat every bareword as an identifier, keywords and the distinction from identifiers is very prevalent in language grammars.
You're not quite right there. A key word is a word that has meaning in the syntax of the language, and a reserved word is one that you're not allowed to use as an identifier. In Java mostly they are the same, but 'true' and 'goto' are reserved words and not key words ('true' is a literal and 'goto' is not used).
The main reason to make the key words in a language reserved words is to simplify parsing and avoid ambiguities. For example, what does this mean if return could be a method?
return(1);
In my opinion, Java has taken this too far. There are key words that are only meaningful in a particular context in which there could be no ambiguity. Perhaps there is benefit in avoiding confusion on the part of the reader, but I put it down to customary habit of compiler writers. There are other languages which have far fewer key words and/or reserved words and work just fine.
Related
If Java allowed "instanceof" as a name for variables (and fields, type names, package names), it appears, at a first glance, that the language would still remain unambiguous.
In most or all of the productions in Java where an Identifier can appear, there are contextual cues that would prevent confusion with a binary operator.
Regarding the basic production:
RelationalExpression:
...
RelationalExpression instanceof ReferenceType
There are no expressions of the form RelationalExpression Identifier ReferenceType, since appending a single Identifier to any Expression is never valid, and no ReferenceType can be extended by adding an Identifier on the front.
The only other reason I can think of why instanceof must be a keyword would be if there were some other production containing an Identifier which can be broken up into an instanceof expression. That is, there may be productions which are ambiguous if we allow instanceof as an Identifier. However, I can't seem to find any, since an Identifier is almost always separated from its surrounding tokens by a dot (or is identifiable as a MethodName by a following lparen).
Is instanceof a keyword simply out of tradition, rather than necessity? Could new relational operators be introduced in future Java versions, with tokens that collide with identifiers? (For example, could a hypothetical "relatedto" operator be introduced without making it a keyword, which would break existing code?)
That question is different, it's asking why "instanceof" isn't a method, I'm asking whether there are reasons syntactically why
You have a point in that it could have been a method on Object or we have
if (myClass.class.isInstance(obj))
This is more cumbersome, however I would say that chains of instanceof are not considered best practice and making it a little harder might not have been a bad idea.
It is worth noting that earlier version of Java didn't use intrinsics as much as they do now and using a method would have been far less efficient than a native keyword, though I don't believe that would have to be true today.
Is instanceof a keyword simply out of tradition, rather than necessity?
IMHO keywords were/are considered good practice to make words with special meaning stand out as having and only having a special purpose.
Could new relational operators be introduced in future Java versions, with tokens that collide with identifiers?
Yes, One of the proposals for adding val and var is that they be special types, rather than keywords to avoid conflicting with code which have used them for variable names.
Given a chose, a new language would make these keywords and it is only for backward compatibility that they might be other wise. Alternatively it has been considered to use final rather than val and transient rather than var.
Personally, I think they should add them how other languages do it for consistency otherwise you are going to have every new Java developer asking basic questions like How do I compare strings in Java? What they did made sense but it confuses just about every new developer.
By comparison, they banned making _ a lambda variable to avoid confusion with other languages where this has a special meaning and they have a warning about using _ as a variable that it might be removed in future versions.
Reading through the javase api docs, I noticed that pretty much all of the methods in the collections framework use angle brackets. For example:
Collection<String> c = new HashSet<String>();
or
Map<String, Integer> m = new HashMap<String, Integer>();
To the eye they seem to serve the same function as a set of parentheses. I still don't know enough of the Java language to be able to see an overarching connection where angle brackets are used and why that might be the case.
My question is specifically: Is there a significance to the way angle brackets are interpreted by the JVM as opposed to perens? Or is it just a common practice across multiple languages?
The angle brackets came with the introduction of generics in Java 1.5
Since this is a later addition to an existing language, I guess the angle brackets where chosen to make a clear distinction to the existing parentheses (method and constructor calls), square brackets (array member access) and curly brackets (block delimiters). I'd say angle brackets are the logical choice here.
I guess they are used in Java because they are used in C++, just like everything from int to void.
Found some interesting references, though partial:
From C++ templates: the complete guide By David Vandevoorde, Nicolai M. Josuttis, page 139:
Relatively early during the development of templates, Tom Pennello—a
widely recognized parsing expert working for Metaware—noted some of
the problems associated with angle brackets. Stroustrup also comments
on that topic in [DnE] and argues that humans prefer to read angle
brackets rather than parentheses. However, other possibilities exist,
and Pennello specifically proposed braces (for example. List{: :X}) at
a C++ standards meeting in 1991 (held in Dallas) At that time the
extent of the problem was more limited because templates nested inside
other templates—so-called nested templates —were not valid and thus
the discussion of Section 9.3.3 on page 132 was largely irrelevant. As
a result. the committee declined the proposal to replace the angle
brackets.
So I may have been mistaken that the angled brackets were used to help the parser, perhaps they were used to help the programmer, because Bjarne Stroustrup thought they were better.
Parentheses are already reserved for method calls and expression grouping. Angle brackets are used for generic type parameters.
If parentheses were used for both, things could become ambiguous, if not for the compiler, then at least for the reader.
Is there a significance to the way angle brackets are interpreted by
the JVM as opposed to perens?
None of them is interpreted by the JVM [neither the braces, nor angle brackets], both parentheses and angle brackets are parsed during compile time, and the JVM doesn't see them, since the JVM is active on run time.
As side notes:
The <> are used for generics, and their usage is also common in other languages such as C++.
You are referring to new HashSet<String>(); as a method - it is not, it is invoking a constructor. A constructor is not a method.
In Java, angle brackets indicate the use of a generic type, which is a type that has different semantics depending on the type passed in.
The simplest use case for generics is in specialized collections, to indicate that they should hold only objects of a particular type.
In Java, generics do not actually add a lot to the language except basic enforcement functionality at run-time. Objects inserted into or retrieved from a collection are automatically cast to the given type at run time (risking a ClassCastException in the worst case). There is no compile-time checking for generic types in the language specification.
Angle brackets are used to denote type parameter lists for polymorphic ("generic") classes and methods. These are a very different beast from value parameter lists (the stuff in parentheses). If they were the same, then imagine you have the expression new Foo(bar)... How would the parser interpret this? Is bar the name of a type or the name of a variable?
Imagine that C++ used () instead of <> for templates. Now consider this line of code:
foo(bar)*bang;
Is that:
Declaring a local variable bang whose type is a pointer to the template type foo with type argument bar?
Calling the function foo, passing in bar, then multiplying the result by bang?
It's grammatically ambiguous. We could tweak the grammar such that it would always prefer one over the other, but that makes the (already painfully complex) grammar even hairier. Worse, whichever way you pick, users will probably guess wrong sometimes.
So, for C++, it makes sense to use a different grouping character for templates.
Java then just followed in C++'s footsteps.
Most of the trouble here stem's from C's decision to not have explicit syntax for variable declaration and instead just use a type annotation to implicitly mean "make a new variable of that type". Languages like Scala which have explicit keywords (var and val) for variables have more freedom with type declaration syntax, which is why they can use [] for generics.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Why doesn't Java need operator overloading? Is there any way it can be supported in Java?
Java only allows arithmetic operations on elementary numeric types. It's a mixed blessing, because although it's convenient to define operators on other types (like complex numbers, vectors etc), there are always implementation-dependent idiosyncrasies. So operators don't always do what you expect them to do. By avoiding operator overloading, it's more transparent which function is called when. A wise design move in some people's eyes.
Java doesn't "need" operator overloading, because no language needs it.
a + b is just "syntactic sugar" for a.Add(b) (actually, some would argue that a.Add(b) is just syntactic sugar for Add(a,b))
This related question might help. In short, operator overloading was intentionally avoided when Java was designed because of issues with overloading in C++.
Scala, a newer JVM language, has a syntax that allows method overloading that functions very much like operator overloading, without the limitations of C++ operator overloading. In Scala, it's possible to define a method named +, for example. It's also possible to omit the . operator and parentheses in method calls:
case class A(value: Int) {
def +(other: A) = new A(value + other.value)
}
scala> new A(1) + new A(3)
res0: A = A(4)
No language needs operator overloading. Some believe that Java would benefit from adding it, but its omission has been publicized as a benefit for so long that adding it is almost certainly politically unacceptable (and it's only since the Oracle buyout that I'd even include the "almost").
The counterpoint generally consists of postulating some meaningless (or even counterintuitive) overload, such as adding together two employees or overloading '+' to do division. While operator overloading in such languages as C++ would allow this, lack of operator overloading in Java does little to prevent or even mitigate the problem. someEmployee.Add(anotherEmployee) is no improvement over someEmployee + anotherEmployee. Likewise, if myLargeInteger.Add(anotherLargeInteger) actually does division instead of addition. At least to me, this line of argument appears thoroughly unconvincing at best.
There is, however, another respect in which omitting operator overloading does (almost certainly) have a real benefit. Its omission keeps the language easier to process, which makes it much easier (and quicker) to develop tools that process the language. Just for an obvious example, refactoring tools for Java are much more numerous and comprehensive than for C++. I doubt that this can or should be credited specifically and solely to support for operator overloading in C++ and its omission in Java. Nonetheless, the general attitude of keeping Java simple (including omission of operator overloading) is undoubtedly a major contributing factor.
The possibility of simplifying parsing by requiring spaces between identifiers and operators (e.g., a+b prohibited, but a + b allowed) has been raised. At least in my opinion, this is unlikely to make any real difference in most cases. The reason is fairly simple: at least in a typical compiler, the parser is preceded by a lexer. The lexer extracts tokens from the input stream and feeds them to the parser. With such a structure, the parser wouldn't see any difference at all between the a+b and a + b. Either way, it would receive exactly three tokens: identifer, +, and identifier.
Requiring the spaces might simplify the lexer a tiny bit--but to the extent it did, it would be completely independent of operator overloading, at least assuming the operator overloading was done like it is in C++, where only existing tokens are used1.
So, if that's not the problem, what is? The problem with operator overloading is that you can't hard-code a parser to know the meaning of an operator. With Java, for some given a = b + c, there are exactly two possibilities: a, b and c are each chosen from a small, limited set of types, and the meaning of that + is baked into the language, or else you have an error. So, a tool that needs to look at b + c and make sense of it can do a very minimal parse to assure that b and c are of types that can be added. If they are, it knows what the addition means, what kind of result it produces, and so on. If they are't, it can underline it in red squiggles (or whatever) to indicate an error.
For C++, things are quite different. For an expression like a = b + c;, b and c could be of almost entirely arbitrary types. The + could be implemented as a member function of b's type, or it could be a free function. In some cases, we might have a number of operator overloads (some of which could be templates) that could carry out that operation, so we need to do overload resolution to determine which one the compiler would actually select based on the types of the parameters (and if some of them are templates, the overload resolution rules get even more complex).
That lets us determine the type of the result from b + c. From there we basically repeat the whole process again to figure out what (if any) overload is used to assign that result to a. It might be built-in, or it might be another operator overload, and there might be multiple possible overloads that could do the job, so we have to do overload resolution again to figure out the right operator to use here.
In short, just figuring out what a = b + c; means in C++ requires nearly an entire compiler front-end. We can do the same in Java with a much smaller subset of a compiler2
I suppose things could be somewhat different if you allowed operator overloading like, for example, ML does, where a more or less arbitrary token can be designated as an operator, and that operator can be given a more or less arbitrary associativity and/or precedence. I believe ML handles this entirely in parsing, not lexing, but if you took this basic concept enough further, I can believe it might start to affect lexing, not just parsing.
Not to mention that most Java tools will use the JDK, which has a complete Java compiler built into the JVM, so tools can normally do most such analysis without dealing directly with parsing and such at all.
java-oo compiler plugin can add Operator Overloading support in Java.
It's not that java doesn't "need" operator overloading, it's just a choice made by its creators who wanted to keep the language more simple.
Java does not support operator overloading by programmers. This is not the same as stating that Java does not need operator overloading.
Operator overloading is syntactic sugar to express an operation using (arithmetic) symbols. For obvious reasons, the designers of the Java programming language chose to omit support for operator overloading in the language. This declaration can be found in the Java Language Environment whitepaper:
There are no means provided by which
programmers can overload the standard
arithmetic operators. Once again, the
effects of operator overloading can be
just as easily achieved by declaring a
class, appropriate instance variables,
and appropriate methods to manipulate
those variables. Eliminating operator
overloading leads to great
simplification of code.
In my personal opinion, that is a wise decision. Consider the following piece of code:
String b = "b";
String c = "c";
String a = b + c;
Now, it is fairly evident that b and c are concatenated to yield a. But when one consider the following snippet written using a hypothetical language that supports operator overloading, it is fairly evident that using operator overloading does not make for readable code.
Person b = new Person("B");
Person c = new Person("C");
Person a = b + c;
In order to understand the result of the above operation, one must view the implementation of the overloaded addition operator for the Person class. Surely, that makes for a tedious debugging session, and the code is better implemented as:
Person b = new Person("B");
Person c = new Person("C");
Person a = b.copyAttributesFrom(c);
OK Well... we have a very discussed and common issue. Today, in software industry, there are, mainly, two different types of languages:
Low level languages
High level languages
This distinction was useful about 10 years before now, the situation, at present, is a bit different.
Today we talk about business-ready applications.
Business models are some particular models where programs need to meet many requirements. They are so complex and so strict that coding an application with a language like c or c++ would be very time-spending. For this reason hybrid languages where invented.
We commonly know two types of languages:
Compiled
Interpreted
Well, today there is another one:
Compiled/Interpreted: in one word: MANAGED.
Managed languages are languages that are compiled in order to produce another code, different from the original one, but much more complex to handle. This INTERMEDIATE LANGUAGE is then INTERPETED by a program that runs the final program.
It is the common dynamics we came knowing from Java... It is a winning approach for business-ready applications.
Well, now going to your question...
Operator overloading is a matter that concerns also multiple inheritance and other advanced characteristics of low level languages.
Java, as well as C#, Python and so on, is a managed language, made to be easy to write and useful for building complex applications in very few time.
If we included operator overloading in Java, the language would become more complex and difficult to handle.
If you program in C++ you sure understand that operator overloading is a very very very delicate matter because it can lead to very complex situations and sometimes compiler might refuse to compile because of conflicts and so on... Introducing operator overloading is to be done carefully. IT IS POWERFUL, but we pay this power with an incredibly big load of problems to handle.
OKOK IT IS TRUE, you might tell me: "HEY, But C# uses operator overloading... What the hell are you telling me? why c# supports them and Java not?".
Well, this is the answer. C#, yes, implements operator overloading, but it is not like C++. There are many operator that cannot be overloaded in c# like "new" or many others that you can overload in c++... So C# supports operator overloading, but in a much lower level than c++ or other languages that fully supports it. But this is not a good answer to the earlier question...
The real answer is that C# is more complex than Java. This is a pro but also a con. It is a matter of deciding where to place the language: high level, higher level, very high level?
Well, Java does not support op overloading because it wants to be fast and easy to manage and use. When introducing op overloading, a language must also carry a large amount of problems caused by this new functionality.
It is exactly like questioning: "Why does Java not support multiple inheritance?"
Because it is tremendously complex to manage. Think about it... IT WOULD BE IMPOSSIBLE for a managed language to support multiple inheritance... No common class tree, no object class as a common base class for all classes, no possibility of upcasting (safely) and many problems to handle, manage, foresee, keep in count...
Java wants to be simple.
Even if I believe that future implementations of this language will result in supporting op overloading, you will see that the overloading dynamics will involve a fewer set of all the possibilities you have about overloading in C++.
Many others, here, also told you that overloading is useless.
Well I belong to those ones who think this is not true.
Well, if you think this way (op overloading is useless), then also many other features of managed languages are useless too. Think about interfaces, classes and so on, you really do not need them. You can use abstract classes for interface implementations... Let's look at c#... so many sugar syntax, LINQ and so on, they are not really necessary, BUT THEY FASTEN YOUR WORK...
Well, in managed languages everything that fasten a development process is welcome and does not imply uselessness. If you think that such features are not useful than the entire language itself would be useless and we all would come back programming complex applications in c++, ada, etc. The added value of managed languages is to be measured right on this elements.
Op overloading is a very useful feature, it could be implemented in languages like Java, and this would change the language structure and purposes, it would be a good thing but a bad thing too, just a matter of tastes.
But today, Java is simpler than C# even for this reason, because Java does not supports op overloading.
I know, maybe I was a little long, but hope it helps. Bye
Check Java Features Removed from C and C++ p 2.2.7 No More Operator Overloading.
There are no means provided by which
programmers can overload the standard
arithmetic operators. Once again, the
effects of operator overloading can be
just as easily achieved by declaring a
class, appropriate instance variables,
and appropriate methods to manipulate
those variables. Eliminating operator
overloading leads to great
simplification of code.
Java doesn't support operator overloading (one reference is the Wikipedia Operator Overloading page). This was a design decision by Java's creators to avoid perceived problems seen with operator overloading in other languages (especially C++).
Some people say that every programming language has its "complexity budget" which it can use to accomplish its purpose. But if the complexity budget is depleted, every minor change becomes increasingly complicated and hard to implement in a backward-compatible way.
After reading the current provisional syntax for Lambda (≙ Lambda expressions, exception transparency, defender methods and method references) from August 2010 I wonder if people at Oracle completely ignored Java's complexity budget when considering such changes.
These are the questions I'm thinking about - some of them more about language design in general:
Are the proposed additions comparable in complexity to approaches other languages chose?
Is it generally possible to add such additions to a language and protecting the developer from the complexity of the implementation ?
Are these additions a sign of reaching the end of the evolution of Java-as-a-language or is this expected when changing a language with a huge history?
Have other languages taken a totally different approach at this point of language evolution?
Thanks!
I have not followed the process and evolution of the Java 7 lambda
proposal, I am not even sure of what the latest proposal wording is.
Consider this as a rant/opinion rather than statements of truth. Also,
I have not used Java for ages, so the syntax might be rusty and
incorrect at places.
First, what are lambdas to the Java language? Syntactic sugar. While
in general lambdas enable code to create small function objects in
place, that support was already preset --to some extent-- in the Java
language through the use of inner classes.
So how much better is the syntax of lambdas? Where does it outperform
previous language constructs? Where could it be better?
For starters, I dislike the fact that there are two available syntax
for lambda functions (but this goes in the line of C#, so I guess my
opinion is not widespread. I guess if we want to sugar coat, then
#(int x)(x*x) is sweeter than #(int x){ return x*x; } even if the
double syntax does not add anything else. I would have preferred the
second syntax, more generic at the extra cost of writting return and
; in the short versions.
To be really useful, lambdas can take variables from the scope in
where they are defined and from a closure. Being consistent with
Inner classes, lambdas are restricted to capturing 'effectively
final' variables. Consistency with the previous features of the
language is a nice feature, but for sweetness, it would be nice to be
able to capture variables that can be reassigned. For that purpose,
they are considering that variables present in the context and
annotated with #Shared will be captured by-reference, allowing
assignments. To me this seems weird as how a lambda can use a variable
is determined at the place of declaration of the variable rather than
where the lambda is defined. A single variable could be used in more
than one lambda and this forces the same behavior in all of them.
Lambdas try to simulate actual function objects, but the proposal does
not get completely there: to keep the parser simple, since up to now
an identifier denotes either an object or a method that has been kept
consistent and calling a lambda requires using a ! after the lambda
name: #(int x)(x*x)!(5) will return 25. This brings a new syntax
to use for lambdas that differ from the rest of the language, where
! stands somehow as a synonim for .execute on a virtual generic
interface Lambda<Result,Args...> but, why not make it complete?
A new generic (virtual) interface Lambda could be created. It would
have to be virtual as the interface is not a real interface, but a
family of such: Lambda<Return>, Lambda<Return,Arg1>,
Lambda<Return,Arg1,Arg2>... They could define a single execution
method, which I would like to be like C++ operator(), but if that is
a burden then any other name would be fine, embracing the ! as a
shortcut for the method execution:
interface Lambda<R> {
R exec();
}
interface Lambda<R,A> {
R exec( A a );
}
Then the compiler need only translate identifier!(args) to
identifier.exec( args ), which is simple. The translation of the
lambda syntax would require the compiler to identify the proper
interface being implemented and could be matched as:
#( int x )(x *x)
// translated to
new Lambda<int,int>{ int exec( int x ) { return x*x; } }
This would also allow users to define Inner classes that can be used
as lambdas, in more complex situations. For example, if lambda
function needed to capture a variable annotated as #Shared in a
read-only manner, or maintain the state of the captured object at the
place of capture, manual implementation of the Lambda would be
available:
new Lambda<int,int>{ int value = context_value;
int exec( int x ) { return x * context_value; }
};
In a manner similar to what the current Inner classes definition is,
and thus being natural to current Java users. This could be used,
for example, in a loop to generate multiplier lambdas:
Lambda<int,int> array[10] = new Lambda<int,int>[10]();
for (int i = 0; i < 10; ++i ) {
array[i] = new Lambda<int,int>{ final int multiplier = i;
int exec( int x ) { return x * multiplier; }
};
}
// note this is disallowed in the current proposal, as `i` is
// not effectively final and as such cannot be 'captured'. Also
// if `i` was marked #Shared, then all the lambdas would share
// the same `i` as the loop and thus would produce the same
// result: multiply by 10 --probably quite unexpectedly.
//
// I am aware that this can be rewritten as:
// for (int ii = 0; ii < 10; ++ii ) { final int i = ii; ...
//
// but that is not simplifying the system, just pushing the
// complexity outside of the lambda.
This would allow usage of lambdas and methods that accept lambdas both
with the new simple syntax: #(int x){ return x*x; } or with the more
complex manual approach for specific cases where the sugar coating
interferes with the intended semantics.
Overall, I believe that the lambda proposal can be improved in
different directions, that the way it adds syntactic sugar is a
leaking abstraction (you have deal externally with issues that are
particular to the lambda) and that by not providing a lower level
interface it makes user code less readable in use cases that do not
perfectly fit the simple use case.
:
Modulo some scope-disambiguation constructs, almost all of these methods follow from the actual definition of a lambda abstraction:
λx.E
To answer your questions in order:
I don't think there are any particular things that make the proposals by the Java community better or worse than anything else. As I said, it follows from the mathematical definition, and therefore all faithful implementations are going to have almost exactly the same form.
Anonymous first-class functions bolted onto imperative languages tend to end up as a feature that some programmers love and use frequently, and that others ignore completely - therefore it is probably a sensible choice to give it some syntax that will not confuse the kinds of people who choose to ignore the presence of this particular language feature. I think hiding the complexity and particulars of implementation is what they have attempted to do by using syntax that blends well with Java, but which has no real connotation for Java programmers.
It's probably desirable for them to use some bits of syntax that are not going to complicate existing definitions, and so they are slightly constrained in the symbols they can choose to use as operators and such. Certainly Java's insistence on remaining backwards-compatible limits the language evolution slightly, but I don't think this is necessarily a bad thing. The PHP approach is at the other end of the spectrum (i.e. "let's break everything every time there is a new point release!"). I don't think that Java's evolution is inherently limited except by some of the fundamental tenets of its design - e.g. adherence to OOP principles, VM-based.
I think it's very difficult to make strong statements about language evolution from Java's perspective. It is in a reasonably unique position. For one, it's very, very popular, but it's relatively old. Microsoft had the benefit of at least 10 years worth of Java legacy before they decided to even start designing a language called "C#". The C programming language basically stopped evolving at all. C++ has had few significant changes that found any mainstream acceptance. Java has continued to evolve through a slow but consistent process - if anything I think it is better-equipped to keep on evolving than any other languages with similarly huge installed code bases.
It's not much more complicated then lambda expressions in other languages.
Consider...
int square(x) {
return x*x;
}
Java:
#(x){x*x}
Python:
lambda x:x*x
C#:
x => x*x
I think the C# approach is slightly more intuitive. Personally I would prefer...
x#x*x
Maybe this is not really an answer to your question, but this may be comparable to the way objective-c (which of course has a very narrow user base in contrast to Java) was extended by blocks (examples). While the syntax does not fit the rest of the language (IMHO), it is a useful addition and and the added complexity in terms of language features is rewarded for example with lower complexity of concurrent programming (simple things like concurrent iteration over an array or complicated techniques like Grand Central Dispatch).
In addition, many common tasks are simpler when using blocks, for example making one object a delegate (or - in Java lingo - "listener") for multiple instances of the same class. In Java, anonymous classes can already be used for that cause, so programmers know the concept and can just spare a few lines of source code using lambda expressions.
In objective-c (or the Cocoa/Cocoa Touch frameworks), new functionality is now often only accessible using blocks, and it seems like programmers are adopting it quickly (given that they have to give up backwards compatibility with old OS versions).
This is really really close to Lambda functions proposed in the new generation of C++ (C++0x)
so I think, Oracle guys have looked at the other implementations before cooking up their own.
http://en.wikipedia.org/wiki/C%2B%2B0x
[](int x, int y) { return x + y; }
I'm looking at some Java code that are maintained by other parts of the company, incidentally some former C and C++ devs. One thing that is ubiquitous is the use of static integer constants, such as
class Engine {
private static int ENGINE_IDLE = 0;
private static int ENGINE_COLLECTING = 1;
...
}
Besides a lacking 'final' qualifier, I'm a bit bothered by this kind of code. What I would have liked to see, being trained primarily in Java from school, would be something more like
class Engine {
private enum State { Idle, Collecting };
...
}
However, the arguments fail me. Why, if at all, is the latter better than the former?
Why, if at all, is the latter better
than the former?
It is much better because it gives you type safety and is self-documenting. With integer constants, you have to look at the API doc to find out what values are valid, and nothing prevents you from using invalid values (or, perhaps worse, integer constants that are completely unrelated). With Enums, the method signature tells you directly what values are valid (IDE autocompletion will work) and it's impossible to use an invalid value.
The "integer constant enums" pattern is unfortunately very common, even in the Java Standard API (and widely copied from there) because Java did not have Enums prior to Java 5.
An excerpt from the official docs, http://java.sun.com/j2se/1.5.0/docs/guide/language/enums.html:
This pattern has many problems, such as:
Not typesafe - Since a season is just an int you can pass in any other int value where a season is required, or add two seasons together (which makes no sense).
No namespace - You must prefix constants of an int enum with a string (in this case SEASON_) to avoid collisions with other int enum types.
Brittleness - Because int enums are compile-time constants, they are compiled into clients that use them. If a new constant is added between two existing constants or the order is changed, clients must be recompiled. If they are not, they will still run, but their behavior will be undefined.
Printed values are uninformative - Because they are just ints, if you print one out all you get is a number, which tells you nothing about what it represents, or even what type it is.
And this just about covers it. A one word argument would be that enums are just more readable and informative.
One more thing is that enums, like classes. can have fields and methods. This gives you the option to encompass some additional information about each type of state in the enum itself.
Because enums provide type safety. In the first case, you can pass any integer and if you use enum you are restricted to Idle and Collecting.
FYI : http://www.javapractices.com/topic/TopicAction.do?Id=1.
By using an int to refer to a constant, you're not forcing someone to actually use that constant. So, for example, you might have a method which takes an engine state, to which someone might happy invoke with:
engine.updateState(1);
Using an enum forces the user to stick with the explanatory label, so it is more legible.
There is one situation when static constance is preferred (rather that the code is legacy with tonne of dependency) and that is when the member of that value are not/may later not be finite.
Imagine if you may later add new state like Collected. The only way to do it with enum is to edit the original code which can be problem if the modification is done when there are already a lot of code manipulating it. Other than this, I personally see no reason why enum is not used.
Just my thought.
Readabiliy - When you use enums and do State.Idle, the reader immediately knows that you are talking about an idle state. Compare this with 4 or 5.
Type Safety - When use enum, even by mistake the user cannot pass a wrong value, as compiler will force him to use one of the pre-declared values in the enum. In case of simple integers, he could even pass -3274.
Maintainability - If you wanted to add a new state Waiting, then it would be very easy to add new state by adding a constant Waiting in your enum State without casuing any confusion.
The reasons from the spec, which Lajcik quotes, are explained in more detail in Josh Bloch's Effective Java, Item 30. If you have access to that book, I'd recommend perusing it. Java Enums are full-fledged classes which is why you get compile-time type safety. You can also give them behavior, giving you better encapsulation.
The former is common in code that started pre-1.5. Actually, another common idiom was to define your constants in an interface, because they didn't have any code.
Enums also give you a great deal of flexibility. Since Enums are essentially classes, you can augment them with useful methods (such as providing an internationalized resource string corresponding to a certain value in the enumeration, converting back and forth between instances of the enum type and other representations that may be required, etc.)