Why functional programming language support automated memoization but not imperative languages? - java

This is a question I read on some lectures about dynamic programming I randomly found on the internet. (I am graduated and I know the basic of dynamic programming already)
In the section of explaining why memoization is needed, i.e.
// psuedo code
int F[100000] = {0};
int fibonacci(int x){
if(x <= 1) return x;
if(F[x]>0) return F[x];
return F[x] = fibonacci(x-1) + fibonacci(x-2);
}
If memoization is not used, then many subproblems will be re-calculated many time that makes the complexity very high.
Then on one page, the notes have a question without answer, which is exactly what I want to ask. Here I am using exact wordings and the examples it show:
Automated memoization: Many functional programming languages (e.g. Lisp) have built-in support for memoization.
Why not in imperative languages (e.g. Java)?
LISP example the note provides (which it claims it is efficient):
(defun F (n)
(if
(<= n 1)
n
(+ (F (- n 1)) (F (- n 2)))))
Java example it provides (which it claims it is exponential)
static int F(int n) {
if (n <= 1) return n;
else return F(n-1) + F(n-2);
}
Before reading this, I do not even know there is built-in support of memoization in some programming languages.
Is the claim in the notes true? If yes, then why imperative languages not supporting it?

The claims about "LISP" are very vague, they don't even mention which LISP dialect or implementation they mean. None of LISP dialects I'm familiar with do automatic memoization, but LISP makes it easy to write a wrapper function which transforms any existing function into a memoized one.
Fully automatic, unconditional memoization would be a very dangerous practice and would lead to out-of-memory errors. In imperative languages it would be even worse because return values are often mutable, therefore not reusable. Imperative languages don't usually support tail-recursion optimization, further reducing the applicability of memoization.

The support for memoization is nothing more than having first-class functions.
If you want to memoize the Java version for one specific case, you can write it explicitly: create a hashtable, check for existing values, etc. Unfortunately, you cannot easily generalize this in order to memoize any function. Languages with first-class functions make writing functions and memoizing them almost orthogonal problems.
The basic case is easy, but you have to take into account recursive calls.
In statically typed functional languages like OCaml, a function that is memoized cannot just call itself recursively, because it would call the non-memoized version. However the only change to your existing function is to accept a function as an argument, named for example self, which should be called whenever you function wants to recurse. The generic memoization facility then provides the appropriate function. A full example of this is available in this answer.
The Lisp version has two features that makes memoizing an existing function even more straightforward.
You can manipulate functions like any other value
You can redefine functions at runtime
So for example, in Common Lisp, you define F:
(defun F (n)
(if (<= n 1)
n
(+ (F (- n 1))
(F (- n 2)))))
Then, you see that you need to memoize the function, so you load a library:
(ql:quickload :memoize)
... and you memoize F:
(org.tfeb.hax.memoize:memoize-function 'F)
The facility accepts arguments to specify which input should be cached and which test function to use. Then, the function F is replaced by a fresh one, which introduces the necessary code to use an internal hash-table. Recursive calls to F inside F are now calling the wrapping function, not the original one (you don't even recompile F). The only potential problem is if the original F was subject to tail-call optimization. You should probably declare it notinline or use DEF-MEMOIZED-FUNCTION.

Although I'm not sure any widely-used Lisps have supported automatic memoization, I think there are two reasons why memoization is more common in functional languages, and an additional one for Lisp-family languages.
First of all, people write functions in functional languages: computations whose result depends only on their arguments and which do not side-effect the environment. Anything which doesn't meet that requirement isn't amenable to memoization at all. And, well, imperative languages are just those languages in which those requirements are not, or may not be, met, because they would not be imperative otherwise!
Of course, even in merely functional-friendly languages like (most) Lisps you have to be careful: you probably should not memoize the following, for instance:
(defvar *p* 1)
(defun foo (n)
(if (<= n 0)
*p*
(+ (foo (1- n)) (foo (- n *p*)))))
Secondly is that functional languages generally want to talk about immutable data structures. This means two things:
It is actually safe to memoize a function which returns a large data structure
Functions which build very large data structures often need to cons an enormous amount of garbage, because they can't mutate interim structures.
(2) is slightly controversial: the received wisdom is that GCs are now so good that it's not a problem, copying is very cheap, compilers can do magic and so on. Well, people who have written such functions will know that this is only partly true: GCs are good, copying is cheap (but pointer-chasing large structures to copy them is often very hostile to caches), but it's not actually enough (and compilers almost never do the magic they are claimed to do). So you either cheat by gratuitously resorting to non-functional code, or you memoize. If you memoize the function then you only build all the interim structures once, and everything becomes cheap (other than in memory, but suitable weakness in the memoization can handle that).
Thirdly: if your language does not support easy metalinguistic abstraction, it's a serious pain to implement memoization. Or to put it another way: you need Lisp-style macros.
To memoize a function you need to do at least two things:
You need to control which arguments are the keys for the memoization -- not all functions have just one argument, and not all functions with multiple arguments should be memoized on the first;
You need to intervene inside the function to disable any self-tail-call optimization, which will completely subvert memoization.
Although it's kind of cruel to do so because it's so easy, I will demonstrate this by poking fun at Python.
You might think that decorators are what you need to memoize functions in Python. And indeed, you can write memoizing tools using decorators (and I have written a bunch of them). And these even sort-of work, although they do so mostly by chance.
For a start, a decorator can't easily know anything about the function it is decorating. So you end up either trying to memoize based on a tuple of all the arguments to the function, or having to specify in the decorator which arguments to memoize on, or something equally grotty.
Secondly, the decorator gets the function it is decorating as an argument: it doesn't get to poke around inside it. That's actually OK, because Python, as part of its 'no concepts invented after 1956' policy, of course, does not assume that calls to f lexically within the definion of f (and with no intervening bindings) are in fact self-calls. But perhaps one day it will, and all your memoization will now break.
So in summary: to memoize functions robustly, you need Lisp-style macros. Probably the only imperative languages which have those are Lisps.

Related

Why does Java CharSequence.chars() return an IntStream? [duplicate]

In Java 8, there is a new method String.chars() which returns a stream of ints (IntStream) that represent the character codes. I guess many people would expect a stream of chars here instead. What was the motivation to design the API this way?
As others have already mentioned, the design decision behind this was to prevent the explosion of methods and classes.
Still, personally I think this was a very bad decision, and there should, given they do not want to make CharStream, which is reasonable, different methods instead of chars(), I would think of:
Stream<Character> chars(), that gives a stream of boxes characters, which will have some light performance penalty.
IntStream unboxedChars(), which would to be used for performance code.
However, instead of focusing on why it is done this way currently, I think this answer should focus on showing a way to do it with the API that we have gotten with Java 8.
In Java 7 I would have done it like this:
for (int i = 0; i < hello.length(); i++) {
System.out.println(hello.charAt(i));
}
And I think a reasonable method to do it in Java 8 is the following:
hello.chars()
.mapToObj(i -> (char)i)
.forEach(System.out::println);
Here I obtain an IntStream and map it to an object via the lambda i -> (char)i, this will automatically box it into a Stream<Character>, and then we can do what we want, and still use method references as a plus.
Be aware though that you must do mapToObj, if you forget and use map, then nothing will complain, but you will still end up with an IntStream, and you might be left off wondering why it prints the integer values instead of the strings representing the characters.
Other ugly alternatives for Java 8:
By remaining in an IntStream and wanting to print them ultimately, you cannot use method references anymore for printing:
hello.chars()
.forEach(i -> System.out.println((char)i));
Moreover, using method references to your own method do not work anymore! Consider the following:
private void print(char c) {
System.out.println(c);
}
and then
hello.chars()
.forEach(this::print);
This will give a compile error, as there possibly is a lossy conversion.
Conclusion:
The API was designed this way because of not wanting to add CharStream, I personally think that the method should return a Stream<Character>, and the workaround currently is to use mapToObj(i -> (char)i) on an IntStream to be able to work properly with them.
The answer from skiwi covered many of the major points already. I'll fill in a bit more background.
The design of any API is a series of tradeoffs. In Java, one of the difficult issues is dealing with design decisions that were made long ago.
Primitives have been in Java since 1.0. They make Java an "impure" object-oriented language, since the primitives are not objects. The addition of primitives was, I believe, a pragmatic decision to improve performance at the expense of object-oriented purity.
This is a tradeoff we're still living with today, nearly 20 years later. The autoboxing feature added in Java 5 mostly eliminated the need to clutter source code with boxing and unboxing method calls, but the overhead is still there. In many cases it's not noticeable. However, if you were to perform boxing or unboxing within an inner loop, you'd see that it can impose significant CPU and garbage collection overhead.
When designing the Streams API, it was clear that we had to support primitives. The boxing/unboxing overhead would kill any performance benefit from parallelism. We didn't want to support all of the primitives, though, since that would have added a huge amount of clutter to the API. (Can you really see a use for a ShortStream?) "All" or "none" are comfortable places for a design to be, yet neither was acceptable. So we had to find a reasonable value of "some". We ended up with primitive specializations for int, long, and double. (Personally I would have left out int but that's just me.)
For CharSequence.chars() we considered returning Stream<Character> (an early prototype might have implemented this) but it was rejected because of boxing overhead. Considering that a String has char values as primitives, it would seem to be a mistake to impose boxing unconditionally when the caller would probably just do a bit of processing on the value and unbox it right back into a string.
We also considered a CharStream primitive specialization, but its use would seem to be quite narrow compared to the amount of bulk it would add to the API. It didn't seem worthwhile to add it.
The penalty this imposes on callers is that they have to know that the IntStream contains char values represented as ints and that casting must be done at the proper place. This is doubly confusing because there are overloaded API calls like PrintStream.print(char) and PrintStream.print(int) that differ markedly in their behavior. An additional point of confusion possibly arises because the codePoints() call also returns an IntStream but the values it contains are quite different.
So, this boils down to choosing pragmatically among several alternatives:
We could provide no primitive specializations, resulting in a simple, elegant, consistent API, but which imposes a high performance and GC overhead;
we could provide a complete set of primitive specializations, at the cost of cluttering up the API and imposing a maintenance burden on JDK developers; or
we could provide a subset of primitive specializations, giving a moderately sized, high performing API that imposes a relatively small burden on callers in a fairly narrow range of use cases (char processing).
We chose the last one.

Why is String.chars() a stream of ints in Java 8?

In Java 8, there is a new method String.chars() which returns a stream of ints (IntStream) that represent the character codes. I guess many people would expect a stream of chars here instead. What was the motivation to design the API this way?
As others have already mentioned, the design decision behind this was to prevent the explosion of methods and classes.
Still, personally I think this was a very bad decision, and there should, given they do not want to make CharStream, which is reasonable, different methods instead of chars(), I would think of:
Stream<Character> chars(), that gives a stream of boxes characters, which will have some light performance penalty.
IntStream unboxedChars(), which would to be used for performance code.
However, instead of focusing on why it is done this way currently, I think this answer should focus on showing a way to do it with the API that we have gotten with Java 8.
In Java 7 I would have done it like this:
for (int i = 0; i < hello.length(); i++) {
System.out.println(hello.charAt(i));
}
And I think a reasonable method to do it in Java 8 is the following:
hello.chars()
.mapToObj(i -> (char)i)
.forEach(System.out::println);
Here I obtain an IntStream and map it to an object via the lambda i -> (char)i, this will automatically box it into a Stream<Character>, and then we can do what we want, and still use method references as a plus.
Be aware though that you must do mapToObj, if you forget and use map, then nothing will complain, but you will still end up with an IntStream, and you might be left off wondering why it prints the integer values instead of the strings representing the characters.
Other ugly alternatives for Java 8:
By remaining in an IntStream and wanting to print them ultimately, you cannot use method references anymore for printing:
hello.chars()
.forEach(i -> System.out.println((char)i));
Moreover, using method references to your own method do not work anymore! Consider the following:
private void print(char c) {
System.out.println(c);
}
and then
hello.chars()
.forEach(this::print);
This will give a compile error, as there possibly is a lossy conversion.
Conclusion:
The API was designed this way because of not wanting to add CharStream, I personally think that the method should return a Stream<Character>, and the workaround currently is to use mapToObj(i -> (char)i) on an IntStream to be able to work properly with them.
The answer from skiwi covered many of the major points already. I'll fill in a bit more background.
The design of any API is a series of tradeoffs. In Java, one of the difficult issues is dealing with design decisions that were made long ago.
Primitives have been in Java since 1.0. They make Java an "impure" object-oriented language, since the primitives are not objects. The addition of primitives was, I believe, a pragmatic decision to improve performance at the expense of object-oriented purity.
This is a tradeoff we're still living with today, nearly 20 years later. The autoboxing feature added in Java 5 mostly eliminated the need to clutter source code with boxing and unboxing method calls, but the overhead is still there. In many cases it's not noticeable. However, if you were to perform boxing or unboxing within an inner loop, you'd see that it can impose significant CPU and garbage collection overhead.
When designing the Streams API, it was clear that we had to support primitives. The boxing/unboxing overhead would kill any performance benefit from parallelism. We didn't want to support all of the primitives, though, since that would have added a huge amount of clutter to the API. (Can you really see a use for a ShortStream?) "All" or "none" are comfortable places for a design to be, yet neither was acceptable. So we had to find a reasonable value of "some". We ended up with primitive specializations for int, long, and double. (Personally I would have left out int but that's just me.)
For CharSequence.chars() we considered returning Stream<Character> (an early prototype might have implemented this) but it was rejected because of boxing overhead. Considering that a String has char values as primitives, it would seem to be a mistake to impose boxing unconditionally when the caller would probably just do a bit of processing on the value and unbox it right back into a string.
We also considered a CharStream primitive specialization, but its use would seem to be quite narrow compared to the amount of bulk it would add to the API. It didn't seem worthwhile to add it.
The penalty this imposes on callers is that they have to know that the IntStream contains char values represented as ints and that casting must be done at the proper place. This is doubly confusing because there are overloaded API calls like PrintStream.print(char) and PrintStream.print(int) that differ markedly in their behavior. An additional point of confusion possibly arises because the codePoints() call also returns an IntStream but the values it contains are quite different.
So, this boils down to choosing pragmatically among several alternatives:
We could provide no primitive specializations, resulting in a simple, elegant, consistent API, but which imposes a high performance and GC overhead;
we could provide a complete set of primitive specializations, at the cost of cluttering up the API and imposing a maintenance burden on JDK developers; or
we could provide a subset of primitive specializations, giving a moderately sized, high performing API that imposes a relatively small burden on callers in a fairly narrow range of use cases (char processing).
We chose the last one.

Why does Scala implement for as a closure?

Recent events on the blogosphere have indicated that a possible performance problem with Scala is its use of closures to implement for.
What are the reasons for this design decision, as opposed to a C or Java-style "primitive for" - that is one which will be turned into a simple loop?
(I'm making a distinction between Java's for and its "foreach" construct here, as the latter involves an implicit Iterator).
More detail, following up from Peter. This bit of Scala:
object ScratchFor {
def main(args : Array[String]) : Unit = {
for (val s <- args) {
println(s)
}
}
}
creates 3 classes: ScratchFor$$anonfun$main$1.class ScratchFor$.class ScratchFor.class
ScratchFor::main just forwards to the companion object, ScratchFor$.MODULE$::main which spins up an ScratchFor$$anonfun$main$1 (which is an implementation of AbstractFunction1).
It's in the apply() method of this anonymous inner impl of AbstractFunction1 that the actual code lives, which is effectively the loop body.
I don't see HotSpot being able to rewrite this into a simple loop. Happy to be proved wrong on this, though.
Traditional for loops are clumsy, verbose and error-prone. I think it is proof enough of this that "for-each" loops where added to Java, C# and C++, but if you want more details you may check item 46 of Effective Java.
Now, for-each loops are still much faster than Scala for-comprehension, but they are also much less powerful (and more clumsy) because they cannot return values. If you want to transform or filter a collection (or do both to a group of collections), you'll still have to handle all the mechanical details of constructing the result collection in addition to computing the values. Not to mention it inevitably uses some mutable state.
Finally, even though for-each loops are adequate enough for collections, they are not suited to other monadic classes (of which collections are a subset of).
So Scala has a general method which takes care of all of the above. Yes, it is slower, but the goal is to have the compiler effectively optimise it well enough so that this doesn't become a hindrance (and, of course, JIT could help here as well).
That has not been accomplished to this date, but -optimise has reduced a lot of ground between common for-each loops and for-comprehensions on the latest versions of Scala. If performance is essential, you can always use while or tail recursion.
Now, it would be possibly for Scala to have common for loops or for-each loops as special cases specifically targeted at performance issues (since for-comprehensions can do everything they do). However, that violates two principles that guide Scala's design:
Reduce complexity. Yes, contrary to what some say, that is a design goal, and special cases that serve no other purpose other than optimise performance -- even though a workable solution exists for performance cases -- would needlessly increase the complexity of the language.
Scalability. This is in the sense that the use can scale the language for any size of problem by writing libraries. The point here is that having the compiler optimise one particular class, such as Range, would make it impossible for the user to create a replacement class that would perform just as well.
The for comprehension in Scala is a powerful general-purpose looping and pattern-matching construct. Look at what it can do:
case class Person(first: String, last: String) {}
val people = List(Person("Isaac","Newton"), Person("Michael","Jordan"))
val lastfirst = for (Person(f,l) <- people) yield l+", "+f
for (n <- lastfirst) println(n)
The second case looks pretty straightforward--take each item in a collection and print it. But the first takes apart a list containing a custom data structure and transforms it into a different collection type!
The first for there highlights only a small portion of the capability of the construct; it is both extremely powerful and extremely general. In order to maintain this power, the for must be able to turn into something very general, which means closures. Then the question is: do you also introduce special cases that operate on known collections in simple ways with improved performance? The answer thus far has been mostly no, instead preferring solutions that optimize the general closure-taking methods that for turns into.
Whether this is useful for you in particular depends on whether you are using the general capabilities a lot (in which case you will be glad) or not (in which case you may wish progress was faster).
Still, try -optimize. It often usefully speeds up simple for-comprehensions these days.
The for-comprehension is much more than a simple loop.
If you need an imperative loop, use while. If you want to write performant code in Scala, you need to know this. Just like you have to know about language implementation when you want to write fast code in every other language.
So, since the for-comprehension is not a simple loop, I hope you understand that it's not compiled down to a simple loop.
I would assume using a closure is a general solution. A more optimal solution in some cases would be to "inline" the closure as a loop and eliminate the need to create an object. Perhaps the Scala designers feel the JIT should do this, rather having the compiler do this.
Let's say in Java this is the same as writing
public static void main(String... args) {
for_loop(args, new Function<String>() {
public void apply(String s) {
System.out.println(s);
}
});
}
interface Function<T> {
void apply(T s);
}
public static <T> void for_loop(T... ts, Function<T> tFunc) {
for(T t: ts) tFunc.apply(t);
}
This is fairly easy to inline (if you're a human). What is surprising is that Scala doesn't have an intrinsic to perform the optimisation to eliminate the need for a new object. Certainly the JIT could do it in theory, but in practise, it might be a while before it handles this specific case.
I'm surprised that no one has mentioned one of the pitfalls you can get into if for does not create a closure.
In Python for example:
ls = [None] * 3
for i in [0, 1, 2]:
ls[i] = lambda: i
print(ls[0]())
print(ls[1]())
print(ls[2]())
This prints 2 2 2, because i has a longer lifetime than the for loop. I run into this trap all the time in Python and R.
So even in the very simplest of cases, it is important for for in Scala to be implemented using an anonymous function, because it creates an environment to store variables.

Why doesn't Java need Operator Overloading? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Why doesn't Java need operator overloading? Is there any way it can be supported in Java?
Java only allows arithmetic operations on elementary numeric types. It's a mixed blessing, because although it's convenient to define operators on other types (like complex numbers, vectors etc), there are always implementation-dependent idiosyncrasies. So operators don't always do what you expect them to do. By avoiding operator overloading, it's more transparent which function is called when. A wise design move in some people's eyes.
Java doesn't "need" operator overloading, because no language needs it.
a + b is just "syntactic sugar" for a.Add(b) (actually, some would argue that a.Add(b) is just syntactic sugar for Add(a,b))
This related question might help. In short, operator overloading was intentionally avoided when Java was designed because of issues with overloading in C++.
Scala, a newer JVM language, has a syntax that allows method overloading that functions very much like operator overloading, without the limitations of C++ operator overloading. In Scala, it's possible to define a method named +, for example. It's also possible to omit the . operator and parentheses in method calls:
case class A(value: Int) {
def +(other: A) = new A(value + other.value)
}
scala> new A(1) + new A(3)
res0: A = A(4)
No language needs operator overloading. Some believe that Java would benefit from adding it, but its omission has been publicized as a benefit for so long that adding it is almost certainly politically unacceptable (and it's only since the Oracle buyout that I'd even include the "almost").
The counterpoint generally consists of postulating some meaningless (or even counterintuitive) overload, such as adding together two employees or overloading '+' to do division. While operator overloading in such languages as C++ would allow this, lack of operator overloading in Java does little to prevent or even mitigate the problem. someEmployee.Add(anotherEmployee) is no improvement over someEmployee + anotherEmployee. Likewise, if myLargeInteger.Add(anotherLargeInteger) actually does division instead of addition. At least to me, this line of argument appears thoroughly unconvincing at best.
There is, however, another respect in which omitting operator overloading does (almost certainly) have a real benefit. Its omission keeps the language easier to process, which makes it much easier (and quicker) to develop tools that process the language. Just for an obvious example, refactoring tools for Java are much more numerous and comprehensive than for C++. I doubt that this can or should be credited specifically and solely to support for operator overloading in C++ and its omission in Java. Nonetheless, the general attitude of keeping Java simple (including omission of operator overloading) is undoubtedly a major contributing factor.
The possibility of simplifying parsing by requiring spaces between identifiers and operators (e.g., a+b prohibited, but a + b allowed) has been raised. At least in my opinion, this is unlikely to make any real difference in most cases. The reason is fairly simple: at least in a typical compiler, the parser is preceded by a lexer. The lexer extracts tokens from the input stream and feeds them to the parser. With such a structure, the parser wouldn't see any difference at all between the a+b and a + b. Either way, it would receive exactly three tokens: identifer, +, and identifier.
Requiring the spaces might simplify the lexer a tiny bit--but to the extent it did, it would be completely independent of operator overloading, at least assuming the operator overloading was done like it is in C++, where only existing tokens are used1.
So, if that's not the problem, what is? The problem with operator overloading is that you can't hard-code a parser to know the meaning of an operator. With Java, for some given a = b + c, there are exactly two possibilities: a, b and c are each chosen from a small, limited set of types, and the meaning of that + is baked into the language, or else you have an error. So, a tool that needs to look at b + c and make sense of it can do a very minimal parse to assure that b and c are of types that can be added. If they are, it knows what the addition means, what kind of result it produces, and so on. If they are't, it can underline it in red squiggles (or whatever) to indicate an error.
For C++, things are quite different. For an expression like a = b + c;, b and c could be of almost entirely arbitrary types. The + could be implemented as a member function of b's type, or it could be a free function. In some cases, we might have a number of operator overloads (some of which could be templates) that could carry out that operation, so we need to do overload resolution to determine which one the compiler would actually select based on the types of the parameters (and if some of them are templates, the overload resolution rules get even more complex).
That lets us determine the type of the result from b + c. From there we basically repeat the whole process again to figure out what (if any) overload is used to assign that result to a. It might be built-in, or it might be another operator overload, and there might be multiple possible overloads that could do the job, so we have to do overload resolution again to figure out the right operator to use here.
In short, just figuring out what a = b + c; means in C++ requires nearly an entire compiler front-end. We can do the same in Java with a much smaller subset of a compiler2
I suppose things could be somewhat different if you allowed operator overloading like, for example, ML does, where a more or less arbitrary token can be designated as an operator, and that operator can be given a more or less arbitrary associativity and/or precedence. I believe ML handles this entirely in parsing, not lexing, but if you took this basic concept enough further, I can believe it might start to affect lexing, not just parsing.
Not to mention that most Java tools will use the JDK, which has a complete Java compiler built into the JVM, so tools can normally do most such analysis without dealing directly with parsing and such at all.
java-oo compiler plugin can add Operator Overloading support in Java.
It's not that java doesn't "need" operator overloading, it's just a choice made by its creators who wanted to keep the language more simple.
Java does not support operator overloading by programmers. This is not the same as stating that Java does not need operator overloading.
Operator overloading is syntactic sugar to express an operation using (arithmetic) symbols. For obvious reasons, the designers of the Java programming language chose to omit support for operator overloading in the language. This declaration can be found in the Java Language Environment whitepaper:
There are no means provided by which
programmers can overload the standard
arithmetic operators. Once again, the
effects of operator overloading can be
just as easily achieved by declaring a
class, appropriate instance variables,
and appropriate methods to manipulate
those variables. Eliminating operator
overloading leads to great
simplification of code.
In my personal opinion, that is a wise decision. Consider the following piece of code:
String b = "b";
String c = "c";
String a = b + c;
Now, it is fairly evident that b and c are concatenated to yield a. But when one consider the following snippet written using a hypothetical language that supports operator overloading, it is fairly evident that using operator overloading does not make for readable code.
Person b = new Person("B");
Person c = new Person("C");
Person a = b + c;
In order to understand the result of the above operation, one must view the implementation of the overloaded addition operator for the Person class. Surely, that makes for a tedious debugging session, and the code is better implemented as:
Person b = new Person("B");
Person c = new Person("C");
Person a = b.copyAttributesFrom(c);
OK Well... we have a very discussed and common issue. Today, in software industry, there are, mainly, two different types of languages:
Low level languages
High level languages
This distinction was useful about 10 years before now, the situation, at present, is a bit different.
Today we talk about business-ready applications.
Business models are some particular models where programs need to meet many requirements. They are so complex and so strict that coding an application with a language like c or c++ would be very time-spending. For this reason hybrid languages where invented.
We commonly know two types of languages:
Compiled
Interpreted
Well, today there is another one:
Compiled/Interpreted: in one word: MANAGED.
Managed languages are languages that are compiled in order to produce another code, different from the original one, but much more complex to handle. This INTERMEDIATE LANGUAGE is then INTERPETED by a program that runs the final program.
It is the common dynamics we came knowing from Java... It is a winning approach for business-ready applications.
Well, now going to your question...
Operator overloading is a matter that concerns also multiple inheritance and other advanced characteristics of low level languages.
Java, as well as C#, Python and so on, is a managed language, made to be easy to write and useful for building complex applications in very few time.
If we included operator overloading in Java, the language would become more complex and difficult to handle.
If you program in C++ you sure understand that operator overloading is a very very very delicate matter because it can lead to very complex situations and sometimes compiler might refuse to compile because of conflicts and so on... Introducing operator overloading is to be done carefully. IT IS POWERFUL, but we pay this power with an incredibly big load of problems to handle.
OKOK IT IS TRUE, you might tell me: "HEY, But C# uses operator overloading... What the hell are you telling me? why c# supports them and Java not?".
Well, this is the answer. C#, yes, implements operator overloading, but it is not like C++. There are many operator that cannot be overloaded in c# like "new" or many others that you can overload in c++... So C# supports operator overloading, but in a much lower level than c++ or other languages that fully supports it. But this is not a good answer to the earlier question...
The real answer is that C# is more complex than Java. This is a pro but also a con. It is a matter of deciding where to place the language: high level, higher level, very high level?
Well, Java does not support op overloading because it wants to be fast and easy to manage and use. When introducing op overloading, a language must also carry a large amount of problems caused by this new functionality.
It is exactly like questioning: "Why does Java not support multiple inheritance?"
Because it is tremendously complex to manage. Think about it... IT WOULD BE IMPOSSIBLE for a managed language to support multiple inheritance... No common class tree, no object class as a common base class for all classes, no possibility of upcasting (safely) and many problems to handle, manage, foresee, keep in count...
Java wants to be simple.
Even if I believe that future implementations of this language will result in supporting op overloading, you will see that the overloading dynamics will involve a fewer set of all the possibilities you have about overloading in C++.
Many others, here, also told you that overloading is useless.
Well I belong to those ones who think this is not true.
Well, if you think this way (op overloading is useless), then also many other features of managed languages are useless too. Think about interfaces, classes and so on, you really do not need them. You can use abstract classes for interface implementations... Let's look at c#... so many sugar syntax, LINQ and so on, they are not really necessary, BUT THEY FASTEN YOUR WORK...
Well, in managed languages everything that fasten a development process is welcome and does not imply uselessness. If you think that such features are not useful than the entire language itself would be useless and we all would come back programming complex applications in c++, ada, etc. The added value of managed languages is to be measured right on this elements.
Op overloading is a very useful feature, it could be implemented in languages like Java, and this would change the language structure and purposes, it would be a good thing but a bad thing too, just a matter of tastes.
But today, Java is simpler than C# even for this reason, because Java does not supports op overloading.
I know, maybe I was a little long, but hope it helps. Bye
Check Java Features Removed from C and C++ p 2.2.7 No More Operator Overloading.
There are no means provided by which
programmers can overload the standard
arithmetic operators. Once again, the
effects of operator overloading can be
just as easily achieved by declaring a
class, appropriate instance variables,
and appropriate methods to manipulate
those variables. Eliminating operator
overloading leads to great
simplification of code.
Java doesn't support operator overloading (one reference is the Wikipedia Operator Overloading page). This was a design decision by Java's creators to avoid perceived problems seen with operator overloading in other languages (especially C++).

Complexity of Java 7's current Lambda proposal? (August 2010)

Some people say that every programming language has its "complexity budget" which it can use to accomplish its purpose. But if the complexity budget is depleted, every minor change becomes increasingly complicated and hard to implement in a backward-compatible way.
After reading the current provisional syntax for Lambda (≙ Lambda expressions, exception transparency, defender methods and method references) from August 2010 I wonder if people at Oracle completely ignored Java's complexity budget when considering such changes.
These are the questions I'm thinking about - some of them more about language design in general:
Are the proposed additions comparable in complexity to approaches other languages chose?
Is it generally possible to add such additions to a language and protecting the developer from the complexity of the implementation ?
Are these additions a sign of reaching the end of the evolution of Java-as-a-language or is this expected when changing a language with a huge history?
Have other languages taken a totally different approach at this point of language evolution?
Thanks!
I have not followed the process and evolution of the Java 7 lambda
proposal, I am not even sure of what the latest proposal wording is.
Consider this as a rant/opinion rather than statements of truth. Also,
I have not used Java for ages, so the syntax might be rusty and
incorrect at places.
First, what are lambdas to the Java language? Syntactic sugar. While
in general lambdas enable code to create small function objects in
place, that support was already preset --to some extent-- in the Java
language through the use of inner classes.
So how much better is the syntax of lambdas? Where does it outperform
previous language constructs? Where could it be better?
For starters, I dislike the fact that there are two available syntax
for lambda functions (but this goes in the line of C#, so I guess my
opinion is not widespread. I guess if we want to sugar coat, then
#(int x)(x*x) is sweeter than #(int x){ return x*x; } even if the
double syntax does not add anything else. I would have preferred the
second syntax, more generic at the extra cost of writting return and
; in the short versions.
To be really useful, lambdas can take variables from the scope in
where they are defined and from a closure. Being consistent with
Inner classes, lambdas are restricted to capturing 'effectively
final' variables. Consistency with the previous features of the
language is a nice feature, but for sweetness, it would be nice to be
able to capture variables that can be reassigned. For that purpose,
they are considering that variables present in the context and
annotated with #Shared will be captured by-reference, allowing
assignments. To me this seems weird as how a lambda can use a variable
is determined at the place of declaration of the variable rather than
where the lambda is defined. A single variable could be used in more
than one lambda and this forces the same behavior in all of them.
Lambdas try to simulate actual function objects, but the proposal does
not get completely there: to keep the parser simple, since up to now
an identifier denotes either an object or a method that has been kept
consistent and calling a lambda requires using a ! after the lambda
name: #(int x)(x*x)!(5) will return 25. This brings a new syntax
to use for lambdas that differ from the rest of the language, where
! stands somehow as a synonim for .execute on a virtual generic
interface Lambda<Result,Args...> but, why not make it complete?
A new generic (virtual) interface Lambda could be created. It would
have to be virtual as the interface is not a real interface, but a
family of such: Lambda<Return>, Lambda<Return,Arg1>,
Lambda<Return,Arg1,Arg2>... They could define a single execution
method, which I would like to be like C++ operator(), but if that is
a burden then any other name would be fine, embracing the ! as a
shortcut for the method execution:
interface Lambda<R> {
R exec();
}
interface Lambda<R,A> {
R exec( A a );
}
Then the compiler need only translate identifier!(args) to
identifier.exec( args ), which is simple. The translation of the
lambda syntax would require the compiler to identify the proper
interface being implemented and could be matched as:
#( int x )(x *x)
// translated to
new Lambda<int,int>{ int exec( int x ) { return x*x; } }
This would also allow users to define Inner classes that can be used
as lambdas, in more complex situations. For example, if lambda
function needed to capture a variable annotated as #Shared in a
read-only manner, or maintain the state of the captured object at the
place of capture, manual implementation of the Lambda would be
available:
new Lambda<int,int>{ int value = context_value;
int exec( int x ) { return x * context_value; }
};
In a manner similar to what the current Inner classes definition is,
and thus being natural to current Java users. This could be used,
for example, in a loop to generate multiplier lambdas:
Lambda<int,int> array[10] = new Lambda<int,int>[10]();
for (int i = 0; i < 10; ++i ) {
array[i] = new Lambda<int,int>{ final int multiplier = i;
int exec( int x ) { return x * multiplier; }
};
}
// note this is disallowed in the current proposal, as `i` is
// not effectively final and as such cannot be 'captured'. Also
// if `i` was marked #Shared, then all the lambdas would share
// the same `i` as the loop and thus would produce the same
// result: multiply by 10 --probably quite unexpectedly.
//
// I am aware that this can be rewritten as:
// for (int ii = 0; ii < 10; ++ii ) { final int i = ii; ...
//
// but that is not simplifying the system, just pushing the
// complexity outside of the lambda.
This would allow usage of lambdas and methods that accept lambdas both
with the new simple syntax: #(int x){ return x*x; } or with the more
complex manual approach for specific cases where the sugar coating
interferes with the intended semantics.
Overall, I believe that the lambda proposal can be improved in
different directions, that the way it adds syntactic sugar is a
leaking abstraction (you have deal externally with issues that are
particular to the lambda) and that by not providing a lower level
interface it makes user code less readable in use cases that do not
perfectly fit the simple use case.
:
Modulo some scope-disambiguation constructs, almost all of these methods follow from the actual definition of a lambda abstraction:
λx.E
To answer your questions in order:
I don't think there are any particular things that make the proposals by the Java community better or worse than anything else. As I said, it follows from the mathematical definition, and therefore all faithful implementations are going to have almost exactly the same form.
Anonymous first-class functions bolted onto imperative languages tend to end up as a feature that some programmers love and use frequently, and that others ignore completely - therefore it is probably a sensible choice to give it some syntax that will not confuse the kinds of people who choose to ignore the presence of this particular language feature. I think hiding the complexity and particulars of implementation is what they have attempted to do by using syntax that blends well with Java, but which has no real connotation for Java programmers.
It's probably desirable for them to use some bits of syntax that are not going to complicate existing definitions, and so they are slightly constrained in the symbols they can choose to use as operators and such. Certainly Java's insistence on remaining backwards-compatible limits the language evolution slightly, but I don't think this is necessarily a bad thing. The PHP approach is at the other end of the spectrum (i.e. "let's break everything every time there is a new point release!"). I don't think that Java's evolution is inherently limited except by some of the fundamental tenets of its design - e.g. adherence to OOP principles, VM-based.
I think it's very difficult to make strong statements about language evolution from Java's perspective. It is in a reasonably unique position. For one, it's very, very popular, but it's relatively old. Microsoft had the benefit of at least 10 years worth of Java legacy before they decided to even start designing a language called "C#". The C programming language basically stopped evolving at all. C++ has had few significant changes that found any mainstream acceptance. Java has continued to evolve through a slow but consistent process - if anything I think it is better-equipped to keep on evolving than any other languages with similarly huge installed code bases.
It's not much more complicated then lambda expressions in other languages.
Consider...
int square(x) {
return x*x;
}
Java:
#(x){x*x}
Python:
lambda x:x*x
C#:
x => x*x
I think the C# approach is slightly more intuitive. Personally I would prefer...
x#x*x
Maybe this is not really an answer to your question, but this may be comparable to the way objective-c (which of course has a very narrow user base in contrast to Java) was extended by blocks (examples). While the syntax does not fit the rest of the language (IMHO), it is a useful addition and and the added complexity in terms of language features is rewarded for example with lower complexity of concurrent programming (simple things like concurrent iteration over an array or complicated techniques like Grand Central Dispatch).
In addition, many common tasks are simpler when using blocks, for example making one object a delegate (or - in Java lingo - "listener") for multiple instances of the same class. In Java, anonymous classes can already be used for that cause, so programmers know the concept and can just spare a few lines of source code using lambda expressions.
In objective-c (or the Cocoa/Cocoa Touch frameworks), new functionality is now often only accessible using blocks, and it seems like programmers are adopting it quickly (given that they have to give up backwards compatibility with old OS versions).
This is really really close to Lambda functions proposed in the new generation of C++ (C++0x)
so I think, Oracle guys have looked at the other implementations before cooking up their own.
http://en.wikipedia.org/wiki/C%2B%2B0x
[](int x, int y) { return x + y; }

Categories

Resources