Java 8 lambda expression and first-class values - java

Are Java 8 closures really first-class values or are they only a syntactic sugar?

I would say that Java 8 closures ("Lambdas") are neither mere syntactic sugar nor are they first-class values.
I've addressed the issue of syntactic sugar in an answer to another StackExchange question.
As for whether lambdas are "first class" it really depends on your definition, but I'll make a case that lambdas aren't really first class.
In some sense a lambda wants to be a function, but Java 8 is not adding function types. Instead, a lambda expression is converted into an instance of a functional interface. This has allowed lambdas to be added to Java 8 with only minor changes to Java's type system. After conversion, the result is a reference just like that of any other reference type. In fact, using a Lambda -- for example, in a method that was passed a lambda expression as parameter -- is indistinguishable from calling a method through an interface. A method that receives a parameter of a functional interface type can't tell whether it was passed a lambda expression or an instance of some class that happens to implement that functional interface.
For more information about whether lambdas are objects, see the Lambda FAQ Answer to this question.
Given that lambdas are converted into objects, they inherit (literally) all the characteristics of objects. In particular, objects:
have various methods like equals, getClass, hashCode, notify, toString, and wait
have an identity hash code
can be locked by a synchronized block
can be compared using the == and != and instanceof operators
and so forth. In fact, all of these are irrelevant to the intended usage of lambdas. Their behavior is essentially undefined. You can write a program that uses any of these, and you will get some result, but the result may differ from release to release (or even run to run!).
Restating this more concisely, in Java, objects have identity, but values (particularly function values, if they were to exist) should not have any notion of identity. Java 8 does not have function types. Instead, lambda expressions are converted to objects, so they have a lot baggage that's irrelevant to functions, particularly identity. That doesn't seem like "first class" to me.
Update 2013-10-24
I've been thinking further on this topic since having posted my answer several months ago. From a technical standpoint everything I wrote above is correct. The conclusion is probably expressed more precisely as Java 8 lambdas not being pure (as opposed to first-class) values, because they carry a lot of object baggage along. However, just because they're impure doesn't mean they aren't first-class. Consider the Wikipedia definition of first-class function. Briefly, the criteria listed there for considering functions first-class are the abilities to:
pass functions as arguments to other functions
return functions from other functions
assign functions to variables
store functions in data structures
have functions be anonymous
Java 8 lambdas meet all of these criteria. So that does make them seem first-class.
The article also mentions function names not having special status, instead a function's name is simply a variable whose type is a function type. Java 8 lambdas do not meet this last criterion. Java 8 doesn't have function types; it has functional interfaces. These are used effectively like function types, but they aren't function types at all. If you have a reference whose type is a functional interface, you have no idea whether it's a lambda, an instance of an anonymous inner class, or an instance of a concrete class that happens to implement that interface.
In summary, Java 8 lambdas are more first-class functions than I had originally thought. They just aren't pure first-class functions.

Yes, they are first class values (or will be, once Java 8 is released...)
In the sense that you can pass them as arguments, compose them to make higher order functions, store them in data structures etc. You will be able to use them for a broad range of functional programming techniques.
See also for a bit more definition of what "first class" means in this context:
http://en.wikipedia.org/wiki/First-class_citizen

As I see it, it is syntactic sugar, but in addition with the type inference, a new package java.util.functions and semantic of inner classes it does appear as a first-class value.

A real closure with variable binding to the outside context has some overhead. I would consider the implementation of Java 8 optimal, sufficiently pure.
It is not merely syntactical sugar at least.
And I wouldn't know of any more optimal implementation.

For me Lambdas in Java 8 is just syntax sugar because you cannot use it as First class Citizen (http://en.wikipedia.org/wiki/First-class_function) each function should be wrapped into object it imposes many limitation when comparing to language with pure first class function like SCALA. Java 8 closures can only capture immutable ("effectively final") non-local variables.
Here is better explanation why it is syntax-sugar Java Lambdas and Closures

Related

Java 8 dancing around functions as first class citizens? [duplicate]

This question already has answers here:
Java 8 lambda expression and first-class values
(5 answers)
Closed 8 years ago.
So the functional programmer in me likes languages like python that treat functions as first class citizens. It looks like Java 8 caved to the pressure and "sort of" implemented things like lambda expressions and method references.
My question is, is Java on its way to using first class functions, or is this really just syntactic sugar to reduce the amount of code it takes to implement anonymous classes / interfaces such as runnable? (my gut says the latter).
My ideal scenario
Map<String,DoubleToDoubleFunc> mymap = new HashMap<String,DoubleToDoubleFunc>();
...
mymap.put("Func 1", (double a, double b) -> a + b);
mymap.put("Func 2", Math::pow);
...
w = mymap.get("Func 1")(y,z);
x = mymap.get("Func 2")(y,z);
There is no function structural type in Java 8. But Java has always had first class objects. And in fact, they have been used as an alternative. Historically, due to the lack of first-class functions, what we have done so far is to wrap the function inside an object. These are the famous SAM types (single abstract method types).
So, instead of
function run() {}
Thread t = new Thread(run);
We do
Runnable run = new Runnable(){ public void run(){} };
Thread t = new Thread(run);
That is, we put the function inside an object in order to be able to pass it around as a value. So, first class objects have been an alternative solution for a long time.
The JDK 8 simply makes implementing this concept simpler, and they call this type of wrapper interfaces "Functional Interfaces" and offer some syntactic sugar to implement the wrapper objects.
Runnable run = () -> {};
Thread t = new Thread(run);
But ultimately, we are still using first-class objects. And they have similar properties to first-class functions. They encapsulate behavior, they can be passed as arguments and be returned as values.
In the lambda mailing list Brian Goetz gave a good explanation of some of the reasons that motivated this design.
Along the lines that we've been discussing today, here's a peek at
where we're heading. We explored the road of "maybe lambdas should
just be inner class instances, that would be really simple", but
eventually came to the position of "functions are a better direction
for the future of the language".
This exploration played out in stages: first internally before the EG
was formed, and then again when the EG discussed the issues. The
following is my position on the issue. Hopefully, this fills in some
of the gaps between what the spec currently says and what we say
about it.
The issues that have been raised about whether lambdas are objects or
not largely come down to philosophical questions like "what are
lambdas really", "why will Java benefit from lambdas", and ultimately
"how best to evolve the Java language and platform."
Oracle's position is that Java must evolve -- carefully, of course --
in order to remain competitive. This is, of course, a difficult
balancing act.
It is my belief that the best direction for evolving Java is to
encourage a more functional style of programming. The role of Lambda
is primarily to support the development and consumption of more
functional-like libraries; I've offered examples such as
filter-map-reduce to illustrate this direction.
There is plenty of evidence in the ecosystem to support the hypothesis
that, if given the tools to do so easily, object-oriented programmers
are ready to embrace functional techniques (such as immutability) and
work them into an object-oriented view of the world, and will write
better, less error-prone code as a result. Simply put, we believe the
best thing we can do for Java developers is to give them a gentle push
towards a more functional style of programming. We're not going to
turn Java into Haskell, nor even into Scala. But the direction is
clear.
Lambda is the down-payment on this evolution, but it is far from the
end of the story. The rest of the story isn't written yet, but
preserving our options are a key aspect of how we evaluate the
decisions we make here.
This is why I've been so dogged in my insistence that lambdas are not
objects. I believe the "lambdas are just objects" position, while
very comfortable and tempting, slams the door on a number of
potentially useful directions for language evolution.
As a single example, let's take function types. The lambda strawman
offered at devoxx had function types. I insisted we remove them, and
this made me unpopular. But my objection to function types was not
that I don't like function types -- I love function types -- but that
function types fought badly with an existing aspect of the Java type
system, erasure. Erased function types are the worst of both worlds.
So we removed this from the design.
But I am unwilling to say "Java never will have function types"
(though I recognize that Java may never have function types.) I
believe that in order to get to function types, we have to first deal
with erasure. That may, or may not be possible. But in a world of
reified structural types, function types start to make a lot more
sense.
The lambdas-are-objects view of the world conflicts with this possible
future. The lambdas-are-functions view of the world does not, and
preserving this flexibility is one of the points in favor of not
burdening lambdas with even the appearance of object-ness.
You might think that the current design is tightly tied to an object
box for lambdas -- SAM types -- making them effectively objects
anyway. But this has been carefully hidden from the surface area so
as to allow us to consider 'naked' lambdas in the future, or consider
other conversion contexts for lambdas, or integrate lambdas more
tightly into control constructs. We're not doing that now, and we
don't even have a concrete plan for doing so, but the ability to
possibly do that in the future is a critical part of the design.
I am optimistic about Java's future, but to move forward we sometimes
have to let go of some comfortable ideas. Lambdas-are-functions opens
doors. Lambdas-are-objects closes them. We prefer to see those doors
left open.
They are not really first-class functions IMO. Lambda is ultimately an instance of a functional interface. But it does give you advantages of a first-class function. You can pass it as an argument to a method that expects an instance of functional interface. You can assign it to a variable, in which case its type will be inferred using target typing. You can return it from a method, so on.
Also, lambdas doesn't completely replace anonymous classes. Not all anonymous classes can be converted to lambdas. You can go through this answer for a nice explanation about difference between the two. So, no lambdas are not syntactic sugar for anonymous classes.
It's not syntactic sugar for anonymous classes; what it compiles to is somewhat more involved. (Not that it would be wrong to do that; any language will have its own implementation detail for how lambdas are compiled, and Java's is no worse than many others.)
See http://cr.openjdk.java.net/~briangoetz/lambda/lambda-translation.html for the full, nitty-gritty details.
interface DoubleDoubleToDoubleFunc {
double f(double x, double y);
}
public static void main(String[] args) {
Map<String, DoubleDoubleToDoubleFunc> mymap = new HashMap<>();
mymap.put("Func 1", (double a, double b) -> a + b);
mymap.put("Func 2", Math::pow);
double y = 2.0;
double z = 3.0;
double w = mymap.get("Func 1").f(y,z);
double v = mymap.get("Func 2").f(y,z);
}
So it still is syntactic sugar and only to some degree.

Java Lambdas and Closures

I hear lambdas are coming soon to a Java near you (J8). I found an example of what they will look like on some blog:
SoccerService soccerService = (teamA, teamB) -> {
SoccerResult result = null;
if (teamA == teamB) {
result = SoccerResult.DRAW;
}
else if(teamA < teamB) {
result = SoccerResult.LOST;
}
else {
result = SoccerResult.WON;
}
return result;
};
So right off the bat:
Where are teamA and teamB typed? Or aren't they (like some weird form of generics)?
Is a lambda a type of closure, or is it the other way around?
What benefits will this give me over a typical anonymous function?
The Lambda expression is just syntactic sugar to implement a target interface, this means that you will be implementing a particular method in the interface through a lambda expression. The compiler can infer the types of the parameters in the interface and that's why you do not need to explicitly define them in the lambda expression.
For instance:
Comparator<String> c = (s1, s2) -> s1.compareToIgnoreCase(s2);
In this expression, the lambda expression evidently implements a Comparator of strings, therefore, this implies the lambda expression is syntactic sugar for implementing compare(String, String).
Thus, the compiler can safely assume the type of s1 and s2 is String.
Your target interface type provides all the information the compiler needs to determine what are the actual types of the lambda parameters.
Briant Goetz, Java Language Architect at Oracle Corportion has published a couple of articles of the work in progress in JDK 8 Lambdas. I believe the answers to your questions are there:
State of Lambda.
State of Lambda Libraries Edition.
Translation of Lambda Expressions
JVMLS 2012: Implementing Lambda Expressions in Java
This second article explains how the lambda expressions are implemented at the bytecode level and may help you delve into the details of your second question.
See this page for a full version of that example (however, the relevant parts are shown below).
The types are inferred from the SoccerService interface and SoccerResult enum, not shown in your snippet:
enum SoccerResult{
WON, LOST, DRAW
}
interface SoccerService {
SoccerResult getSoccerResult(Integer teamA, Integer teamB);
}
The benefit (of lambdas versus standard anonymous classes) is just reduced verbosity:
(x, y) => x + y
versus something like:
new Adder()
{
public int add(int x, int y)
{
return x + y;
}
}
For the difference between a closure and a lambda, see this question.
Where are teamA and teamB typed? Or aren't they (like some weird form of generics)?
Lambda's use target typing, much like generic method calls (since 1.5) and the diamond [not an] operator (since 1.7). Roughly, where the type the result applied to is stated (or can be inferred) that is used to supply the type of the Single Abstract Method (SAM) base type and hence the method parameter types.
As an example of generic method inference in 1.5:
Set<Team> noTeams = Collections.emptySet();
And diamond operator in 1.7:
Set<Team> aTeams = new HashSet<>();
Team, team, team, team, team, team. I even love saying the word team.
Is a lambda a type of closure, or is it the other way around?
A lambda is a limited form of closure in almost exactly the same way as anonymous inner classes, but with some random differences to catch you out:
The outer this is not hidden by an inner this. This means that the same text in a lambda and an anonymous inner class can mean subtly but completely different things. That should keep Stack Overflow busy with odd questions.
To make up for the lack of inner this, if assigned directly to a local variable, then that value is accessible within the lambda. IIRC (I could check, but wont), in an anonymous inner class the local will be in scope and hide variables in an outer scope but you can't use it. I believe the lack of an instance initialiser makes this much easier to specify.
Local fields that do not happen to be marked final but could be, are treated as if they are final. So not are they in scope, but you can actually read (though not write) them.
What benefits will this give me over a typical anonymous function?
A briefer syntax. That's about it.
Of course the rest of the Java syntax is just as hopelessly bad as ever.
I don't believe this is in the initial implementation, but instead of being implemented as [inner] classes, lambdas can use method handles. The performance of method handles falls somewhat short of earlier predictions. Doing away with classes, should reduce bytecode footprint, possibly runtime footprint and certainly class loading time. There could be an implementation where most anonymous inner classes (not Serializable, trivial static initialiser) didn't go through the poorly conceived class loading mechanism without any particularly noticeable incompatibilities.
(Hope I've got the terminology wrt hiding correct.)

Why does erasure complicate implementing function types?

I read from an interview with Neal Gafter:
"For example, adding function types to the programming language is much more difficult with Erasure as part of Generics."
EDIT:
Another place where I've met similar statement was in Brian Goetz's message in Lambda Dev mailing list, where he says that lambdas are easier to handle when they are just anonymous classes with syntactic sugar:
But my objection to function types was not that I don't like function types -- I love function types -- but that function types fought badly with an existing aspect of the Java type system, erasure. Erased function types are the worst of both worlds. So we removed this from the design.
Can anyone explain these statements? Why would I need runtime type information with lambdas?
The way I understand it, is that they decided that thanks to erasure it would be messy to go the way of 'function types', e.g. delegates in C# and they only could use lambda expressions, which is just a simplification of single abstract method class syntax.
Delegates in C#:
public delegate void DoSomethingDelegate(Object param1, Object param2);
...
//now assign some method to the function type variable (delegate)
DoSomethingDelegate f = DoSomething;
f(new Object(), new Object());
(another sample here
http://geekswithblogs.net/joycsharp/archive/2008/02/15/simple-c-delegate-sample.aspx)
One argument they put forward in Project Lambda docs:
Generic types are erased, which would expose additional places where
developers are exposed to erasure. For example, it would not be
possible to overload methods m(T->U) and m(X->Y), which would be
confusing.
section 2 in:
http://cr.openjdk.java.net/~briangoetz/lambda/lambda-state-3.html
(The final lambda expressions syntax will be a bit different from the above document:
http://mail.openjdk.java.net/pipermail/lambda-dev/2011-September/003936.html)
(x, y) => { System.out.printf("%d + %d = %d%n", x, y, x+y); }
All in all, my best understanding is that only a part of syntax stuff that could, actually will be used.
What Neal Gafter most likely meant was that not being able to use delegates will make standard APIs more difficult to adjust to functional style, rather than that javac/JVM update would be more difficult to be done.
If someone understands this better than me, I will be happy to read his account.
Goetz expands on the reasoning in State of the Lambda 4th ed.:
An alternative (or complementary) approach to function types,
suggested by some early proposals, would have been to introduce a new,
structural function type. A type like "function from a String and an
Object to an int" might be expressed as (String,Object)->int. This
idea was considered and rejected, at least for now, due to several
disadvantages:
It would add complexity to the type system and further mix structural and nominal types.
It would lead to a divergence of library styles—some libraries would continue to use callback interfaces, while others would use structural
function types.
The syntax could be unweildy, especially when checked exceptions were included.
It is unlikely that there would be a runtime representation for each distinct function type, meaning developers would be further exposed to
and limited by erasure. For example, it would not be possible (perhaps
surprisingly) to overload methods m(T->U) and m(X->Y).
So, we have instead chosen to take the path of "use what you
know"—since existing libraries use functional interfaces extensively,
we codify and leverage this pattern.
To illustrate, here are some of the functional interfaces in Java SE 7
that are well-suited for being used with the new language features;
the examples that follow illustrate the use of a few of them.
java.lang.Runnable
java.util.concurrent.Callable
java.util.Comparator
java.beans.PropertyChangeListener
java.awt.event.ActionListener
javax.swing.event.ChangeListener
...
Note that erasure is just one of the considerations. In general, the Java lambda approach goes in a different direction from Scala, not just on the typed question. It's very Java-centric.
Maybe because what you'd really want would be a type Function<R, P...>, which is parameterised with a return type and some sequence of parameter types. But because of erasure, you can't have a construct like P..., because it could only turn into Object[], which is too loose to be much use at runtime.
This is pure speculation. I am not a type theorist; i haven't even played one on TV.
I think what he means in that statement is that at runtime Java cannot tell the difference between these two function definitions:
void doIt(List<String> strings) {...}
void doIt(List<Integer> ints) {...}
Because at compile time, the information about what type of data the List contains is erased, so the runtime environment wouldn't be able to determine which function you wanted to call.
Trying to compile both of these methods in the same class will throw the following exception:
doIt(List<String>) clashes with doIt(List<Integer); both methods have the same erasure

Parametric polymorphism vs Ad-hoc polymorphism

I would like to understand the key difference between parametric polymorphism such as polymorphism of generic classes/functions in the Java/Scala/C++ languages and "ad-hoc" polymorphism in the Haskell type system. I'm familiar with the first kind of languages, but I have never worked with the Haskell.
More precisely:
How is type inference algorithm e.g. in Java different from the type inference in Haskell?
Please, give me an example of the situation where something can be written in Java/Scala but can not be written in Haskell(according to the modular features of these platforms too), and vice-versa.
Thanks in advance.
As per the TAPL, §23.2:
Parametric polymorphism (...), allows a single piece of
code to be typed “generically,” using variables in place of actual types, and
then instantiated with particular types as needed. Parametric definitions
are uniform: all of their instances behave the same. (...)
Ad-hoc polymorphism, by contrast, allows a polymorphic value to exhibit
different behaviors when “viewed” at different types. The most common
example of ad-hoc polymorphism is overloading, which associates a single
function symbol with many implementations; the compiler (or the runtime system, depending on whether overloading resolution is static or dynamic) chooses an appropriate implementation for each application of the
function, based on the types of the arguments.
So, if you consider successive stages of history, non-generic official Java (a.k.a pre-J2SE 5.0, bef. sept. 2004) had ad-hoc polymorphism - so you could overload a method - but not parametric polymorphism, so you couldn't write a generic method. Afterwards you could do both, of course.
By comparison, since its very beginning in 1990, Haskell was parametrically polymorphic, meaning you could write:
swap :: (A; B) -> (B; A)
swap (x; y) = (y; x)
where A and B are type variables can be instantiated to all types, without assumptions.
But there was no preexisting construct giving ad-hoc polymorphism, which intends to let you write functions that apply to several, but not all types. Type classes were implemented as a way of achieving this goal.
They let you describe a class (something akin to a Java interface), giving the type signature of the functions you want implemented for your generic type. Then you can register some (and hopefully, several) instances matching this class. In the meantime, you can write a generic method such as :
between :: (Ord a) a -> a -> a -> Bool
between x y z = x ≤ y ^ y ≤ z
where the Ord is the class that defines the function (_ ≤ _). When used, (between "abc" "d" "ghi") is resolved statically to select the right instance for strings (rather than e.g. integers) - exactly at the moment when (Java's) method overloading would.
You can do something similar in Java with bounded wildcards. But the key difference between Haskell and Java on that front is that only Haskell can do dictionary passing automatically: in both languages, given two instances of Ord T, say b0 and b1, you can build a function f that takes those as arguments and produces the instance for the pair type (b0, b1), using, say, the lexicographic order. Say now that you are given (("hello", 2), ((3, "hi"), 5)). In Java you have to remember the instances for string and int, and pass the correct instance (made of four applications of f!) in order to apply between to that object. Haskell can apply compositionality, and figure out how to build the correct instance given just the ground instances and the f constructor (this extends to other constructors, of course) .
Now, as far as type inference goes (and this should probably be a distinct question), for both languages it is incomplete, in the sense that you can always write an un-annotated program for which the compiler won't be able to determine the type.
for Haskell, this is because it has impredicative (a.k.a. first-class) polymorphism, for which type inference is undecidable. Note that on that point, Java is limited to first-order polymorphism (something on which Scala expands).
for Java, this is because it supports contravariant subtyping.
But those languages mainly differ in the range of program statements to which type inference applies in practice, and in the importance given to the correctness of the type inference results.
For Haskell, inference applies to all "non-highly polymorphic" terms, and make a serious effort to return sound results based on published extensions of a well-known algorithm:
At its core, Haskell's inference is based on Hindley-Milner, which gives you complete results as soon as when infering the type of an application, type variables (e.g. the A and B in the example above) can be only instantiated with non-polymorphic types (I'm simplifying, but this is essentially the ML-style polymorphism you can find in e.g. Ocaml.).
a recent GHC will make sure that a type annotation may be required only for a let-binding or λ-abstraction that has a non-Damas-Milner type.
Haskell has tried to stay relatively close to this inferrable core across even its most hairy extensions (e.g. GADTs). At any rate, proposed extensions nearly always come in a paper with a proof of the correctness of the extended type inference .
For Java, type inference applies in a much more limited fashion anyway :
Prior to the release of Java 5, there was no type inference in Java. According to the Java language culture, the type of every variable, method, and dynamically allocated object must be explicitly declared by the programmer. When generics (classes and methods parameterized by type) were introduced in Java 5, the language retained this requirement for variables, methods, and allocations. But the introduction of polymorphic methods (parameterized by type) dictated that either (i) the programmer provide the method type arguments at every polymorphic method call site or (ii) the language support the inference of method type arguments. To avoid creating an additional clerical burden for programmers, the designers of Java 5 elected to perform type inference to determine the type arguments for polymorphic method calls. (source, emphasis mine)
The inference algorithm is essentially that of GJ, but with a somewhat kludgy addition of wildcards as an afterthought (Note that I am not up to date on the possible corrections made in J2SE 6.0, though). The large conceptual difference in approach is that Java's inference is local, in the sense that the inferred type of an expression depends only on constraints generated from the type system and on the types of its sub-expressions, but not on the context.
Note that the party line regarding the incomplete & sometimes incorrect type inference is relatively laid back. As per the spec:
Note also that type inference does not affect soundness in any way. If the types inferred are nonsensical, the invocation will yield a type error. The type inference algorithm should be viewed as a heuristic, designed to perfdorm well in practice. If it fails to infer the desired result, explicit type paramneters may be used instead.
Parametric polymorphism means, we don't care about the type, we'll implement the function the same for any type. For example, in Haskell:
length :: [a] -> Int
length [] = 0
length (x:xs) = 1 + length xs
We don't care what the type of the elements of the list are, we just care how many there are.
Ad-hoc polymorphism (aka method overloading), however, means that we'll use a different implementation depending on the type of the parameter.
Here's an example in Haskell. Let's say we want to define a function called makeBreakfast.
If the input parameter is Eggs, I want makeBreakfast to return a message on how to make eggs.
If the input parameter is Pancakes, I want makeBreakfast to return a message on how to make pancakes.
We'll create a typeclass called BreakfastFood that implements the makeBreakfast function. The implementation of makeBreakfast will be different depending on the type of the input to makeBreakfast.
class BreakfastFood food where
makeBreakfast :: food -> String
instance BreakfastFood Eggs where
makeBreakfast = "First crack 'em, then fry 'em"
instance BreakfastFood Toast where
makeBreakfast = "Put bread in the toaster until brown"
According to John Mitchell's Concepts in Programming Languages,
The key difference between parametric polymorphism and overloading (aka ad-hoc polymorphism) is that parameteric polymorphic functions use one algorithm to operate on arguments of many different types, whereas overloaded functions may use a different algorithm for each type of argument.
A complete discussion of what parametric polymorphism and ad-hoc polymorphism mean and to what extent they're available in Haskell and in Java is longish; however, your concrete questions can be tackled much more simply:
How algorithm of type inference e.g. in Java difference from the type inference in Haskell?
As far as I know, Java does not do type inference. So the difference is that Haskell does it.
Please, give me an example of the situation where something can be written in Java/Scala but can not be written in Haskell(according to the modular features of these platforms too), and vice-versa.
One very simple example of something Haskell can do that Java can't is to define maxBound :: Bounded a => a. I don't know enough Java to point out something it can do that Haskell can't.

Can we take advantage of the type system to make programs more secure?

This question is inspired from Joel's "Making Wrong Code Look Wrong"
http://www.joelonsoftware.com/articles/Wrong.html
Sometimes you can use types to enforce semantics on objects beyond their interfaces. For example, the Java interface Serializable does not actually define methods, but the fact that an object implements Serializable says something about how it should be used.
Can we have UnsafeString and SafeString interfaces/subclasses in, say Java, that are used in much of the same way as Joel's Hungarian notation and Java's Serializable so that it doesn't just look bad--it doesn't compile?
Is this feasible in Java/C/C++ or are the type systems too weak or too dynamic?
Also, beyond input sanitization, what other security functions can be implemented in this manner?
The type system already enforces a huge number of such safety features. That is essentially what it's for.
For a very simple example, it prevents you from treating a float as an int. That's one aspect of safety -- it guarantees that the type you're working on are going to behave as expected. It guarantees that only string methods are called on a string. Assembly doesn't have that safeguard, for example.
It's also the job of the type system to ensure that you don't call private functions on a class. That's another safety feature.
Java's type system is too anemic to enforce a lot of interesting constraints effectively, but in many other languages (including C++), the type system can be used to enforce far more wide-ranging rules.
In C++, template metaprogramming gives you a lot of tools for prohibiting "bad" code. For example:
class myclass : boost::noncopyable {
...
};
enforces at compile-time that the class can not be copied. The following will produce compile errors:
myclass m;
myclass m2(m); // copy construction isn't allowed
myclass m3;
m3 = m; // assignment also not allowed
Likewise, we can ensure at compile-time that a template function only gets called on types which fulfill certain criteria (say, they must be random-access iterators, while bilinear ones aren't allowed, or they must be POD types, or they must not be any kind of integer type (char, short, int, long), but all other types should be legal.
A textbook example of template metaprogramming in C++ implements a library for computing physical units. It allows you to multiply a value of type "meter" with another value of the same type, and automatically determines that the result must be of type "square meter". Or divide a value of type "mile" with a value of type "hour" and get a unit of type "miles per hour".
Again, a safety feature that prevents you from getting your types mixed up and accidentally getting your units mixed up. You'll get a compile error if you compute a value and try to assign it to the wrong type. trying to divide, say, liters by meters^2 and assigning the result to a value of, say, kilograms, will result in a compile error.
Most of this requires some manual work to set up, certainly, but the language gives you the tools you need to basically build the type-checks you want. Some of this could be better supported directly in the language, but the more creative checks would have to be implemented manually in any case.
Yes you can do such thing. I don't know about Java, but in C++ it isn't customary and there is no support for this, so you have to do some manual work. It is customary in some other languages, Ada for example, which have the equivalent of a typedef which introduces a new type which can't be converted implicitly into the orignal one (this new type "inherits" some basic operations from the one it is created, so it stays usefull).
BTW, in general inheritance isn't a good way to introduce the new types, as even if there is no implicit conversion in one way, there is one in the other one.
You can do a certian amount of this out of the box in Ada. For example, you can make integer types that cannot implcitily interoperate with each other, and Ada enumerations are not compatible with any integer type. You can still convert between them, but you have to explicitly do it, which calls attention to what you are doing.
You could do the same with present-day C++, but you'd have to wrap all your integers and enums in classes, which is just way too much work for something that should be simple (or better yet, the default way of doing things).
I understand the next version of C++ is going to fix at least the enumeration issue.
In C++, I suppose you could use typedef to create a synonym for a primitive type. Your synonym could imply something about the content of that variable, replacing the function of the apps hungarian notation.
Intellisense will report the synonym you used during declaration, so if you don't like using actual hungarian, it does save you from scrolling about (or using Go To Definition).
I guess you are thinking of something along the lines of Perl's "tainting" analysis.
In Java, it should be possible to use custom annotations and an annotation processor to implement this. Not necessarily easy though.
You can't have a UnsafeString subclass of String in Java, since java.lang.String is final.
In general, you cannot provide any kind of security on the source level - if you want to protect against evil code, you must do that on the binary level (e.g. Java bytecode). That's why private/protected can't be used as a security mechanism in C++: it is possible to bypass that with pointer manipulations.

Categories

Resources