Before you start reading: This question is not about understanding monads, but it is about identifying the limitations of the Java type system which prevents the declaration of a Monad interface.
In my effort to understand monads I read this SO-answer by Eric Lippert on a question which asks about a simple explanation of monads. There, he also lists the operations which can be executed on a monad:
That there is a way to take a value of an unamplified type and turn it into a value of the amplified type.
That there is a way to transform operations on the unamplified type into operations on the amplified type that obeys the rules of functional composition mentioned before
That there is usually a way to get the unamplified type back out of the amplified type. (This last point isn't strictly necessary for a monad but it is frequently the case that such an operation exists.)
After reading more about monads, I identified the first operation as the return function and the second operation as the bind function. I was not able to find a commonly used name for the third operation, so I will just call it the unbox function.
To better understand monads, I went ahead and tried to declare a generic Monad interface in Java. For this, I first looked at the signatures of the three functions above. For the Monad M, it looks like this:
return :: T1 -> M<T1>
bind :: M<T1> -> (T1 -> M<T2>) -> M<T2>
unbox :: M<T1> -> T1
The return function is not executed on an instance of M, so it does not belong into the Monad interface. Instead, it will be implemented as a constructor or factory method.
Also for now, I omit the unbox function from the interface declaration, since it is not required. There will be different implementations of this function for the different implementations of the interface.
Thus, the Monad interface only contains the bind function.
Let's try to declare the interface:
public interface Monad {
Monad bind();
}
There are two flaws:
The bind function should return the concrete implementation, however it does only return the interface type. This is a problem, since we have the unbox operations declared on the concrete subtypes. I will refer to this as problem 1.
The bind function should retrieve a function as a parameter. We will address this later.
Using the concrete type in the interface declaration
This addresses problem 1: If my understanding of monads is correct, then the bind function always returns a new monad of the same concrete type as the monad where it was called on. So, if I have an implementation of the Monad interface called M, then M.bind will return another M but not a Monad. I can implement this using generics:
public interface Monad<M extends Monad<M>> {
M bind();
}
public class MonadImpl<M extends MonadImpl<M>> implements Monad<M> {
#Override
public M bind() { /* do stuff and return an instance of M */ }
}
At first, this seems to work, however there are at least two flaws with this:
This breaks down as soon as an implementing class does not provide itself but another implementation of the Monad interface as the type parameter M, because then the bind method will return the wrong type. For example the
public class FaultyMonad<M extends MonadImpl<M>> implements Monad<M> { ... }
will return an instance of MonadImpl where it should return an instance of FaultyMonad. However, we can specify this restriction in the documentation and consider such an implementation as a programmer error.
The second flaw is more difficult to resolve. I will call it problem 2: When I try to instantiate the class MonadImpl I need to provide the type of M. Lets try this:
new MonadImpl<MonadImpl<MonadImpl<MonadImpl<MonadImpl< ... >>>>>()
To get a valid type declaration, this has to go on infinitely. Here is another attempt:
public static <M extends MonadImpl<M>> MonadImpl<M> create() {
return new MonadImpl<M>();
}
While this seems to work, we just defered the problem to the called. Here is the only usage of that function that works for me:
public void createAndUseMonad() {
MonadImpl<?> monad = create();
// use monad
}
which essentially boils down to
MonadImpl<?> monad = new MonadImpl<>();
but this is clearly not what we want.
Using a type in its own declaration with shifted type parameters
Now, let's add the function parameter to the bind function: As described above, the signature of the bind function looks like this: T1 -> M<T2>. In Java, this is the type Function<T1, M<T2>>. Here is the first attempt to declare the interface with the parameter:
public interface Monad<T1, M extends Monad<?, ?>> {
M bind(Function<T1, M> function);
}
We have to add the type T1 as generic type parameter to the interface declaration, so we can use it in the function signature. The first ? is the T1 of the returned monad of type M. To replace it with T2, we have to add T2 itself as a generic type parameter:
public interface Monad<T1, M extends Monad<T2, ?, ?>,
T2> {
M bind(Function<T1, M> function);
}
Now, we get another problem. We added a third type parameter to the Monad interface, so we had to add a new ? to the usage of it. We will ignore the new ? for now to investigate the now first ?. It is the M of the returned monad of type M. Let's try to remove this ? by renaming M to M1 and by introducing another M2:
public interface Monad<T1, M1 extends Monad<T2, M2, ?, ?>,
T2, M2 extends Monad< ?, ?, ?, ?>> {
M1 bind(Function<T1, M1> function);
}
Introducing another T3 results in:
public interface Monad<T1, M1 extends Monad<T2, M2, T3, ?, ?>,
T2, M2 extends Monad<T3, ?, ?, ?, ?>,
T3> {
M1 bind(Function<T1, M1> function);
}
and introducing another M3 results in:
public interface Monad<T1, M1 extends Monad<T2, M2, T3, M3, ?, ?>,
T2, M2 extends Monad<T3, M3, ?, ?, ?, ?>,
T3, M3 extends Monad< ?, ?, ?, ?, ?, ?>> {
M1 bind(Function<T1, M1> function);
}
We see that this will go on forever if we try to resolve all ?. This is problem 3.
Summing it all up
We identified three problems:
Using the concrete type in the declaration of the abstract type.
Instantiating a type which receives itself as generic type parameter.
Declaring a type which uses itself in its declaration with shifted type parameters.
The question is: What is the feature that is missing in the Java type system? Since there are languages which work with monads, these languages have to somehow declare the Monad type. How do these other languages declare the Monad type? I was not able to find information about this. I only find information about the declaration of concrete monads, like the Maybe monad.
Did I miss anything? Can I properly solve one of these problems with the Java type system? If I cannot solve problem 2 with the Java type system, is there a reason why Java does not warn me about the not instantiable type declaration?
As already stated, this question is not about understanding monads. If my understanding of monads is wrong, you might give a hint about it, but don't attempt to give an explanation. If my understanding of monads is wrong the described problems remain.
This question is also not about whether it is possible to declare the Monad interface in Java. This question already received an answer by Eric Lippert in his SO-answer linked above: It is not. This question is about what exactly is the limitation that prevents me from doing this. Eric Lippert refers to this as higher types, but I can't get my head around them.
Most OOP languages do not have a rich enough type system to represent the monad pattern itself directly; you need a type system that supports types that are higher types than generic types. So I wouldn't try to do that. Rather, I would implement generic types that represent each monad, and implement methods that represent the three operations you need: turning a value into an amplified value, turning an amplified value into a value, and transforming a function on unamplified values into a function on amplified values.
What is the feature that is missing in the Java type system? How do these other languages declare the Monad type?
Good question!
Eric Lippert refers to this as higher types, but I can't get my head around them.
You are not alone. But they are actually not as crazy as they sound.
Let's answer both of your questions by looking at how Haskell declares the monad "type" -- you'll see why the quotes in a minute. I have simplified it somewhat; the standard monad pattern also has a couple other operations in Haskell:
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
Boy, that looks both incredibly simple and completely opaque at the same time, doesn't it?
Here, let me simplify that a bit more. Haskell lets you declare your own infix operator for bind, but we'll just call it bind:
class Monad m where
bind :: m a -> (a -> m b) -> m b
return :: a -> m a
All right, now at least we can see that there are the two monad operations in there. What does the rest of this mean?
The first thing to get your head around, as you note, is "higher kinded types". (As Brian points out, I somewhat simplified this jargon in my original answer. Also quite amusing that your question attracted the attention of Brian!)
In Java, a "class" is a kind of "type", and a class may be generic. So in Java we've got int and IFrob and List<IBar> and they're all types.
From this point on throw away any intuition you have about Giraffe being a class that is a subclass of Animal, and so on; we won't need that. Think about a world with no inheritance; it will not come into this discussion again.
What are classes in Java? Well, the easiest way to think of a class is that it is a name for a set of values that have something in common, such that any one of those values can be used when an instance of the class is required. You have a class Point, lets say, and if you have a variable of type Point, you can assign any instance of Point to it. The Point class is in some sense just a way to describe the set of all Point instances. Classes are a thing that is higher than instances.
In Haskell there are also generic and non-generic types. A class in Haskell is not a kind of type. In Java, a class describes a set of values; any time you need an instance of the class, you can use a value of that type. In Haskell a class describes a set of types. That is the key feature that the Java type system is missing. In Haskell a class is higher than a type, which is higher than an instance. Java only has two levels of hierarchy; Haskell has three. In Haskell you can express the idea "any time I need a type that has certain operations, I can use a member of this class".
(ASIDE: I want to point out here that I am making a bit of an oversimplification . Consider in Java for example List<int> and List<String>. These are two "types", but Java considers them to be one "class", so in a sense Java also has classes which are "higher" than types. But then again, you could say the same in Haskell, that list x and list y are types, and that list is a thing that is higher than a type; it's a thing that can produce a type. So it would in fact be more accurate to say that Java has three levels, and Haskell has four. The point remains though: Haskell has a concept of describing the operations available on a type that is simply more powerful than Java has. We'll look at this in more detail below.)
So how is this different than interfaces? This sounds like interfaces in Java -- you need a type that has certain operations, you define an interface that describes those operations. We'll see what is missing from Java interfaces.
Now we can start making sense of this Haskell:
class Monad m where
So, what is Monad? It's a class. What is a class? It's a set of types that have something in common, such that whenever you need a type that has certain operations, you can use a Monad type.
Suppose we have a type that is a member of this class; call it m. What are the operations that must be on this type in order for that type to be a member of the class Monad?
bind :: m a -> (a -> m b) -> m b
return :: a -> m a
The name of the operation comes to the left of the ::, and the signature comes to the right. So to be a Monad, a type m must have two operations: bind and return. What are the signatures of those operations? Let's look at return first.
a -> m a
m a is Haskell for what in Java would be M<A>. That is, this means m is a generic type, a is a type, m a is m parametrized with a.
x -> y in Haskell is the syntax for "a function which takes type x and returns type y". It's Function<X, Y>.
Put it together, and we have return is a function that takes an argument of type a and returns a value of type m a. Or in Java
static <A> M<A> Return(A a);
bind is a little bit harder. I think the OP well understands this signature, but for readers who are unfamiliar with the terse Haskell syntax, let me expand on this a bit.
In Haskell, functions only take one argument. If you want a function of two arguments, you make a function that takes one argument and returns another function of one argument. So if you have
a -> b -> c
Then what have you got? A function that takes an a and returns a b -> c. So suppose you wanted to make a function that took two numbers and returned their sum. You would make a function that takes the first number, and returns a function that takes a second number and adds it to the first number.
In Java you'd say
static <A, B, C> Function<B, C> F(A a)
So if you wanted a C and you had and A and a B, you could say
F(a)(b)
Make sense?
All right, so
bind :: m a -> (a -> m b) -> m b
is effectively a function that takes two things: an m a, and a a -> m b and it returns an m b. Or, in Java, it is directly:
static <A, B> Function<Function<A, M<B>>, M<B>> Bind(M<A>)
Or, more idiomatically in Java:
static <A, B> M<B> Bind(M<A>, Function<A, M<B>>)
So now you see why Java cannot represent the monad type directly. It does not have the ability to say "I have a class of types that have this pattern in common".
Now, you can make all the monadic types you want in Java. The thing you can't do is make an interface that represents the idea "this type is a monad type". What you would need to do is something like:
typeinterface Monad<M>
{
static <A> M<A> Return(A a);
static <A, B> M<B> Bind(M<A> m, Function<A, M<B>> f);
}
See how the type interface talks about the generic type itself? A monadic type is any type M that is generic with one type parameter and has these two static methods. But you can't do that in the Java or C# type systems. Bind of course could be an instance method that takes an M<A> as this. But there is no way to make Return anything but static. Java gives you no ability to (1) parameterize an interface by an unconstructed generic type, and (2) no ability to specify that static members are part of the interface contract.
Since there are languages which work with monads, these languages have to somehow declare the Monad type.
Well you'd think so but actually not. First off, of course any language with a sufficient type system can define monadic types; you can define all the monadic types you want in C# or Java, you just can't say what they all have in common in the type system. You can't make a generic class that can only be parameterized by monadic types, for instance.
Second, you can embed the monad pattern in the language in other ways. C# has no way to say "this type matches the monad pattern", but C# has query comprehensions (LINQ) built into the language. Query comprehensions work on any monadic type! It's just that the bind operation has to be called SelectMany, which is a little weird. But if you look at the signature of SelectMany, you'll see that it is just bind:
static IEnumerable<R> SelectMany<S, R>(
IEnumerable<S> source,
Func<S, IEnumerable<R>> selector)
That's the implementation of SelectMany for the sequence monad, IEnumerable<T>, but in C# if you write
from x in a from y in b select z
then a's type can be of any monadic type, not just IEnumerable<T>. What is required is that a is M<A>, that b is M<B>, and that there is a suitable SelectMany that follows the monad pattern. So that's another way of embedding a "monad recognizer" in the language, without representing it directly in the type system.
(The previous paragraph is actually a lie of oversimplification; the binding pattern used by this query is slightly different than the standard monadic bind for performance reasons. Conceptually this recognizes the monad pattern; in actuality the details differ slightly. Read about them here http://ericlippert.com/2013/04/02/monads-part-twelve/ if you're interested.)
A few more small points:
I was not able to find a commonly used name for the third operation, so I will just call it the unbox function.
Good choice; it is usually called the "extract" operation. A monad need not have an extract operation exposed, but of course somehow bind needs to be able to get the A out of the M<A> in order to call the Function<A, M<B>> on it, so logically some sort of extraction operation usually exists.
A comonad -- a backwards monad, in a sense -- requires an extract operation to be exposed; extract is essentially return backwards. A comonad as well requires an extend operation that is sort of bind turned backwards. It has the signature static M<B> Extend(M<A> m, Func<M<A>, B> f)
If you look at what the AspectJ project is doing, it is similar to applying monads to Java. The way they do it is to post-process the byte code of the classes to add the additional functionality-- and the reason they have to do that is because there is no way within the language without the AspectJ extensions to do what they need to do; the language is not expressive enough.
A concrete example: say you start with class A. You have a monad M such that M(A) is a class that works just like A, but all method entrances and exits get traced to log4j. AspectJ can do this, but there is no facility within the Java language itself that would let you.
This paper describes how Aspect-Oriented Programming as in AspectJ might be formalized as monads
In particular, there is no way within the Java language to specify a type programmatically (short of byte-code manipulation a la AspectJ). All types are pre-defined when the program starts.
Good question indeed! :-)
As #EricLippert pointed out, the type of polymorphism that is known as "type classes" in Haskell is beyond the grasp of Java's type system. However, at least since the introduction of the Frege programming language it has been shown that a Haskell-like type system can indeed be implemented on top of the JVM.
If you want to use higher-kinded types in the Java language itself you have to resort to libraries like highJ or Cyclops. Both libraries do provide a monad type class in the Haskell sense (see here and here, respectively, for the sources of the monad type class). In both cases, be prepared for some major syntactic inconveniences; this code will not look pretty at all and carries a lot of overhead to shoehorn this functionality into Java's type system. Both libraries use a "type witness" to capture the core type separately from the data type, as John McClean explains in his excellent introduction. However, in neither implementation you will find anything as simple and straightforward as Maybe extends Monad or List extends Monad.
The secondary problem of specifying constructors or static methods with Java interfaces can be easily overcome by introducing a factory (or "companion") interface that declares the static method as a non-static one. Personally, I always try to avoid anything static and use injected singletons instead.
Long story short, yes, it is possible to represent HKTs in Java but at this point it is very inconvenient and not very user friendly.
Yes, we cannot override static method in class, and we cannot write constructor in interface.
use abstract class to simulate Monad type class in Haskell
import java.util.function.Function;
public abstract class Monad<T> {
public static <T> Monad<T> Unit(T a){
throw new UnsupportedOperationException("Call Unit in abstract class: Monad");
}
public <R> Monad<R> OUnit(R a){
throw new UnsupportedOperationException("Call OUnit in abstract class: Monad");
}
public <B> Monad<B> bind(Function<T, Monad<B>> func){
throw new UnsupportedOperationException("Call bind in abstract class: Monad");
}
public <B> Monad<B> combine(Monad<B> b){
return this.bind(unused -> b);
}
}
public class Maybe<T> extends Monad<T> {
public boolean has;
public T val;
public Maybe(T value) {
this.has = true;
this.val = value;
}
public Maybe(){
has = false;
}
public static <T> Maybe<T> Unit(T a) {
return new Maybe<T>(a);
}
public static <T> Maybe<T> Unit() {
return new Maybe<T>();
}
#Override
public <R> Maybe<R> OUnit(R a) {
return new Maybe<R>(a);
}
public <T> Maybe<T> OUnit() {
return new Maybe<T>();
}
#Override
public <B> Monad<B> bind(Function<T, Monad<B>> func){
if (this.has){
return func.apply(this.val);
}
return new Maybe<B>();
}
#Override
public String toString(){
if (this.has){
return "Maybe " + val.toString();
}
return "Nothing";
}
}
public class Main {
/*
example :: (Monad m, Show (m n), Num n) => m n -> m n -> IO ()
example a b = do
print $ a >> b
print $ b >> a
print $ a >>= (\x -> return $ x+x)
print $ b >>= (\x -> return $ x+x)
main = do
example (Just 10) (Just 5)
example (Right 10) (Left 5)
*/
public static void example(Monad<Integer> a, Monad<Integer> b){
System.out.println(a.bind(x -> b));
System.out.println(b.bind(x -> b));
System.out.println(a.bind(x -> a.OUnit(x*2)));
System.out.println(b.bind(x -> b.OUnit(x*2)));
System.out.println(a.combine(a));
System.out.println(a.combine(b));
System.out.println(b.combine(a));
System.out.println(b.combine(b));
}
// Monad can also used in any Objects
public static void example2(Monad<Object> a, Monad<Object> b){
System.out.println(a.bind(x -> b));
System.out.println(b.bind(x -> b));
System.out.println(a.combine(a));
System.out.println(a.combine(b));
System.out.println(b.combine(a));
System.out.println(b.combine(b));
}
public static void main(String[] args){
System.out.println("Example 1:");
example(Maybe.<Integer>Unit(10), Maybe.<Integer>Unit());
System.out.println("\n\nExample 2:");
example(Maybe.<Integer>Unit(1), Maybe.<Integer>Unit(3));
System.out.println("\n\nExample 3:");
example2(Maybe.<Object>Unit(10), Maybe.<Object>Unit());
}
}
use interface to simulate Monad type class in Haskell
import java.util.function.Function;
public interface Monad<T> {
public static <T> Monad<T> Unit(T a){
throw new UnsupportedOperationException("call Unit in Monad interface");
}
public <R> Monad<R> OUnit(R a);
public <B> Monad<B> bind(Function<T, Monad<B>> func);
default public <B> Monad<B> combine(Monad<B> b){
return bind(x-> b);
};
}
// in class Maybe, replace extends with implements
// in class Main, unchanged
and the output is the same
Related
I am going through this link (Chapter 4. Types, Values, and Variables) and did not understand below point:
The relationship of wildcards to established type theory is an interesting one, which we briefly allude to here. Wildcards are a restricted form of existential types. Given a generic type declaration G<T extends B>, G<?> is roughly analogous to Some X <: B. G<X>.
I appreciate if you provide good example to understand above point clearly.
Thanks in advance.
The wording and formatting of this statement are a bit unlucky*. The link in the answer by Maouven actually covers the general topic pretty well, but one can try to focus on the particular case of Java and Wildcards here:
Wildcards are a restricted form of existential types. Given a generic type declaration G, G is roughly analogous to Some X <: B. G.
This basically says that the type parameter of the G is any subtype of B. And this is always the case, even when you don't say it explicitly.
Consider the following snippet, which hopefully illustrates the point:
class B { }
class G<T extends B>
{
T get() { return null; }
}
public class Example
{
public static void main(String[] args)
{
G<?> g = null;
// This works, even though "G<?>" seemingly does not say
// anything about the type parameter:
B b = g.get();
}
}
The object that you obtain by calling g.get() is of type B, because the declaration of G<T extends B> guarantees that any type parameter (even if it is the ? wildcard) always be "at least" of type B.
(In contrast to that: If the declaration only was G<T>, then the type obtained from g.get() would only be of type Object)
The relationship is described as "roughly analogous" to the type theoretic notation. You can probably imagine this as saying: If the declaration is G<T extends B>, and you use the type G<?>, then this roughly (!) means: There exists a type X extends B, and the ? here stands for this (unknown) type X.
An aside: Note that this also refers to Insersection Types. If you declared the class as class G<T extends B & Runnable>, then the statements
B b = g.get();
Runnable x = g.get();
would both be valid.
* The "unlucky" formatting referred to the fact that the source code of this paragraph actually reads
... is roughly analogous to <span class="type">Some <span class="type">X</span> ...
making clearer that the word "Some" already is part of the type that is being defined there formally...
Wildcards are a restricted form of existential types, in the way they incorporate the principle of existential types in Java. You can refer to the link here which provide explanatory examples:
What is an existential type?
I've been reading up on type classes in Scala and thought I had a good grasp on it, until I remembered Java's java.util.Comparator.
If I understand properly, Ordering is the prototypical example of a type class. The only difference I can think of between a Comparator and an instance of Ordering is that comparators are necessarily explicit, while orderings can be, and often are, implicit.
Is Comparator a type class? I get the (mistaken?) impression that Java does not actually have type classes. Does this mean that a type class needs to be able to be implicit? I considered implicit conversions of type classes to be mostly syntactic sugar - awesome as it is, it's "simply" giving the compiler enough hint - was I missing something?
The following code example shows how Comparator adds an ordering operation to a type that didn't have it, without having to modify said type.
// Comparator used to retroactively fit the MyExample class with an ordering operation.
public static class MyExampleComparator implements Comparator<MyExample> {
public static final Comparator<MyExample> SINGLETON = new MyExampleComparator();
private MyExampleComparator() {}
public int compare(MyExample a, MyExample b) {
return a.value - b.value;
}
}
// Custom type, its only purpose is to show that Comparator can add an ordering operation to it when it doesn't
// have one to begin with.
public static class MyExample {
private final int value;
public MyExample(int v) {
value = v;
}
public String toString() {
return Integer.toString(value);
}
}
public static void main(String... args) {
List<MyExample> list = new ArrayList<MyExample>();
for(int i = 0; i < 10; i++)
list.add(new MyExample(-i));
// Sorts the list without having had to modify MyExample to implement an interface.
Collections.sort(list, MyExampleComparator.SINGLETON);
// Prints the expected [-9, -8, -7, -6, -5, -4, -3, -2, -1, 0]
System.out.println(list);
}
I prefer not to talk specifically about type classes but about the type class pattern in Scala; the reason is that when you start asking "what is the type class", you end up concluding that it is just an interface used in a particular way.
(In Haskell it makes more sense to call a specific construct a type class.)
The type class pattern consists of three essential parts (but there are usually a couple more for convenience). The first is an interface parameterized by a single type that abstracts some sort of capability on the parameterized type. java.util.Comparator is a perfect example: it provides an interface for comparison. Let's just use that.
The second thing you need is a method that makes use of that parameterization, which you can specify with short-hand notation in Scala:
// Short signature
// v------------------- "We must be able to find a Comparator for A"
def ordered[A: java.util.Comparator](a0: A, a1: A, a2: A) = {
val cmp = implicitly[java.util.Comparator[A]] // This is the Comparator
cmp.compare(a0, a1) <= 0 && cmp.compare(a1, a2) <= 0
}
// Long signature version
def ordered[A](a0: A, a1: A, a2: A)(implicit cmp: java.util.Comparator[A]) = {
cmp.compare(a0, a1) <= 0 && cmp.compare(a1, a2) <= 0
}
Okay, but where do you get that comparator from? That's the third necessary piece. By default, Scala doesn't give you Comparators for the classes you might like, but you can define your own:
implicit object IntComp extends java.util.Comparator[Int] {
def compare(a: Int, b: Int) = a.compareTo(b)
}
scala> ordered(1,2,3)
res5: Boolean = true
scala> ordered(1,3,2)
res6: Boolean = false
Now that you've provided the functionality for Int (implicitly), the compiler will fill in the implicit parameter to ordered to make it work. If you haven't yet provided the functionality, it gives an error:
scala> ordered("fish","wish","dish")
<console>:12: error: could not find implicit value
for parameter cmp: java.util.Comparator[String]
ordered("fish","wish","dish")
until you supply that functionality:
implicit object StringComp extends java.util.Comparator[String] {
def compare(a: String, b: String) = a.compareTo(b)
}
scala> ordered("fish","wish","dish")
res11: Boolean = false
So, do we call java.util.Comparator a type class? It certainly functions just as well as a Scala trait that handles the equivalent part of the type class pattern. So even though the type class pattern doesn't work as well in Java (since you have to explicitly specify the instance to use instead of having it implicitly looked up), from a Scala perspective java.util.Comparator is as much a type class as anything.
The term type class comes from Haskell were they are part of the language. In scala, it is not, it is more of a pattern, which happens to have a lot of language support in scala (implicits, mostly). The pattern makes sense even without this syntactic support, for instance in java, and I would say that Comparator is a typical example of that pattern there (although the term type class is not used in java).
From an object oriented perspective, the pattern consist in having Comparator rather than Comparable. The most basic object thinking would have the comparison service in the object, say class String implements Comparable<String>. However, extracting it has numerous advantages:
You can provide the service for a class whose code you cannot change (for instance, arrays)
You can provide different implementation of the service (there are zillion ways to compare strings (case insensitive and a lot of language dependent one). People may be sorted by their name, their age, whatever. And also, you may simply want an ordering reversed.
These two reasons are enough to have Comparable in java, and to use them in in sorted collections (e.g TreeSet) Comparable is kept, as it gives a convenient default (no need to pass a Comparator when you want the "default" comparison, and it is easier to call (x.compareTo(y) rather than comparator.compare(x,y)). In scala, with implicits, none of this reason is compelling (interoperability with java would still be a reason to implement Ordered/Comparable in scala).
There are other, less obvious advantages to type classes. Among them :
A type class implementation is available and may be useful even when you have no instance of the type it operates on. Consider the operation sum(list). It requires that there is some sort of addition available on the elements of the list. This might be available in the element themselves. Say they could be some Addable[T] with def add(other: T): T. But if you pass the empty list to sum, it should return the "zero" of the type of the type of the list (0 for ints, the empty string for strings...). Having a def zero: T in Addable[T] would be useless, as at that moment, you have no Addable around. But this works fine with a type class such as Numeric or Monoid.
As the type class are reified (they are objects rather than methods) they are first class, you can combine them, transform them. A very simple example is reversing an Ordering (you could implement that on Comparable too, in java probably in a static method). You can combine the ordering of Int and String to have an ordering defined on the pair (Int, String), or given an Ordering on T, build an ordering on List[T]. Scala does that with implicits, but it still makes sense in java, explicitly.
A more sophisticated example:
// Comparison by the first comparator which finds the two elements different.
public static Comparator<T> lexicographic<T>(final Comparator<T>... comparators) {
return new Comparator<T>() {
public int compare(T t1, T t2) {
for(comparator : comparators) {
int result = comparator.compare(t1, t2);
if (result != 0) return result;
}
return 0;
}
}
}
(might be simpler in scala, but again, this is of interest in java)
There are some small disadvantages too (much more so in java than in scala, but still)
You must pass around the type class instance from method to method.
Much easier in scala with implicit parameter, or the [T : Comparable]
constraint, but still, something has to be written in the methods
definitions if not at call site, and at run time, the parameter has
to be passed around.
Everything must be set at compile time (even in scala where it is set implicitly
So while you can try if(x is Comparable<?>) {do some sorting},
this would not be possible with a Comparator.
No java.util.Comparator is an interface
public interface Comparator<T>
A comparison function, which imposes a total ordering on some collection of objects. Comparators can be passed to a sort method (such as Collections.sort or Arrays.sort) to allow precise control over the sort order. Comparators can also be used to control the order of certain data structures (such as sorted sets or sorted maps), or to provide an ordering for collections of objects that don't have a natural ordering.
Assume that we have two generic Java interfaces: Foo<T> and Bar<T>, of which there may be many implementations. Now, assume that we want to store one of each in a single class, both using the same value for T, but keep the exact implementations typed:
public interface FooBar<T, TFoo extends Foo<T>, TBar extends Bar<T>> {
TFoo getFoo();
TBar getBar();
}
Above, T is used for the sole purpose of enforcing that TFoo and TBar's classes use the same type parameter. Adding this type parameter to FooBar seems redundant for two reasons:
FooBar doesn't actually care about T at all.
Even if it did, T can be inferred from TFoo and TBar.
My question is therefore if there is a way to enforce conditions like this without cluttering up FooBar's list of type parameters. Having to write FooBar<String, StringFoo, StringBar> instead of the theoretically equivalent FooBar<StringFoo, StringBar> looks ugly to me.
Unfortunately, there is no better way... The compiler needs the T type to be declared in order to use it and there is no other place to declare it :
EDIT : unrelated link
If bound A is not specified first, you get a compile-time error:
class D <T extends B & A & C> { /* ... */ } // compile-time error
(extract from this doc)
And this is a little out of the subject, but this doc defines the conventions on type parameters names as being single, uppercase letters.
I know quite a bit how to use C++-Templates -- not an expert, mind you. With Java Generics (and Scala, for that matter), I have my diffuculties. Maybe, because I try to translate my C++ knowledge to the Java world. I read elsewhere, "they are nothing alike: Java Generics are only syntactic sugar saving casts, C++ Templates are only a glorified Preprocessor" :-)
I am quite sure, both is a bit simplified a view. So, to understand the big and the subtle differences, I try to start with Specialization:
In C++ I can design a Template (class of function) that acts on any type T that supports my required operations:
template<typename T>
T plus(T a, T b) { return a.add(b); }
This now potentially adds the plus() operation to any type that can add().[note1][1]
Thus, if T supports the add(T) my template woll work. If it doesn't,
The compiler will not complain as long as I do not use plus(). In Python
we call this "duck typing": *If it acts like a duck, quacks like a duck,
it is a duck.* (Of course, with using type_traits this is modified a bit,
but as long as we have no concepts, this is how C++ Templates work, right?)
I guess, thats how Generics in Java work as well, isn't it? The generic type I device is used as a "template" how to operate on any anything I try to put in there, right? As far as I understand I can (or must?) put some constraints on the type arguments: If I want to use add in my template, I have to declare the type argument to implement Addable. Correct? So, no "duck typing" (for better or worse).
Now, in C++ I can choose to specialize on a type that has no add():
template<>
T plus<MyX>(MyX a, MyX b) { return a + b; }
And even if all other types still can use the "default" implementation, now I added a special one for MyX -- with no runtime overhead.
Is there any Java Generics mechanism that has the same purpose? Of course, in programming everything is doable, but I mean conceptually, without any tricks and magic?
No, generics in Java don't work this way.
With generics you can't do anything which would not be possible without Generics - you just avoid to have to write lots of casts, and the compiler ensures that everything is typesafe (as long as you don't get some warnings or suppress those).
So, for each type variable you can only call the methods defined in its bounds (no duck typing).
Also, there is no code generation (apart from some adapter methods to delegate to methods with other parameter types for the purpose of implementing generic types). Assume you had something like this
/**
* interface for objects who allow adding some other objects
*/
interface Addable<T> {
/** returns the sum of this object and another object. */
T plus(T summand);
}
Then we could create our sum method with two arguments:
public static <T extends Addable<T>> T sum(T first, T second) {
return first.plus(second);
}
The static method is compiled to the same bytecode like this (with additional type information in annotations):
public static Addable sum(Addable first, Addable second) {
return first.plus(second);
}
This is called type erasure.
Now this method can be called for every pair of two elements of an addable type, like this one:
public class Integer implements Addable<Integer> {
public Integer plus(Integer that) {
return new Integer(this.value + that.value);
}
// private implementation details omitted
}
What here happens is that the compiler creates an additional synthetic method like this:
public Object plus(Object that) {
return this.plus((Integer)that);
}
This method will only be called by generic code with the right types, this guarantees the compiler, assuming you are not doing some unsafe casts somewhere - then the (Integer) cast here will catch the mistake (and throw a ClassCastException).
The sum method now always calls the plus method of the first object, there is no way around this. There is not code generated for every type argument possible (this is the key difference between Java generics and C++ templates), so we can't simply replace one of the generated method with a specialized one.
Of course, you can create a second sum method like irreputable proposed (with overloading), but this will only be selected if you use the MyX type directly in source code, not when you are calling the sum method from some other generic code which happens to be parametrized with MyX, like this:
public static <T extends Addable<T>> product (int times, T factor) {
T result = factor;
while(n > 1) {
result = sum(result, factor);
}
return result;
}
Now product(5, new MyX(...)) will call our sum(T,T) method (which in turn calls the plus method), not any overloaded sum(MyX, MyX) method.
(JDK 7 adds a new dynamic method dispatch mode which allows specialization by every argument on run time, but this is not used by the Java language, only intended to be used by other JVM-based languages.)
no - but your particular problem is more of an overloading issue.
There's no problem to define 2 plus methods like these
<T extends Addable>
T plus(T a, T b) { .. }
MyX plus(MyX a, MyX b) { .. }
This works even if MyX is an Addable; javac knows that the 2nd plus is more specific than the 1st plus, so when you call plus with two MyX args, the 2nd plus is chosen. In a sense Java does allow "specialized" version of methods:
f(T1, T2, .. Tn)
f(S1, S2, .. Sn)
works great if each Si is a subtype of Ti
For generic classes, we can do
class C<T extends Number> { ... }
class C_Integer extends C<Integer>{ ... }
caller must use C_Integer instead of C<Integer> to pick the "specialized" version.
On duck typing: Java is more strict in static typing - unless it is a Duck, it is not a duck.
HI,
java Generics it's different from C++ template.
Example:
Java code:
public <T> T sum(T a, T b) {
T newValue = a.sum(b);
return newValue;
}
In java this code don't work because generics base is class java.lang.Object, so you can use only method of this class.
you can construct this methis like this:
public <T extends Number> T sum(T a, T b) {
T newValue = a.sum(b);
return newValue;
}
in this case the base of generics is class java.lang.Number so you can use Integer, Double, Long ecc..
method "sum" depend of implementation of java.lang.Number.
Bye
Java has generics and C++ provides a very strong programming model with templates.
So then, what is the difference between C++ and Java generics?
There is a big difference between them. In C++ you don't have to specify a class or an interface for the generic type. That's why you can create truly generic functions and classes, with the caveat of a looser typing.
template <typename T> T sum(T a, T b) { return a + b; }
The method above adds two objects of the same type, and can be used for any type T that has the "+" operator available.
In Java you have to specify a type if you want to call methods on the objects passed, something like:
<T extends Something> T sum(T a, T b) { return a.add ( b ); }
In C++ generic functions/classes can only be defined in headers, since the compiler generates different functions for different types (that it's invoked with). So the compilation is slower. In Java the compilation doesn't have a major penalty, but Java uses a technique called "erasure" where the generic type is erased at runtime, so at runtime Java is actually calling ...
Something sum(Something a, Something b) { return a.add ( b ); }
Nevertheless, Java's generics help with type-safety.
Java Generics are massively different to C++ templates.
Basically in C++ templates are basically a glorified preprocessor/macro set (Note: since some people seem unable to comprehend an analogy, I'm not saying template processing is a macro). In Java they are basically syntactic sugar to minimize boilerplate casting of Objects. Here is a pretty decent introduction to C++ templates vs Java generics.
To elaborate on this point: when you use a C++ template, you're basically creating another copy of the code, just as if you used a #define macro. This allows you to do things like have int parameters in template definitions that determine sizes of arrays and such.
Java doesn't work like that. In Java all objects extent from java.lang.Object so, pre-Generics, you'd write code like this:
public class PhoneNumbers {
private Map phoneNumbers = new HashMap();
public String getPhoneNumber(String name) {
return (String) phoneNumbers.get(name);
}
}
because all the Java collection types used Object as their base type so you could put anything in them. Java 5 rolls around and adds generics so you can do things like:
public class PhoneNumbers {
private Map<String, String> phoneNumbers = new HashMap<String, String>();
public String getPhoneNumber(String name) {
return phoneNumbers.get(name);
}
}
And that's all Java Generics are: wrappers for casting objects. That's because Java Generics aren't refined. They use type erasure. This decision was made because Java Generics came along so late in the piece that they didn't want to break backward compatibility (a Map<String, String> is usable whenever a Map is called for). Compare this to .Net/C# where type erasure isn't used, which leads to all sorts of differences (e.g. you can use primitive types and IEnumerable and IEnumerable<T> bear no relation to each other).
And a class using generics compiled with a Java 5+ compiler is usable on JDK 1.4 (assuming it doesn't use any other features or classes that require Java 5+).
That's why Java Generics are called syntactic sugar.
But this decision on how to do generics has profound effects so much so that the (superb) Java Generics FAQ has sprung up to answer the many, many questions people have about Java Generics.
C++ templates have a number of features that Java Generics don't:
Use of primitive type arguments.
For example:
template<class T, int i>
class Matrix {
int T[i][i];
...
}
Java does not allow the use of primitive type arguments in generics.
Use of default type arguments, which is one feature I miss in Java but there are backwards compatibility reasons for this;
Java allows bounding of arguments.
For example:
public class ObservableList<T extends List> {
...
}
It really does need to be stressed that template invocations with different arguments really are different types. They don't even share static members. In Java this is not the case.
Aside from the differences with generics, for completeness, here is a basic comparison of C++ and Java (and another one).
And I can also suggest Thinking in Java. As a C++ programmer a lot of the concepts like objects will be second nature already but there are subtle differences so it can be worthwhile to have an introductory text even if you skim parts.
A lot of what you'll learn when learning Java is all the libraries (both standard--what comes in the JDK--and nonstandard, which includes commonly used things like Spring). Java syntax is more verbose than C++ syntax and doesn't have a lot of C++ features (e.g. operator overloading, multiple inheritance, the destructor mechanism, etc) but that doesn't strictly make it a subset of C++ either.
C++ has templates. Java has generics, which look kinda sorta like C++ templates, but they're very, very different.
Templates work, as the name implies, by providing the compiler with a (wait for it...) template that it can use to generate type-safe code by filling in the template parameters.
Generics, as I understand them, work the other way around: the type parameters are used by the compiler to verify that the code using them is type-safe, but the resulting code is generated without types at all.
Think of C++ templates as a really good macro system, and Java generics as a tool for automatically generating typecasts.
Another feature that C++ templates have that Java generics don't is specialization. That allows you to have a different implementation for specific types. So you can, for example, have a highly optimized version for an int, while still having a generic version for the rest of the types. Or you can have different versions for pointer and non-pointer types. This comes in handy if you want to operate on the dereferenced object when handed a pointer.
There is a great explanation of this topic in Java Generics and Collections
By Maurice Naftalin, Philip Wadler. I highly recommend this book. To quote:
Generics in Java resemble templates in
C++. ... The syntax is deliberately
similar and the semantics are
deliberately different. ...
Semantically, Java generics are
defined by erasure, where as C++
templates are defined by expansion.
Please read the full explanation here.
(source: oreilly.com)
Basically, AFAIK, C++ templates create a copy of the code for each type, while Java generics use exactly the same code.
Yes, you can say that C++ template is equivalent to Java generic concept ( although more properly would be to say Java generics are equivalent to C++ in concept )
If you are familiar with C++'s template mechanism, you might think that generics are similar, but the similarity is superficial. Generics do not generate a new class for each specialization, nor do they permit “template metaprogramming.”
from: Java Generics
Java (and C#) generics seem to be a simple run-time type substitution mechanism.
C++ templates are a compile-time construct which give you a way to modify the language to suit your needs. They are actually a purely-functional language that the compiler executes during a compile.
The answer below is from the book Cracking The Coding Interview Solutions to Chapter 13, which I think is very good.
The implementation of Java generics is rooted in an idea of"type erasure:'This technique eliminates the parameterized types when source code is translated to the Java Virtual Machine (JVM) bytecode. For example, suppose you have the Java code below:
Vector<String> vector = new Vector<String>();
vector.add(new String("hello"));
String str = vector.get(0);
During compilation, this code is re-written into:
Vector vector = new Vector();
vector.add(new String("hello"));
String str = (String) vector.get(0);
The use of Java generics didn't really change much about our capabilities; it just made things a bit prettier. For this reason, Java generics are sometimes called"syntactic sugar:'.
This is quite different from C++. In C++, templates are essentially a glorified macro set, with the compiler creating a new copy of the template code for each type. Proof of this is in the fact that an instance of MyClass will not share a static variable withMyClass. Tow instances of MyClass, however, will share a static variable.
/*** MyClass.h ***/
template<class T> class MyClass {
public:
static int val;
MyClass(int v) { val v;}
};
/*** MyClass.cpp ***/
template<typename T>
int MyClass<T>::bar;
template class MyClass<Foo>;
template class MyClass<Bar>;
/*** main.cpp ***/
MyClass<Foo> * fool
MyClass<Foo> * foo2
MyClass<Bar> * barl
MyClass<Bar> * bar2
new MyClass<Foo>(10);
new MyClass<Foo>(15);
new MyClass<Bar>(20);
new MyClass<Bar>(35);
int fl fool->val; // will equal 15
int f2 foo2->val; // will equal 15
int bl barl->val; // will equal 35
int b2 bar2->val; // will equal 35
In Java, static variables are shared across instances of MyClass, regardless of the different type parameters.
Java generics and C ++ templates have a number of other differences. These include:
C++ templates can use primitive types, like int. Java cannot and must
instead use Integer.
In Java, you can restrict the template's type parameters to be of a
certain type. For instance, you might use generics to implement a
CardDeck and specify that the type parameter must extend from
CardGame.
In C++, the type parameter can be instantiated, whereas Java does not
support this.
In Java, the type parameter (i.e., the Foo in MyClass) cannot be
used for static methods and variables, since these would be shared between MyClass and MyClass. In C++, these classes are different, so the type parameter can be used for static methods and variables.
In Java, all instances of MyClass, regardless of their type parameters, are the same type. The type parameters are erased at runtime. In C++, instances with different type parameters are different types.
Another advantage of C++ templates is specialization.
template <typename T> T sum(T a, T b) { return a + b; }
template <typename T> T sum(T* a, T* b) { return (*a) + (*b); }
Special sum(const Special& a, const Special& b) { return a.plus(b); }
Now, if you call sum with pointers, the second method will be called, if you call sum with non-pointer objects the first method will be called, and if you call sum with Special objects, the third will be called. I don't think that this is possible with Java.
I will sum it up in a single sentence: templates create new types, generics restricts existing types.
#Keith:
That code is actually wrong and apart from the smaller glitches (template omitted, specialization syntax looks differently), partial specialization doesn't work on function templates, only on class templates. The code would however work without partial template specialization, instead using plain old overloading:
template <typename T> T sum(T a, T b) { return a + b; }
template <typename T> T sum(T* a, T* b) { return (*a) + (*b); }
Templates are nothing but a macro system. Syntax sugar. They are fully expanded before actual compilation (or, at least, compilers behave as if it were the case).
Example:
Let's say we want two functions. One function takes two sequences (list, arrays, vectors, whatever goes) of numbers, and returns their inner product. Another function takes a length, generates two sequences of that length, passes them to the first function, and returns it's result. The catch is that we might make a mistake in the second function, so that these two functions aren't really of the same length. We need the compiler to warn us in this case. Not when the program is running, but when it's compiling.
In Java you can do something like this:
import java.io.*;
interface ScalarProduct<A> {
public Integer scalarProduct(A second);
}
class Nil implements ScalarProduct<Nil>{
Nil(){}
public Integer scalarProduct(Nil second) {
return 0;
}
}
class Cons<A implements ScalarProduct<A>> implements ScalarProduct<Cons<A>>{
public Integer value;
public A tail;
Cons(Integer _value, A _tail) {
value = _value;
tail = _tail;
}
public Integer scalarProduct(Cons<A> second){
return value * second.value + tail.scalarProduct(second.tail);
}
}
class _Test{
public static Integer main(Integer n){
return _main(n, 0, new Nil(), new Nil());
}
public static <A implements ScalarProduct<A>>
Integer _main(Integer n, Integer i, A first, A second){
if (n == 0) {
return first.scalarProduct(second);
} else {
return _main(n-1, i+1,
new Cons<A>(2*i+1,first), new Cons<A>(i*i, second));
//the following line won't compile, it produces an error:
//return _main(n-1, i+1, first, new Cons<A>(i*i, second));
}
}
}
public class Test{
public static void main(String [] args){
System.out.print("Enter a number: ");
try {
BufferedReader is =
new BufferedReader(new InputStreamReader(System.in));
String line = is.readLine();
Integer val = Integer.parseInt(line);
System.out.println(_Test.main(val));
} catch (NumberFormatException ex) {
System.err.println("Not a valid number");
} catch (IOException e) {
System.err.println("Unexpected IO ERROR");
}
}
}
In C# you can write almost the same thing. Try to rewrite it in C++, and it won't compile, complaining about infinite expansion of templates.
I would like to quote askanydifference here:
The main difference between C++ and Java lies in their dependency on the platform. While, C++ is platform dependent language, Java is platform independent language.
The above statement is the reason why C++ is able to provide true generic types. While Java does have strict checking and hence they don't allow using generics the way C++ allows it.