Why does Java need interfaces and Smalltalk does not? - java

I have been programming in Smalltalk for some time, but I never really needed interfaces to implement anything. Then why can't languages such as Java get rid of interfaces? Is it only Smalltalk or is there another language which doesn't need interfaces?

Because Java is statically typed while Smalltalk is not. Interfaces don't serve any purpose when you don't declare types and your variables aren't going to be typechecked. But in a statically typed language like Java, they're extremely handy, because they let you have a variable whose type is defined by the methods the object implements instead of its class. It brings you a lot closer to the dynamic typing Smalltalk has natively without giving up the benefits of typechecking.

It is a polymorphism issue: in Java you have static types and therefore you need to know which messages can your object answer... in Smalltalk (and other non-static languages) you just need to implement right methods to have polymorphism.
For instance:
In Java you need to implement Cloneable, who defines method
Cloneable.clone to have cloneble objects. Then, the compiler knows
your object will understand that method (otherwise it will throw an
error)
In smalltalk, you just need to implement method #clone.
Compiler never knows/don't care about which messages understands your
objects until it is called.
That also means you can have polymorphic objects without being part of same hierarchy... multi inheritance, mixins and other approachs (traits are present on Pharo) are just reuse technics, not a design constraint.
This way of do things is often called "duck typing"... see: http://en.wikipedia.org/wiki/Duck_typing

Do you think there might be a useful role for "interfaces" in Smalltalk?
See - Adding Dynamic Interfaces to Smalltalk

not sure what exactly your asking (or rather, which question you most want answered) but have a look at Ruby. From my understanding it's much closer to smalltalk than Java is.
If i were to answer the question about why java needs interfaces, I guess I'd say something about java being a statically typed language and taking that philosophy to the extent that java does is what makes for the need of interfaces. Effectively interfaces try to give you something like multiple inheritence in java without the multiple inheritance issues that other languages face (C++ i believe).

Java has no need for interfaces. It is something the compiler chose to support rather than discard.
At runtime, interfaces cannot be enforced in any language because all dynamic objects are either 1. structs of pure state or 2. structs of pure state with first member being a pointer to a vtable mapping either integers to members(via array) or strings to members(being dictionary/hashmap). The consequence of this is that you can always change the values at indices of the vtable, or entries of the hashmap, or just change the vtable pointer to another vtable address, or just illegal access memory.
Smalltalk could have easily stored information given at compile time of your classes, and in a way it does, that's how intellisense in smalltalk browsers gives member suggestions, but this would not actually benefit smalltalk.
There are several issues with smalltalk's syntax that limits the use of interfaces.
Smalltalk has only one primary type
This means that it can't just warn you if you try putting a square into a circle hole, there are no squares, there are no holes, everything is an object to the smalltalk compiler.
The compiler could choose to type deduce variables that were assigned, but smalltalk philosophically objects to doing so.
Smalltalk methods always take one argument
It might seem like doing myArray at: 1 put: 'hi' has two arguments, but in reality, you are calling a javascript equivelent of myArray['at:put:']([1, 'hi']) with myArray being an object(~hashmap). the number of arguments thus cannot be checked without breaking the philosophy of smalltalk.
there are workarounds smalltalk could do to check number of arguments, but it would not give much benefit.
smalltalk exposes its compiler to runtime, whereas java tries very hard to bury the compiler from runtime.
When you expose your compiler to runtime(all languages from assembly to javascript can easily expose their compiler to runtime, few make it part of the easily accessible parts of the language, the more accessible the compiler is at runtime, the higher level we consider the language to be), your language becomes a tad more fragile in that the information you use at compile time on one line, may no longer be valid on another line because at runtime, the information compiler relied on being fixed is no longer the same.
One consequence of this is that a class might have one interface at one point of the program, but half way into the program, the user changed the class to have another interface; if the user wants to use this interface at compile time(after changing the class using code), the compiler needs to be much smarter to realize that the class that didn't support ".Greet()" now suddenly does, or no longer does, or that the method ".Greet()" and method ".Foo()" have been swapped around.
interfaces are great at compile time, but completely unenforceable at runtime. This is great news for those that want to change behavior of code without needing to restart the program, and horrible news for type safety purists - their ideals simply can't be enforced at runtime without manually poking every assertion at an interval.
unlike C++, smalltalk does not use arrays for vtables, instead it uses maps from strings to objects. This means that even if you do know that the method exists in the class you're calling, you cannot optimize it to a dispid so that future calls to this method use array offset instead of hashing to find the method. To demonstrate this, let's use javascript:
The current smalltalk objects behave analogous to this:
var myIntVtbl = {
'+': function(self, obj1) {
return {lpVtbl: myIntVtbl, data: self.data + obj1.data};
}
};
var myInt1 = {lpVtbl: myIntVtbl, data: 2};
var myInt2 = {lpVtbl: myIntVtbl, data: 5};
var myInt3 = myInt1['lpVtbl']['+'](myInt1, myInt2);
var myInt4 = myInt3['lpVtbl']['+'](myInt3, myInt3);
var myInt5 = myInt4['lpVtbl']['+'](myInt4, myInt4);
console.log(myInt5);
each time we call +, we must hash '+' to get the member from the vtable dictionary. Java works similarly, which is why decompilers can tell the names of methods so easily.
one optimization that a compiler can do if it knows interfaces, is to compile strings to dispids, like so:
var myIntVtbl = [
function(self, obj1) {
return {lpVtbl: myIntVtbl, data: self.data + obj1.data};
}
];
var myInt1 = {lpVtbl: myIntVtbl, data: 2};
var myInt2 = {lpVtbl: myIntVtbl, data: 5};
var myInt3 = myInt1['lpVtbl'][0](myInt1, myInt2);
var myInt4 = myInt3['lpVtbl'][0](myInt3, myInt3);
var myInt5 = myInt4['lpVtbl'][0](myInt4, myInt4);
console.log(myInt5);
as far as I know, neither java nor smalltalk do this for classes, whereas C++, C#(via comvisible attribute) do.
To summarize, smalltalk can use interfaces, in turn becoming more like PHP, but won't get pretty much any benefit of it at compile time other than weak reassurances.
Likewise, Java doesn't need interfaces, it can literally work like smalltalk under the condition that you expose the java compiler to java to be more accessible. To get a feel for this, you can interop between java and nashorn javascript engine that comes with all current java kits and use its' eval function as a runtime compiler. Java can easily get rid of interfaces, and use reflective polymorphism, treat everything as Object, but it'll be much more verbose to talk to objects without letting you index by string, and overloading the index operator for string to dynamically find the members.

Related

Why does java need type identifiers?

Since every class in java is a subclass of the Object, and variables in java are not objects themselves but instead are object references, why does java make type specification compulsory, when the Object type could be made implicit? The only time it seems necessary is when using the simple data types.
If a variable is of type Object, the compiler will not let you use the variable as any other type (unless you cast it).
This is called type safety.
For example:
Object str = "abc";
s.toUpppercase(); //Compiler error
Well. Java does it because.. this is how the language was defined.
this boils down to what was considered good practice when the language was designed (nearly 20 years ago), and also with complier ease of development.
Scala, a language which is closely related to Java (runs on the same JVM), does not require explicit type identifiers in most cases.
the downside is the scala compiler is much slower (for this among other reasons).
The answer is probably Java is an object oriented language used in large projects and created by reasonable company. Having such strong typing decrease amount of potential bugs, that you are able to remove on compilation level.
btw.
In the .NET product you have such thing as var x. But it can be used only locally in method body. But this is only a compiler sugar for developers. So Java is only a example of strong typed language.
Java is a strongly typed language. The types are needed to be able to compile code and validate types compatibility at compile time.
Weakly typed languages do not have types. For example when you say var x in JavaScript you just define variable. Then value of any type may be assigned there. This means that if for example your code has bug and you assign string to this variable and then try to divide this variable by 2 (y = x / 2) the script will just fail at runtime. Java will not allow you to compile such code.
There is the principle: the bug costs x during development, x*10 during QA and x*100 if it arrives to production. Compiler and strongly typed languages allow to decrease number of (stupid) bugs that arrive to QA and therefore make software development easier, faster and cheaper.

Why does erasure complicate implementing function types?

I read from an interview with Neal Gafter:
"For example, adding function types to the programming language is much more difficult with Erasure as part of Generics."
EDIT:
Another place where I've met similar statement was in Brian Goetz's message in Lambda Dev mailing list, where he says that lambdas are easier to handle when they are just anonymous classes with syntactic sugar:
But my objection to function types was not that I don't like function types -- I love function types -- but that function types fought badly with an existing aspect of the Java type system, erasure. Erased function types are the worst of both worlds. So we removed this from the design.
Can anyone explain these statements? Why would I need runtime type information with lambdas?
The way I understand it, is that they decided that thanks to erasure it would be messy to go the way of 'function types', e.g. delegates in C# and they only could use lambda expressions, which is just a simplification of single abstract method class syntax.
Delegates in C#:
public delegate void DoSomethingDelegate(Object param1, Object param2);
...
//now assign some method to the function type variable (delegate)
DoSomethingDelegate f = DoSomething;
f(new Object(), new Object());
(another sample here
http://geekswithblogs.net/joycsharp/archive/2008/02/15/simple-c-delegate-sample.aspx)
One argument they put forward in Project Lambda docs:
Generic types are erased, which would expose additional places where
developers are exposed to erasure. For example, it would not be
possible to overload methods m(T->U) and m(X->Y), which would be
confusing.
section 2 in:
http://cr.openjdk.java.net/~briangoetz/lambda/lambda-state-3.html
(The final lambda expressions syntax will be a bit different from the above document:
http://mail.openjdk.java.net/pipermail/lambda-dev/2011-September/003936.html)
(x, y) => { System.out.printf("%d + %d = %d%n", x, y, x+y); }
All in all, my best understanding is that only a part of syntax stuff that could, actually will be used.
What Neal Gafter most likely meant was that not being able to use delegates will make standard APIs more difficult to adjust to functional style, rather than that javac/JVM update would be more difficult to be done.
If someone understands this better than me, I will be happy to read his account.
Goetz expands on the reasoning in State of the Lambda 4th ed.:
An alternative (or complementary) approach to function types,
suggested by some early proposals, would have been to introduce a new,
structural function type. A type like "function from a String and an
Object to an int" might be expressed as (String,Object)->int. This
idea was considered and rejected, at least for now, due to several
disadvantages:
It would add complexity to the type system and further mix structural and nominal types.
It would lead to a divergence of library styles—some libraries would continue to use callback interfaces, while others would use structural
function types.
The syntax could be unweildy, especially when checked exceptions were included.
It is unlikely that there would be a runtime representation for each distinct function type, meaning developers would be further exposed to
and limited by erasure. For example, it would not be possible (perhaps
surprisingly) to overload methods m(T->U) and m(X->Y).
So, we have instead chosen to take the path of "use what you
know"—since existing libraries use functional interfaces extensively,
we codify and leverage this pattern.
To illustrate, here are some of the functional interfaces in Java SE 7
that are well-suited for being used with the new language features;
the examples that follow illustrate the use of a few of them.
java.lang.Runnable
java.util.concurrent.Callable
java.util.Comparator
java.beans.PropertyChangeListener
java.awt.event.ActionListener
javax.swing.event.ChangeListener
...
Note that erasure is just one of the considerations. In general, the Java lambda approach goes in a different direction from Scala, not just on the typed question. It's very Java-centric.
Maybe because what you'd really want would be a type Function<R, P...>, which is parameterised with a return type and some sequence of parameter types. But because of erasure, you can't have a construct like P..., because it could only turn into Object[], which is too loose to be much use at runtime.
This is pure speculation. I am not a type theorist; i haven't even played one on TV.
I think what he means in that statement is that at runtime Java cannot tell the difference between these two function definitions:
void doIt(List<String> strings) {...}
void doIt(List<Integer> ints) {...}
Because at compile time, the information about what type of data the List contains is erased, so the runtime environment wouldn't be able to determine which function you wanted to call.
Trying to compile both of these methods in the same class will throw the following exception:
doIt(List<String>) clashes with doIt(List<Integer); both methods have the same erasure

why MyClass.class exists in java and MyField.field isn't?

Let's say I have:
class A {
Integer b;
void c() {}
}
Why does Java have this syntax: A.class, and doesn't have a syntax like this: b.field, c.method?
Is there any use that is so common for class literals?
The A.class syntax looks like a field access, but in fact it is a result of a special syntax rule in a context where normal field access is simply not allowed; i.e. where A is a class name.
Here is what the grammar in the JLS says:
Primary:
ParExpression
NonWildcardTypeArguments (
ExplicitGenericInvocationSuffix | this Arguments)
this [Arguments]
super SuperSuffix
Literal
new Creator
Identifier { . Identifier }[ IdentifierSuffix]
BasicType {[]} .class
void.class
Note that there is no equivalent syntax for field or method.
(Aside: The grammar allows b.field, but the JLS states that b.field means the contents of a field named "field" ... and it is a compilation error if no such field exists. Ditto for c.method, with the addition that a field c must exist. So neither of these constructs mean what you want them to mean ... )
Why does this limitation exist? Well, I guess because the Java language designers did not see the need to clutter up the language syntax / semantics to support convenient access to the Field and Method objects. (See * below for some of the problems of changing Java to allow what you want.)
Java reflection is not designed to be easy to use. In Java, it is best practice use static typing where possible. It is more efficient, and less fragile. Limit your use of reflection to the few cases where static typing simply won't work.
This may irk you if you are used to programming to a language where everything is dynamic. But you are better off not fighting it.
Is there any use that is so common for class literals?
I guess, the main reason they supported this for classes is that it avoids programs calling Class.forName("some horrible string") each time you need to do something reflectively. You could call it a compromise / small concession to usability for reflection.
I guess the other reason is that the <type>.class syntax didn't break anything, because class was already a keyword. (IIRC, the syntax was added in Java 1.1.)
* If the language designers tried to retrofit support for this kind of thing there would be all sorts of problems:
The changes would introduce ambiguities into the language, making compilation and other parser-dependent tasks harder.
The changes would undoubtedly break existing code, whether or not method and field were turned into keywords.
You cannot treat b.field as an implicit object attribute, because it doesn't apply to objects. Rather b.field would need to apply to field / attribute identifiers. But unless we make field a reserved word, we have the anomalous situation that you can create a field called field but you cannot refer to it in Java sourcecode.
For c.method, there is the problem that there can be multiple visible methods called c. A second issue that if there is a field called c and a method called c, then c.method could be a reference to an field called method on the object referred to by the c field.
I take it you want this info for logging and such. It is most unfortunate that such information is not available although the compiler has full access to such information.
One with a little creativity you can get the information using reflection. I can't provide any examples for asthere are little requirements to follow and I'm not in the mood to completely waste my time :)
I'm not sure if I fully understand your question. You are being unclear in what you mean by A.class syntax. You can use the reflections API to get the class from a given object by:
A a = new A()
Class c = a.getClass()
or
Class c = A.class;
Then do some things using c.
The reflections API is mostly used for debugging tools, since Java has support for polymorphism, you can always know the actual Class of an object at runtime, so the reflections API was developed to help debug problems (sub-class given, when super-class behavior is expected, etc.).
The reason there is no b.field or c.method, is because they have no meaning and no functional purpose in Java. You cannot create a reference to a method, and a field cannot change its type at runtime, these things are set at compile-time. Java is a very rigid language, without much in the way of runtime-flexibility (unless you use dynamic class loading, but even then you need some information on the loaded objects). If you have come from a flexible language like Ruby or Javascript, then you might find Java a little controlling for your tastes.
However, having the compiler help you figure our potential problems in your code is very helpful.
In java, Not everything is an object.
You can have
A a = new A()
Class cls = a.getClass()
or directly from the class
A.class
With this you get the object for the class.
With reflection you can get methods and fields but this gets complicated. Since not everything is an object. This is not a language like Scala or Ruby where everything is an object.
Reflection tutorial : http://download.oracle.com/javase/tutorial/reflect/index.html
BTW: You did not specify the public/private/protected , so by default your things are declared package private. This is package level protected access http://download.oracle.com/javase/tutorial/java/javaOO/accesscontrol.html

What features of Scala cannot be translated to Java?

The Scala compiler compiles direct to Java byte code (or .NET CIL). Some of the features of Scala could be re-done in Java straightforwardly (e.g. simple for comprehensions, classes, translating anonymous/inner functionc etc). What are the features that cannot be translated that way?
That is presumably mostly of academic interest. More usefully, perhaps, what are the key features or idioms of Scala that YOU use that cannot be easily represented in Java?
Are there any the other way about? Things that can be done straightforwardly in Java that have no straightforward equivalent in Scala? Idioms in Java that don't translate?
This question, in my opinion, misses the point about by asking us to compare JVM languages by looking at their generated bytecode.
Scala compiles to Java-equivalent bytecode. That is, the bytecode could have been generated by code written in Java. Indeed you can even get scalac to output an intermediate form which looks a lot like Java.
All features like traits (via static forwarders), non-local returns (via exceptions), lazy values (via references) etc are all expressible by a Java program, although possibly in a most-ugly manner!
But what makes scala scala and not Java is what scalac can do for you, before the bytecode is generated. What scalac has going for it, as a statically typed language, is the ability to check a program for correctness, including type correctness (according to its type system) at compile time.
The major difference then between Java and scala (as of course Java is also statically typed), therefore, is scala's type system, which is capable of expressing programmatic relations which java-the-language's type system cannot.For example:
class Foo[M[_], A](m : M[A])
trait Bar[+A]
These concept, that M is a type parameter which itself has type parameters or that Bar is covariant, just do not exist in Java-land.
Traits are one thing that does not have an equivalent. Traits are Interfaces with code in them. You can copy the code to all classes that have a trait mixed in, but that is not the same thing.
Also I believe scala type system is more complete. While it will eventually map to the JVM types (actually suffer erasure). You can express some things in the Scala type system that may not be possible in Java (like variances).
I think, there is no equivalent for dynamically mix in some Traits. In Scala you can add at the time you're creating new objects some Traits, which are mixed in.
For example, we create one dog which is hungry and thirsty and one dog which is just hungry.
val hungryThirstyDog = new Dog with Hungry with Thirsty
val onlyHungryDog = new Dog with Hungry
I don't know an equivalent way to do this in Java. In Java, the inheritance is statically defined.
Implicit conversions don't have a straightforward equivalent in Java.
One feature of scala that I have found a good use for is type reification through Manifests. Since the JVM strips out all type information from generics, scala allows you to conserve this information in variables. This is something that Java reflection AFAIK can't handle, since there are no arguments to types in the bytecode.
The case I needed them was to pattern match on a type of List. This is, I had a VertexBuffer object which stored data on the GPU, that could be constructed from a List of floats or integers. The Manifest code looked approximately like this:
class VertexBuffer[T](data:List[T])(implicit m:Manifest[T]) {
m.toString.match {
case "float" => ...
case "int" => ...
}
}
This link links to a blog post with more information.
There are plenty of SO pages with more information too, like this one.
Three words: higher kinded types.
Your topic is not clear wehther you mean Java the JVM or Java the language. Given that Scala runs on the JVM, the q makes no sense, as we all know Scala runs on the JVM.
Scala has a "native" support for XML. You can build the XML, find elements, match directly in the Scala code.
Examples: http://programming-scala.labs.oreilly.com/ch10.html

Can we take advantage of the type system to make programs more secure?

This question is inspired from Joel's "Making Wrong Code Look Wrong"
http://www.joelonsoftware.com/articles/Wrong.html
Sometimes you can use types to enforce semantics on objects beyond their interfaces. For example, the Java interface Serializable does not actually define methods, but the fact that an object implements Serializable says something about how it should be used.
Can we have UnsafeString and SafeString interfaces/subclasses in, say Java, that are used in much of the same way as Joel's Hungarian notation and Java's Serializable so that it doesn't just look bad--it doesn't compile?
Is this feasible in Java/C/C++ or are the type systems too weak or too dynamic?
Also, beyond input sanitization, what other security functions can be implemented in this manner?
The type system already enforces a huge number of such safety features. That is essentially what it's for.
For a very simple example, it prevents you from treating a float as an int. That's one aspect of safety -- it guarantees that the type you're working on are going to behave as expected. It guarantees that only string methods are called on a string. Assembly doesn't have that safeguard, for example.
It's also the job of the type system to ensure that you don't call private functions on a class. That's another safety feature.
Java's type system is too anemic to enforce a lot of interesting constraints effectively, but in many other languages (including C++), the type system can be used to enforce far more wide-ranging rules.
In C++, template metaprogramming gives you a lot of tools for prohibiting "bad" code. For example:
class myclass : boost::noncopyable {
...
};
enforces at compile-time that the class can not be copied. The following will produce compile errors:
myclass m;
myclass m2(m); // copy construction isn't allowed
myclass m3;
m3 = m; // assignment also not allowed
Likewise, we can ensure at compile-time that a template function only gets called on types which fulfill certain criteria (say, they must be random-access iterators, while bilinear ones aren't allowed, or they must be POD types, or they must not be any kind of integer type (char, short, int, long), but all other types should be legal.
A textbook example of template metaprogramming in C++ implements a library for computing physical units. It allows you to multiply a value of type "meter" with another value of the same type, and automatically determines that the result must be of type "square meter". Or divide a value of type "mile" with a value of type "hour" and get a unit of type "miles per hour".
Again, a safety feature that prevents you from getting your types mixed up and accidentally getting your units mixed up. You'll get a compile error if you compute a value and try to assign it to the wrong type. trying to divide, say, liters by meters^2 and assigning the result to a value of, say, kilograms, will result in a compile error.
Most of this requires some manual work to set up, certainly, but the language gives you the tools you need to basically build the type-checks you want. Some of this could be better supported directly in the language, but the more creative checks would have to be implemented manually in any case.
Yes you can do such thing. I don't know about Java, but in C++ it isn't customary and there is no support for this, so you have to do some manual work. It is customary in some other languages, Ada for example, which have the equivalent of a typedef which introduces a new type which can't be converted implicitly into the orignal one (this new type "inherits" some basic operations from the one it is created, so it stays usefull).
BTW, in general inheritance isn't a good way to introduce the new types, as even if there is no implicit conversion in one way, there is one in the other one.
You can do a certian amount of this out of the box in Ada. For example, you can make integer types that cannot implcitily interoperate with each other, and Ada enumerations are not compatible with any integer type. You can still convert between them, but you have to explicitly do it, which calls attention to what you are doing.
You could do the same with present-day C++, but you'd have to wrap all your integers and enums in classes, which is just way too much work for something that should be simple (or better yet, the default way of doing things).
I understand the next version of C++ is going to fix at least the enumeration issue.
In C++, I suppose you could use typedef to create a synonym for a primitive type. Your synonym could imply something about the content of that variable, replacing the function of the apps hungarian notation.
Intellisense will report the synonym you used during declaration, so if you don't like using actual hungarian, it does save you from scrolling about (or using Go To Definition).
I guess you are thinking of something along the lines of Perl's "tainting" analysis.
In Java, it should be possible to use custom annotations and an annotation processor to implement this. Not necessarily easy though.
You can't have a UnsafeString subclass of String in Java, since java.lang.String is final.
In general, you cannot provide any kind of security on the source level - if you want to protect against evil code, you must do that on the binary level (e.g. Java bytecode). That's why private/protected can't be used as a security mechanism in C++: it is possible to bypass that with pointer manipulations.

Categories

Resources