Why does java need type identifiers?

Why does java need type identifiers? - java

Since every class in java is a subclass of the Object, and variables in java are not objects themselves but instead are object references, why does java make type specification compulsory, when the Object type could be made implicit? The only time it seems necessary is when using the simple data types.

If a variable is of type Object, the compiler will not let you use the variable as any other type (unless you cast it).
This is called type safety.
For example:
Object str = "abc";
s.toUpppercase(); //Compiler error

Well. Java does it because.. this is how the language was defined.
this boils down to what was considered good practice when the language was designed (nearly 20 years ago), and also with complier ease of development.
Scala, a language which is closely related to Java (runs on the same JVM), does not require explicit type identifiers in most cases.
the downside is the scala compiler is much slower (for this among other reasons).

The answer is probably Java is an object oriented language used in large projects and created by reasonable company. Having such strong typing decrease amount of potential bugs, that you are able to remove on compilation level.
btw.
In the .NET product you have such thing as var x. But it can be used only locally in method body. But this is only a compiler sugar for developers. So Java is only a example of strong typed language.

Java is a strongly typed language. The types are needed to be able to compile code and validate types compatibility at compile time.
Weakly typed languages do not have types. For example when you say var x in JavaScript you just define variable. Then value of any type may be assigned there. This means that if for example your code has bug and you assign string to this variable and then try to divide this variable by 2 (y = x / 2) the script will just fail at runtime. Java will not allow you to compile such code.
There is the principle: the bug costs x during development, x*10 during QA and x*100 if it arrives to production. Compiler and strongly typed languages allow to decrease number of (stupid) bugs that arrive to QA and therefore make software development easier, faster and cheaper.

Related

Difference between RTTI and reflection in Java

My question is when how does the class info gets loaded during runtime?
When someone calls instanceof is that considered RTTI or reflection? Or it depends on the actual situation?

The term "RTTI" is a C++-specific term referring to the functionality of the core language that allows the program to determine the dynamic types of various objects at runtime. It usually refers to the dynamic_cast or typeid operators, along with the associated std::type_info object produced by typeid.
The term reflection, on the other hand, is a generic term used across programming languages to refer to the ability of a program to inspect and modify its objects, types, etc. at runtime.
The term I've heard applied to instanceof is type introspection and instanceof is sometimes referred to as object introspection, as the program is allowed to look at the running types to determine what course of action to take. I think this is a weaker term than reflection, as it doesn't allow for elaborate introspection on the fields or methods of an object, but I don't think it would be technically incorrect to call the use of the instanceof operator reflection.
As to your other question - how does class information get loaded at runtime? - that's really up to the JVM implementation. The ClassLoader type is ultimately responsible for loading classes into the system, but the JVM can interpret this however it wants to. I once built a prototype JVM in JavaScript, and internally all reflection calls just queried the underlying JS data structures I had in place to represent classes, fields, and methods. I would imagine that the HotSpot JVM does something totally different, but it's pretty much implementation-defined.
Hope this helps!

In short, the true difference between RTTI and reflection is that with RTTI, the compiler opens and examines the .class file at compile time. With reflection, the .class file is unavailable at compile time; it is opened and examined by the runtime environment.

Why does Java need interfaces and Smalltalk does not?

I have been programming in Smalltalk for some time, but I never really needed interfaces to implement anything. Then why can't languages such as Java get rid of interfaces? Is it only Smalltalk or is there another language which doesn't need interfaces?

Because Java is statically typed while Smalltalk is not. Interfaces don't serve any purpose when you don't declare types and your variables aren't going to be typechecked. But in a statically typed language like Java, they're extremely handy, because they let you have a variable whose type is defined by the methods the object implements instead of its class. It brings you a lot closer to the dynamic typing Smalltalk has natively without giving up the benefits of typechecking.

It is a polymorphism issue: in Java you have static types and therefore you need to know which messages can your object answer... in Smalltalk (and other non-static languages) you just need to implement right methods to have polymorphism.
For instance:
In Java you need to implement Cloneable, who defines method
Cloneable.clone to have cloneble objects. Then, the compiler knows
your object will understand that method (otherwise it will throw an
error)
In smalltalk, you just need to implement method #clone.
Compiler never knows/don't care about which messages understands your
objects until it is called.
That also means you can have polymorphic objects without being part of same hierarchy... multi inheritance, mixins and other approachs (traits are present on Pharo) are just reuse technics, not a design constraint.
This way of do things is often called "duck typing"... see: http://en.wikipedia.org/wiki/Duck_typing

Do you think there might be a useful role for "interfaces" in Smalltalk?
See - Adding Dynamic Interfaces to Smalltalk

not sure what exactly your asking (or rather, which question you most want answered) but have a look at Ruby. From my understanding it's much closer to smalltalk than Java is.
If i were to answer the question about why java needs interfaces, I guess I'd say something about java being a statically typed language and taking that philosophy to the extent that java does is what makes for the need of interfaces. Effectively interfaces try to give you something like multiple inheritence in java without the multiple inheritance issues that other languages face (C++ i believe).

Java has no need for interfaces. It is something the compiler chose to support rather than discard.
At runtime, interfaces cannot be enforced in any language because all dynamic objects are either 1. structs of pure state or 2. structs of pure state with first member being a pointer to a vtable mapping either integers to members(via array) or strings to members(being dictionary/hashmap). The consequence of this is that you can always change the values at indices of the vtable, or entries of the hashmap, or just change the vtable pointer to another vtable address, or just illegal access memory.
Smalltalk could have easily stored information given at compile time of your classes, and in a way it does, that's how intellisense in smalltalk browsers gives member suggestions, but this would not actually benefit smalltalk.
There are several issues with smalltalk's syntax that limits the use of interfaces.
Smalltalk has only one primary type
This means that it can't just warn you if you try putting a square into a circle hole, there are no squares, there are no holes, everything is an object to the smalltalk compiler.
The compiler could choose to type deduce variables that were assigned, but smalltalk philosophically objects to doing so.
Smalltalk methods always take one argument
It might seem like doing myArray at: 1 put: 'hi' has two arguments, but in reality, you are calling a javascript equivelent of myArray['at:put:']([1, 'hi']) with myArray being an object(~hashmap). the number of arguments thus cannot be checked without breaking the philosophy of smalltalk.
there are workarounds smalltalk could do to check number of arguments, but it would not give much benefit.
smalltalk exposes its compiler to runtime, whereas java tries very hard to bury the compiler from runtime.
When you expose your compiler to runtime(all languages from assembly to javascript can easily expose their compiler to runtime, few make it part of the easily accessible parts of the language, the more accessible the compiler is at runtime, the higher level we consider the language to be), your language becomes a tad more fragile in that the information you use at compile time on one line, may no longer be valid on another line because at runtime, the information compiler relied on being fixed is no longer the same.
One consequence of this is that a class might have one interface at one point of the program, but half way into the program, the user changed the class to have another interface; if the user wants to use this interface at compile time(after changing the class using code), the compiler needs to be much smarter to realize that the class that didn't support ".Greet()" now suddenly does, or no longer does, or that the method ".Greet()" and method ".Foo()" have been swapped around.
interfaces are great at compile time, but completely unenforceable at runtime. This is great news for those that want to change behavior of code without needing to restart the program, and horrible news for type safety purists - their ideals simply can't be enforced at runtime without manually poking every assertion at an interval.
unlike C++, smalltalk does not use arrays for vtables, instead it uses maps from strings to objects. This means that even if you do know that the method exists in the class you're calling, you cannot optimize it to a dispid so that future calls to this method use array offset instead of hashing to find the method. To demonstrate this, let's use javascript:
The current smalltalk objects behave analogous to this:
var myIntVtbl = {
'+': function(self, obj1) {
return {lpVtbl: myIntVtbl, data: self.data + obj1.data};
}
};
var myInt1 = {lpVtbl: myIntVtbl, data: 2};
var myInt2 = {lpVtbl: myIntVtbl, data: 5};
var myInt3 = myInt1['lpVtbl']['+'](myInt1, myInt2);
var myInt4 = myInt3['lpVtbl']['+'](myInt3, myInt3);
var myInt5 = myInt4['lpVtbl']['+'](myInt4, myInt4);
console.log(myInt5);
each time we call +, we must hash '+' to get the member from the vtable dictionary. Java works similarly, which is why decompilers can tell the names of methods so easily.
one optimization that a compiler can do if it knows interfaces, is to compile strings to dispids, like so:
var myIntVtbl = [
function(self, obj1) {
return {lpVtbl: myIntVtbl, data: self.data + obj1.data};
}
];
var myInt1 = {lpVtbl: myIntVtbl, data: 2};
var myInt2 = {lpVtbl: myIntVtbl, data: 5};
var myInt3 = myInt1['lpVtbl'][0](myInt1, myInt2);
var myInt4 = myInt3['lpVtbl'][0](myInt3, myInt3);
var myInt5 = myInt4['lpVtbl'][0](myInt4, myInt4);
console.log(myInt5);
as far as I know, neither java nor smalltalk do this for classes, whereas C++, C#(via comvisible attribute) do.
To summarize, smalltalk can use interfaces, in turn becoming more like PHP, but won't get pretty much any benefit of it at compile time other than weak reassurances.
Likewise, Java doesn't need interfaces, it can literally work like smalltalk under the condition that you expose the java compiler to java to be more accessible. To get a feel for this, you can interop between java and nashorn javascript engine that comes with all current java kits and use its' eval function as a runtime compiler. Java can easily get rid of interfaces, and use reflective polymorphism, treat everything as Object, but it'll be much more verbose to talk to objects without letting you index by string, and overloading the index operator for string to dynamically find the members.

What features of Scala cannot be translated to Java?

The Scala compiler compiles direct to Java byte code (or .NET CIL). Some of the features of Scala could be re-done in Java straightforwardly (e.g. simple for comprehensions, classes, translating anonymous/inner functionc etc). What are the features that cannot be translated that way?
That is presumably mostly of academic interest. More usefully, perhaps, what are the key features or idioms of Scala that YOU use that cannot be easily represented in Java?
Are there any the other way about? Things that can be done straightforwardly in Java that have no straightforward equivalent in Scala? Idioms in Java that don't translate?

This question, in my opinion, misses the point about by asking us to compare JVM languages by looking at their generated bytecode.
Scala compiles to Java-equivalent bytecode. That is, the bytecode could have been generated by code written in Java. Indeed you can even get scalac to output an intermediate form which looks a lot like Java.
All features like traits (via static forwarders), non-local returns (via exceptions), lazy values (via references) etc are all expressible by a Java program, although possibly in a most-ugly manner!
But what makes scala scala and not Java is what scalac can do for you, before the bytecode is generated. What scalac has going for it, as a statically typed language, is the ability to check a program for correctness, including type correctness (according to its type system) at compile time.
The major difference then between Java and scala (as of course Java is also statically typed), therefore, is scala's type system, which is capable of expressing programmatic relations which java-the-language's type system cannot.For example:
class Foo[M[_], A](m : M[A])
trait Bar[+A]
These concept, that M is a type parameter which itself has type parameters or that Bar is covariant, just do not exist in Java-land.

Traits are one thing that does not have an equivalent. Traits are Interfaces with code in them. You can copy the code to all classes that have a trait mixed in, but that is not the same thing.
Also I believe scala type system is more complete. While it will eventually map to the JVM types (actually suffer erasure). You can express some things in the Scala type system that may not be possible in Java (like variances).

I think, there is no equivalent for dynamically mix in some Traits. In Scala you can add at the time you're creating new objects some Traits, which are mixed in.
For example, we create one dog which is hungry and thirsty and one dog which is just hungry.
val hungryThirstyDog = new Dog with Hungry with Thirsty
val onlyHungryDog = new Dog with Hungry
I don't know an equivalent way to do this in Java. In Java, the inheritance is statically defined.

Implicit conversions don't have a straightforward equivalent in Java.

One feature of scala that I have found a good use for is type reification through Manifests. Since the JVM strips out all type information from generics, scala allows you to conserve this information in variables. This is something that Java reflection AFAIK can't handle, since there are no arguments to types in the bytecode.
The case I needed them was to pattern match on a type of List. This is, I had a VertexBuffer object which stored data on the GPU, that could be constructed from a List of floats or integers. The Manifest code looked approximately like this:
class VertexBuffer[T](data:List[T])(implicit m:Manifest[T]) {
m.toString.match {
case "float" => ...
case "int" => ...
}
}
This link links to a blog post with more information.
There are plenty of SO pages with more information too, like this one.

Three words: higher kinded types.

Your topic is not clear wehther you mean Java the JVM or Java the language. Given that Scala runs on the JVM, the q makes no sense, as we all know Scala runs on the JVM.

Scala has a "native" support for XML. You can build the XML, find elements, match directly in the Scala code.
Examples: http://programming-scala.labs.oreilly.com/ch10.html

Can we take advantage of the type system to make programs more secure?

This question is inspired from Joel's "Making Wrong Code Look Wrong"
http://www.joelonsoftware.com/articles/Wrong.html
Sometimes you can use types to enforce semantics on objects beyond their interfaces. For example, the Java interface Serializable does not actually define methods, but the fact that an object implements Serializable says something about how it should be used.
Can we have UnsafeString and SafeString interfaces/subclasses in, say Java, that are used in much of the same way as Joel's Hungarian notation and Java's Serializable so that it doesn't just look bad--it doesn't compile?
Is this feasible in Java/C/C++ or are the type systems too weak or too dynamic?
Also, beyond input sanitization, what other security functions can be implemented in this manner?

The type system already enforces a huge number of such safety features. That is essentially what it's for.
For a very simple example, it prevents you from treating a float as an int. That's one aspect of safety -- it guarantees that the type you're working on are going to behave as expected. It guarantees that only string methods are called on a string. Assembly doesn't have that safeguard, for example.
It's also the job of the type system to ensure that you don't call private functions on a class. That's another safety feature.
Java's type system is too anemic to enforce a lot of interesting constraints effectively, but in many other languages (including C++), the type system can be used to enforce far more wide-ranging rules.
In C++, template metaprogramming gives you a lot of tools for prohibiting "bad" code. For example:
class myclass : boost::noncopyable {
...
};
enforces at compile-time that the class can not be copied. The following will produce compile errors:
myclass m;
myclass m2(m); // copy construction isn't allowed
myclass m3;
m3 = m; // assignment also not allowed
Likewise, we can ensure at compile-time that a template function only gets called on types which fulfill certain criteria (say, they must be random-access iterators, while bilinear ones aren't allowed, or they must be POD types, or they must not be any kind of integer type (char, short, int, long), but all other types should be legal.
A textbook example of template metaprogramming in C++ implements a library for computing physical units. It allows you to multiply a value of type "meter" with another value of the same type, and automatically determines that the result must be of type "square meter". Or divide a value of type "mile" with a value of type "hour" and get a unit of type "miles per hour".
Again, a safety feature that prevents you from getting your types mixed up and accidentally getting your units mixed up. You'll get a compile error if you compute a value and try to assign it to the wrong type. trying to divide, say, liters by meters^2 and assigning the result to a value of, say, kilograms, will result in a compile error.
Most of this requires some manual work to set up, certainly, but the language gives you the tools you need to basically build the type-checks you want. Some of this could be better supported directly in the language, but the more creative checks would have to be implemented manually in any case.

Yes you can do such thing. I don't know about Java, but in C++ it isn't customary and there is no support for this, so you have to do some manual work. It is customary in some other languages, Ada for example, which have the equivalent of a typedef which introduces a new type which can't be converted implicitly into the orignal one (this new type "inherits" some basic operations from the one it is created, so it stays usefull).
BTW, in general inheritance isn't a good way to introduce the new types, as even if there is no implicit conversion in one way, there is one in the other one.

You can do a certian amount of this out of the box in Ada. For example, you can make integer types that cannot implcitily interoperate with each other, and Ada enumerations are not compatible with any integer type. You can still convert between them, but you have to explicitly do it, which calls attention to what you are doing.
You could do the same with present-day C++, but you'd have to wrap all your integers and enums in classes, which is just way too much work for something that should be simple (or better yet, the default way of doing things).
I understand the next version of C++ is going to fix at least the enumeration issue.

In C++, I suppose you could use typedef to create a synonym for a primitive type. Your synonym could imply something about the content of that variable, replacing the function of the apps hungarian notation.
Intellisense will report the synonym you used during declaration, so if you don't like using actual hungarian, it does save you from scrolling about (or using Go To Definition).

I guess you are thinking of something along the lines of Perl's "tainting" analysis.
In Java, it should be possible to use custom annotations and an annotation processor to implement this. Not necessarily easy though.

You can't have a UnsafeString subclass of String in Java, since java.lang.String is final.
In general, you cannot provide any kind of security on the source level - if you want to protect against evil code, you must do that on the binary level (e.g. Java bytecode). That's why private/protected can't be used as a security mechanism in C++: it is possible to bypass that with pointer manipulations.

Why aren't Java generic type parameters reified at runtime?

My understanding is that C# and java differ with respect to generics in some ways, one of which is that generic type parameters are available at runtime in C#/.NET but not in Java. Why did the Java language designers do it this way?

To allow binary compatibility with pre-generics bytecode, therefore allowing new code to interface with old code.
From the Type Erasure page of The Java Tutorials:
Type erasure enables Java applications
that use generics to maintain binary
compatibility with Java libraries and
applications that were created before
generics.
[...]
Type erasure exists so that new code
may continue to interface with legacy
code.
For a related question, take a look at C# vs Java generics.

I remember reading something about this in the book Hardcore Java:
The problem with checking elements in
a collection at runtime is that it is
extremely expensive; the order of
efficiency is only O(n). If you have
only 10 addresses in your collection,
checking elements is easy. However, if
the collection contains 15,000
addresses, then you would incur a
significant overhead whenever someone
calls the setter.
On the other hand, if
you can prevent users from placing
anything other than an address in your
collection at compile time, then you
wouldn't have to check the types at
runtime. If they try to give you
something that isn't an address, then
the compiler will reject the attempt.
This is exactly what parameterized
types do.
However, "why" questions can never really be satisfactorily answered because there are just too many variable involved with people, time, place and politics. I remember reading somewhere else that the decision had a lot to do with maintaining compatibility with the way things were already being done in the Java byte code. Here is another quote from the same book:
After the compiler has resolved the type safety introduced by generics, it erases the parameterization from the type. Therefore, the information is not available at runtime. The purpose of erasure, as stated by Sun, is to allow class libraries built with an older version of the JDK to be able to run on the JDK 1.5 virtual machine.
I'm curious, what are the advantages offered by runtime generics?

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.