Generic working : compile time vs runtime - java

What actually happens in a generic class when we compile the file and what happens at runtime? how does T behave at compile time vs runtime? What was the main purpose of introducing generics? Since we can do the same thing with the Object class. I am very much confused and have spent 3 months understanding this generic topic. Kindly anyone here, explain it in every detail. Thanks
//demo class, basically what is happening here ?
class Generic<T>{
Generic(){
T[] arr = (T[]) new Object[5];
}
public static void main(String [] args) {
new Generic();
}
} // class
// another demo class , let say i have a Student class
class AnotherGeneric<T extends Student> {
T fun(){
T data = (T)new Object();
return data;
}
public static void main(String[] args) {
Student std = new AnotherGeneric<Student>().fun();
}
}// class

Mostly, generics just disappear entirely at runtime. Generics is, in essence, "compiler checked documentation". It's a way to get both of these things at the same time:
You have a method that returns, say, a List. You'd like to document that the list only contains strings.
You'd like for the compiler to be aware of this and tell users who treat that list as if it contains something other than strings to go: "Hey, there - hang on. I don't think you understand how this method works, given that you appear to be treating it as if it has non-strings in it, which it won't, as the documentation says that it won't". Or vice versa: "Hey there - hang on. You documented that the list you return only contain strings but you appear to be attempting to stuff a number in there. That doesn't make sense. I shall not compile this inconsistency until you fix it".
And very much in last place, generics makes your code very slightly shorter, as the compiler will inserts casts for you silently. When the method is documented to return only a list of strings, when you call that method and then call .get(0) on the result, the compiler "pre-casts" it to a String for you.
That's it. It doesn't change anything at runtime. Those casts are even generated by the compiler.
So, how does it work:
In signatures, generics is compiled into the class file, but the JVM treats these as effectively 'a comment' - the JVM completely ignores them. The point of this is solely for the benefit of javac, who can read these comments and act accordingly. In other words, the fact that ArrayList has generics needs to be known by javac in order to properly compile the line new ArrayList<String>() - and how does javac know? By checking the class file that contains the ArrayList code. Signatures are:
The name of a class.
The extends and implements clauses of a class.
The type of every field, and the name of every field.
The return type of every method, and the name of every method.
The type (not name) of every parameter of a method.
The throws clause of a method.
Everywhere else, generics just disappear. So, if you write inside a method: List<String> list = new ArrayList<String>();, the code you end up with is JUST new ArrayList() in the class file. That string is just gone. It also explains why given a List<?> x; there is simply no way to ask this list: What is your component type. Because it is no longer available at runtime.
Javac uses this information to figure out what to do.
For the purposes of compilation, ALL generics-typed stuff is compiled as if they are their lower bound.
What about generic casts?
The closest other java language feature to a generic cast is #SuppressWarnings. A generic cast does literally nothing. It's just you telling the compiler: Shut up, I know what I'm doing (hence, you best really know what you are doing to use them!!).
For example, given:
void foo(List<?> x) {
List<String> y = (List<String>) x;
}
The compiler does nothing. There is no way for the compiler to generate code that actually checks if x really is a List. The above code cannot throw an exception, even if there are non-strings in that list. As I said before, generics also cause the compiler to inject casts. So, if you later write:
x.get(0).toLowerCase();
That will compile (there is no need to cast x.get(0) to String, however, it is compiled that way!) - and if you pass a list to this method that has a non-string object as first item, that line throws a ClassCastException even though you didn't write any casts on that line. That's because the compiler inserted a cast for you.
Think about it like this: Generics are for linking types in signatures.
Imagine you want to write a method that logs its argument and then just returns it. That's all it does.
You want to now 'link' the type of the argument to the return type: You want to tell the compiler and all users of this method: Whatever type you feed into this method is identical to the type that rolls of it.
In normal java you cannot do this:
Object log(Object o) {
log.mark("Logged: {}", o);
return o;
}
The above works fine but is annoying to use. I can't do this:
String y = scanner.next();
new URL(log(y)).openConnection();
The reason I can't do that, is the log(y) expression is of type Object, and the URL constructor requires a String. Us humans can clearly see that log(y) is obviously going to return a string, but the signature of the log method doesn't indicate this at all. We have to look at the implementation of log to know this, and perhaps tomorrow this implementation changes. The log method does not indicate that any future updates will continue to just 'return the parameter' like this. So javac does not let you write this code.
But now we add generics:
public <T> T log(T o) {
log.mark("Logged: {}", o);
return o;
}
And now it works fine. We've told the compiler that there exists a link between the 2 places we used T in this code: The caller gets to choose what T ends up being, and the compiler ensures that no matter what the caller chose, your code works.
Hence, if you define a type parameter and use it exactly 0 or 1 times, it's virtually always either a bug or a weird hack. The point is to link things and '0 or 1 times' is obviously not linking things.
Generics goes much further than this, your question's scope is far too broad. If you want to know every detail, read the Java Lang Spec, which gets into hopeless amounts of detail that will take your 6 months to even understand. There's no real point to this. You don't need to know the chemical composition of brake fluid to drive a car either.

This is the way I was taught the importance of generics.
Imagine that you were blindfolded, then told to do some basic task, such as move boxes from one side of the room to the other. Now also imagine that the room is full of other blindfolded people doing exactly the same thing as you.
Programming without generics would be tell all of these people to do their tasks, and then run the risk of them accidentally crashing into each other and damaging the boxes.
Programming with generics would be to sit down with each blindfolded person, and give all of them a very specific plan beforehand. For example, tell one of them to go forward 10 feet, grab the box on the floor in front of them, turn 180 degress, then go 10 feet, then put the box down. Then (and this is the important part) you draw a map of all of the plans and make sure that each of the blindfolded people's paths CANNOT cross each other. That is what generics give you. If you can prove that none of paths cross each other, then it doesn't matter if they are blindfolded - they cannot bump into each other - by design!
Once you can prove that they cannot bump into each other, you can start doing something more complex, like telling one blindfolded person to hand a box to another blindfolded person. And if you get really good at it, you can have paths that actually do cross, but only one person is crossing the intersection at the time.
That is the power of generics in Java - you can perform unsafe actions safely by planning it all ahead of time - at compile time! Then, when you are at runtime, it doesn't matter that you are blind - you know exactly what to do, and you have proven that you cannot crash into anyone else. As a result, when you actually do the task, you don't slowly shuffle forwards, putting your hands in front of you, constantly checking in fear that you will bump into someone else. You sprint headfirst forwards, blindly, but confident that you cannot fail, because the entire path has been mapped out for you.
Now, I should mention, the only way Java generics work is by ensuring none of the paths cross. You do this by turning on warnings when you compile your java code. If you get warnings about unchecked or raw, then that means your code is not safe, and you need to fix your plan. Once you compile with no warnings related to generics, you can be certain that your types are safe and will not crash into each other unexpectedly.
And finally, generics are powerful, but they do not play well with nulls. If you let nulls sneak into your code, that is a blindspot which generics cannot protect you from. Be very certain to limit, if not remove, the nulls in your code, otherwise your generics may not be bulletproof. If you avoid nulls and compile without warnings, you can guarantee that your code will never run into a type error unexpectedly.

Related

Why Functional API in java does not handle checked exceptions?

I saw many times that using the functional API in java is really verbose and error-prone when we have to deal with checked exceptions.
E.g: it's really convenient to write (and easier to read) code like
var obj = Objects.requireNonNullElseGet(something, Other::get);
Indeed, it also avoids to improper multiple invokation of getters, like when you do
var obj = something.get() != null ? something.get() : other.get();
// ^^^^ first ^^^^ ^^^^ second ^^^^
BUT everything becomes a jungle when you have to deal with checked exceptions, and I saw sometimes this really ugly code style:
try {
Objects.requireNonNullElseGet(obj, () -> {
try {
return invokeMethodWhichThrows();
} catch (Exception e) {
throw new RuntimeException(e);
}
});
} catch (RuntimeException r){
Throwable cause = r.getCause();
if(cause == null)
throw r;
else
throw cause;
}
which only intent is to handle checked exceptions like when you write code without lambdas. Now, I know that those cases can be better expressed with the ternary operator and a variable to hold the result of something.get(), but that's also the case for Objects.requireNonNullElse(a, b), which is there, in the java.util package of the JDK.
The same can be said for logging frameworks' methods which take Suppliers as parameters and evaluate them only if needed, BUT if you need to handle checked exceptions in those supplier you need to invoke them and explicitly check for the log level.
if(LOGGER.isDebugEnabled())
LOGGER.debug("request from " + resolveIPOrThrow());
Some similar reasonament can be maid also for Futures, but let me go ahead.
My question is: why is Functional API in java not handling checked exceptions?
For example having something like a ThrowingSupplier interface, like the one below, can potentially fit the need of dealing with checked exceptions, guarantee type consistency and better code readability.
interface ThrowingSupplier<O, T extends Exception> {
O get() throws T;
}
Then we need to duplicate methods that uses Suppliers to have an overload that uses ThrowingSuppliers and throws exceptions. But we as java developers have been used to this kind of duplication (like with Stream, IntStream, LongStream, or methods with overloads to handle int[], char[], long[], byte[], ...), so it's nothing too strange for us.
I would really appreciate if someone who has deep knowledge of the JDK argues about why checked exceptions have been excluded from the functional API, if there was a way to incorporate them.
This question can be interpreted as 'why did those who made this decision decide it this way', which is asking: "Please summarize 5 years of serious debate - specifically what Brian Goetz and co thought about it", which is impossible, unless your name is Brian Goetz. He does not answer questions on SO as far as I know. You can go spelunking in de archives of the lambda-dev mailing list if you want.
One could make an informed guess, though.
In-scope vs Beyond-scope
There are 3 transparancies that lambdas do not have.
Control flow.
Checked exceptions.
Mutable local variables.
Control flow transparency
Take this code, as an example:
private Map<String, PhoneNumber> phonebook = ...;
public PhoneNumber findPhoneNumberOf(String personName) {
phonebook.entrySet().stream().forEach(entry -> {
if (entry.getKey().equals(personName)) return entry.getValue();
});
return null;
}
This code is silly (why not just do a .get, or if we must stream through the thing, why not use .filter and .findFirst, but if you look past that, it doesn't even work: You cannot return the method from within that lambda. That return statement returns the lambda (and thus is a compiler error, the lambda you pass to forEach returns void). You can't continue or break a loop that is outside the lambda from inside it, either.
Contrast to a for loop that can do it just fine:
for (var entry : phonebook.entrySet()) {
if (entry.getKey().equals(personName)) return entry.getValue();
}
return null;
does exactly what you think, and works fine.
Checked exception transparency
This is the one you are complaining about. This doesn't compile:
public void printFiles(Path... files) throws IOException {
Arrays.stream(files).forEach(p -> System.out.println(Files.readString(p)));
}
The fact that the context allows you to throw IOExceptions doesn't help: The above does not compile, because 'can throw IOExceptions' as a status doesn't 'transfer' to the inside of the lambda.
There's a theme here: Rewrite it to a normal for loop and it compiles and works precisely the way you want to. So why, exactly, can't we make lambdas work the same way?
mutable local variables
This doesn't work:
int x = 0;
someList.stream().forEach(k -> x++);
System.out.println("Count: " + x);
You can neither modify local variables declared outside the lambda, nor even read them unless they are (effectively) final. Why not?
These are all GOOD things.. depending on scope layering
So far it seems really stupid that lambdas aren't transparent in these 3 regards. But it turns into a good thing in a slightly different context. Imagine instead of .stream().forEach something a little bit different:
class DoubleNullException extends Exception {} // checked!
public class Example {
private TreeSet<String> words;
public Example() throws DoubleNullException {
int comparisonCount = 0;
this.words = new TreeSet<String>((a, b) -> {
comparisonCount++;
if (a == null && b == null) throw new DoubleNullException();
});
System.out.println("Comparisons performed: " + comparisonCount);
}
}
Let's image the 3 transparencies did work. The above code makes use of two of them (tries to mutate comparisonCount, and tries to throw DoubleNullException from inside to outside).
The above code makes absolutely no sense. The compiler errors are very much desired. That comparator is not going to run until perhaps next week in a completely different thread. It runs whenever you add the second element to the set, which is a field, so who knows who is going to do that and which thread would do it. The constructor has long since ceased running - local vars are 'on the stack' and thus the local var has disappeared. Nevermind that the printing would always print 'comparisons made: 0' here, the statement 'comparisonCount++:' would be trying to increment a memory position that no longer holds that variable at all.
Even if we 'fix' this (the compiler realizes that a local is used in a lambda and hoists it onto heap, this is what most other languages do), the code still makes no sense as a concept: That print statement wouldn't print. Also, that comparator can be called from multiple threads so... do we now allow volatile on our local vars? Quite the can of worms! In current java, a local variable cannot possibly suffer from thread concurrency synchronization issues because it is not possible to share the variable (you can share the object the variable points at, not the variable itself) with another thread.
The reason you ARE allowed to mess with (effectively) final locals is because you can just make a copy, and that's what the compiler does for you. Copies are fine - if nobody changes anything.
The exception similarly doesn't work: It's the code that calls thatSet.add(someElement) that would get the DoubleNullException. The fact that somebody wrote:
Example ex;
try {
ex = new Example();
} catch (DoubleNullException e) {
throw new WrappedEx(e);
}
ex.add(null);
ex.add(null); // BOOM
The line with the remark (BOOM) would throw the DoubleNullEx. It 'breaks' the checked exception rules: That line would compile (set.add doesn't throw DNEx), but isn't in a context where throwing DNEx is allowed. The catch block that is in the above snippet cannot ever run.
See how it all falls apart, and nothing makes sense?
The key clue is: What happens to the lambda? Is it 'transported'?
For some situations, you hand a lambda straight to a method, and that method has a 'use it and lose it' mentality: That method you handed the lambda to will run it 0, 1, or many times, but the key is: It runs it right then and there and once the method you handed the lambda to returns, that lambda is gone. The thing you handed the lambda to did not store it in a field or hand it to other code that stores it in a field, nor did that method transport the lambda to another thread.
In such cases (the method is use-it-then-lose-it), the transparencies would certainly be handy and wouldn't "break" anything.
But when the method you hand the lambda to does transport it to a field (such as the constructor of TreeSet which stores the passed comparator in a field, so that future .add calls can call it), the transparencies break down and make no sense.
Lambdas in java are for both and therefore the lack of transparency (in all 3 regards) actually makes sense. It's just annoying when you have a use-it-then-lose-it situation.
POTENTIAL FUTURE JAVA FIX: I've championed it before but so far, it fell on mostly deaf ears. Next time I see Brian I might bring it up again. Imagine an annotation or other marker you can stick on the parameter of a method that says: "I shall use it or lose it". The compiler will then ensure you do not transport it (the only thing the compiler will let you do with that param is call .invoke() on it. You can't call anything else, nor can you assign it or hand it to anything else unless you hand it to a method that also marked that parameter as #UseItOrLoseIt. Then the compiler can make the transparency happen with some tactical wrapping for control flow, and for checked exception flow, just by not complaining (checked exceptions are a figment of javac's imagination. The runtime does not have checked exceptions. Which is why scala, kotlin, and other runs-on-the-JVM languages can do it).
Actually THEY CAN!
As your question ends with - you can actually write O get() throws T. So why do the various functional interfaces, such as Supplier, not do this?
Mostly because it's a pain. I'm honestly not sure why e.g. list's forEach is not defined as:
public <T extends Throwable> forEach(ThrowingConsumer<? super E, ? super T> consumer) throws T {
for (E elem : this) consumer.consume(elem);
}
Which would work fine and compile (with ThrowingConsumer having the obvious impl). Or even that Consumer as we have it is declared with the <O, T extends Exception> part.
It's a bit of a hassle. The way lambdas 'work' is that the compiler has to infer from context what functionalinterface you are implementing which notably includes having to bind all the generics out. Adding exception binding to this mix makes it even harder. IDEs tend to get a little confused if you're in the middle of writing code in a 'throwing lambda' and start red-underlining rather a lot, and auto-complete and the like is no help, because the IDE can't be useful in that context until it knows.
Lambdas as a system were also designed to backwards compatibly replace any existing usages of the concept, such as swing's ActionListener. Such listeners couldn't throw either, so having the interfaces in the java.util.function package be similar would be more familiar and slightly more java idiomatic, possibly.
The throws T solution would help but isn't a panacea. It solves, to an extent, the lack of checked exception transparency, but does nothing to solve either mutable local var transparency or control flow transparency. Perhaps the conclusion is simply: The benefits of doing it are more limited than you think, the costs are higher than you think. The cost/benefit analysis says: Bad idea, so it wasn't done.

Eclipse does not suggest methods in Lambda expression

I have an ArrayList of Strings, and am adding a method to sort the ArrayList
list.sort(Comparator.comparing(x -> x.length()));
When I write x and press ctrl + space eclipse does not suggest the methods of the String class, but only shows methods of the Object class.
Please help me configure eclipse to show the exact method suggestions in this case.
In regular cases eclipse is exact.
This is a two-fold issue, one with eclipse, and one with java semantics.
Java Semantics
A quick example:
public static void main(String[] args) {
List<String> myList = new ArrayList<>();
myList.sort(Comparator.comparing(x -> x.|));
}
Assume you press ctrl + space at the | (cursor) position. Then eclipse has to infer a lot of information to know, that x is in fact an element of type String. First, the list's generic type String must be known (it is, eclipse can deduce this). Then the Comparator.comparing method needs to know, that it must return an instance of a Comparator which compares Strings, which eclipse could deduce, but here is the first issue: The Comparator could be one that compares not just Strings, but also any other kind of Object. What if you wanted to pass a method to myList.sort that is more general than the generic Comparator<String>? To be more precise: The List.sort method can take (in your case) any Comparator of type Comparator<? super String>. And ? super String is already either Object or String.
So in your example. the type of x could just be an object, eclipse cannot ultimately decide. However, you can write your lambda expression differently, to make it clear:
myList.sort(Comparator.comparing((String x) -> x.|));
In this case, the completion suggestion could be more helpful (depending on the version of eclipse).
eclipse AST issues with incomplete lambdas
An incomplete lambda expression is more often than not such an upset in the syntax of the entire file, that eclipse cannot determine the syntax tree at that position correctly. That means, that eclipse cannot deduce, that the code you are writing is supposed to be a lambda expression, where x is the parameter of the lambda function, and you want to complete that. This issue could be addressed, if the tokenizer and AST-parser of eclipse are adapted accordingly (which might have already been tried). Whether this is possible at all, I cannot answer. I only know it helps, to write a "full" lambda, with a method block, and convert that to a "slim" lambda later on:
myList.sort(Comparator.comparing((String x) -> { return x.| }));
For the above case, the completion should work (IF you specify String as absolute type of the Comparator, as I have done in the example).
Issues like this stem from the question of how to interpret the characters and therefore deduce, what the programmer might intent to write (the process of auto completion and completion suggestion).
eclipse is very strong in isolating a reference to a named entity, when in regular code, like a method block, a for loop, or any other construct. That is why it works well there. The syntax tree is usually easy to process then.
However when using lambdas, eclipse (and any other IDE for that matter) have a harder time. This is due to the fact, that lambdas work by inferring a lot of implicit information, which would otherwise need to be written explicitly (for example in an explicit implementation of the interface).
If everything else fails, you can create the explicit interface at that position and then convert to a lambda after completing it.

Casting and Generics, Any performance difference?

I am coding in Android a lot lately, Though I am comfortable in JAVA, but missing some
ideas about core concepts being used there.
I am interested to know whether any performance difference is there between these 2 codes.
First Method:
//Specified as member variable.
ArrayList <String> myList = new ArrayList <String>();
and using as String temp = myList.get(1);
2nd Method:
ArrayList myList = new ArrayList(); //Specified as member variable.
and using
String temp1 = myList.get(1).toString();
I know its about casting. Does the first method has great advantage over the second,
Most of the time in real coding I have to use second method because arraylist can take different data types, I end up specifying
ArrayList <Object> = new ArrayList <Object>();
or more generic way.
In short, there's no performance difference worth worrying about, if it exists at all. Generic information isn't stored at runtime anyway, so there's not really anything else happening to slow things down - and as pointed out by other answers it may even be faster (though even if it hypothetically were slightly slower, I'd still advocate using generics.) It's probably good to get into the habit of not thinking about performance so much on this level. Readability and code quality are generally much more important than micro-optimisations!
In short, generics would be the preferred option since they guarantee type safety and make your code cleaner to read.
In terms of the fact you're storing completely different object types (i.e. not related from some inheritance hierarchy you're using) in an arraylist, that's almost definitely a flaw with your design! I can count the times I've done this on one hand, and it was always a temporary bodge.
Generics aren't reified, which means they go away at runtime. Using generics is preferred for several reasons:
It makes your code clearer, as to which classes are interacting
It keeps it type safe: you can't accidentally add a List to a List
It's faster: casting requires the JVM to test type castability at runtime, in case it needs to throw a ClassCastException. With Generics, the compiler knows what types things must be, and so it doesn't need to check them.
There is a performance difference in that code:
The second method is actually slower.
The reason why:
Generics don't require casting/conversion (your code uses a conversion method, not a cast), the type is already correct. So when you call the toString() method, it is an extra call with extra operations that are unnecessary when using the method with generics.
There wouldn't be a problem with casting, as you are using the toString() method. But you could accidentally add an incorrect object (such as an array of Strings). The toString() method would work properly and not throw an exception, but you would get odd results.
As android is used for Mobiles and handheld devices where resources are limited you have to be careful using while coding.
Casting can be overhead if you are using String data type to store in ArrayList.
So in my opinion you should use first method of being specific.
There is no runtime performance difference because of "type erasure".
But if you are using Java 1.5 or above, you SHOULD use generics and not the weakly typed counterparts.
Advantages of generics --
* The flexibility of dynamic binding, with the advantage of static type-checking. Compiler-detected errors are less expensive to repair than those detected at runtime.
* There is less ambiguity between containers, so code reviews are simpler.
* Using fewer casts makes code cleaner.

Is there any runtime cost for Casting in Java?

Would there be any performance differences between these two chunks?
public void doSomething(Supertype input)
{
Subtype foo = (Subtype)input;
foo.methodA();
foo.methodB();
}
vs.
public void doSomething(Supertype input)
{
((Subtype)input).methodA();
((Subtype)input).methodB();
}
Any other considerations or recommendations between these two?
Well, the compiled code probably includes the cast twice in the second case - so in theory it's doing the same work twice. However, it's very possible that a smart JIT will work out that you're doing the same cast on the same value, so it can cache the result. But it is having to do work at least once - after all, it needs to make a decision as to whether to allow the cast to succeed, or throw an exception.
As ever, you should test and profile your code if you care about the performance - but I'd personally use the first form anyway, just because it looks more readable to me.
Yes. Checks must be done with each cast along with the actual mechanism of casting, so casting multiple times will cost more than casting once. However, that's the type of thing that the compiler would likely optimize away. It can clearly see that input hasn't changed its type since the last cast and should be able to avoid multiple casts - or at least avoid some of the casting checks.
In any case, if you're really that worried about efficiency, I'd wonder whether Java is the language that you should be using.
Personally, I'd say to use the first one. Not only is it more readable, but it makes it easier to change the type later. You'll only have to change it in one place instead of every time that you call a function on that variable.
I agree with Jon's comment, do it once, but for what it's worth in the general question of "is casting expensive", from what I remember: Java 1.4 improved this noticeably with Java 5 making casts extremely inexpensive. Unless you are writing a game engine, I don't know if it's something to fret about anymore. I'd worry more about auto-boxing/unboxing and hidden object creation instead.
Acording to this article, there is a cost associated with casting.
Please note that the article is from 1999 and it is up to the reader to decide if the information is still trustworthy!
In the first case :
Subtype foo = (Subtype)input;
it is determined at compile time, so no cost at runtime.
In the second case :
((Subtype)input).methodA();
it is determined at run time because compiler will not know. The jvm has to check if it can converted to a reference of Subtype and if not throw ClassCastException etc. So there will be some cost.

Does Java casting introduce overhead? Why?

Is there any overhead when we cast objects of one type to another? Or the compiler just resolves everything and there is no cost at run time?
Is this a general things, or there are different cases?
For example, suppose we have an array of Object[], where each element might have a different type. But we always know for sure that, say, element 0 is a Double, element 1 is a String. (I know this is a wrong design, but let's just assume I had to do this.)
Is Java's type information still kept around at run time? Or everything is forgotten after compilation, and if we do (Double)elements[0], we'll just follow the pointer and interpret those 8 bytes as a double, whatever that is?
I'm very unclear about how types are done in Java. If you have any reccommendation on books or article then thanks, too.
There are 2 types of casting:
Implicit casting, when you cast from a type to a wider type, which is done automatically and there is no overhead:
String s = "Cast";
Object o = s; // implicit casting
Explicit casting, when you go from a wider type to a more narrow one. For this case, you must explicitly use casting like that:
Object o = someObject;
String s = (String) o; // explicit casting
In this second case, there is overhead in runtime, because the two types must be checked and in case that casting is not feasible, JVM must throw a ClassCastException.
Taken from JavaWorld: The cost of casting
Casting is used to convert between
types -- between reference types in
particular, for the type of casting
operation in which we're interested
here.
Upcast operations (also called
widening conversions in the Java
Language Specification) convert a
subclass reference to an ancestor
class reference. This casting
operation is normally automatic, since
it's always safe and can be
implemented directly by the compiler.
Downcast operations (also called
narrowing conversions in the Java
Language Specification) convert an
ancestor class reference to a subclass
reference. This casting operation
creates execution overhead, since Java
requires that the cast be checked at
runtime to make sure that it's valid.
If the referenced object is not an
instance of either the target type for
the cast or a subclass of that type,
the attempted cast is not permitted
and must throw a
java.lang.ClassCastException.
For a reasonable implementation of Java:
Each object has a header containing, amongst other things, a pointer to the runtime type (for instance Double or String, but it could never be CharSequence or AbstractList). Assuming the runtime compiler (generally HotSpot in Sun's case) cannot determine the type statically a some checking needs to be performed by the generated machine code.
First that pointer to the runtime type needs to be read. This is necessary for calling a virtual method in a similar situation anyway.
For casting to a class type, it is known exactly how many superclasses there are until you hit java.lang.Object, so the type can be read at a constant offset from the type pointer (actually the first eight in HotSpot). Again this is analogous to reading a method pointer for a virtual method.
Then the read value just needs a comparison to the expected static type of the cast. Depending upon instruction set architecture, another instruction will need to branch (or fault) on an incorrect branch. ISAs such as 32-bit ARM have conditional instruction and may be able to have the sad path pass through the happy path.
Interfaces are more difficult due to multiple inheritance of interface. Generally the last two casts to interfaces are cached in the runtime type. IN the very early days (over a decade ago), interfaces were a bit slow, but that is no longer relevant.
Hopefully you can see that this sort of thing is largely irrelevant to performance. Your source code is more important. In terms of performance, the biggest hit in your scenario is liable to be cache misses from chasing object pointers all over the place (the type information will of course be common).
For example, suppose we have an array of Object[], where each element might have a different type. But we always know for sure that, say, element 0 is a Double, element 1 is a String. (I know this is a wrong design, but let's just assume I had to do this.)
The compiler does not note the types of the individual elements of an array. It simply checks that the type of each element expression is assignable to the array element type.
Is Java's type information still kept around at run time? Or everything is forgotten after compilation, and if we do (Double)elements[0], we'll just follow the pointer and interpret those 8 bytes as a double, whatever that is?
Some information is kept around at run time, but not the static types of the individual elements. You can tell this from looking at the class file format.
It is theoretically possible that the JIT compiler could use "escape analysis" to eliminate unnecessary type checks in some assignments. However, doing this to the degree you are suggesting would be beyond the bounds of realistic optimization. The payoff of analysing the types of individual elements would be too small.
Besides, people should not write application code like that anyway.
The byte code instruction for performing casting at runtime is called checkcast. You can disassemble Java code using javap to see what instructions are generated.
For arrays, Java keeps type information at runtime. Most of the time, the compiler will catch type errors for you, but there are cases where you will run into an ArrayStoreException when trying to store an object in an array, but the type does not match (and the compiler didn't catch it). The Java language spec gives the following example:
class Point { int x, y; }
class ColoredPoint extends Point { int color; }
class Test {
public static void main(String[] args) {
ColoredPoint[] cpa = new ColoredPoint[10];
Point[] pa = cpa;
System.out.println(pa[1] == null);
try {
pa[0] = new Point();
} catch (ArrayStoreException e) {
System.out.println(e);
}
}
}
Point[] pa = cpa is valid since ColoredPoint is a subclass of Point, but pa[0] = new Point() is not valid.
This is opposed to generic types, where there is no type information kept at runtime. The compiler inserts checkcast instructions where necessary.
This difference in typing for generic types and arrays makes it often unsuitable to mix arrays and generic types.
In theory, there is overhead introduced.
However, modern JVMs are smart.
Each implementation is different, but it is not unreasonable to assume that there could exist an implementation that JIT optimized away casting checks when it could guarantee that there would never be a conflict.
As for which specific JVMs offer this, I couldn't tell you. I must admit I'd like to know the specifics of JIT optimization myself, but these are for JVM engineers to worry about.
The moral of the story is to write understandable code first. If you're experiencing slowdowns, profile and identify your problem.
Odds are good that it won't be due to casting.
Never sacrifice clean, safe code in an attempt to optimize it UNTIL YOU KNOW YOU NEED TO.

Categories

Resources