How costly is Object Typecasting in terms of performance?
Should I try to avoid Typecasting when possible?
It is cheap enough that it falls into the category of premature optimization. Don't waste time even thinking or asking questions about it unless you have profiled your application and determined that it's a problem, and most importantly: don't compromise your design to avoid it.
JavaWorld: The cost of casting
Casting is used to convert between
types -- between reference types in
particular, for the type of casting
operation in which we're interested
here.
Upcast operations (also called
widening conversions in the Java
Language Specification) convert a
subclass reference to an ancestor
class reference. This casting
operation is normally automatic, since
it's always safe and can be
implemented directly by the compiler.
Downcast operations (also called
narrowing conversions in the Java
Language Specification) convert an
ancestor class reference to a subclass
reference. This casting operation
creates execution overhead, since Java
requires that the cast be checked at
runtime to make sure that it's valid.
If the referenced object is not an
instance of either the target type for
the cast or a subclass of that type,
the attempted cast is not permitted
and must throw a
java.lang.ClassCastException.
Depending on what you mean by typecasting. There is "upcasting" which costs you nothing and there is "downcasting" which costs you a lot. The answer to the second also begins with "it depends". Usually I avoid downcasting in my code because, from my expierience, if it is overused in your code, it means that the design is bad. Which on the other hand does not necessarily have to mean that it should not be used at all.
Typecasting will have a cost because the runtime type information has to be checked to ensure the cast will work. Compared to everything else, I doubt this will be significant, but you could try and measure it.
More generally, typecasting is (IMHO) a sign that something is not right in the design. Sure, sometimes you can't avoid it (working with legacy collections, for example), but I would definitely see if I could remove it.
No it shouldn't affect performance significantly enough to matter.
Related
The error I get from the compiler is "The left hand side of an assignment must be a variable". My use case is deep copying, but is not really relevant.
In C++, one can assign to *this.
The question is not how to circumvent assignment to this. It's very simple, but rather what rationale is there behind the decision not to make this a variable.
Are the reasons technical or conceptual?
My guess so far - the possibility of rebuilding an Object in a random method is error-prone (conceptual), but technically possible.
Please restrain from variations of "because java specs say so". I would like to know the reason for the decision.
In C++, one can assign to *this
Yes, but you can't do this = something in C++, which I actually believe is a closer match for what you're asking about on the Java side here.
[...] what rationale is there behind the decision not to make this a variable.
I would say clarity / readability.
this was chosen to be a reserved word, probably since it's not passed as an explicit argument to a method. Using it as an ordinary parameter and being able to reassign a new value to it, would mess up readability severely.
In fact, many people argue that you shouldn't change argument-variables at all, for this very reason.
Are the reasons technical or conceptual?
Mostly conceptual I would presume. A few technical quirks would arise though. If you could reassign a value to this, you could completely hide instance variables behind local variables for example.
My guess so far - the possibility of rebuilding an Object in a random method is error-prone (conceptual), but technically possible.
I'm not sure I understand this statement fully, but yes, error prone is probably the primary reason behind the decision to make it a keyword and not a variable.
because this is final,
this is keyword, not a variable. and you can't assign something to keyword. now for a min consider if it were a reference variable in design spec..and see the example below
and it holds implicit reference to the object calling method. and it is used for reference purpose only, now consider you assign something to this so won't it break everything ?
Example
consider the following code from String class (Note: below code contains compilation error it is just to demonstrate OP the situation)
public CharSequence subSequence(int beginIndex, int endIndex) {
//if you assign something here
this = "XYZ" ;
// you can imagine the zoombie situation here
return this.substring(beginIndex, endIndex);
}
Are the reasons technical or conceptual?
IMO, conceptual.
The this keyword is a short hand for "the reference to the object whose method you are currently executing". You can't change what that object is. It simply makes no sense in the Java execution model.
Since it makes no sense for this to change, there is no sense in making it a variable.
(Note that in C++ you are assigning to *this, not this. And in Java there is no * operator and no real equivalent to it.)
If you take the view that you could change the target object for a method in mid flight, then here are some counter questions.
What is the use of doing this? What problems would this (hypothetical) linguistic feature help you solve ... that can't be solved in a more easy-to-understand way?
How would you deal with mutexes? For instance, what would happen if you assign to this in the middle of a synchronized method ... and does the proposed semantic make sense? (The problem is that you either end up executing in synchronized method on an object that you don't have a lock on ... or you have to unlock the old this and lock the new this with the complications that that entails. And besides, how does this make sense in terms of what mutexes are designed to achieve?)
How would you make sense of something like this:
class Animal {
foo(Animal other) {
this = other;
// At this point we could be executing the overridden
// Animal version of the foo method ... on a Llama.
}
}
class Llama {
foo(Animal other) {
}
}
Sure you can ascribe a semantic to this but:
you've broken encapsulation of the subclass in a way that is hard to understand, and
you've not actually achieved anything particularly useful.
If you try seriously to answer these questions, I expect you'll come to the conclusion that it would have been a bad idea to implement this. (But if you do have satisfactory answers, I'd encourage you to write them up and post them as your own Answer to your Question!)
But in reality, I doubt that the Java designers even gave this idea more than a moment's consideration. (And rightly so, IMO)
The *this = ... form of C++ is really just a shorthand for a sequence of assignments of the the attributes of the current object. We can already do that in Java ... with a sequence of normal assignments. There is certainly no need for new syntax to support this. (How often does a class reinitialize itself from the state of another class?)
I note that you commented thus:
I wonder what the semantics of this = xy; should be. What do you think it should do? – JimmyB Nov 2 '11 at 12:18
Provided xy is of the right type, the reference of this would be set to xy, making the "original" object gc-eligible - kostja Nov 2 '11 at 12:24
That won't work.
The value of this is (effectively) passed by value to the method when the method is invoked. The callee doesn't know where the this reference came from.
Even if it did, that's only one place where the reference is held. Unless null is assigned in all places, the object cannot be eligible of garbage collection.
Ignoring the fact that this is technically impossible, I do not think that your idea would be useful OR conducive to writing readable / maintainable code. Consider this:
public class MyClass {
public void kill(MyClass other) {
this = other;
}
}
MyClass mine = new MyClass();
....
mine.kill(new MyClass());
// 'mine' is now null!
Why would you want to do that? Supposing that the method name was something innocuous rather than kill, would you expect the method to be able to zap the value of mine?
I don't. In fact, I think that this would be a misfeature: useless and dangerous.
Even without these nasty "make it unreachable" semantics, I don't actually see any good use-cases for modifying this.
this isn't even a variable. It's a keyword, as defined in the Java Language Specification:
When used as a primary expression, the keyword this denotes a value that is a reference to the object for which the instance method was invoked (§15.12), or to the object being constructed
So, it's not possible as it's not possible to assign a value to while.
The this in Java is a part of the language, a key word, not a simple variable. It was made for accessing an object from one of its methods, not another object. Assigning another object to it would cause a mess. If you want to save another objects reference in your object, just create a new variable.
The reason is just conceptual. this was made for accessing an Object itself, for example to return it in a method. Like I said, it would cause a mess if you would assign another reference to it. Tell me a reason why altering this would make sense.
Assigning to (*this) in C++ performs a copy operation -- treating the object as a value-type.
Java does not use the concept of a value-type for classes. Object assignment is always by-reference.
To copy an object as if it were a value-type: How do I copy an object in Java?
The terminology used for Java is confusing though: Is Java “pass-by-reference” or “pass-by-value”
Answer: Java passes references by value. (from here)
In other words, because Java never treats non-primitives as value-types, every class-type variable is a reference (effectively a pointer).
So when I say, "object assignment is always by-reference", it might be more technically accurate to phrase that as "object assignment is always by the value of the reference".
The practical implication of the distinction drawn by Java always being pass-by-value is embodied in the question "How do I make my swap function in java?", and its answer: You can't. Languages such as C and C++ are able to provide swap functions because they, unlike Java, allow you to assign from any variable by using a reference to that variable -- thus allowing you to change its value (if non-const) without changing the contents of the object that it previously referenced.
It could make your head spin to try to think this all the way through, but here goes nothing...
Java class-type variables are always "references" which are effectively pointers.
Java pointers are primitive types.
Java assignment is always by the value of the underlying primitive (the pointer in this case).
Java simply has no mechanism equivalent to C/C++ pass-by-reference that would allow you to indirectly modify a free-standing primitive type, which may be a "pointer" such as this.
Additionally, it is interesting to note that C++ actually has two different syntaxes for pass-by-reference. One is based on explicit pointers, and was inherited from the C language. The other is based on the C++ reference-type operator &. [There is also the C++ smart pointer form of reference management, but that is more akin to Java-like semantics -- where the references themselves are passed by value.]
Note: In the above discussion assign-by and pass-by are generally interchangeable terminology. Underlying any assignment, is a conceptual operator function that performs the assignment based on the right-hand-side object being passed in.
So coming back to the original question: If you could assign to this in Java, that would imply changing the value of the reference held by this. That is actually equivalent to assigning directly to this in C++, which is not legal in that language either.
In both Java and C++, this is effectively a pointer that cannot be modified. Java seems different because it uses the . operator to dereference the pointer -- which, if you're used to C++ syntax, gives you the impression that it isn't one.
You can, of course, write something in Java that is similar to a C++ copy constructor, but unlike with C++, there is no way of getting around the fact that the implementation will need to be supplied in terms of an explicit member-wise initialization. [In C++ you can avoid this, ultimately, only because the compiler will provide a member-wise implementation of the assignment operator for you.]
The Java limitation that you can't copy to this as a whole is sort-of artificial though. You can achieve exactly the same result by writing it out member-wise, but the language just doesn't have a natural way of specifying such an operation to be performed on a this -- the C++ syntax, (*this) doesn't have an analogue in Java.
And, in fact, there is no built-in operation in Java that reassigns the contents of any existing object -- even if it's not referred to as this. [Such an operation is probably more important for stack-based objects such as are common in C++.]
Regarding the use-case of performing a deep copy: It's complicated in Java.
For C++, a value-type-oriented language. The semantic intention of assignment is generally obvious. If I say a=b, I typically want a to become and independent clone of b, containing an equal value. C++ does this automatically for assignment, and there are plans to automate the process, also, for the comparison.
For Java, and other reference-oriented languages, copying an object, in a generic sense, has ambiguous meaning. Primitives aside, Java doesn't differentiate between value-types and reference-types, so copying an object has to consider every nested class-type member (including those of the parent) and decide, on a case-by-case basis, if that member object should be copied or just referenced. If left to default implementations, there is a very good chance that result would not be what you want.
Comparing objects for equality in Java suffers from the same ambiguities.
Based on all of this, the answer to the underlying question: why can't I copy an object by some simple, automatically generated, operation on this, is that fundamentally, Java doesn't have a clear notion of what it means to copy an object.
One last point, to answer the literal question:
What rationale is there behind the decision not to make this a variable?
It would simply be pointless to do so. The value of this is just a pointer that has been passed to a function, and if you were able to change the value of this, it could not directly affect whatever object, or reference, was used to invoke that method. After all, Java is pass-by-value.
Assigning to *this in C++ isn't equivalent to assigning this in Java. Assigning this is, and it isn't legal in either language.
I am coding in Android a lot lately, Though I am comfortable in JAVA, but missing some
ideas about core concepts being used there.
I am interested to know whether any performance difference is there between these 2 codes.
First Method:
//Specified as member variable.
ArrayList <String> myList = new ArrayList <String>();
and using as String temp = myList.get(1);
2nd Method:
ArrayList myList = new ArrayList(); //Specified as member variable.
and using
String temp1 = myList.get(1).toString();
I know its about casting. Does the first method has great advantage over the second,
Most of the time in real coding I have to use second method because arraylist can take different data types, I end up specifying
ArrayList <Object> = new ArrayList <Object>();
or more generic way.
In short, there's no performance difference worth worrying about, if it exists at all. Generic information isn't stored at runtime anyway, so there's not really anything else happening to slow things down - and as pointed out by other answers it may even be faster (though even if it hypothetically were slightly slower, I'd still advocate using generics.) It's probably good to get into the habit of not thinking about performance so much on this level. Readability and code quality are generally much more important than micro-optimisations!
In short, generics would be the preferred option since they guarantee type safety and make your code cleaner to read.
In terms of the fact you're storing completely different object types (i.e. not related from some inheritance hierarchy you're using) in an arraylist, that's almost definitely a flaw with your design! I can count the times I've done this on one hand, and it was always a temporary bodge.
Generics aren't reified, which means they go away at runtime. Using generics is preferred for several reasons:
It makes your code clearer, as to which classes are interacting
It keeps it type safe: you can't accidentally add a List to a List
It's faster: casting requires the JVM to test type castability at runtime, in case it needs to throw a ClassCastException. With Generics, the compiler knows what types things must be, and so it doesn't need to check them.
There is a performance difference in that code:
The second method is actually slower.
The reason why:
Generics don't require casting/conversion (your code uses a conversion method, not a cast), the type is already correct. So when you call the toString() method, it is an extra call with extra operations that are unnecessary when using the method with generics.
There wouldn't be a problem with casting, as you are using the toString() method. But you could accidentally add an incorrect object (such as an array of Strings). The toString() method would work properly and not throw an exception, but you would get odd results.
As android is used for Mobiles and handheld devices where resources are limited you have to be careful using while coding.
Casting can be overhead if you are using String data type to store in ArrayList.
So in my opinion you should use first method of being specific.
There is no runtime performance difference because of "type erasure".
But if you are using Java 1.5 or above, you SHOULD use generics and not the weakly typed counterparts.
Advantages of generics --
* The flexibility of dynamic binding, with the advantage of static type-checking. Compiler-detected errors are less expensive to repair than those detected at runtime.
* There is less ambiguity between containers, so code reviews are simpler.
* Using fewer casts makes code cleaner.
Would there be any performance differences between these two chunks?
public void doSomething(Supertype input)
{
Subtype foo = (Subtype)input;
foo.methodA();
foo.methodB();
}
vs.
public void doSomething(Supertype input)
{
((Subtype)input).methodA();
((Subtype)input).methodB();
}
Any other considerations or recommendations between these two?
Well, the compiled code probably includes the cast twice in the second case - so in theory it's doing the same work twice. However, it's very possible that a smart JIT will work out that you're doing the same cast on the same value, so it can cache the result. But it is having to do work at least once - after all, it needs to make a decision as to whether to allow the cast to succeed, or throw an exception.
As ever, you should test and profile your code if you care about the performance - but I'd personally use the first form anyway, just because it looks more readable to me.
Yes. Checks must be done with each cast along with the actual mechanism of casting, so casting multiple times will cost more than casting once. However, that's the type of thing that the compiler would likely optimize away. It can clearly see that input hasn't changed its type since the last cast and should be able to avoid multiple casts - or at least avoid some of the casting checks.
In any case, if you're really that worried about efficiency, I'd wonder whether Java is the language that you should be using.
Personally, I'd say to use the first one. Not only is it more readable, but it makes it easier to change the type later. You'll only have to change it in one place instead of every time that you call a function on that variable.
I agree with Jon's comment, do it once, but for what it's worth in the general question of "is casting expensive", from what I remember: Java 1.4 improved this noticeably with Java 5 making casts extremely inexpensive. Unless you are writing a game engine, I don't know if it's something to fret about anymore. I'd worry more about auto-boxing/unboxing and hidden object creation instead.
Acording to this article, there is a cost associated with casting.
Please note that the article is from 1999 and it is up to the reader to decide if the information is still trustworthy!
In the first case :
Subtype foo = (Subtype)input;
it is determined at compile time, so no cost at runtime.
In the second case :
((Subtype)input).methodA();
it is determined at run time because compiler will not know. The jvm has to check if it can converted to a reference of Subtype and if not throw ClassCastException etc. So there will be some cost.
Is there any overhead when we cast objects of one type to another? Or the compiler just resolves everything and there is no cost at run time?
Is this a general things, or there are different cases?
For example, suppose we have an array of Object[], where each element might have a different type. But we always know for sure that, say, element 0 is a Double, element 1 is a String. (I know this is a wrong design, but let's just assume I had to do this.)
Is Java's type information still kept around at run time? Or everything is forgotten after compilation, and if we do (Double)elements[0], we'll just follow the pointer and interpret those 8 bytes as a double, whatever that is?
I'm very unclear about how types are done in Java. If you have any reccommendation on books or article then thanks, too.
There are 2 types of casting:
Implicit casting, when you cast from a type to a wider type, which is done automatically and there is no overhead:
String s = "Cast";
Object o = s; // implicit casting
Explicit casting, when you go from a wider type to a more narrow one. For this case, you must explicitly use casting like that:
Object o = someObject;
String s = (String) o; // explicit casting
In this second case, there is overhead in runtime, because the two types must be checked and in case that casting is not feasible, JVM must throw a ClassCastException.
Taken from JavaWorld: The cost of casting
Casting is used to convert between
types -- between reference types in
particular, for the type of casting
operation in which we're interested
here.
Upcast operations (also called
widening conversions in the Java
Language Specification) convert a
subclass reference to an ancestor
class reference. This casting
operation is normally automatic, since
it's always safe and can be
implemented directly by the compiler.
Downcast operations (also called
narrowing conversions in the Java
Language Specification) convert an
ancestor class reference to a subclass
reference. This casting operation
creates execution overhead, since Java
requires that the cast be checked at
runtime to make sure that it's valid.
If the referenced object is not an
instance of either the target type for
the cast or a subclass of that type,
the attempted cast is not permitted
and must throw a
java.lang.ClassCastException.
For a reasonable implementation of Java:
Each object has a header containing, amongst other things, a pointer to the runtime type (for instance Double or String, but it could never be CharSequence or AbstractList). Assuming the runtime compiler (generally HotSpot in Sun's case) cannot determine the type statically a some checking needs to be performed by the generated machine code.
First that pointer to the runtime type needs to be read. This is necessary for calling a virtual method in a similar situation anyway.
For casting to a class type, it is known exactly how many superclasses there are until you hit java.lang.Object, so the type can be read at a constant offset from the type pointer (actually the first eight in HotSpot). Again this is analogous to reading a method pointer for a virtual method.
Then the read value just needs a comparison to the expected static type of the cast. Depending upon instruction set architecture, another instruction will need to branch (or fault) on an incorrect branch. ISAs such as 32-bit ARM have conditional instruction and may be able to have the sad path pass through the happy path.
Interfaces are more difficult due to multiple inheritance of interface. Generally the last two casts to interfaces are cached in the runtime type. IN the very early days (over a decade ago), interfaces were a bit slow, but that is no longer relevant.
Hopefully you can see that this sort of thing is largely irrelevant to performance. Your source code is more important. In terms of performance, the biggest hit in your scenario is liable to be cache misses from chasing object pointers all over the place (the type information will of course be common).
For example, suppose we have an array of Object[], where each element might have a different type. But we always know for sure that, say, element 0 is a Double, element 1 is a String. (I know this is a wrong design, but let's just assume I had to do this.)
The compiler does not note the types of the individual elements of an array. It simply checks that the type of each element expression is assignable to the array element type.
Is Java's type information still kept around at run time? Or everything is forgotten after compilation, and if we do (Double)elements[0], we'll just follow the pointer and interpret those 8 bytes as a double, whatever that is?
Some information is kept around at run time, but not the static types of the individual elements. You can tell this from looking at the class file format.
It is theoretically possible that the JIT compiler could use "escape analysis" to eliminate unnecessary type checks in some assignments. However, doing this to the degree you are suggesting would be beyond the bounds of realistic optimization. The payoff of analysing the types of individual elements would be too small.
Besides, people should not write application code like that anyway.
The byte code instruction for performing casting at runtime is called checkcast. You can disassemble Java code using javap to see what instructions are generated.
For arrays, Java keeps type information at runtime. Most of the time, the compiler will catch type errors for you, but there are cases where you will run into an ArrayStoreException when trying to store an object in an array, but the type does not match (and the compiler didn't catch it). The Java language spec gives the following example:
class Point { int x, y; }
class ColoredPoint extends Point { int color; }
class Test {
public static void main(String[] args) {
ColoredPoint[] cpa = new ColoredPoint[10];
Point[] pa = cpa;
System.out.println(pa[1] == null);
try {
pa[0] = new Point();
} catch (ArrayStoreException e) {
System.out.println(e);
}
}
}
Point[] pa = cpa is valid since ColoredPoint is a subclass of Point, but pa[0] = new Point() is not valid.
This is opposed to generic types, where there is no type information kept at runtime. The compiler inserts checkcast instructions where necessary.
This difference in typing for generic types and arrays makes it often unsuitable to mix arrays and generic types.
In theory, there is overhead introduced.
However, modern JVMs are smart.
Each implementation is different, but it is not unreasonable to assume that there could exist an implementation that JIT optimized away casting checks when it could guarantee that there would never be a conflict.
As for which specific JVMs offer this, I couldn't tell you. I must admit I'd like to know the specifics of JIT optimization myself, but these are for JVM engineers to worry about.
The moral of the story is to write understandable code first. If you're experiencing slowdowns, profile and identify your problem.
Odds are good that it won't be due to casting.
Never sacrifice clean, safe code in an attempt to optimize it UNTIL YOU KNOW YOU NEED TO.
In Beyond Java(Section 2.2.9), Brute Tate claims that "typing model" is one of the problems of C++. What does that mean?
What he means is that objects in C++ don't intrinsically have types. While you might write
struct Dog {
char* name;
int breed;
};
Dog ralph("Ralph", POODLE);
in truth ralph doesn't have a type; it's just a bunch of bits, and the CPU doesn't give a damn about the fact that you call that collection of bits a Dog. For example, the following is valid:
struct Cat {
int color;
char* country_of_origin;
};
Cat ralph_is_that_you = * (Cat*) &ralph;
Watch in wonder as Professor C performs trans-species mutations between Dogs and Cats! The point here is that since ralph is just a sequence of bits, you can just claim that that sequence of bits is really a Cat and nothing would go wrong... except that the "Cat"'s color would be some random large integer, and you better not try to read it's country of origin. The fundamental problems is that while a variable (as in, the name, not the object it represents) has a type, the underlying object does not.
Compare this with JAVA, where not only types, but objects have intrinsic types. This may be due partly to the fact that there are no pointers and thus no access to memory, but the fact nonetheless exists that if you cast a Dog to an Object, you can't cast it back down to a Cat, because the object knows that deep down, it's actually a Dog, not a Cat.
The weak typing present in C++ is rather detrimental, because it makes the compiler static type checks nearly useless if you want to truly abuse-proof your application, and also makes secure and robust software hard to write. For example, you need to be very careful whenever you access a "pointer" because it could really be any random bit pattern.
EDIT 1: The comments had very good points, and I'd like to add them here.
kts points out that Sun's JAVA does indeed have pointers if you look deeply enough. Thanks! I hadn't known, and that's rather cool. However, the fundamental point is that JAVA objects are typed and C types aren't. Yes, you can circumvent this, but this is the same as the difference between opt-in and opt-out spam: yes, you could abuse the JAVA pointers, but the default is that no abuse is possible. You'd have to opt in.
martin-york points out that the example I showed is a purely C phenomenon. This is true, but
C++ mostly contains C as a subset (the differences are usually too minor to list).
C++ includes reinterpret_cast<T> specifically to allow hacks like this.
Just because it's discouraged doesn't mean it isn't pervasive or dangerous. Basically, even if JAVA has opt-in pointers (as I'll call them), the fact is that the person using them has probably thought of the consequences. C's casts are so easy that they're at times done without thinking (To quote Stroustroup, "But the new syntax was made deliberately ugly, because casting is still an ugly and often unsafe operation."). There's also the fact that the work needed to circumvent JAVA's type system is far more than what would make for a clever hack, while circumventing the C type system (and, yes, the C++ type system) is easy enough that I've seen it done just for a minor performance boost.
Anyway, discouraging something doesn't make it not happen. I discourage bad coding, but I haven't seen it get me anywhere...
As for the feature being useful, it admittedly is (just look up "fast inverse square root" on Google or Wikipedia) but it is dangerous enough that, following Stroustroup's maxim that ugly operations should be ugly, the difficulty threshold should be significantly higher.
That it is hard to type C++ code. :-p
Seriously though, they are probably referring to the fact that C++ has a weak static type system, that can be easily circumvented. Some examples: typedefs are not real types, enumerated types are just ints, booleans and integers are equivalent in many cases, and so on.