How expensive is reflexion or introspection in java - java

I want to use reflexion or introspection in java for replace by "IF - ELSE" statements, I want to know how that expensive is "Reflexion" VS "IF - ELSE" statements? and which is more efficient, if I use into a loop with approximately 700.000 iterations?

See Effective Java, Item 53: Prefer interfaces to reflection
Performance suffers. Reflective method invocation is much slower than
normal method invocation. Exactly how much slower is hard to say, because
there are so many factors at work. On my machine, the speed difference can be
as small as a factor of two or as large as a factor of fifty.

I find this documentation, and I understand the risk

The problem with conditions is, as you might know, that they cripple predictability features of processors.
This means that a prefetch will actually turn into a disadvantage, if it is wrong.
This is what happens in todays processors when they don't know what the result of a comparison might be, so they bet.
In an IF-ELSE statement the processor has a 50% probability of correctly predicting the next instruction to be executed.
If we are lucky we execute more instructions per cycle, if we are not, we get a medium of ~16 cycle penalty for the processor to recover from the wrong prefetching.
That said let's go on to reflection.
Reflection is mean, IF performance is important to you. In most frameworks reflection vs. address call is C_reflection = 3*C_address_call. In Java it is even worse and I don't really have official numbers about it. The biggest problem is name/address/availability resolution.
That said, lets go to the real world and see some numbers. As we now can understand, justify and even predict results.
We test 2 classes, 3 tests. For a total of 10M calls/test | 5M calls per class:
Conditional
Reflection
Common Interface
And these are the numbers in seconds:
Conditions - 0.022333
Reflection - 3.02087
Interface - 0.012547
So common interface is the winner in your case, second comes conditional call with almost double the execution time (for the reasons specified above). And last one with an extremely notable difference comes reflection.
Here is the code of the test * :
import java.lang.reflect.InvocationTargetException;
public class JavaReflectionTest {
public interface ClassInterface {
public void execute();
public String getCount();
}
public static class ClassOne implements ClassInterface {
int count = 0;
public String getCount(){
return String.valueOf(count);
}
public void execute(){
count++;
}
}
public static class ClassTwo implements ClassInterface {
int count = 0;
public String getCount(){
return String.valueOf(count);
}
public void execute(){
count++;
}
}
public static void main(String[] args) throws SecurityException, NoSuchMethodException,
IllegalArgumentException, IllegalAccessException,
InvocationTargetException, ClassNotFoundException, InterruptedException {
ClassOne one = new ClassOne();
ClassTwo two = new ClassTwo();
ClassInterface ione = new ClassOne();
ClassInterface itwo = new ClassTwo();
long stopT;
long startT;
int i;
int mod;
//Warm up
for(i=0;i<350000;i++){
one.execute();
two.execute();
ione.execute();
itwo.execute();
one.getClass().getMethod("execute").invoke(one,null);
two.getClass().getMethod("execute").invoke(two,null);
}
//Test conditional call
one = new ClassOne();
two = new ClassTwo();
Thread.sleep(1000);
startT=System.nanoTime();
for(i=0;i<10000000;i++){
mod=i%2;
if(mod==0)
one.execute();
else
two.execute();
}
stopT=System.nanoTime();
System.out.println("Conditions - " + ((stopT-startT)/1000000000.0f)+ " Calls 1: " + one.getCount() + " Calls 2: " + two.getCount());
//Test reflection
one = new ClassOne();
two = new ClassTwo();
Thread.sleep(1000);
startT = System.nanoTime();
for(i=0;i<5000000;i++){
mod=i%2;
one.getClass().getMethod("execute").invoke(one,null);
two.getClass().getMethod("execute").invoke(two,null);
mod=i%2;
}
stopT=System.nanoTime();
System.out.println("Reflection - " + ((stopT-startT)/1000000000.0f)+ " Calls 1: " + one.getCount() + " Calls 2: " + two.getCount());
//Test common interface
ione = new ClassOne();
itwo = new ClassTwo();
Thread.sleep(1000);
startT = System.nanoTime();
for(i=0;i<5000000;i++){
mod=i%2;
ione.execute();
itwo.execute();
mod=i%2;
}
stopT=System.nanoTime();
System.out.println("Interface - " + ((stopT-startT)/1000000000.0f) + " Calls 1: " + ione.getCount() + " Calls 2: " + itwo.getCount());
}
}
before creating performance tests we need to be sure that the compiler will optimize/modify our code as less as possible, this explains some of the statements in the test that you might not find intuitive.

Related

Benchmarking in Java (comparing two classes)

I'll want to compare the speed of work two classes (StringBuider and StringBuffer) using append method.
And I wrote very simple program:
public class Test {
public static void main(String[] args) {
try {
test(new StringBuffer("")); // StringBuffer: 35117ms.
test(new StringBuilder("")); // StringBuilder: 3358ms.
} catch (IOException e) {
System.err.println(e.getMessage());
}
}
private static void test(Appendable obj) throws IOException {
long before = System.currentTimeMillis();
for (int i = 0; i++ < 1e9; ) {
obj.append("");
}
long after = System.currentTimeMillis();
System.out.println(obj.getClass().getSimpleName() + ": " +
(after - before) + "ms.");
}
}
But I know, that it's bad way for benchmarking. I want to put the annotations on the method or class, set the number of iterations, tests, different conditions and at the output to get accurate results.
Please advise a good library or standard Java tools to solve this problem. Additionally, if not difficult, write a good benchmarking.
Thanks in advance!
JMH, the Java Microbenchmark Harness, allows to run correct micro benchmarks. It uses annotations to express benchmark parameters.

java optimization nitpick: is it faster to cast something and let it throw exception than calling instanceof to check before cast?

Before anyone says anything I'm asking this out of curiosity only; I'm not planning to do any premature optimization based off of this answer.
My question is about speed in using reflection and casting. The standard saying is 'reflection is slow'. My question is what part exactly is slow, and why; particularly in comparing if something is a parent of another instance.
I'm pretty confident that just comparing the class of an object to another Class object is about as fast as any comparison, presumably just doing direct comparison of singleton objects that are already stored int he Object's state; but what if one class is a parent of the other?
I usually think of instanceof as being about as fast as regular class checking, but today I thought about it and it seems that some reflection has to happen 'under the scenes' for instanceof to work. I checked online and found a few places where someone said instanceof is slow; presumably due to reflection required to compare the parent of an object?
This lead to the next question, what about just casting. If I cast something as an object it's not I get a ClassCastException. But this doesn't happen if a cast an object to a parent of itself. Essentially I'm doing an instanceof call, or logic to that effect, when I do the cast at run time am I not? I've never heard anyone hint that casting an object could be slow before. Admittedly not all casts are to a parent of the provided object, but plenty of casts are to parent classes. Yet never has anyone hinted at all that this could be slow.
So which is it. Is instanceof really not that slow? Are both instanceof and casting to parent class kind of slow? or is there some reason a cast can be done faster then an instanceof call?
As always try it and see in your particular situations, but:
-Exceptions are expensive, remarkably so.
-Using exceptions for code flow is almost always a bad idea
Edit:
Ok I was interested so I wrote a quick test system
public class Test{
public Test(){
B b=new B();
C c=new C();
for(int i=0;i<10000;i++){
testUsingInstanceOf(b);
testUsingInstanceOf(c);
testUsingException(b);
testUsingException(c);
}
}
public static void main(String[] args){
Test test=new Test();
}
public static boolean testUsingInstanceOf(A possiblyB){
if (possiblyB instanceof B){
return true;
}else{
return false;
}
}
public static boolean testUsingException(A possiblyB){
try{
B b=(B)possiblyB;
return true;
}catch(Exception e){
return false;
}
}
private class A{
}
private class B extends A{
}
private class C extends A{
}
}
Profile results:
by InstanceOf: 4.43 ms
by Exception: 79.4 ms
as I say, remarkably expensive
And even when its always going to be a B (simulating when you're 99% sure its B you just need to be sure its still no faster:
Profile results when always B:
by InstanceOf: 4.48 ms
by Exception: 4.51 ms
There is a general answer and a particular answer.
General case
if (/* guard against exception */) {
/* do something that would throw an exception */
} else {
/* recover */
}
// versus
try {
/* do something that would throw an exception */
} catch (TheException ex) {
/* recover */
}
It is a fact that creating/throwing/catching an exception is expensive. And they are likely to be significantly more expensive than doing the test. However, that doesn't mean that the "test first" version is always faster. This is because in the "test first" version, the tests may actually be performed: first time in the if, and second time in the code that would throw the exception.
When you take that into account, it is clear that if the cost of the (extra) test is large enough and the relative frequency of exceptions is small enough, "test first" will actually be slower. For example, in:
if (file.exists() && file.isReadable()) {
is = new FileInputStream(file);
} else {
System.err.println("missing file");
}
versus
try {
is = new FileInputStream(file);
} catch (IOException ex) {
System.err.println("missing file");
}
the "test first" approach performs 2 extra system calls, and system calls are expensive. If the "missing file" scenario is also unusual ....
The second confounding factor is that the most recent HotSpot JIT compilers do some significant optimization of exceptions. In particular, if the JIT compiler can figure out that the state of the exception object is not used, it may turn the exception create/throw/catch into a simple jump instruction.
The specific case of instanceof
In this case we are most likely comparing these two:
if (o instanceof Foo) {
Foo f = (Foo) o;
/* ... */
}
// versus
try {
Foo f = (Foo) o;
} catch (ClassCastException ex) {
/* */
}
Here a second optimization occurs. An instanceof followed by a type cast is a common pattern. A HotSpot JIT compiler can often eliminate the dynamic type check that is performed by the type cast ... since this is repeating a test that just succeeded. When you factor this in, it the "test first" version cannot be slower than the "exception" version ... even if the latter is optimized to a jump.
Unfortunately, Richard's code can't be run directly to produce timings. I've modified the question slightly, assuming that you really want "if A is a B, do something on B", rather than just asking the question "is A a B?". Here is the test code.
UPDATED FROM ORIGINAL
It was rather challenging to write trivial code that the HotSpot compiler didn't reduce to nothing, but I think the following is good:
package net.redpoint.utils;
public class Scratch {
public long counter = 0;
public class A {
public void inc() { counter++; }
}
public class B extends A {
public void inc() { counter++; }
}
public class C extends A {
public void inc() { counter++; }
}
public A[] a = new A[3];
public void test() {
a[0] = new A();
a[1] = new B();
a[2] = new C();
int iter = 100000000;
long start = System.nanoTime();
for(int i = iter; i > 0; i--) {
testUsingInstanceOf(a[i%3]);
}
long end = System.nanoTime();
System.out.println("instanceof: " + iter / ((end - start) / 1000000000.0) + " per second");
start = System.nanoTime();
for(int i = iter; i > 0; i--) {
testUsingException(a[i%3]);
}
end = System.nanoTime();
System.out.println("try{}: " + iter / ((end - start) / 1000000000.0) + " per second");
start = System.nanoTime();
for(int i = iter; i > 0; i--) {
testUsingClassName(a[i%3]);
}
end = System.nanoTime();
System.out.println("classname: " + iter / ((end - start) / 1000000000.0) + " per second");
}
public static void main(String[] args) {
Scratch s = new Scratch();
s.test();
}
public void testUsingInstanceOf(A possiblyB){
if (possiblyB instanceof B){
((B)possiblyB).inc();
}
}
public void testUsingException(A possiblyB){
try{
((B)possiblyB).inc();
} catch(Exception e){
}
}
public void testUsingClassName(A possiblyB){
if (possiblyB.getClass().getName().equals("net.redpoint.utils.Scratch$B")){
((B)possiblyB).inc();
}
}
}
The resulting output:
instanceof: 4.573174070960945E8 per second
try{}: 3.926650051387284E8 per second
classname: 7.689439655530204E7 per second
Test was performed using Oracle JRE7 SE on Windows 8 x64, with Intel i7 sandy bridge CPU.
If the cast can throw an exception, it means that it's doing implicitly what instanceof would do. So, in both cases you are implicitly using reflection, probably in exactly the same way.
The difference is that if the result of instanceof comes back false, no more reflection is happening. If a cast fails and an exception is thrown, you'll have unrolling of the execution stack and, quite possibly, more reflection (for the runtime to determine the correct catch block, based on whether the thrown exception object is an instance of the type to be caught).
The above logic tells me that an instanceof check should be faster. Of course, benchmarking for your particular case would give you the definitive answer.
I don't think you would find one being clearly better than the other.
For instanceof, the work done use memory and cpu time. Creating a exception use memory and cpu time also. Which use less of each, only a well done benchmark would give you that answer.
Coding-wise, I would prefer to see a instanceof rather than casting and having to manage exceptions.

Why are interfaces slower than abstract classes [duplicate]

This is question comes in mind when I finding difference between abstract class and interface.
In this post I came to know that interfaces are slow as they required extra indirection.
But I am not getting what type of indirection required by the interface and not by the abstract class or concrete class.Please clarify on it.
Thanks in advance
There are many performance myths, and some were probably true several years ago, and some might still be true on VMs that don't have a JIT.
The Android documentation (remember that Android don't have a JVM, they have Dalvik VM) used to say that invoking a method on an interfaces was slower than invoking it on a class, so they were contributing to spreading the myth (it's also possible that it was slower on the Dalvik VM before they turned on the JIT). The documentation does now say:
Performance Myths
Previous versions of this document made various misleading claims. We
address some of them here.
On devices without a JIT, it is true that invoking methods via a
variable with an exact type rather than an interface is slightly more
efficient. (So, for example, it was cheaper to invoke methods on a
HashMap map than a Map map, even though in both cases the map was a
HashMap.) It was not the case that this was 2x slower; the actual
difference was more like 6% slower. Furthermore, the JIT makes the two
effectively indistinguishable.
Source: Designing for performance on Android
The same thing is probably true for the JIT in the JVM, it would be very odd otherwise.
If in doubt, measure it. My results showed no significant difference. When run, the following program produced:
7421714 (abstract)
5840702 (interface)
7621523 (abstract)
5929049 (interface)
But when I switched the places of the two loops:
7887080 (interface)
5573605 (abstract)
7986213 (interface)
5609046 (abstract)
It appears that abstract classes are slightly (~6%) faster, but that should not be noticeable; These are nanoseconds. 7887080 nanoseconds are ~7 milliseconds. That makes it a difference of 0.1 millis per 40k invocations (Java version: 1.6.20)
Here's the code:
public class ClassTest {
public static void main(String[] args) {
Random random = new Random();
List<Foo> foos = new ArrayList<Foo>(40000);
List<Bar> bars = new ArrayList<Bar>(40000);
for (int i = 0; i < 40000; i++) {
foos.add(random.nextBoolean() ? new Foo1Impl() : new Foo2Impl());
bars.add(random.nextBoolean() ? new Bar1Impl() : new Bar2Impl());
}
long start = System.nanoTime();
for (Foo foo : foos) {
foo.foo();
}
System.out.println(System.nanoTime() - start);
start = System.nanoTime();
for (Bar bar : bars) {
bar.bar();
}
System.out.println(System.nanoTime() - start);
}
abstract static class Foo {
public abstract int foo();
}
static interface Bar {
int bar();
}
static class Foo1Impl extends Foo {
#Override
public int foo() {
int i = 10;
i++;
return i;
}
}
static class Foo2Impl extends Foo {
#Override
public int foo() {
int i = 10;
i++;
return i;
}
}
static class Bar1Impl implements Bar {
#Override
public int bar() {
int i = 10;
i++;
return i;
}
}
static class Bar2Impl implements Bar {
#Override
public int bar() {
int i = 10;
i++;
return i;
}
}
}
An object has a "vtable pointer" of some kind which points to a "vtable" (method pointer table) for its class ("vtable" might be the wrong terminology, but that's not important). The vtable has pointers to all the method implementations; each method has an index which corresponds to a table entry. So, to call a class method, you just look up the corresponding method (using its index) in the vtable. If one class extends another, it just has a longer vtable with more entries; calling a method from the base class still uses the same procedure: that is, look up the method by its index.
However, in calling a method from an interface via an interface reference, there must be some alternative mechanism to find the method implementation pointer. Because a class can implement multiple interfaces, it's not possible for the method to always have the same index in the vtable (for instance). There are various possible ways to resolve this, but no way that is quite as efficient as simple vtable dispatch.
However, as mentioned in the comments, it probably won't make much difference with a modern Java VM implementation.
This is variation on Bozho example. It runs longer and re-uses the same objects so the cache size doesn't matter so much. I also use an array so there is no overhead from the iterator.
public static void main(String[] args) {
Random random = new Random();
int testLength = 200 * 1000 * 1000;
Foo[] foos = new Foo[testLength];
Bar[] bars = new Bar[testLength];
Foo1Impl foo1 = new Foo1Impl();
Foo2Impl foo2 = new Foo2Impl();
Bar1Impl bar1 = new Bar1Impl();
Bar2Impl bar2 = new Bar2Impl();
for (int i = 0; i < testLength; i++) {
boolean flip = random.nextBoolean();
foos[i] = flip ? foo1 : foo2;
bars[i] = flip ? bar1 : bar2;
}
long start;
start = System.nanoTime();
for (Foo foo : foos) {
foo.foo();
}
System.out.printf("The average abstract method call was %.1f ns%n", (double) (System.nanoTime() - start) / testLength);
start = System.nanoTime();
for (Bar bar : bars) {
bar.bar();
}
System.out.printf("The average interface method call was %.1f ns%n", (double) (System.nanoTime() - start) / testLength);
}
prints
The average abstract method call was 4.2 ns
The average interface method call was 4.1 ns
if you swap the order the tests are run you get
The average interface method call was 4.2 ns
The average abstract method call was 4.1 ns
There is more difference in how you run the test than which one you chose.
I got the same result with Java 6 update 26 and OpenJDK 7.
BTW: If you add a loop which only call the same object each time, you get
The direct method call was 2.2 ns
I tried to write a test that would quantify all of the various ways methods might be invoked. My findings show that it is not whether a method is an interface method or not that matters, but rather the type of the reference through which you are calling it. Calling an interface method through a class reference is much faster (relative to the number of calls) than calling the same method on the same class via an interface reference.
The results for 1,000,000 calls are...
interface method via interface reference: (nanos, millis) 5172161.0, 5.0
interface method via abstract reference: (nanos, millis) 1893732.0, 1.8
interface method via toplevel derived reference: (nanos, millis) 1841659.0, 1.8
Concrete method via concrete class reference: (nanos, millis) 1822885.0, 1.8
Note that the first two lines of the results are calls to the exact same method, but via different references.
And here is the code...
package interfacetest;
/**
*
* #author rpbarbat
*/
public class InterfaceTest
{
static public interface ITest
{
public int getFirstValue();
public int getSecondValue();
}
static abstract public class ATest implements ITest
{
int first = 0;
#Override
public int getFirstValue()
{
return first++;
}
}
static public class TestImpl extends ATest
{
int second = 0;
#Override
public int getSecondValue()
{
return second++;
}
}
static public class Test
{
int value = 0;
public int getConcreteValue()
{
return value++;
}
}
static int loops = 1000000;
/**
* #param args the command line arguments
*/
public static void main(String[] args)
{
// Get some various pointers to the test classes
// To Interface
ITest iTest = new TestImpl();
// To abstract base
ATest aTest = new TestImpl();
// To impl
TestImpl testImpl = new TestImpl();
// To concrete
Test test = new Test();
System.out.println("Method call timings - " + loops + " loops");
StopWatch stopWatch = new StopWatch();
// Call interface method via interface reference
stopWatch.start();
for (int i = 0; i < loops; i++)
{
iTest.getFirstValue();
}
stopWatch.stop();
System.out.println("interface method via interface reference: (nanos, millis)" + stopWatch.getElapsedNanos() + ", " + stopWatch.getElapsedMillis());
// Call interface method via abstract reference
stopWatch.start();
for (int i = 0; i < loops; i++)
{
aTest.getFirstValue();
}
stopWatch.stop();
System.out.println("interface method via abstract reference: (nanos, millis)" + stopWatch.getElapsedNanos() + ", " + stopWatch.getElapsedMillis());
// Call derived interface via derived reference
stopWatch.start();
for (int i = 0; i < loops; i++)
{
testImpl.getSecondValue();
}
stopWatch.stop();
System.out.println("interface via toplevel derived reference: (nanos, millis)" + stopWatch.getElapsedNanos() + ", " + stopWatch.getElapsedMillis());
// Call concrete method in concrete class
stopWatch.start();
for (int i = 0; i < loops; i++)
{
test.getConcreteValue();
}
stopWatch.stop();
System.out.println("Concrete method via concrete class reference: (nanos, millis)" + stopWatch.getElapsedNanos() + ", " + stopWatch.getElapsedMillis());
}
}
package interfacetest;
/**
*
* #author rpbarbat
*/
public class StopWatch
{
private long start;
private long stop;
public StopWatch()
{
start = 0;
stop = 0;
}
public void start()
{
stop = 0;
start = System.nanoTime();
}
public void stop()
{
stop = System.nanoTime();
}
public float getElapsedNanos()
{
return (stop - start);
}
public float getElapsedMillis()
{
return (stop - start) / 1000;
}
public float getElapsedSeconds()
{
return (stop - start) / 1000000000;
}
}
This was using the Oracles JDK 1.6_24. Hope this helps put this question to bed...
Regards,
Rodney Barbati
Interfaces are slower than abstract class as run time decision of method invocation would add little penalty of time,
However as JIT comes in picture which will take care of repeated calls of same method hence you may see the performance lag only in first call which is also very minimal,
Now for Java 8, they almost made abstract class useless by adding default & static function,

Why are interface method invocations slower than concrete invocations?

This is question comes in mind when I finding difference between abstract class and interface.
In this post I came to know that interfaces are slow as they required extra indirection.
But I am not getting what type of indirection required by the interface and not by the abstract class or concrete class.Please clarify on it.
Thanks in advance
There are many performance myths, and some were probably true several years ago, and some might still be true on VMs that don't have a JIT.
The Android documentation (remember that Android don't have a JVM, they have Dalvik VM) used to say that invoking a method on an interfaces was slower than invoking it on a class, so they were contributing to spreading the myth (it's also possible that it was slower on the Dalvik VM before they turned on the JIT). The documentation does now say:
Performance Myths
Previous versions of this document made various misleading claims. We
address some of them here.
On devices without a JIT, it is true that invoking methods via a
variable with an exact type rather than an interface is slightly more
efficient. (So, for example, it was cheaper to invoke methods on a
HashMap map than a Map map, even though in both cases the map was a
HashMap.) It was not the case that this was 2x slower; the actual
difference was more like 6% slower. Furthermore, the JIT makes the two
effectively indistinguishable.
Source: Designing for performance on Android
The same thing is probably true for the JIT in the JVM, it would be very odd otherwise.
If in doubt, measure it. My results showed no significant difference. When run, the following program produced:
7421714 (abstract)
5840702 (interface)
7621523 (abstract)
5929049 (interface)
But when I switched the places of the two loops:
7887080 (interface)
5573605 (abstract)
7986213 (interface)
5609046 (abstract)
It appears that abstract classes are slightly (~6%) faster, but that should not be noticeable; These are nanoseconds. 7887080 nanoseconds are ~7 milliseconds. That makes it a difference of 0.1 millis per 40k invocations (Java version: 1.6.20)
Here's the code:
public class ClassTest {
public static void main(String[] args) {
Random random = new Random();
List<Foo> foos = new ArrayList<Foo>(40000);
List<Bar> bars = new ArrayList<Bar>(40000);
for (int i = 0; i < 40000; i++) {
foos.add(random.nextBoolean() ? new Foo1Impl() : new Foo2Impl());
bars.add(random.nextBoolean() ? new Bar1Impl() : new Bar2Impl());
}
long start = System.nanoTime();
for (Foo foo : foos) {
foo.foo();
}
System.out.println(System.nanoTime() - start);
start = System.nanoTime();
for (Bar bar : bars) {
bar.bar();
}
System.out.println(System.nanoTime() - start);
}
abstract static class Foo {
public abstract int foo();
}
static interface Bar {
int bar();
}
static class Foo1Impl extends Foo {
#Override
public int foo() {
int i = 10;
i++;
return i;
}
}
static class Foo2Impl extends Foo {
#Override
public int foo() {
int i = 10;
i++;
return i;
}
}
static class Bar1Impl implements Bar {
#Override
public int bar() {
int i = 10;
i++;
return i;
}
}
static class Bar2Impl implements Bar {
#Override
public int bar() {
int i = 10;
i++;
return i;
}
}
}
An object has a "vtable pointer" of some kind which points to a "vtable" (method pointer table) for its class ("vtable" might be the wrong terminology, but that's not important). The vtable has pointers to all the method implementations; each method has an index which corresponds to a table entry. So, to call a class method, you just look up the corresponding method (using its index) in the vtable. If one class extends another, it just has a longer vtable with more entries; calling a method from the base class still uses the same procedure: that is, look up the method by its index.
However, in calling a method from an interface via an interface reference, there must be some alternative mechanism to find the method implementation pointer. Because a class can implement multiple interfaces, it's not possible for the method to always have the same index in the vtable (for instance). There are various possible ways to resolve this, but no way that is quite as efficient as simple vtable dispatch.
However, as mentioned in the comments, it probably won't make much difference with a modern Java VM implementation.
This is variation on Bozho example. It runs longer and re-uses the same objects so the cache size doesn't matter so much. I also use an array so there is no overhead from the iterator.
public static void main(String[] args) {
Random random = new Random();
int testLength = 200 * 1000 * 1000;
Foo[] foos = new Foo[testLength];
Bar[] bars = new Bar[testLength];
Foo1Impl foo1 = new Foo1Impl();
Foo2Impl foo2 = new Foo2Impl();
Bar1Impl bar1 = new Bar1Impl();
Bar2Impl bar2 = new Bar2Impl();
for (int i = 0; i < testLength; i++) {
boolean flip = random.nextBoolean();
foos[i] = flip ? foo1 : foo2;
bars[i] = flip ? bar1 : bar2;
}
long start;
start = System.nanoTime();
for (Foo foo : foos) {
foo.foo();
}
System.out.printf("The average abstract method call was %.1f ns%n", (double) (System.nanoTime() - start) / testLength);
start = System.nanoTime();
for (Bar bar : bars) {
bar.bar();
}
System.out.printf("The average interface method call was %.1f ns%n", (double) (System.nanoTime() - start) / testLength);
}
prints
The average abstract method call was 4.2 ns
The average interface method call was 4.1 ns
if you swap the order the tests are run you get
The average interface method call was 4.2 ns
The average abstract method call was 4.1 ns
There is more difference in how you run the test than which one you chose.
I got the same result with Java 6 update 26 and OpenJDK 7.
BTW: If you add a loop which only call the same object each time, you get
The direct method call was 2.2 ns
I tried to write a test that would quantify all of the various ways methods might be invoked. My findings show that it is not whether a method is an interface method or not that matters, but rather the type of the reference through which you are calling it. Calling an interface method through a class reference is much faster (relative to the number of calls) than calling the same method on the same class via an interface reference.
The results for 1,000,000 calls are...
interface method via interface reference: (nanos, millis) 5172161.0, 5.0
interface method via abstract reference: (nanos, millis) 1893732.0, 1.8
interface method via toplevel derived reference: (nanos, millis) 1841659.0, 1.8
Concrete method via concrete class reference: (nanos, millis) 1822885.0, 1.8
Note that the first two lines of the results are calls to the exact same method, but via different references.
And here is the code...
package interfacetest;
/**
*
* #author rpbarbat
*/
public class InterfaceTest
{
static public interface ITest
{
public int getFirstValue();
public int getSecondValue();
}
static abstract public class ATest implements ITest
{
int first = 0;
#Override
public int getFirstValue()
{
return first++;
}
}
static public class TestImpl extends ATest
{
int second = 0;
#Override
public int getSecondValue()
{
return second++;
}
}
static public class Test
{
int value = 0;
public int getConcreteValue()
{
return value++;
}
}
static int loops = 1000000;
/**
* #param args the command line arguments
*/
public static void main(String[] args)
{
// Get some various pointers to the test classes
// To Interface
ITest iTest = new TestImpl();
// To abstract base
ATest aTest = new TestImpl();
// To impl
TestImpl testImpl = new TestImpl();
// To concrete
Test test = new Test();
System.out.println("Method call timings - " + loops + " loops");
StopWatch stopWatch = new StopWatch();
// Call interface method via interface reference
stopWatch.start();
for (int i = 0; i < loops; i++)
{
iTest.getFirstValue();
}
stopWatch.stop();
System.out.println("interface method via interface reference: (nanos, millis)" + stopWatch.getElapsedNanos() + ", " + stopWatch.getElapsedMillis());
// Call interface method via abstract reference
stopWatch.start();
for (int i = 0; i < loops; i++)
{
aTest.getFirstValue();
}
stopWatch.stop();
System.out.println("interface method via abstract reference: (nanos, millis)" + stopWatch.getElapsedNanos() + ", " + stopWatch.getElapsedMillis());
// Call derived interface via derived reference
stopWatch.start();
for (int i = 0; i < loops; i++)
{
testImpl.getSecondValue();
}
stopWatch.stop();
System.out.println("interface via toplevel derived reference: (nanos, millis)" + stopWatch.getElapsedNanos() + ", " + stopWatch.getElapsedMillis());
// Call concrete method in concrete class
stopWatch.start();
for (int i = 0; i < loops; i++)
{
test.getConcreteValue();
}
stopWatch.stop();
System.out.println("Concrete method via concrete class reference: (nanos, millis)" + stopWatch.getElapsedNanos() + ", " + stopWatch.getElapsedMillis());
}
}
package interfacetest;
/**
*
* #author rpbarbat
*/
public class StopWatch
{
private long start;
private long stop;
public StopWatch()
{
start = 0;
stop = 0;
}
public void start()
{
stop = 0;
start = System.nanoTime();
}
public void stop()
{
stop = System.nanoTime();
}
public float getElapsedNanos()
{
return (stop - start);
}
public float getElapsedMillis()
{
return (stop - start) / 1000;
}
public float getElapsedSeconds()
{
return (stop - start) / 1000000000;
}
}
This was using the Oracles JDK 1.6_24. Hope this helps put this question to bed...
Regards,
Rodney Barbati
Interfaces are slower than abstract class as run time decision of method invocation would add little penalty of time,
However as JIT comes in picture which will take care of repeated calls of same method hence you may see the performance lag only in first call which is also very minimal,
Now for Java 8, they almost made abstract class useless by adding default & static function,

Null-free "maps": Is a callback solution slower than tryGet()?

In comments to "How to implement List, Set, and Map in null free design?", Steven Sudit and I got into a discussion about using a callback, with handlers for "found" and "not found" situations, vs. a tryGet() method, taking an out parameter and returning a boolean indicating whether the out parameter had been populated. Steven maintained that the callback approach was more complex and almost certain to be slower; I maintained that the complexity was no greater and the performance at worst the same.
But code speaks louder than words, so I thought I'd implement both and see what I got. The original question was fairly theoretical with regard to language ("And for argument sake, let's say this language don't even have null") -- I've used Java here because that's what I've got handy. Java doesn't have out parameters, but it doesn't have first-class functions either, so style-wise, it should suck equally for both approaches.
(Digression: As far as complexity goes: I like the callback design because it inherently forces the user of the API to handle both cases, whereas the tryGet() design requires callers to perform their own boilerplate conditional check, which they could forget or get wrong. But having now implemented both, I can see why the tryGet() design looks simpler, at least in the short term.)
First, the callback example:
class CallbackMap<K, V> {
private final Map<K, V> backingMap;
public CallbackMap(Map<K, V> backingMap) {
this.backingMap = backingMap;
}
void lookup(K key, Callback<K, V> handler) {
V val = backingMap.get(key);
if (val == null) {
handler.handleMissing(key);
} else {
handler.handleFound(key, val);
}
}
}
interface Callback<K, V> {
void handleFound(K key, V value);
void handleMissing(K key);
}
class CallbackExample {
private final Map<String, String> map;
private final List<String> found;
private final List<String> missing;
private Callback<String, String> handler;
public CallbackExample(Map<String, String> map) {
this.map = map;
found = new ArrayList<String>(map.size());
missing = new ArrayList<String>(map.size());
handler = new Callback<String, String>() {
public void handleFound(String key, String value) {
found.add(key + ": " + value);
}
public void handleMissing(String key) {
missing.add(key);
}
};
}
void test() {
CallbackMap<String, String> cbMap = new CallbackMap<String, String>(map);
for (int i = 0, count = map.size(); i < count; i++) {
String key = "key" + i;
cbMap.lookup(key, handler);
}
System.out.println(found.size() + " found");
System.out.println(missing.size() + " missing");
}
}
Now, the tryGet() example -- as best I understand the pattern (and I might well be wrong):
class TryGetMap<K, V> {
private final Map<K, V> backingMap;
public TryGetMap(Map<K, V> backingMap) {
this.backingMap = backingMap;
}
boolean tryGet(K key, OutParameter<V> valueParam) {
V val = backingMap.get(key);
if (val == null) {
return false;
}
valueParam.value = val;
return true;
}
}
class OutParameter<V> {
V value;
}
class TryGetExample {
private final Map<String, String> map;
private final List<String> found;
private final List<String> missing;
private final OutParameter<String> out = new OutParameter<String>();
public TryGetExample(Map<String, String> map) {
this.map = map;
found = new ArrayList<String>(map.size());
missing = new ArrayList<String>(map.size());
}
void test() {
TryGetMap<String, String> tgMap = new TryGetMap<String, String>(map);
for (int i = 0, count = map.size(); i < count; i++) {
String key = "key" + i;
if (tgMap.tryGet(key, out)) {
found.add(key + ": " + out.value);
} else {
missing.add(key);
}
}
System.out.println(found.size() + " found");
System.out.println(missing.size() + " missing");
}
}
And finally, the performance test code:
public static void main(String[] args) {
int size = 200000;
Map<String, String> map = new HashMap<String, String>();
for (int i = 0; i < size; i++) {
String val = (i % 5 == 0) ? null : "value" + i;
map.put("key" + i, val);
}
long totalCallback = 0;
long totalTryGet = 0;
int iterations = 20;
for (int i = 0; i < iterations; i++) {
{
TryGetExample tryGet = new TryGetExample(map);
long tryGetStart = System.currentTimeMillis();
tryGet.test();
totalTryGet += (System.currentTimeMillis() - tryGetStart);
}
System.gc();
{
CallbackExample callback = new CallbackExample(map);
long callbackStart = System.currentTimeMillis();
callback.test();
totalCallback += (System.currentTimeMillis() - callbackStart);
}
System.gc();
}
System.out.println("Avg. callback: " + (totalCallback / iterations));
System.out.println("Avg. tryGet(): " + (totalTryGet / iterations));
}
On my first attempt, I got 50% worse performance for callback than for tryGet(), which really surprised me. But, on a hunch, I added some garbage collection, and the performance penalty vanished.
This fits with my instinct, which is that we're basically talking about taking the same number of method calls, conditional checks, etc. and rearranging them. But then, I wrote the code, so I might well have written a suboptimal or subconsicously penalized tryGet() implementation. Thoughts?
Updated: Per comment from Michael Aaron Safyan, fixed TryGetExample to reuse OutParameter.
I would say that neither design makes sense in practice, regardless of the performance. I would argue that both mechanisms are overly complicated and, more importantly, don't take into account actual usage.
Actual Usage
If a user looks up a value in a map and it isn't there, most likely the user wants one of the following:
To insert some value with that key into the map
To get back some default value
To be informed that the value isn't there
Thus I would argue that a better, null-free API would be:
has(key) which indicates if the key is present (if one only wishes to check for the key's existence).
get(key) which reports the value if the key is present; otherwise, throws NoSuchElementException.
get(key,defaultval) which reports the value for the key, or defaultval if the key isn't present.
setdefault(key,defaultval) which inserts (key,defaultval) if key isn't present, and returns the value associated with key (which is defaultval if there is no previous mapping, otherwise prev mapping).
The only way to get back null is if you explicity ask for it as in get(key,null). This API is incredibly simple, and yet is able to handle the most common map-related tasks (in most use cases that I have encountered).
I should also add that in Java, has() would be called containsKey() while setdefault() would be called putIfAbsent(). Because get() signals an object's absence via a NoSuchElementException, it is then possible to associate a key with null and treat it as a legitimate association.... if get() returns null, it means the key has been associated with the value null, not that the key is absent (although you can define your API to disallow a value of null if you so choose, in which case you would throw an IllegalArgumentException from the functions that are used to add associations if the value given is null). Another advantage to this API, is that setdefault() only needs to perform the lookup procedure once instead of twice, which would be the case if you used if( ! dict.has(key) ){ dict.set(key,val); }. Another advantage is that you do not surprise developers who write something like dict.get(key).doSomething() who assume that get() will always return a non-null object (because they have never inserted a null value into the dictionary)... instead, they get a NoSuchElementException if there is no value for that key, which is more consistent with the rest of the error checking in Java and which is also a much easier to understand and debug than NullPointerException.
Answer To Question
To answer original question, yes, you are unfairly penalizing the tryGet version.... in your callback based mechanism you construct the callback object only once and use it in all subsequent calls; whereas in your tryGet example, you construct your out parameter object in every single iteration. Try taking the line:
OutParameter out = new OutParameter();
Take the line above out of the for-loop and see if that improves the performance of the tryGet example. In other words, place the line above the for-loop, and re-use the out parameter in each iteration.
David, thanks for taking the time to write this up. I'm a C# programmer, so my Java skills are a bit vague these days. Because of this, I decided to port your code over and test it myself. I found some interesting differences and similarities, which are pretty much worth the price of admission as far as I'm concerned. Among the major differences are:
I didn't have to implement TryGet because it's built into Dictionary.
In order to use the native TryGet, instead of inserting nulls to simulate misses, I simply omitted those values. This still means that v = map[k] would have set v to null, so I think it's a proper porting. In hindsight, I could have inserted the nulls and changed (_map.TryGetValue(key, out value)) to (_map.TryGetValue(key, out value) && value != null)), but I'm glad I didn't.
I want to be exceedingly fair. So, to keep the code as compact and maintainable as possible, I used lambda calculus notation, which let me define the callbacks painlessly. This hides much of the complexity of setting up anonymous delegates, and allows me to use closures seamlessly. Ironically, the implementation of Lookup uses TryGet internally.
Instead of declaring a new type of Dictionary, I used an extension method to graft Lookup onto the standard dictionary, much simplifying the code.
With apologies for the less-than-professional quality of the code, here it is:
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApplication1
{
static class CallbackDictionary
{
public static void Lookup<K, V>(this Dictionary<K, V> map, K key, Action<K, V> found, Action<K> missed)
{
V v;
if (map.TryGetValue(key, out v))
found(key, v);
else
missed(key);
}
}
class TryGetExample
{
private Dictionary<string, string> _map;
private List<string> _found;
private List<string> _missing;
public TryGetExample(Dictionary<string, string> map)
{
_map = map;
_found = new List<string>(_map.Count);
_missing = new List<string>(_map.Count);
}
public void TestTryGet()
{
for (int i = 0; i < _map.Count; i++)
{
string key = "key" + i;
string value;
if (_map.TryGetValue(key, out value))
_found.Add(key + ": " + value);
else
_missing.Add(key);
}
Console.WriteLine(_found.Count() + " found");
Console.WriteLine(_missing.Count() + " missing");
}
public void TestCallback()
{
for (int i = 0; i < _map.Count; i++)
_map.Lookup("key" + i, (k, v) => _found.Add(k + ": " + v), k => _missing.Add(k));
Console.WriteLine(_found.Count() + " found");
Console.WriteLine(_missing.Count() + " missing");
}
}
class Program
{
static void Main(string[] args)
{
int size = 2000000;
var map = new Dictionary<string, string>(size);
for (int i = 0; i < size; i++)
if (i % 5 != 0)
map.Add("key" + i, "value" + i);
long totalCallback = 0;
long totalTryGet = 0;
int iterations = 20;
TryGetExample tryGet;
for (int i = 0; i < iterations; i++)
{
tryGet = new TryGetExample(map);
long tryGetStart = DateTime.UtcNow.Ticks;
tryGet.TestTryGet();
totalTryGet += (DateTime.UtcNow.Ticks - tryGetStart);
GC.Collect();
tryGet = new TryGetExample(map);
long callbackStart = DateTime.UtcNow.Ticks;
tryGet.TestCallback();
totalCallback += (DateTime.UtcNow.Ticks - callbackStart);
GC.Collect();
}
Console.WriteLine("Avg. callback: " + (totalCallback / iterations));
Console.WriteLine("Avg. tryGet(): " + (totalTryGet / iterations));
}
}
}
My performance expectations, as I said in the article that inspired this one, would be that neither one is much faster or slower than the other. After all, most of the work is in the searching and adding, not in the simple logic that structures it. In fact, it varied a bit among runs, but I was unable to detect any consistent advantage.
Part of the problem is that I used a low-precision timer and the test was short, so I increased the count by 10x to 2000000 and that helped. Now callbacks are about 3% slower, which I do not consider significant. On my fairly slow machine, callbacks took 17773437 while tryget took 17234375.
Now, as for code complexity, it's a bit unfair because TryGet is native, so let's just ignore the fact that I had to add a callback interface. At the calling spot, lambda notation did a great job of hiding the complexity. If anything, it's actually shorter than the if/then/else used in the TryGet version, although I suppose I could have used a ternary operator to make it equally compact.
On the whole, I found the C# to be more elegant, and only some of that is due to my bias as a C# programmer. Mainly, I didn't have to define and implement interfaces, which cut down on the plumbing overhead. I also used pretty standard .NET conventions, which seem to be a bit more streamlined than the sort of style favored in Java.

Categories

Resources