Java Reflection Performance - java

Does creating an object using reflection rather than calling the class constructor result in any significant performance differences?

Yes - absolutely. Looking up a class via reflection is, by magnitude, more expensive.
Quoting Java's documentation on reflection:
Because reflection involves types that are dynamically resolved, certain Java virtual machine optimizations can not be performed. Consequently, reflective operations have slower performance than their non-reflective counterparts, and should be avoided in sections of code which are called frequently in performance-sensitive applications.
Here's a simple test I hacked up in 5 minutes on my machine, running Sun JRE 6u10:
public class Main {
public static void main(String[] args) throws Exception
{
doRegular();
doReflection();
}
public static void doRegular() throws Exception
{
long start = System.currentTimeMillis();
for (int i=0; i<1000000; i++)
{
A a = new A();
a.doSomeThing();
}
System.out.println(System.currentTimeMillis() - start);
}
public static void doReflection() throws Exception
{
long start = System.currentTimeMillis();
for (int i=0; i<1000000; i++)
{
A a = (A) Class.forName("misc.A").newInstance();
a.doSomeThing();
}
System.out.println(System.currentTimeMillis() - start);
}
}
With these results:
35 // no reflection
465 // using reflection
Bear in mind the lookup and the instantiation are done together, and in some cases the lookup can be refactored away, but this is just a basic example.
Even if you just instantiate, you still get a performance hit:
30 // no reflection
47 // reflection using one lookup, only instantiating
Again, YMMV.

Yes, it's slower.
But remember the damn #1 rule--PREMATURE OPTIMIZATION IS THE ROOT OF ALL EVIL
(Well, may be tied with #1 for DRY)
I swear, if someone came up to me at work and asked me this I'd be very watchful over their code for the next few months.
You must never optimize until you are sure you need it, until then, just write good, readable code.
Oh, and I don't mean write stupid code either. Just be thinking about the cleanest way you can possibly do it--no copy and paste, etc. (Still be wary of stuff like inner loops and using the collection that best fits your need--Ignoring these isn't "unoptimized" programming, it's "bad" programming)
It freaks me out when I hear questions like this, but then I forget that everyone has to go through learning all the rules themselves before they really get it. You'll get it after you've spent a man-month debugging something someone "Optimized".
EDIT:
An interesting thing happened in this thread. Check the #1 answer, it's an example of how powerful the compiler is at optimizing things. The test is completely invalid because the non-reflective instantiation can be completely factored out.
Lesson? Don't EVER optimize until you've written a clean, neatly coded solution and proven it to be too slow.

You may find that A a = new A() is being optimised out by the JVM.
If you put the objects into an array, they don't perform so well. ;)
The following prints...
new A(), 141 ns
A.class.newInstance(), 266 ns
new A(), 103 ns
A.class.newInstance(), 261 ns
public class Run {
private static final int RUNS = 3000000;
public static class A {
}
public static void main(String[] args) throws Exception {
doRegular();
doReflection();
doRegular();
doReflection();
}
public static void doRegular() throws Exception {
A[] as = new A[RUNS];
long start = System.nanoTime();
for (int i = 0; i < RUNS; i++) {
as[i] = new A();
}
System.out.printf("new A(), %,d ns%n", (System.nanoTime() - start)/RUNS);
}
public static void doReflection() throws Exception {
A[] as = new A[RUNS];
long start = System.nanoTime();
for (int i = 0; i < RUNS; i++) {
as[i] = A.class.newInstance();
}
System.out.printf("A.class.newInstance(), %,d ns%n", (System.nanoTime() - start)/RUNS);
}
}
This suggest the difference is about 150 ns on my machine.

If there really is need for something faster than reflection, and it's not just a premature optimization, then bytecode generation with ASM or a higher level library is an option. Generating the bytecode the first time is slower than just using reflection, but once the bytecode has been generated, it is as fast as normal Java code and will be optimized by the JIT compiler.
Some examples of applications which use code generation:
Invoking methods on proxies generated by CGLIB is slightly faster than Java's dynamic proxies, because CGLIB generates bytecode for its proxies, but dynamic proxies use only reflection (I measured CGLIB to be about 10x faster in method calls, but creating the proxies was slower).
JSerial generates bytecode for reading/writing the fields of serialized objects, instead of using reflection. There are some benchmarks on JSerial's site.
I'm not 100% sure (and I don't feel like reading the source now), but I think Guice generates bytecode to do dependency injection. Correct me if I'm wrong.

"Significant" is entirely dependent on context.
If you're using reflection to create a single handler object based on some configuration file, and then spending the rest of your time running database queries, then it's insignificant. If you're creating large numbers of objects via reflection in a tight loop, then yes, it's significant.
In general, design flexibility (where needed!) should drive your use of reflection, not performance. However, to determine whether performance is an issue, you need to profile rather than get arbitrary responses from a discussion forum.

There is some overhead with reflection, but it's a lot smaller on modern VMs than it used to be.
If you're using reflection to create every simple object in your program then something is wrong. Using it occasionally, when you have good reason, shouldn't be a problem at all.

Yes there is a performance hit when using Reflection but a possible workaround for optimization is caching the method:
Method md = null; // Call while looking up the method at each iteration.
millis = System.currentTimeMillis( );
for (idx = 0; idx < CALL_AMOUNT; idx++) {
md = ri.getClass( ).getMethod("getValue", null);
md.invoke(ri, null);
}
System.out.println("Calling method " + CALL_AMOUNT+ " times reflexively with lookup took " + (System.currentTimeMillis( ) - millis) + " millis");
// Call using a cache of the method.
md = ri.getClass( ).getMethod("getValue", null);
millis = System.currentTimeMillis( );
for (idx = 0; idx < CALL_AMOUNT; idx++) {
md.invoke(ri, null);
}
System.out.println("Calling method " + CALL_AMOUNT + " times reflexively with cache took " + (System.currentTimeMillis( ) - millis) + " millis");
will result in:
[java] Calling method 1000000 times reflexively with lookup took 5618 millis
[java] Calling method 1000000 times reflexively with cache took 270 millis

Interestingly enough, settting setAccessible(true), which skips the security checks, has a 20% reduction in cost.
Without setAccessible(true)
new A(), 70 ns
A.class.newInstance(), 214 ns
new A(), 84 ns
A.class.newInstance(), 229 ns
With setAccessible(true)
new A(), 69 ns
A.class.newInstance(), 159 ns
new A(), 85 ns
A.class.newInstance(), 171 ns

Reflection is slow, though object allocation is not as hopeless as other aspects of reflection. Achieving equivalent performance with reflection-based instantiation requires you to write your code so the jit can tell which class is being instantiated. If the identity of the class can't be determined, then the allocation code can't be inlined. Worse, escape analysis fails, and the object can't be stack-allocated. If you're lucky, the JVM's run-time profiling may come to the rescue if this code gets hot, and may determine dynamically which class predominates and may optimize for that one.
Be aware the microbenchmarks in this thread are deeply flawed, so take them with a grain of salt. The least flawed by far is Peter Lawrey's: it does warmup runs to get the methods jitted, and it (consciously) defeats escape analysis to ensure the allocations are actually occurring. Even that one has its problems, though: for example, the tremendous number of array stores can be expected to defeat caches and store buffers, so this will wind up being mostly a memory benchmark if your allocations are very fast. (Kudos to Peter on getting the conclusion right though: that the difference is "150ns" rather than "2.5x". I suspect he does this kind of thing for a living.)

Yes, it is significantly slower. We were running some code that did that, and while I don't have the metrics available at the moment, the end result was that we had to refactor that code to not use reflection. If you know what the class is, just call the constructor directly.

In the doReflection() is the overhead because of Class.forName("misc.A") (that would require a class lookup, potentially scanning the class path on the filsystem), rather than the newInstance() called on the class. I am wondering what the stats would look like if the Class.forName("misc.A") is done only once outside the for-loop, it doesn't really have to be done for every invocation of the loop.

Yes, always will be slower create an object by reflection because the JVM cannot optimize the code on compilation time. See the Sun/Java Reflection tutorials for more details.
See this simple test:
public class TestSpeed {
public static void main(String[] args) {
long startTime = System.nanoTime();
Object instance = new TestSpeed();
long endTime = System.nanoTime();
System.out.println(endTime - startTime + "ns");
startTime = System.nanoTime();
try {
Object reflectionInstance = Class.forName("TestSpeed").newInstance();
} catch (InstantiationException e) {
e.printStackTrace();
} catch (IllegalAccessException e) {
e.printStackTrace();
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
endTime = System.nanoTime();
System.out.println(endTime - startTime + "ns");
}
}

Often you can use Apache commons BeanUtils or PropertyUtils which introspection (basically they cache the meta data about the classes so they don't always need to use reflection).

I think it depends on how light/heavy the target method is. if the target method is very light(e.g. getter/setter), It could be 1 ~ 3 times slower. if the target method takes about 1 millisecond or above, then the performance will be very close. here is the test I did with Java 8 and reflectasm :
public class ReflectionTest extends TestCase {
#Test
public void test_perf() {
Profiler.run(3, 100000, 3, "m_01 by refelct", () -> Reflection.on(X.class)._new().invoke("m_01")).printResult();
Profiler.run(3, 100000, 3, "m_01 direct call", () -> new X().m_01()).printResult();
Profiler.run(3, 100000, 3, "m_02 by refelct", () -> Reflection.on(X.class)._new().invoke("m_02")).printResult();
Profiler.run(3, 100000, 3, "m_02 direct call", () -> new X().m_02()).printResult();
Profiler.run(3, 100000, 3, "m_11 by refelct", () -> Reflection.on(X.class)._new().invoke("m_11")).printResult();
Profiler.run(3, 100000, 3, "m_11 direct call", () -> X.m_11()).printResult();
Profiler.run(3, 100000, 3, "m_12 by refelct", () -> Reflection.on(X.class)._new().invoke("m_12")).printResult();
Profiler.run(3, 100000, 3, "m_12 direct call", () -> X.m_12()).printResult();
}
public static class X {
public long m_01() {
return m_11();
}
public long m_02() {
return m_12();
}
public static long m_11() {
long sum = IntStream.range(0, 10).sum();
assertEquals(45, sum);
return sum;
}
public static long m_12() {
long sum = IntStream.range(0, 10000).sum();
assertEquals(49995000, sum);
return sum;
}
}
}
The complete test code is available at GitHub:ReflectionTest.java

Related

Why does running multiple lambdas in loops suddenly slow down?

Consider the following code:
public class Playground {
private static final int MAX = 100_000_000;
public static void main(String... args) {
execute(() -> {});
execute(() -> {});
execute(() -> {});
execute(() -> {});
}
public static void execute(Runnable task) {
Stopwatch stopwatch = Stopwatch.createStarted();
for (int i = 0; i < MAX; i++) {
task.run();
}
System.out.println(stopwatch);
}
}
This currently prints the following on my Intel MBP on Temurin 17:
3.675 ms
1.948 ms
216.9 ms
243.3 ms
Notice the 100* slowdown for the third (and any subsequent) execution. Now, obviously, this is NOT how to write benchmarks in Java. The loop code doesn't do anything, so I'd expect it to be eliminated for all and any repetitions. Also I could not repeat this effect using JMH which tells me the reason is tricky and fragile.
So, why does this happen? Why would there suddenly be such a catastrophic slowdown, what's going on under the hood? An assumption is that C2 bails on us, but which limitation are we bumping into?
Things that don't change the behavior:
using anonymous inner classes instead of lambdas,
using 3+ different nested classes instead of lambdas.
Things that "fix" the behavior. Actually the third invocation and all subsequent appear to be much faster, hinting that compilation correctly eliminated the loops completely:
using 1-2 nested classes instead of lambdas,
using 1-2 lambda instances instead of 4 different ones,
not calling task.run() lambdas inside the loop,
inlining the execute() method, still maintaining 4 different lambdas.
You can actually replicate this with JMH SingleShot mode:
#BenchmarkMode(Mode.SingleShotTime)
#Warmup(iterations = 0)
#Measurement(iterations = 1)
#Fork(1)
public class Lambdas {
#Benchmark
public static void doOne() {
execute(() -> {});
}
#Benchmark
public static void doFour() {
execute(() -> {});
execute(() -> {});
execute(() -> {});
execute(() -> {});
}
public static void execute(Runnable task) {
for (int i = 0; i < 100_000_000; i++) {
task.run();
}
}
}
Benchmark Mode Cnt Score Error Units
Lambdas.doFour ss 0.446 s/op
Lambdas.doOne ss 0.006 s/op
If you look at -prof perfasm profile for doFour test, you would get a fat clue:
....[Hottest Methods (after inlining)]..............................................................
32.19% c2, level 4 org.openjdk.Lambdas$$Lambda$44.0x0000000800c258b8::run, version 664
26.16% c2, level 4 org.openjdk.Lambdas$$Lambda$43.0x0000000800c25698::run, version 658
There are at least two hot lambdas, and those are represented by different classes. So what you are seeing is likely monomorphic (one target), then bimorphic (two targets), then polymorphic virtual call at task.run.
Virtual call has to choose which class to call the implementation from. The more classes you have, the worse it gets for optimizer. JVM tries to adapt, but it gets worse and worse as the run progresses. Roughly like this:
execute(() -> {}); // compiles with single target, fast
execute(() -> {}); // recompiles with two targets, a bit slower
execute(() -> {}); // recompiles with three targets, slow
execute(() -> {}); // continues to be slow
Now, the elimination of the loop requires seeing through the task.run(). In monomorphic and bimorphic cases it is easy: one or both targets are inlined, their empty body is discovered, done. In both cases, you would have to do typechecks, which means it is not completely free, with bimorphic costing a bit extra. When you experience a polymorphic call, there is no such luck at all: it is opaque call.
You can add two more benchmarks in the mix to see it:
#Benchmark
public static void doFour_Same() {
Runnable l = () -> {};
execute(l);
execute(l);
execute(l);
execute(l);
}
#Benchmark
public static void doFour_Pair() {
Runnable l1 = () -> {};
Runnable l2 = () -> {};
execute(l1);
execute(l1);
execute(l2);
execute(l2);
}
Which would then yield:
Benchmark Mode Cnt Score Error Units
Lambdas.doFour ss 0.445 s/op ; polymorphic
Lambdas.doFour_Pair ss 0.016 s/op ; bimorphic
Lambdas.doFour_Same ss 0.008 s/op ; monomorphic
Lambdas.doOne ss 0.006 s/op
This also explains why your "fixes" help:
using 1-2 nested classes instead of lambdas,
Bimorphic inlining.
using 1-2 lambda instances instead of 4 different ones,
Bimorphic inlining.
not calling task.run() lambdas inside the loop,
Avoids polymorphic (opaque) call in the loop, allows loop elimination.
inlining the execute() method, still maintaining 4 different lambdas.
Avoids a single call site that experiences multiple call targets. In other words, turns a single polymorphic call site into a series of monomorphic call sites each with its own target.

Why does Guava Enums.ifPresent use synchronized under the hood?

Guava's Enums.ifPresent(Class, String) calls Enums.getEnumConstants under the hood:
#GwtIncompatible // java.lang.ref.WeakReference
static <T extends Enum<T>> Map<String, WeakReference<? extends Enum<?>>> getEnumConstants(
Class<T> enumClass) {
synchronized (enumConstantCache) {
Map<String, WeakReference<? extends Enum<?>>> constants = enumConstantCache.get(enumClass);
if (constants == null) {
constants = populateCache(enumClass);
}
return constants;
}
}
Why does it need a synchronized block? Wouldn't that incur a heavy performance penalty? Java's Enum.valueOf(Class, String) does not appear to need one. Further on if synchronization is indeed necessary, why do it so inefficiently? One would hope if enum is present in cache, it can be retrieved without locking. Only lock if cache needs to be populated.
For Reference: Maven Dependency
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>23.2-jre</version>
</dependency>
Edit: By locking I'm referring to a double checking lock.
I've accepted #maaartinus answer, but wanted to write a separate "answer" about the circumstances behind the question and the interesting rabbit hole it lead me to.
tl;dr - Use Java's Enum.valueOf which is thread safe and does not sync unlike Guava's Enums.ifPresent. Also in majority of cases it probably doesn't matter.
Long story:
I'm working on a codebase that utilizes light weight java threads Quasar Fibers. In order to harness the power of Fibers, the code they run should be primarily async and non-blocking because Fibers are multiplexed to Java/OS Threads. It becomes very important that individual Fibers do not "block" the underlying thread. If underlying thread is blocked, it will block all Fibers running on it and performance degrades considerably. Guava's Enums.ifPresent is one of those blockers and I'm certain it can be avoided.
Initially, I started using Guava's Enums.ifPresent because it returns null on invalid enum values. Unlike Java's Enum.valueOf which throws IllegalArgumentException (which to my taste is less preferrable than a null value).
Here is a crude benchmark comparing various methods of converting to enums:
Java's Enum.valueOf with catching IllegalArgumentException to return null
Guava's Enums.ifPresent
Apache Commons Lang EnumUtils.getEnum
Apache Commons Lang 3 EnumUtils.getEnum
My Own Custom Immutable Map Lookup
Notes:
Apache Common Lang 3 uses Java's Enum.valueOf under the hood and are hence identical
Earlier version of Apache Common Lang uses a very similar WeakHashMap solution to Guava but does not use synchronization. They favor cheap reads and more expensive writes (my knee jerk reaction says that's how Guava should have done it)
Java's decision to throw IllegalArgumentException is likely to have a small cost associated with it when dealing with invalid enum values. Throwing/catching exceptions isn't free.
Guava is the only method here that uses synchronization
Benchmark Setup:
uses an ExecutorService with a fixed thread pool of 10 threads
submits 100K Runnable tasks to convert enums
each Runnable task converts 100 enums
each method of converting enums will convert 10 million strings (100K x 100)
Benchmark Results from a run:
Convert valid enum string value:
JAVA -> 222 ms
GUAVA -> 964 ms
APACHE_COMMONS_LANG -> 138 ms
APACHE_COMMONS_LANG3 -> 149 ms
MY_OWN_CUSTOM_LOOKUP -> 160 ms
Try to convert INVALID enum string value:
JAVA -> 6009 ms
GUAVA -> 734 ms
APACHE_COMMONS_LANG -> 65 ms
APACHE_COMMONS_LANG3 -> 5558 ms
MY_OWN_CUSTOM_LOOKUP -> 92 ms
These numbers should be taken with a heavy grain of salt and will change depending on other factors. But they were good enough for me to conclude to go with Java's solution for the codebase using Fibers.
Benchmark Code:
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import com.google.common.base.Enums;
import com.google.common.collect.ImmutableMap;
import com.google.common.collect.ImmutableMap.Builder;
public class BenchmarkEnumValueOf {
enum Strategy {
JAVA,
GUAVA,
APACHE_COMMONS_LANG,
APACHE_COMMONS_LANG3,
MY_OWN_CUSTOM_LOOKUP;
private final static ImmutableMap<String, Strategy> lookup;
static {
Builder<String, Strategy> immutableMapBuilder = ImmutableMap.builder();
for (Strategy strategy : Strategy.values()) {
immutableMapBuilder.put(strategy.name(), strategy);
}
lookup = immutableMapBuilder.build();
}
static Strategy toEnum(String name) {
return name != null ? lookup.get(name) : null;
}
}
public static void main(String[] args) {
final int BENCHMARKS_TO_RUN = 1;
System.out.println("Convert valid enum string value:");
for (int i = 0; i < BENCHMARKS_TO_RUN; i++) {
for (Strategy strategy : Strategy.values()) {
runBenchmark(strategy, "JAVA", 100_000);
}
}
System.out.println("\nTry to convert INVALID enum string value:");
for (int i = 0; i < BENCHMARKS_TO_RUN; i++) {
for (Strategy strategy : Strategy.values()) {
runBenchmark(strategy, "INVALID_ENUM", 100_000);
}
}
}
static void runBenchmark(Strategy strategy, String enumStringValue, int iterations) {
ExecutorService executorService = Executors.newFixedThreadPool(10);
long timeStart = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
executorService.submit(new EnumValueOfRunnable(strategy, enumStringValue));
}
executorService.shutdown();
try {
executorService.awaitTermination(1000, TimeUnit.SECONDS);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
long timeDuration = System.currentTimeMillis() - timeStart;
System.out.println("\t" + strategy.name() + " -> " + timeDuration + " ms");
}
static class EnumValueOfRunnable implements Runnable {
Strategy strategy;
String enumStringValue;
EnumValueOfRunnable(Strategy strategy, String enumStringValue) {
this.strategy = strategy;
this.enumStringValue = enumStringValue;
}
#Override
public void run() {
for (int i = 0; i < 100; i++) {
switch (strategy) {
case JAVA:
try {
Enum.valueOf(Strategy.class, enumStringValue);
} catch (IllegalArgumentException e) {}
break;
case GUAVA:
Enums.getIfPresent(Strategy.class, enumStringValue);
break;
case APACHE_COMMONS_LANG:
org.apache.commons.lang.enums.EnumUtils.getEnum(Strategy.class, enumStringValue);
break;
case APACHE_COMMONS_LANG3:
org.apache.commons.lang3.EnumUtils.getEnum(Strategy.class, enumStringValue);
break;
case MY_OWN_CUSTOM_LOOKUP:
Strategy.toEnum(enumStringValue);
break;
}
}
}
}
}
I guess, the reason is simply that enumConstantCache is a WeakHashMap, which is not thread-safe.
Two threads writing to the cache at the same time could end up with an endless loop or alike (at least such thing happened with HashMap as I tried it years ago).
I guess, you could use DCL, but it mayn't be worth it (as stated in a comment).
Further on if synchronization is indeed necessary, why do it so inefficiently? One would hope if enum is present in cache, it can be retrieved without locking. Only lock if cache needs to be populated.
This may get too tricky. For visibility using volatile, you need a volatile read paired with a volatile write. You could get the volatile read easily by declaring enumConstantCache to be volatile instead of final. The volatile write is trickier. Something like
enumConstantCache = enumConstantCache;
may work, but I'm not sure about that.
10 threads, each one having to convert String values to Enums and then perform some task
The Map access is usually way faster than anything you do with the obtained value, so I guess, you'd need much more threads to get a problem.
Unlike HashMap, the WeakHashMap needs to perform some cleanup (called expungeStaleEntries). This cleanup gets performed even in get (via getTable). So get is a modifying operation and you really don't want to execute it concurrently.
Note that reading WeakHashMap without synchronization means performing mutation without locking and it's plain wrong and that's not only theory.
You'd need an own version of WeakHashMap performing no mutations in get (which is simple) and guaranteeing some sane behavior when written during read by a different thread (which may or may not be possible).
I guess, something like SoftReference<ImmutableMap<String, Enum<?>> with some re-loading logic could work well.

Java padding performance busting

Hi Guys so i got this piece of code
public class Padding {
static class Pair {
volatile long c1;
// UN-comment this line and see how perofmance is boosted * 2
// long q1; //Magic dodo thingy
volatile long c2;
}
static Pair p = new Pair();
static class Worker implements Runnable {
private static final int INT = Integer.MAX_VALUE/8;
private boolean b;
Worker(boolean b) {
this.b = b;
}
public void run() {
long start = System.currentTimeMillis();
if (b) {
for (int i = 0; i < INT; i++) {
p.c1++;
res += Math.random();
}
} else {
for (int i = 0; i < INT; i++) {
p.c2++;
res += Math.random();
}
}
long end = System.currentTimeMillis();
System.out.println("took: " + (end-start) + " Result:" + p.c1+p.c2);
}
}
public static void main(String[] args) {
System.out.println("Starting....");
Thread t1 = new Thread(new Worker(true));
Thread t2 = new Thread(new Worker(false));
t1.start();
t2.start();
}
}
So if i run it takes about 11 seconds but if i uncomment qa1 it runs in 3 second .I tried to find some thing on internet but nothing informative enough came up. As i understand it has some thing to do with JVM optimiztion and long q1 probably makes memory (or cache) distribution some how better . Any way my question is does some one knows where can i read about it more .
Thanks
Performance in your example is degradated by false sharing - c1 and c2 instances are placed in the same cache line and threads need to flush/load values to/from main memory at every increment of different fields to maintain cache coherency after mututal cache line copy invalidation.
In your case it is enough to declare one more q1 long field right after c1 to make c2 go to another cache line (which size is only 64 bytes for the x86 family). After it cache management becomes way more efficient - threads can use different cache lines and do not invalidate copy of the other thread's cache line.
There are many articles which are devoted to the hardware nature of this issue (and software ways of avoiding it). Dealing with false sharing by "footprint padding" solution (like yours) has been being tricky for a long time - Java platform doesn't guarantee that fields order and cache line padding in runtime would be exactly as your expect in class declaration. Even "minor" platfrom update or switch to another VM implementation can brake a solution (because fields - especially unused dummies - are subjects of optimizations).
That's whу JEP-142 was introduced and #Contended annotation was implemented in Java 8. This annotation allows you to configure which fields of the class should be placed on different cache lines.
But now it's just a VM hint without any absolute guarantee about false sharing avoidance in all situations, so you should look at your code carefully and verify its behaviour (if your application is sensitive to the false sharing issue, of course)

Is this an efficient way to obfuscate code and is it performance friendly?

So I recently thought of a way that would make code de-obfuscation a lot more difficult. But I am not quite sure if this affects the performance of my code negatively: So my idea is to turn this code:
public class SpeedTest1 {
public static void main(String[] args){
long start = System.currentTimeMillis();
String toEncode = "fhsdakjfhasdkfhdsajkhfkshfv ksahyfvkawksefhkfhskfhkjsfhsdkfhjfhskjhafjkhskjadfhksdfhkjsdhfksfhksdhfsdyfieyt893489ygfudhgiueryriohetyuieyiuweatiuewytiueaytuiwfytwuiediuvnhuighsiudghfjdkghfsdkjghdiugfdkghdkjghdfkghfdghdigyeuriyeibuityeuirtuireytiuerythgfdkgiuegduigkghfdjkghjgkdfhgjfdhgjfdghfdkjghfdjkghfdkjghfdjkgfdjkghfdkjghfdjkgfdkjghfdkjgheriytretyretrityreiutyeriuhslfjlflkfflksdjflkjflks";
String str = Base64.encode(toEncode.getBytes());
try {
System.out.println(new String(Base64.decode(str), "UTF-8"));
} catch (Base64DecodingException e) {
e.printStackTrace();
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
long end = System.currentTimeMillis();
System.out.println("Time: " + (end - start));
}
}
Into this code:
public class SpeedTest2 {
public static void main(String[] args){
long start = System.currentTimeMillis();
System.out.println(y(x(z(a().getBytes()))));
long end = System.currentTimeMillis();
System.out.println("Time: " + (end - start));
}
private static String y(byte[] b){
try {
return new String(b, "UTF-8");
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
return null;
}
private static byte[] x(String s){
try {
return Base64.decode(s);
} catch (Base64DecodingException e) {
e.printStackTrace();
}
return null;
}
private static String z(byte[] b){
return Base64.encode(b);
}
private static String a(){
return "fhsdakjfhasdkfhdsajkhfkshfv ksahyfvkawksefhkfhskfhkjsfhsdkfhjfhskjhafjkhskjadfhksdfhkjsdhfksfhksdhfsdyfieyt893489ygfudhgiueryriohetyuieyiuweatiuewytiueaytuiwfytwuiediuvnhuighsiudghfjdkghfsdkjghdiugfdkghdkjghdfkghfdghdigyeuriyeibuityeuirtuireytiuerythgfdkgiuegduigkghfdjkghjgkdfhgjfdhgjfdghfdkjghfdjkghfdkjghfdjkgfdjkghfdkjghfdjkgfdkjghfdkjgheriytretyretrityreiutyeriuhslfjlflkfflksdjflkjflks";
}
}
Now the second piece is more difficult to figure out about what it does. But now I am worried about the performance of my programm so I added the check time lines to see if one of the two is faster that the other one. Now most of the time they both print 0 as the time but sometimes one of the two print someting like 15 but that's never the same method. I did find this answer [java how expensive is a method call that states that Java itself optimizes the code at run time so that would mean that it doesn't matter wich of the two examples to use right? As they are both equal at the time of execution. So I now want to know is this a good way to obfuscate code or does it affect code-efficiency negatively?
Regarding obfuscation in general:
1) Against what do you want to protect?
1.1) A Java beginner understanding your code: Your approach may be ok
1.2) A Java expert understanding your code: Your approach won't really work and may not even slow the expert down by more than factor 2
1.3) A competitor understanding your code: Your approach won't really work, the competitor may put considerable resources on the task
2) Do you have to maintain the code later on?
2.1) Yes: You need something which translates your original sources to the obfuscated variants (see below)
2.2) No: You may be ok with it, however, if you need to fix a bug later on, you might not understand what you have created some time ago (it happens to experts regularly even with documented code...)
Obfuscation tools:
Having said that, you may want to look at ProGuard ( http://proguard.sourceforge.net ). It is a Java obfuscator and it even improves performance on low resources platforms (mostly by shortening class and package names and reducing the size of the class files to push around).
There are encrypting class loaders. It raises the difficulty for your opponent, but you won't be safe - see this article.
Regarding performance:
You have to run performance tests with and without your change. And you need to do that on the platform on which the software shall run eventually. Some code is certainly slower and some certainly faster - however, eventually you need to test it.

How to estimate execution time of method when it is invoked using Java Reflection

hi can anyone help me in this issue
I am invoking a method "add" of class "CustomeMath" now i want to know the the execution time taken by the method addMe() only (This time does not include the execution time taken by method "Invoke()" of reflection to invoke some method.).I am posting the generic code of this problem.
import java.lang.reflect.*;
public class CustomeMath{
public int addMe(int a, int b)
{
return a + b;
}
public static void main(String args[])
{
try {
Class cls = Class.forName("CustomeMath");
Class partypes[] = new Class[2];
partypes[0] = Integer.TYPE;
partypes[1] = Integer.TYPE;
Method meth = cls.getMethod(
"addMe", partypes);
CustomeMath methobj = new CustomeMath();
Object arglist[] = new Object[2];
arglist[0] = new Integer(37);
arglist[1] = new Integer(47);
Object retobj
= meth.invoke(methobj, arglist);
Integer retval = (Integer)retobj;
System.out.println(retval.intValue());
}
catch (Throwable e) {
System.err.println(e);
}
}
}
Now i want to get the execution time taken by the method "addMe()".and this time doesn't include time taken by the "invoke()" method.How do i get this time?(Remember i dont want to use the "System.currentTimeMillis();")
If you want to measure times with some precision, you must use System.nanoTime(). Invoke as the very first line of the addMe() and invoke again as the last line, the compute the difference and print it, remember, they are nanoseconds.
If you want to measure the execution time from outside (in the main) without including the invoke() overhead, you simply can't.
System.currentTimeMillis() is very fast to execute compared to System.nanoTime(), but has a poor precision (several milliseconds), so be careful if the addMe() executes fast as the measured time could not be relevant. When you want to measure the execution time of a method that is very fast, normally you execute the method thousands in a loop and measure the total execution time with only two calls to System.nanoTime(), as the overhead imposed of the time measurement could be larger than the execution time of the method itself.
To check performance issues you would usually use profilers, then make several variations of you algorithm and compare their performance in the profiler. This is one of the main purposes of profilers, I would say it is much better bet than to try to come with a home grown solution which may or may not give you reliable results. There are quite a few of them, I have very good experience with YourKit.
And - I think a profiler is a must have tool for any serious developer regardless the programming language anyway...
The only way to get the time without reflections is to call the method without reflections. This is because the method is so trivial, that attempting to monitor it will create more overhead than the actual method and give you how fast you can monitor the method rather than the speed of the method.
public static int addMe(int a, int b) {
return a + b;
}
public static void main(String args[]) throws InvocationTargetException, IllegalAccessException, NoSuchMethodException {
Method addMe = CustomeMath.class.getMethod("addMe", int.class, int.class);
int result = (Integer) addMe.invoke(null, 37, 47);
System.out.println(result);
int runs = 10000000;
{
long start = System.nanoTime();
for (int i = 0; i < runs; i++)
addMe(i, 10);
long time = System.nanoTime() - start;
System.out.printf("The average call time was %.1f ns%n", (double) time / runs);
}
{
long start = System.nanoTime();
for (int i = 0; i < runs; i++)
addMe.invoke(null, i, 10);
long time = System.nanoTime() - start;
System.out.printf("The average call time using reflections was %.1f ns%n", (double) time / runs);
}
}
prints
84
The average call time was 1.4 ns
The average call time using reflections was 670.1 ns
Some homework for you;
Can you can speculate why there is such a difference?
Why might you get a different result in a more realistic example?

Categories

Resources