Do more methods, even if they are not called, have an affect on the performance of a particular class...
By performance, I mean anything, like does it take longer to create the object, does it take longer to actually execute a method...etc...
Now my understanding is that the code will only be compiled by the JIT compiler if it reaches a code block/method that it has not reached before...which would lead me to think that I am no affecting anything by adding default methods. Yes it will add to the "size" of the (byte) code but doesn't actually affect performance?
Am I right or wrong?
Here is the example:
public interface MyInterface {
void someMethod();
public default void myDefaultMethod() {
System.out.println("hey");
}
}
public class MyClass implements MyInterface {
public void someMethod() {
}
}
public static void main(String[] args) {
MyClass c = new MyClass();
c.someMethod();
c.myDefaultMethod();
}
If I then change MyInterface and add LOTS of default methods, even if they are never called, will it have an affect:
public interface MyInterface {
void someMethod();
public default void myDefaultMethod() {
System.out.println("hey");
}
public default void myDefaultMethod1() {
System.out.println("hey1");
}
public default void myDefaultMethod2() {
System.out.println("hey1");
}
// ...
public default void myDefaultMethod100() {
System.out.println("hey100");
}
}
You're right in one sense. There's some nuance, though.
Do more methods, even if they are not called, have an affect on the performance of a particular class...
"Performance" usually refers to speed of execution of the program. Code that is never executed will never (directly) consume any CPU time. Thus, code that is never executed cannot (directly) affect execution time. So in that sense, you are correct.
By performance, I mean anything, like does it take longer to create the object, does it take longer to actually execute a method...etc...
No, and no. There's no reason having extra methods lying around would affect object creation time, as that's a function of object size, and at least in Java objects don't directly contain their methods, if memory serves. Having extra methods definitely won't (directly) affect execution of unrelated methods.
Now my understanding is that the code will only be compiled by the JIT compiler if it reaches a code block/method that it has not reached before...
This isn't totally right. The JITC can revisit the same section of code over and over again if it determines that doing so would be beneficial.
... which would lead me to think that I am no affecting anything by adding default methods. Yes it will add to the "size" of the (byte) code but doesn't actually affect performance?
You're right that the bytecode file would be larger. You're wrong in that that wouldn't make a difference.
Code size can have a significant impact on performance. Small programs that can fit mostly/entirely in cache will have a significant advantage over larger programs, as pieces of code don't have to be loaded from RAM or the HDD/SSD, which are much slower than the cache.
The amount of code needed to do this might be pretty large, though, so maybe for a method or two it wouldn't matter that much. I'm not sure at what point code size in Java becomes a problem. Never tried to find out.
If you never call those methods, it might be possible that the bits of code that make up those methods are never loaded, which removes their cache-related performance penalty. I'm not certain if splitting the program code like this is possible, though.
So in the end, it probably wouldn't be harmful, so long as you don't have an excessive number of methods. Having methods around that are never called, though, might be problem for code maintainability, which is always a factor that you should consider.
Related
For example:
class Main {
public boolean hasBeenUpdated = false;
public void updateMain(){
this.hasBeenUpdated = true;
/*
alternative:
if(!hasBeenUpdated){
this.hasBeenUpdated = true;
}
*/
}
public void persistUpdate(){
this.hasBeenUpdated = false;
}
}
public Main instance = new Main()
instance.updateMain()
instance.updateMain()
instance.updateMain()
Does instance.hasBeenUpdated get updated 3 times in memory?
The reason I ask this is because I hoped to use a boolean("hasBeenUpdated") as a flag, and this could theoretically be "changed" many, many times, before I call "instance.persistUpdate()".
Does the JVM's JIT see this and perform an optimization?
JIT will collapse redundant statements only when it can PROVE that removing the code will not change the behavior. For example, if you did this:
int i;
i = 1;
i = 1;
i = 1;
The first two assignments are provably redundant, and the JIT could eliminate them. If instead it's
int i;
i = someMethodReturningInt();
i = someMethodReturningInt();
i = someMethodReturningInt();
the JIT has no way of knowing what someMethodReturnintInt() does, including whether it has any side effects, so it must invoke the method 3 times. Whether or not it actually stores any but the final value is immaterial, as the code would behave the same either way. (Declaring volatile int i; instead would force it to store each value)
Of course if you're doing other things in between the method invocations the it will be forced to perform the assignment.
The whole topic is part of the more general "happens-before" and "happens-after" concepts documented in the language and JVM specifications.
Optimization is NEVER supposed to change the behavior of a program, except possibly to reduce its runtime. There have been instances where bugs in the optimizer inadvertently did introduce errors, but these have been few and far between. In general you don't need to worry about whether optimization will break your code.
It can perform an optimization, yes.
As a matter of fact, it can issue a single write, or a single call to updateMain. All those three calls will be collapsed to one, only.
But for that to happen, JIT has to prove that nothing else breaks, or more specifically that code does not break the JMM rules. In this specific case, as far as I understand it, it does not.
Given the choice is between JVM code that implements
move new value to variable
and
compare new value with current value of variable
if not the same
move new value to variable
the JVM would have to be fairly nutty to implement it the latter way. That's a pessimization, not an optimization.
The JVM to a large extent relies on the real machine to do simple operations, and real machines store values in memory when you tell them to store values in memory.
Similar to Can JIT be prevented from optimising away method calls? I'm attempting to track memory usage of long-lived data store objects, however I'm finding that if I initialize a store, log the system memory, then initialize another store, sometimes the compiler (presumably the JIT) is smart enough to notice that these objects are no longer needed.
public class MemTest {
public static void main(String[] args) {
logMemory("Initial State");
MemoryHog mh = new MemoryHog();
logMemory("Built MemoryHog");
MemoryHog mh2 = new MemoryHog();
logMemory("Built Second MemoryHog"); // by here, mh may be GCed
}
}
Now the suggestion in the linked thread is to keep a pointer to these objects, but the GC appears to be smart enough to tell that the objects aren't used by main() anymore. I could add a call to these objects after the last logMemory() call, but that's a rather manual solution - every time I test an object, I have to do some sort of side-effect triggering call after the final logMemory() call, or I may get inconsistent results.
I'm looking for general case solutions; I understand that adding a call like System.out.println(mh.hashCode()+mh2.hashCode()) at the end of the main() method would be sufficient, but I dislike this for several reasons. First, it introduces an external dependency on the testing above - if the SOUT call is removed, the behavior of the JVM during the memory logging calls may change. Second, it's prone to user-error; if the objects being tested above change, or new ones are added, the user must remember to manually update this SOUT call as well, or they'll introduce difficult to detect inconsistencies in their test. Finally, I dislike that this solution prints at all - it seems like an unnecessary hack that I can avoid with a better understanding of the JIT's optimizations. To the last point, Patricia Shanahan's answer offers a reasonable solution (explicitly print that the output is for memory sanity purposes) but I'd still like to avoid it if possible.
So my initial solution is to store these objects in a static list, and then iterate over them in the main class's finalize method*, like so:
public class MemTest {
private static ArrayList<Object> objectHolder = new ArrayList<>();
public static void main(String[] args) {
logMemory("Initial State", null);
MemoryHog mh = new MemoryHog();
logMemory("Built MemoryHog", mh); // adds mh to objectHolder
MemoryHog mh2 = new MemoryHog();
logMemory("Built Second MemoryHog", mh2); // adds mh2 to objectHolder
}
protected void finalize() throws Throwable {
for(Object o : objectHolder) {
o.hashCode();
}
}
}
But now I've only offloaded the problem one step - what if the JIT optimizes away the loop in the finalize method, and decides these objects don't need to be saved? Admittedly, maybe simply holding the objects in the main class is enough for Java 7, but unless it's documented that the finalzie method can't be optimized away, there's still nothing theoretically preventing the JIT/GC from getting rid of these objects early, since there's no side effects in the contents of my finalize method.
One possibility would be to change the finalize method to:
protected void finalize() throws Throwable {
int codes = 0;
for(Object o : loggedObjects) {
codes += o.hashCode();
}
System.out.println(codes);
}
As I understand it (and I could be wrong here), calling System.out.println() will prevent the JIT from getting rid of this code, since it's a method with external side effects, so even though it doesn't impact the program, it can't be removed. This is promising, but I don't really want some sort of gibberish being output if I can help it. The fact that the JIT can't (or shouldn't!) optimize away System.out.println() calls suggests to me that the JIT has a notion of side effects, and if I can tell it this finalize block has such side effects, it should never optimize it away.
So my questions:
Is holdijng a list of objects in the main class enough to prevent them from ever being GCed?
Is looping over those objects and calling something trivial like .hashCode() in the finalize method enough?
Is computing and printing some result in this method enough?
Are there other methods (like System.out.println) the JIT is aware of that cannot be optimized away, or even better, is there some way to tell the JIT not to optimize away a method call / code block?
*Some quick testing confirms, as I suspected, that the JVM doesn't generally run the main class's finalize method, it abruptly exits. The JIT/GC may still not be smart enough to GC my objects simply because the finalize method exists, even if it doesn't get run, but I'm not confident that's always the case. If it's not documented behavior, I can't reasonably trust it will remain true, even if it's true now.
Here's a plan that may be overkill, but should be safe and reasonably simple:
Keep a List of references to the objects.
At the end, iterate over the list summing the hashCode() results.
Print the sum of the hash codes.
Printing the sum ensures that the final loop cannot be optimized out. The only thing you need to do for each object creation is put it in a List add call.
Yes, it would be legal for mh1 to be garbage collected at that point. At that point, there is no code that could possibly use the variable. If the JVM could detect this, then the corresponding MemoryHog object will be treated as unreachable ... if the GC were to run at that point.
A later call like System.out.println(mh1) would be sufficient to inhibit collection of the object. So would using it in a "computation"; e.g.
if (mh1 == mh2) { System.out.println("the sky is falling!"); }
Is holding a list of objects in the main class enough to prevent them from ever being GCed?
It depends on where the list is declared. If the list was a local variable, and it became unreachable before mh1, then putting the object into the list will make no difference.
Is looping over those objects and calling something trivial like .hashCode() in the finalize method enough?
By the time the finalize method is called, the GC has already decided that the object is unreachable. The only way that the finalize method could prevent the object being deleted would be to add it to some other (reachable) data structure or assign it to a (reachable) variable.
Are there other methods (like System.out.println) the JIT is aware of that cannot be optimized away,
Yea ... anything that makes the object reachable.
or even better, is there some way to tell the JIT not to optimize away a method call / code block?
No way to do that ... apart from making sure that the method call or code block does something that contributes to the computation being performed.
UPDATE
First, what is going on here is not really JIT optimization. Rather, the JIT is emitting some kind of "map" that the GC is using to determine when local variables (i.e. variables on the stack) are dead ... depending on the program counter (PC).
Your examples to inhibit collection all involve blocking the JIT via SOUT, I'd like to avoid that somewhat hacky solution.
Hey ... ANYTHING that depends on the exact timing of when things are garbage collected is a hack. You are not supposed to do that in a properly engineered application.
I updated my code to make it clear that the list that's holding my objects is a static variable of the main class, but it seems if the JIT's smart enough it could still theoretically GC these values once it knows the main method doesn't need them.
I disagree. In practice, the JIT cannot determine that a static will never be referenced. Consider these cases:
Before the JIT runs, it appears that nothing will use static s again. After the JIT has run, the application loads a new class that refers to s. If the JIT "optimized" the s variable, the GC would treat it as unreachable, and either null it or create a dangling references. When the dynamically loaded class then looked at s it would then see the wrong value ... or worse.
If the application ... or any libraries used by the application ... uses reflection, then it can refer to the value of any static variable without this being detectable by the JIT.
So while it is theoretically possible to do this optimization is a small number of cases:
in the vast majority of cases, you can't, and
in the few cases that you can, the pay-off (in terms of performance improvement) is most likely negligible.
I similarly updated my code to clarify that I'm talking about the finalize method of the main class.
The finalize method of the main class is irrelevant because:
you are not creating an instance of the main class, and
the finalize method CANNOT refer to the local variables of another method (e.g. the main method).
... it's existence prevents the JIT from nuking my static list.
Not true. The static list can't be nuked anyway; see above.
As I understand it, there's something special about SOUT that the JIT is aware of that prevents it from optimizing such calls away.
There is nothing special about sout. It is just something that we KNOW that influences the results of the computation and that we therefore KNOW that the JIT cannot legally optimize away.
In Java, I've done things like the following without thinking much about it:
public class Main {
public void run() {
// ...
}
public static void main(String[] args) {
new Main().run();
}
}
However, recently I've become unsure as to whether doing that is safe. After all, there is no reference to the Main object after it's created (well, there is the this reference, but does that count?), so it looks like there's a danger that the garbage collector might delete the object while it's in the middle of executing something. So perhaps the main method should look like this:
public static void main(String[] args) {
Main m = new Main();
m.run();
}
Now, I'm pretty sure that the first version works and I've never had any problems with it, but I'd like to know if it's safe to do in all cases (not only in a specific JVM, but preferably according to what the language specification says about such cases).
If an object method is being executed, it means someone is in possession of that reference. So no, an object can't be GC'd while a method is being executed.
For the most part garbage collection is transparent. It's there to remove the unnecessary complication of manual memory management. So, it will appear not to be collected, but what actually happens is more subtle.
Trivially, a compiler may completely elide the construction of the object. (By compiler, I mean a lower level compiler than javac. The bytecodes will be a literal transliteration of the source.) More obscurely, garbage collection typically runs in separate threads and actually remove the unaccessed object as a method on it is being run.
How can this be observed? The usual suspect in a finaliser. It may run concurrently with a method running on the object. Typically you would get around this problem with synchronized blocks in both the finaliser and the normal methods, which introduces the necessary happens-before relationship.
m is just a variable which has reference stored. This will be used by programmer to use the same object further to write logic on same object.
While execution, program will be converted to OP-CODES / INSTRUCTIONS .
These INSTRUCTION will have the reference to object(it is a memory location after all).
In case m is present, location of object will be accessed via INDIRECT REFERENCE.
If m is absent, the reference is DIRECT.
So here, object is being used by CPU registers, irrespective of use of reference variable.
This will be available till the flow of execution is in scope of main() function.
Further, as per GC process, GC only removes objects from memory, once GC is sure that the object will not be used any further.
Every object is given chance to survive a number of times(depends on situation and algorithm). Once the number of chances are over, then only object is garbage collected.
In simpler words, objects which were used recently, will be given chance to stay in memory.
Old objects will be removed from memory.
So given your code:
public class Main {
public void run() {
// ...
}
public static void main(String[] args) {
new Main().run();
}
}
the object will not be garbage collected.
Also, for examples, try to look at anonymous class examples. Or examples from event handling in AWT / SWING.
There, you will find a lot of usage like this.
The accepted answer is not correct. Whether the object can be GCed or not depends on if your public void run() {// ...} method has a reference to the class instance (this). Try:
public class FinalizeThis {
private String a = "a";
protected void finalize() {
System.out.println("finalized!");
}
void loop() {
System.out.println("loop() called");
for (int i = 0; i < 1_000_000_000; i++) {
if (i % 1_000_000 == 0)
System.gc();
}
// System.out.println(a);
System.out.println("loop() returns");
}
public static void main(String[] args) {
new FinalizeThis().loop();
}
}
The above program always outputs
loop() called
finalized!
loop() returns
in Java 8. If you, however, uncomment System.out.println(a), the output changes to
loop() called
a
loop() returns
There is no GC this time because the method called references the instance variable (this.a).
You can take look at this answer
Does the java compiler (the default javac that comes in JDK1.6.0_21) optimize code to prevent the same method from being called with the same arguments over and over? If I wrote this code:
public class FooBar {
public static void main(String[] args) {
foo(bar);
foo(bar);
foo(bar);
}
}
Would the method foo(bar) only run once? If so, is there any way to prevent this optimization? (I'm trying to compare runtime for two algos, one iterative and one comparative, and I want to call them a bunch of times to get a representative sample)
Any insight would be much appreciated; I took this problem to the point of insanity (I though my computer was insanely fast for a little while, so I kept on adding method calls until I got the code too large error at 43671 lines).
The optimization you are observing is probably nothing to do with repeated calls ... because that would be an invalid optimization. More likely, the optimizer has figured out that the method calls have no observable effect on the computation.
The cure is to change the method so that it does affect the result of computation ...
It doesn't; that would cause a big problem if foo is non-pure (changes the global state of the program). For example:
public class FooBar {
private int i = 0;
private static int foo() {
return ++i;
}
public static void main(String[] args) {
foo();
foo();
foo();
System.out.println(i);
}
}
You haven't provided enough information to allow for any definitive answers, but the jvm runtime optimizer is extremely powerful and does all sorts of inlining, runtime dataflow and escape analysis, and all manner of cache tricks.
The end result is to make the sort of micro-benchmarks you are trying to perform all but useless in practice; and extremely difficult to get right even when they are potentially useful.
Definitely read http://www.ibm.com/developerworks/java/library/j-benchmark1.html for a fuller discussion on the problems you face. At the very least you need to ensure:
foo is called in a loop that runs thousands of times
foo() returns a result, and
that result is used
The following is the minimum starting point, assuming foo() is non-trivial and therefore is unlikely to be inlined. Note: You still have to expect loop-unrolling and other cache level optimizations. Also watch out for the hotspot compile breakpoint (I believe this is ~5000 calls on -server IIRC), which can completely stuff up your measurements if you try to re-run the measurements in the same JVM.
public class FooBar {
public static void main(String[] args) {
int sum = 0;
int ITERATIONS = 10000;
for (int i = 0; i < ITERATIONS; i++) {
sum += foo(i);
}
System.out.println("%d iterations returned %d sum", ITERATIONS, sum);
}
}
Seriously, you need to do some reading before you can make any meaningful progress towards writing benchmarks on a modern JVM. The same optimizations that allows modern Java code to match or even sometimes beat C++ make benchmarking really difficult.
The Java compiler is not allowed to perform such optimizations because method calls very likely cause side effets, for example IO actions or changes to all fields it can reach, or calling other methods that do so.
In functional languages where each function call is guaranteed to return the same result if called with the same arguments (changes to state are forbidden), a compiler might indeed optimize away multiple calls by memorizing the result.
If you feel your algorithms are too fast, try to give them some large or complicated problem sets. There are only a few algorithms which are always quite fast.
What is the best way to test value visibility between threads?
class X {
private volatile Object ref;
public Object getRef() {
return ref;
}
public void setRef(Object newRef) {
this.ref = newRef;
}
}
The class X exposes a reference to the ref object. If concurrent threads read and and write the object reference every Thread has to see the latest object that was set. The volatile modifier should do that. The implementation here is an example it could also be synchronized or a lock-based implementation.
Now I'm looking for a way to write a test that informs me when the value visibility is not as specified (older values were read).
It's okay if the test does burn some cpu cycles.
The JLS says what you are supposed to do in order to get guaranteed consistent execution in an application involving "inter-thread actions". If you don't do these things, the execution may be inconsistent. But whether it actually will be inconsistent depends on the JVM you are using, the hardware that you are using, the application, the input data, and ... whatever else might be happening on the machine when you run the application.
I cannot see what your proposed test would tell you. If the test shows inconsistent executions, it would confirm the wisdom of doing synchronization properly. But if running the test a few time shows only (apparently) consistent executions, this doesn't tell you that executions are always going to be consistent.
Example:
Suppose that you run your tests on (say) JDK 1.4.2 (rev 12) / Linux / 32bit with the 'client' JVM and options x, y, z running on a single processor machine. And that after running the test 1000 times, you observe that it does not seem to make any difference if you leave out the volatile. What have you actually learned in that scenario?
You have NOT learned that it really makes no difference? If you change the test to use more threads, etc, you may get a different answer. If you run the test a few more thousand or million or billion times, you might get a different answer.
You have NOT learned anything about what might happen on other platforms; e.g. different Java version, different hardware, or different system load conditions.
You have NOT learned if it is safe to leave out the volatile keyword.
You only learn something if the test shows a difference. And the only thing that you learn is that synchronization is important ... which is what all of the text books, etc have been telling you all along :-)
Bottom line: this is the worst kind of black box testing. It gives you no real insight as to what is going on inside the box. To get that insight you need to 1) understand the Memory Model and 2) deeply analyze the native code emitted by the JIT compiler (on multiple platforms ...)
If I understand correctly, then you want a test-case that passes if the variable is defined as volatile and fails if not.
However I think there is no reliable way to do this. Depending on the implementation of the jvm concurrent access may work correctly even without volatile.
So a unit test will work correctly when volatile is specified but it still might work correctly without volatile.
Wow, that's much tougher than I initially thought.
I might be completely off, but how about this?
class Wrapper {
private X x = new X();
private volatile Object volatileRef;
private final Object setterLock = new Object();
private final Object getterLock = new Object();
public Object getRef() {
synchronized(getterLock) {
Object refFromX = x.getRef();
if (refFromX != volatileRef) {
// FAILURE CASE!
}
return refFromX;
}
}
public void setRef(Object ref) {
synchronized(setterLock) {
volatileRef = ref;
x.setRef(ref);
}
}
}
Could this help?
Of course, you will have to create many Threads to hit this wrapper, hoping for the bad case to appear.
How about this ?
public class XTest {
#Test
public void testRefIsVolatile() {
Field field = null;
try {
field = X.class.getDeclaredField("ref");
} catch (SecurityException e) {
e.printStackTrace();
Assert.fail(e.getMessage());
} catch (NoSuchFieldException e) {
e.printStackTrace();
Assert.fail(e.getMessage());
}
Assert.assertNotNull("Ref field", field);
Assert.assertTrue("Is Volatile", Modifier.isVolatile(field
.getModifiers()));
}
}
So basicaly you want this scenario: one thread writes the variable, while another reads it at the same time, and you want to ensure that the variable read has the correct value, right?
Well, I don't think you can use unit testing for that, because you can't ensure the right environment. That is done by the JVM, by how it schedules instructions. Here's what I would do. Use a debugger. Start one thread to write the data and put a breakpoint on the line that does this. Start the second thread and have it read the data, also stopping at that point. Now, step the first thread to execute the code that writes, and then read with the second one. In your example, you won't achieve anything with this, because read and write are single instructions. But usually if these operations are more complex, you can alternate the execution of the two threads and see if everything is consistent.
This will take some time, because it's not automated. But I wouldn't go and write a unit test that tries reading and writing a lot of times, hoping to catch that case where it fails, because you wouldn't achieve anything. The role of a unit test is to assure you that code you wrote is working as expected. But in this case, if the test passes, you're not assured of anyhing. Maybe it was just lucky and the conflict didn't appera on this run. And that defeats the purpose.