Some CPU intensive routines get dramatically slower when run through a debugger. Why is this?
Currently I'm just using IntelliJ to step through code running in JBoss. When I start JBoss, I use these options:
set JAVA_OPTS=-Xms512m -Xmx1024m -XX:MaxPermSize=256m -Xdebug -Xrunjdwp:transport=dt_socket,address=5005,server=y,suspend=n %JAVA_OPTS%
Is there a way to speed up the execution? Or to speed up certain method executions that I don't need to step through?
Update: Seems if I don't step over/into the CPU intensive routines (ie: Just run til a breakpoint set right after the routine), then the execution time is as if not in a debugger.
Some CPU intensive routines get dramatically slower when run through a debugger. Why is this?
Because the JITter won't optimize code as much (often, not at all) when debugging is enabled.
It also depends on the "breakpoints-style". E.g. having watchpoints on variables or putting breakpoints on interface level (debugger will stop on all method-implementations when they're executed) often dramatically slows down the process time.
When debugging, in addition to running your application, you are also running a debugger.
The code is compiled in debug mode with metadata symbols about local variables and other source-level information. The debugger reads to know which line of source code corresponds with the current instruction. The process is called symbolic debugging. The stored symbols increase code size and interpreting them increases execution time.
Some debuggers actually interpret the code on the fly, which is almost always a major performance hit.
More information about Java's debug compilation mode, which is performed by javac and includes debug information in class files: Java Language Compiler Options.
For example: -g generates all debugging information, including local variables.
You do need to consider that another program -- the debugger -- is hooked into your program and is watching it for things like exceptions. It's also monitoring the current line in order to react to breakpoints or user requested interruptions (like a pause request or watch condition).
Top Tip: in IDEA you can use ALT+F9 to run to where you have the cursor placed rather than set an extra breakpoint.
I have found anecdotally that debugging becomes very slow in IDEA if you are walking through code where there is a lot of data accessible from the stack. Don't forget, IDEA collects this data (anything currently in the lexical scope) and presents it up to you as a tree of objects to browse whether you are "watching" or not and does this at every subsequent step (maybe it re-creates the tree each time?).
This is especially apparent when, for example, there is a large collection as an instance variable of the "current" object.
Debugging the optimized code produced by the JIT would be very hard, because there isn't a direct relationship between a range of native instructions and a line of Java code like there is a relationship between a range of Java bytecode and a line of Java code.
So breaking into a function in the debugger forces the JVM to deoptimize the method you are stepping through. Hotspot doesn't generate native code at all and just interprets the bytecode of the method.
Prior to JDK 1.4.1 starting with debugging enabled forced the JVM to only use the interpreter: http://java.sun.com/products/hotspot/docs/whitepaper/Java_Hotspot_v1.4.1/Java_HSpot_WP_v1.4.1_1002_3.html#full
If you use Java 5, the parameter for debugging is :
-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=
and before Java 5
-Xdebug -Xrunjdwp:transport=dt_socket,address=5005,server=y,suspend=n
Related
I am hoping to do a profiling analysis on my Java project. To get the results I want to add a "hook" to the JVM so that every time a heap access occurs, the "hook" is called and does some tracing. I have been looking into JVMTI but this does not seem to give me what I expect.
I have several questions:
Is it possible to add such a hook?
If possible, what are the correct tools/interfaces that I should use?
If there is no existing tools that do this, can I achieve this by modifying the JVM codebase?
Thanks.
I want to add a "hook" to the JVM so that every time a heap access occurs
You can't really do this in the Java as the hook itself would access the heap and cal itself. Even if you work around this, it would make the program impossibly slow.
What you can do is use the debugging interface to breakpoint after each instruction, inspect the instruction and see if it accessed the heap or not. This would be perhaps 10,000x slower than normal.
An alternative is to translate the bytecode using Instrumentation to trace each memory access. This might be only a few hundred times slower.
To do what you propose efficiently, you could use https://software.intel.com/en-us/articles/intel-performance-counter-monitor which used by tools such as perf on Linux. This requires in-depth knowledge of the processor you are using
I have a situation where I want to instrument Java code to add function calls, those function I add calls to might affect the objects status in the system thus changing the state of the program. I am looking for a way to insert those calls but leave the program status unchanged.
I am looking for a method to store the status (Image?) of the heap and come back to it later, I mean at the end of my instrumentation code. I tried tuckling it with an idea of copying the current JVM, maybe execute the instrumented code inside it (with the exact state of the program) and come back to the original JVM when the instrumentation is done. I couldn't find a documentation on such scenario so I am wondering if there is a better approach to it.
The state of Java program is not only the Heap. It also includes running threads, loaded classes, constant pools, caches and many other VM structures.
Saving state of a Java program is roughly the same as saving state of an arbitrary process in OS. fork is probably the closest way to achieve this, but it's still not an easy solution.
Imagine you have command-line application that takes input file and does something with it. Now imagine you want to sample/profile this application. If it were Visual Studio you would just select profiling method (sampling/instrumentation) and VS would run application for you and collect data while program completes. But as far as I can see there is no similar functionality in VisualVM. You have to run your application, then select it in VisualVM and then explicitly start sampling/profiling. The problem is that sometimes execution of program with certain input data takes less time than it is required to setup VisualVM. Also with such an approach there is no possibility to batch profile application. Someone has suggested to start application in debug mode from Eclipse and set breakpoint somewhere in the beginning of main() method. Then setup VisualVM and continue execution. But I have suspicion that running in Debug vs Release mode has performance implications on its own.
Suggestions?
There is a new Startup Profiler plugin for VisualVM 1.3.6, which allows you to profile your application from its startup. See this article for additional information.
If the program does I/O, the Visual Studio sampler will not see the I/O because it is a "CPU Sampler" (even if nearly all of the time is spent waiting for I/O).
If you use Instrumentation, you won't see any line-level information because it only summarizes at the function level.
I use this technique.
If the program runs too quickly to sample, just put a temporary outer loop around it of, say, 100 or 1000 iterations.
The difference between Debug and Release mode will be next to nothing unless you are spending a good fraction of time in tight loops, in your code, where the loops do not contain any function calls, OR if you are doing data structure operations that do a lot of validation in the libraries.
If you are, then your samples will show that you are, and you will know that Release will make a speed difference.
As far as batch profiling is concerned, I don't. I just keep an eye on the program's overall throughput rate. If there is some input that seems to make it take too long, then I do the sampling procedure on the program with that input, see what the problem is, and fix it.
I did some simple function call and string operation in a loop, the java program runs much faster under command line than launching ( Run as... ) from eclipse...
6 lines of output were printed, each line is around 120 characters.
each line is a perf result ranges from 50ms to 300ms.
The total time is a little more than 2 seconds.
"much slower" here means, for certain operations ( function call ), I see 20ms vs 300 ms.
After running on console once, the speed on eclipse catches up!
After I change and build the code in eclipse, the speed on CL will drop if I don't rebuild it with command line.
Looks like some hotspot information is only generated with CL...
Maybe it is just the eclipse console that is slower than your operating systems console?
Plus, at a total runtime of ~2 seconds, your benchmark probably is just super inaccurate.
Most likely the culprit is memory usage as a result of Eclipse loading, with the possibility that Eclipse is also doing something additional to the executable like swapping class loaders, or starting the java debugger.
I would say the most likely answer however is simply: Eclipse uses a lot of resources, especially memory, and is starving the system a bit, leading to swapping, and decreased performance. YMMV, and there's no guarantee I'm right without seeing your system, it's just my best guess.
I do agree with other comments that Eclipse is doing something when running the application and printing the console.
Eclipse has its own compiler (usually referred as Eclipse JDT) which supports incremental compilation. There is a possibility that the binary compiled by Eclipse is not optimized as it is compiled by javac.
These two compiler serves different purpose, JDT mainly enable Eclipse to provide state-of-art refactoring and auto-completion, and javac spends a lot of effort doing optimization.
I would say it's understandable that the application would run slower with all the Eclipse baggage underneath. Eclipse spawns the JVM process as a child and I am sure still does its own 'magic'.
Is there any Java profiler that allows profiling short-lived applications? The profilers I found so far seem to work with applications that keep running until user termination. However, I want to profile applications that work like command-line utilities, it runs and exits immediately. Tools like visualvm or NetBeans Profiler do not even recognize that the application was ran.
I am looking for something similar to Python's cProfile, in that the profiler result is returned when the application exits.
You can profile your application using the JVM builtin HPROF.
It provides two methods:
sampling the active methods on the stack
timing method execution times using injected bytecode (BCI, byte codee injection)
Sampling
This method reveals how often methods were found on top of the stack.
java -agentlib:hprof=cpu=samples,file=profile.txt ...
Timing
This method counts the actual invocations of a method. The instrumenting code has been injected by the JVM beforehand.
java -agentlib:hprof=cpu=times,file=profile.txt ...
Note: this method will slow down the execution time drastically.
For both methods, the default filename is java.hprof.txt if the file= option is not present.
Full help can be obtained using java -agentlib:hprof=help or can be found on Oracles documentation
Sun Java 6 has the java -Xprof switch that'll give you some profiling data.
-Xprof output cpu profiling data
A program running 30 seconds is not shortlived. What you want is a profiler which can start your program instead of you having to attach to a running system. I believe most profilers can do that, but you would most likely like one integrated in an IDE the best. Have a look at Netbeans.
Profiling a short running Java applications has a couple of technical difficulties:
Profiling tools typically work by sampling the processor's SP or PC register periodically to see where the application is currently executing. If your application is short-lived, insufficient samples may be taken to get an accurate picture.
You can address this by modifying the application to run a number of times in a loop, as suggested by #Mike. You'll have problems if your app calls System.exit(), but the main problem is ...
The performance characteristics of a short-lived Java application are likely to be distorted by JVM warm-up effects. A lot of time will be spent in loading the classes required by your app. Then your code (and library code) will be interpreted for a bit, until the JIT compiler has figured out what needs to be compiled to native code. Finally, the JIT compiler will spend time doing its work.
I don't know if profilers attempt to compensate to for JVM warmup effects. But even if they do, these effects influence your applications real behavior, and there is not a great deal that the application developer can do to mitigate them.
Returning to my previous point ... if you run a short lived app in a loop you are actually doing something that modifies its normal execution pattern and removes the JVM warmup component. So when you optimize the method that takes (say) 50% of the execution time in the modified app, that is really 50% of the time excluding JVM warmup. If JVM warmup is using (say) 80% of the execution time when the app is executed normally, you are actually optimizing 50% of 20% ... and that is not worth the effort.
If it doesn't take long enough, just wrap a loop around it, an infinite loop if you like. That will have no effect on the inclusive time percentages spent either in functions or in lines of code. Then, given that it's taking plenty of time, I just rely on this technique. That tells which lines of code, whether they are function calls or not, are costing the highest percentage of time and would therefore gain the most if they could be avoided.
start your application with profiling turned on, waiting for profiler to attach. Any profiler that conforms to Java profiling architecture should work. i've tried this with NetBeans's profiler.
basically, when your application starts, it waits for a profiler to be attached before execution. So, technically even line of code execution can be profiled.
with this approach, you can profile all kinds of things from threads, memory, cpu, method/class invocation times/duration...
http://profiler.netbeans.org/
The SD Java Profiler can capture statement block execution-count data no matter how short your run is. Relative execution counts will tell you where the time is spent.
You can use a measurement (metering) recording: http://www.jinspired.com/site/case-study-scala-compiler-part-9
You can also inspect the resulting snapshots: http://www.jinspired.com/site/case-study-scala-compiler-part-10
Disclaimer: I am the architect of JXInsight/OpenCore.
I suggest you try yourkit. It can profile from the start and dump the results when the program finishes. You have to pay for it but you can get an eval license or use the EAP version without one. (Time limited)
YourKit can take a snapshot of a profile session, which can be later analyzed in the YourKit GUI. I use this to feature to profile a command-line short-lived application I work on. See my answer to this question for details.