I ran a java program and calculated the time taken for a code to run. On multiple executions,I saw that the time taken keeps changing for the same piece of code. Could someone explain why? I used Eclipse IDE.
Note the times in consoles in below images
The jvm is working really hard for you dont put her on the clock!
also its nanoseconds not milliseconds, 459s would really be bad.
If you really want to mesure meaningful time do it on a scale where the background operations that grant you the privilege of coding without having to know what is happening on your machine and on which machine the code will be executed wont wheight as much. 1 second would be good enough
Related
I am having a very strange problem with Java that I have been unable to make any progress in fixing. I have designed a small application for generating and viewing simple fractals, it is part of a coursework. When I run the code from Eclipse it runs very quickly, generally using <5% of my processor and acting generally very responsive.
However when I run this same program from the windows command line, after compiling it of course, it runs much more slowly. It uses about 20% of the processor and if I was to put a figure on it I would say it runs about 10x slower, and is generally unusable. I have done a good bit of research on this problem and it hasn't been easy to come across relevant information, it certainly doesn't seem to be a common bug. The times it has been asked before differ from my situation in that people are writing an excessive amount to the console, which causes the slowdown. I am not printing anything to the console.
The process started by Eclipse when the code is run uses up to about 700mb of RAM. The code ran from the console uses up to about 70mb. I tried running it with a greater heap size, which did increase the RAM the process uses but did virtually nothing to increase the performance.
I would really appreciate it if anybody could help with this issue, I am tearing my hair out here.
Thanks very much!
Imagine you have command-line application that takes input file and does something with it. Now imagine you want to sample/profile this application. If it were Visual Studio you would just select profiling method (sampling/instrumentation) and VS would run application for you and collect data while program completes. But as far as I can see there is no similar functionality in VisualVM. You have to run your application, then select it in VisualVM and then explicitly start sampling/profiling. The problem is that sometimes execution of program with certain input data takes less time than it is required to setup VisualVM. Also with such an approach there is no possibility to batch profile application. Someone has suggested to start application in debug mode from Eclipse and set breakpoint somewhere in the beginning of main() method. Then setup VisualVM and continue execution. But I have suspicion that running in Debug vs Release mode has performance implications on its own.
Suggestions?
There is a new Startup Profiler plugin for VisualVM 1.3.6, which allows you to profile your application from its startup. See this article for additional information.
If the program does I/O, the Visual Studio sampler will not see the I/O because it is a "CPU Sampler" (even if nearly all of the time is spent waiting for I/O).
If you use Instrumentation, you won't see any line-level information because it only summarizes at the function level.
I use this technique.
If the program runs too quickly to sample, just put a temporary outer loop around it of, say, 100 or 1000 iterations.
The difference between Debug and Release mode will be next to nothing unless you are spending a good fraction of time in tight loops, in your code, where the loops do not contain any function calls, OR if you are doing data structure operations that do a lot of validation in the libraries.
If you are, then your samples will show that you are, and you will know that Release will make a speed difference.
As far as batch profiling is concerned, I don't. I just keep an eye on the program's overall throughput rate. If there is some input that seems to make it take too long, then I do the sampling procedure on the program with that input, see what the problem is, and fix it.
I have to write a program that is thought to run 'forever' , meaning that it won't terminate regularly. Up until now I always wrote programs that would run and be terminated at the end of the day. The program has to do some synchronizations, pause for n minutes and than sync again.
AFAIK there should be no problem with my current implementation and it should theoretically run just fine, but I'm lacking any real-world experience.
So are there any 'patterns' or best practices for writing very robust and resource efficient java programs that have a very long runtime? What could be possible problems after for example a month/year of runtime?
Some background :
Java : 1.7 but compiled down to 1.5
OS : Windows (exact version is not certain yet)
Thanks in advance
Just a brain dump of all the things I've had to keep in mind when writing this kind of app.
Avoid Memory Leaks
I had an app that runs once at mid day, every day, and in that I had a FileWriter. I wasn't closing that properly, and then we started wondering why our virtual machine was going into melt down after a few weeks. Memory leaks can come in the form of anyhing really, with one of the most common examples being that you don't de-reference an object appropriately. For example, using a class's field as a method of temporary storage. Often the class persists, and so does the reference. This leaves you with objects, sitting in memory and doing nothing.
Use the right kind of Scheduler
I used a java Timer in that app, and later I learnt that it's better to use a ScheduledThreadPoolExecutor when another app was changing the System clock. So if you plan on keeping it completely Java based, I would strongly recommend using that over a Timer for all of the reasons detailed in this question.
Be mindful of memory usage and your environment
If your app is loading large amounts of data each and every day, and you have other apps running on the same server, you may want to be careful about the timing. For example, say at mid day, three of the apps run their scheduled operation, I would say running it at any other time would probably be a smart move. Be mindful of the environment in which you're executing your code in.
Error handling
You probably want to configure your app to let you know if something has gone wrong, without the app breaking down. If it's running at a certain time every few hours, that means people are probably depending on it, so I would have a function in your Java code that sends out an email to you, detailing the nature of the exception.
Make it configurable
Again, if it needs to run at various points in the day, you don't want to have to pull the thing down for a few hours to work out some minor changes to your code. Instead, port it into a java Properties file, or into an XML Config (or really, whatever). The advantage of this is that you can update your program and get it up and running before anyone really noticed the difference.
Be afraid of the static keyword
That bad boy will make objects persist, even when you destroy their parent reference. It is the mother of all memory leaks if you are not careful with it. It's fine for constants, and things that you know don't need to change and need to exist within the project to run well, but if you're using it for random values inside a project, you're going to quickly wonder why your app is crashing every few hours rather than syncing.
Props to #X86 for reminding me of that one.
Memory leaks are likely to be the biggest problem. Ensure that there are no long-term references held after an iteration of your logic. Even a relatively small object being referenced forever, will exhaust the memory eventually (and worse, it's going to be harder to detect during testing if the growth rate is 1GB/month). One approach that may help is using the snapshot functionality of profilers: take a snapshot during the pause, let the sync run a few times, and take another snapshot. Comparing these should show the delta between the synchronizations, which should hopefully be zero.
Cache maintenance is another issue. The overall size of a cache needs to be strictly limited (whereas often you can get away without in short-running programs, because everything seen will be small enough to not cause problems). Equally it's more important to do cache-invalidation properly - broadly speaking, everything that gets cached will become stale at some point while your program is still running, and you need to be able to detect this and take appropriate action. This can be tricky depending on where the golden source of the cached data is.
The last thing I'll mention is exception-handling. For short-running processes, it's often enough to simply let the process die when an exception is encountered, so the issue can be dealt with, and the app rerun. With a long-running process you'll likely need to be more defensive than this. Consider running parts of your program in threads, which can be restarted* if/when they fail. You may need a supervisor-type module, which checks that everything else is still heartbeating and reboots it if not. If appropriate to your structure, this is anecdotally a lot easier to achieve with actors-style libraries rather than Java's standard executors. And if it's at all possible, you may want to have hooks (perhaps exposed over JMX/MBeans) that let you modify the behaviour somewhat, to allow a short-term hack/workaround to be affected without having to bring the process down. Though this requires quite some amount of foresight to predict exactly what's going to go wrong in several months...
*or rather, the job can be restarted in another thread
My program is fairly large, and because it tends to carry out processes randomly, at times it gets stuck and loops forever. If I were to forcefully stop and restart the program manually, usually (around 85%) of the time, the program completes all commands and terminates.
Is there a way to make a Java program restart itself after say 20 seconds, if it gets stuck? I tried using the system time to solve the issue, but the problem with this is that if my program gets stuck in a for loop, it does not update the system time until the next iteration.
This isn't the right way to approach this problem! You need to figure out why your program is getting stuck in an infinite loop, and then fix it. "Okay let's try this again" is not the right way to solve a bug - you have no idea what other effects this bug could be having. You might very well be getting incorrect output as well. Debug the program, don't work around the flaw.
You could use some external program that launches the java program and kills it after 20 seconds when it gets stuck, and then launches it again, but again, that is not the right way to solve the problem.
Considering that solving this problem would mean solving the Halting problem, we can be fairly sure that the whole approach is doomed ;)
You could obviously use timers to kill the program after some specified time and whatnot, but really - find the bugs in your program.
Your program shouldn't stuck and loops forever, repair this. But if this isn't possible and you still want "restart" program after forever-loop I propose you such solution:
Create main program which will be director. The director will be create thread. The thread will be doing main algorithm which can take a lot of time. The director will be waiting some time which will be final parameter. This parameter help to director recognises if the thread is in forever loop (it takes too long time). When forever-loop will be recognize, it'll terminate the thread and start new one (restart).
Have a look at ExecutorServices to get a mechanism that allows you to invoke a piece of code and receive a timeout if it doesn't finish within the expected time. You can then act upon as you see fit.
Another nice tool is to use jvisualvm in the JDK to attach to the program when looping. You can then ask for a thread dump and use it to figure out what it is doing.
Does anyone ever use stopwatch benchmarking, or should a performance tool always be used? Are there any good free tools available for Java? What tools do you use?
To clarify my concerns, stopwatch benchmarking is subject to error due to operating system scheduling. On a given run of your program the OS might schedule another process (or several) in the middle of the function you're timing. In Java, things are even a little bit worse if you're trying to time a threaded application, as the JVM scheduler throws even a little bit more randomness into the mix.
How do you address operating system scheduling when benchmarking?
Stopwatch benchmarking is fine, provided you measure enough iterations to be meaningful. Typically, I require a total elapsed time of some number of single digit seconds. Otherwise, your results are easily significantly skewed by scheduling, and other O/S interruptions to your process.
For this I use a little set of static methods I built a long time ago, which are based on System.currentTimeMillis().
For the profiling work I have used jProfiler for a number of years and have found it very good. I have recently looked over YourKit, which seems great from the WebSite, but I've not used it at all, personally.
To answer the question on scheduling interruptions, I find that doing repeated runs until consistency is achieved/observed works in practice to weed out anomalous results from process scheduling. I also find that thread scheduling has no practical impact for runs of between 5 and 30 seconds. Lastly, after you pass the few seconds threshold scheduling has, in my experience, negligible impact on the results - I find that a 5 second run consistently averages out the same as a 5 minute run for time/iteration.
You may also want to consider prerunning the tested code about 10,000 times to "warm up" the JIT, depending on the number of times you expect the tested code to run over time in real life.
It's totally valid as long as you measure large enough intervals of time. I would execute 20-30 runs of what you intend to test so that the total elapsed time is over 1 second. I've noticed that time calculations based off System.currentTimeMillis() tend to be either 0ms or ~30ms; I don't think you can get anything more precise than that. You may want to try out System.nanoTime() if you really need to measure a small time interval:
documentation: http://java.sun.com/javase/6/docs/api/java/lang/System.html#nanoTime()
SO question about measuring small time spans, since System.nanoTime() has some issues, too: How can I measure time with microsecond precision in Java?
Stopwatch is actually the best benchmark!
The real end to end user response time is the time that actually matters.
It is not always possible to obtain this time using the available tools, for instance most testing tools do not include the time it takes for a browser to render a page so an overcomplex page with badly written css will show sub second response times to the testing tools, but, 5 seconds plus response time to the user.
The tools are great for automated testing, and for problem determinittion but dont lose sight of what you really want to measure.
A profiler gives you more detailed information, which can help to diagnose and fix performance problems.
In terms of actual measurement, stopwatch time is what users notice so if you want to validate that things are within acceptable limits, stopwatch time is fine.
When you want to actually fix problems, however, a profiler can be really helpful.
You need to test a realistic number of iterations as you will get different answers depending on how you test the timing. If you only perform an operation once, it could be misleading to take the average of many iterations. If you want to know the time it takes after the JVM has warmed up you might run many (e.g. 10,000) iterations which are not included in the timings.
I also suggest you use System.nanoTime() as it's much more accurate. If your test time is around 10 micro-seconds or less, you don't want to call this too often or it can change your result. (e.g. If I am testing for say 5 seconds and I want to know when this is up I only get the nanoTime every 1000 iterations, if I know an iteration is very quick)
How do you address operating system scheduling when benchmarking?
Benchmark for long enough on a system which is representative of the machine you will be using. If your OS slows down your application, then that should be part of the result.
There is no point in saying, my program would be faster, if only I didn't have an OS.
If you are using Linux, you can use tools such as numactl, chrt and taskset to control how CPUs are used and the scheduling.
Profilers can get in the way of timings, so I would use a combination of stopwatch timing to identify overall performance problems, then use the profiler to work out where the time is being spent. Repeat the process as required.
After all, it's probably the second most popular form of benchmarking, right after "no-watch benchmarking" - where we say "this activity seems slow, that one seems fast."
Usually what's most important to optimize is whatever interferes with the user experience - which is most often a function of how frequently you perform the action, and whatever else is going on at the same time. Other forms of benchmarking often just help zero in on these.
I think a key question is the complexity and length of time of the operation.
I sometimes even use physical stopwatch measurements to see if something takes minutes, hours, days, or even weeks to compute (I am working with an application where run times on the orders of several days are not unheard of, even if seconds and minutes are the most common time spans).
However, the automation afforded by calls to any kind of clock system on the computer, like the java millis call referred to in the linked article, is clearly superior to manually seeing how long something runs.
Profilers are nice, when they work, but I have had problems applying them to our application, which usually involves dynamic code generation, dynamic loading of DLLs, and work performed in the two built-in just-in-time-compiled scripting languages of my application. They are quite often limited to assuming a single source language and other unrealistic expectations for complex software.
I ran a program today that searched through and collected information from a bunch of dBase files, it took just over an hour to run. I took a look at the code, made an educated guess at what the bottleneck was, made a minor improvement to the algorithm, and re-ran the program, this time it completed in 2.5 minutes.
I didn't need any fancy profiling tools or benchmark suites to tell me the new version was a significant improvement. If I needed to further optimize the running time I probably would have done some more sophisticated analysis but this wasn't necessary. I find that this sort of "stopwatch benchmarking" is an acceptable solution in quite a number of cases and resorting to more advanced tools would actually be more time-consuming in these cases.
I don't think stopwatch benchmarking is too horrible, but if you can get onto a Solaris or OS X machine you should check out DTrace. I've used it to get some great information about timing in my applications.
I always use stopwatch benchmarking as it is so much easier. The results don't need to be very accurate for me though. If you need accurate results then you shouldn't use stopwatch benchmarking.
I do it all the time. I'd much rather use a profiler, but the vendor of the domain-specific language I'm working with doesn't provide one.