I am trying out the VisualVM program that comes with the new JDKs. I am doing profiling on it and trying to profile CPU on only methods in a particular package.
I put the following in the "Profile Only Classes:"
jig.*
Where jig is the package I want to instrument. Unfortunately I get back results on other methods that are not in that package or any subpackages.
The only way I can reproduce your problem is if I leave the "Profile new Runnables" box checked. When I leave that checked, the profiler picks up code started as new threads, even if that code does not meet the filtering criteria. I guess this is unclear functionality.
You should make sure you uncheck that box before you do your profiling activity. Just be aware that with it unchecked, that probably means you won't see profile information of any of your own code that happens to be started as a separate Thread. (But I figure there's a good chance you're not doing that, so you have nothing to be concerned about.)
Actually there's an opened bug about that:
https://java.net/jira/browse/VISUALVM-546
I totally agree with the submitter (and with your disappointing about the "strange" behavior of VisualVM). Even with "Profile new Runnables" checked the filter must be honored in my opinion.
Profiling it's an important task to do especially with large project typically deployed on application-server where it's the common-way (and the right-way) have threads for background tasks and to serve user requests.
I invite everybody to vote for give attention from the VisualVM developers.
You can enter a filtering criterion in the text field at the bottom of the "Profiling Results" list, that should do the trick.
Related
When I receive a request on my API, I want to do a series of steps, each being a check or an enrichment. Each step could either succeed or fail. On Success, the next step should be carried out. On Failure, an end-step should be executed, and the flow is done. For that I have considered Spring State Machine, as it seems to fit the bill.
I have read up on the documentation and played around with it, but some things elude me:
Should there be a 1-to-1 relationship between a request and a State Machine, meaning that for every request, I make a new State Machine instance? Or should I somehow reuse a completed State Machine by resetting the machine for the next request?
What about cleanup of completed State Machines? There doesn't seem to be a way to destroy and clean a State Machine instance. If I create 1 per request, I've effectively introduced a memory leak, unless the framework somehow handles resources.
There is no absolutely correct answer to your question so I just need to leave some comments here. State machine as a concept is so loose that it gives you so many different ways to do things.
Whole concept if steps one after another kinda relates to how tasks recipe was implemented. It executes a dag of tasks and if parent task fails machine enters into error state giving user a chance to fix things and request machine to continue. statemachine-recipes-tasks statemachine-examples-tasks. Might be that this kind of use case would be a good candidate to create a new recipe as it is pretty generic.
Framework should clear things after machine has been stopped and eventually jvm should clear garbage. If you find something abnormal, please file a gh issue and we'll fix things.
We have sample statemachine-examples-eventservice which is reusing machines but I'm currently re-implementing that sample(it works but should be implemented better) as I was told by our head-chef that what I did there is dump SPR-15042. Machines cannot be used with a session scope and things go south if rich object(which ssm is) is serialised.
It is relatively easy to do a combination of states and choices which would do your step flow. It's only question how much you want this to be re-usable(thus generic recipe would be a good thing, PR's welcomed :) )
What comes for error handling something I presented in a statechart in gh-240 is also something to consider.
There has been some questions if ssm could work as a more generic flow engine but it's probably something it's never going to be as it would be a completely new project. Thought most of a flows could be handled as a separate recipes.
We have recently installed a SonarQube instance to check our source code.
The codebase is pretty large, with more than 1 million lines of code.
We run sonar-runner automatically via Jenkins.
Now I get that the UI gets updates only after sonar-runner stores its results in the database.
But it seems to really take ages sometimes, up to an hour after the success of sonar-runner before we are able to see anything coming in the UI.
So I have a couple questions, all related :
Is there a way to see analysis that are still 'in the pipes'?
Where can I see whether the conversion from database to the UI has failed?
Is there a way to speed the process?
So if I summarize? How can I impact the sonar-runner to sonar UI latency?
I went through all the docs but couldn´t find much about this yet.
Thanks for the info,
Is there a way to see analysis that are still 'in the pipes'?
Yes, log in as admin and go to Settings > System > Analysis report
Where can I see whether the conversion from database to the UI has failed?
have a look at the content of the "Current Activity" and "Past Reports" tabs
Is there a way to speed the process?
This is a very broad question which implies tones of different answers. It all depends on where time is spent. You may be CPU bound, or memory bound or database bound, ...
Having a look at the queue of report processing might give you a hint.
1 MLoc is not so huge. I run SonarQube thru sonar-runner+Jenkins, and when Jenkins indicates in the log that the analysis has been successful, I am able to see it in SonarQube's dashboard. So I would say your 'latency' is not normal.
Could you please precise your environment? Physical/virtual? OS? DB? SQ release? etc.
After loads of searching around, I realized that for some reason sonarQube didn´t handle correctly the fact that I was running several sonar-runner analysis right after each other.
After the ´Store results in database´ message, there are a couple seconds for which starting a new analysis will cause SonarQube GUI to not see the analysis.
Running analysis with a bit more time between them reduced the latecny by a great deal.
Due to the fact that Seb gave a lot of insight about SonarQube itself, I will accept his answer. It is also probably more fit to a general public and less specific to my situation.
I have to write a program that is thought to run 'forever' , meaning that it won't terminate regularly. Up until now I always wrote programs that would run and be terminated at the end of the day. The program has to do some synchronizations, pause for n minutes and than sync again.
AFAIK there should be no problem with my current implementation and it should theoretically run just fine, but I'm lacking any real-world experience.
So are there any 'patterns' or best practices for writing very robust and resource efficient java programs that have a very long runtime? What could be possible problems after for example a month/year of runtime?
Some background :
Java : 1.7 but compiled down to 1.5
OS : Windows (exact version is not certain yet)
Thanks in advance
Just a brain dump of all the things I've had to keep in mind when writing this kind of app.
Avoid Memory Leaks
I had an app that runs once at mid day, every day, and in that I had a FileWriter. I wasn't closing that properly, and then we started wondering why our virtual machine was going into melt down after a few weeks. Memory leaks can come in the form of anyhing really, with one of the most common examples being that you don't de-reference an object appropriately. For example, using a class's field as a method of temporary storage. Often the class persists, and so does the reference. This leaves you with objects, sitting in memory and doing nothing.
Use the right kind of Scheduler
I used a java Timer in that app, and later I learnt that it's better to use a ScheduledThreadPoolExecutor when another app was changing the System clock. So if you plan on keeping it completely Java based, I would strongly recommend using that over a Timer for all of the reasons detailed in this question.
Be mindful of memory usage and your environment
If your app is loading large amounts of data each and every day, and you have other apps running on the same server, you may want to be careful about the timing. For example, say at mid day, three of the apps run their scheduled operation, I would say running it at any other time would probably be a smart move. Be mindful of the environment in which you're executing your code in.
Error handling
You probably want to configure your app to let you know if something has gone wrong, without the app breaking down. If it's running at a certain time every few hours, that means people are probably depending on it, so I would have a function in your Java code that sends out an email to you, detailing the nature of the exception.
Make it configurable
Again, if it needs to run at various points in the day, you don't want to have to pull the thing down for a few hours to work out some minor changes to your code. Instead, port it into a java Properties file, or into an XML Config (or really, whatever). The advantage of this is that you can update your program and get it up and running before anyone really noticed the difference.
Be afraid of the static keyword
That bad boy will make objects persist, even when you destroy their parent reference. It is the mother of all memory leaks if you are not careful with it. It's fine for constants, and things that you know don't need to change and need to exist within the project to run well, but if you're using it for random values inside a project, you're going to quickly wonder why your app is crashing every few hours rather than syncing.
Props to #X86 for reminding me of that one.
Memory leaks are likely to be the biggest problem. Ensure that there are no long-term references held after an iteration of your logic. Even a relatively small object being referenced forever, will exhaust the memory eventually (and worse, it's going to be harder to detect during testing if the growth rate is 1GB/month). One approach that may help is using the snapshot functionality of profilers: take a snapshot during the pause, let the sync run a few times, and take another snapshot. Comparing these should show the delta between the synchronizations, which should hopefully be zero.
Cache maintenance is another issue. The overall size of a cache needs to be strictly limited (whereas often you can get away without in short-running programs, because everything seen will be small enough to not cause problems). Equally it's more important to do cache-invalidation properly - broadly speaking, everything that gets cached will become stale at some point while your program is still running, and you need to be able to detect this and take appropriate action. This can be tricky depending on where the golden source of the cached data is.
The last thing I'll mention is exception-handling. For short-running processes, it's often enough to simply let the process die when an exception is encountered, so the issue can be dealt with, and the app rerun. With a long-running process you'll likely need to be more defensive than this. Consider running parts of your program in threads, which can be restarted* if/when they fail. You may need a supervisor-type module, which checks that everything else is still heartbeating and reboots it if not. If appropriate to your structure, this is anecdotally a lot easier to achieve with actors-style libraries rather than Java's standard executors. And if it's at all possible, you may want to have hooks (perhaps exposed over JMX/MBeans) that let you modify the behaviour somewhat, to allow a short-term hack/workaround to be affected without having to bring the process down. Though this requires quite some amount of foresight to predict exactly what's going to go wrong in several months...
*or rather, the job can be restarted in another thread
we created an java agent which does a check on our application suite to see if for instance the parent/child structure is still correct. Therefore it needs to check for 8000+ documents accros several applications.
The check itself goes very fast. We use a navigator to retrieve data from views and only read data from those entries. The problem is within our logging mechanism. Whenever we report a log entry with level SEVERE ( aka: A realy big issue ) the backend document is directly updated. This is becuase we dont want to lose any info about these issues.
In our test runs we see that everything runs smoot but as soon as we 'create' a lot of severe issues the performance drops enormously because of all the writes. I would like to see if there are any notes developers facing the same challenge.. How couuld we speed up the writing without losing any data?
-- added more info after comment from simon --
Its a scheduled agent which runs every night to check for inconsistencies. Goal is ofcourse to find inconsistencies and fix the cause and to eventualy have no inconsistencies reported at all.
Its a scheduled agent which runs every night to check for
inconsistencies.
OK. So there are a number of factors to take into account.
Are there any embedded Jars? When an agent has embedded jars the server has to detach them from the agent to the disk before they can run the code. This is done every time the agent executes. This can be a performance hit. If your agent spawns a number of times, remove the embedded jars and put them into the lib\ext folder on the server instead (requires server restart).
You mention it runs at night. By default general housekeeping processes run at night. Check the notes ini for Server Tasks scheduled and appraise what impact they have on the server/agent when running. For example:
ServerTasksAt1=Catalog,Design
ServerTasksAt2=Updall
ServerTasksAt5=Statlog
In this case if ran between 2-5 then UPDALL could have an impact on it. Also check program documents for scheduled executions.
In what way are you writing? If you are creating a document for each incident and the document contents is not much then the write time should be reasonable. What is liable to be a hit in performance is one of the following.
If you are multi threading those writes.
Pulling a log document, appending a line, saving and then repeating.
One last thing to think about. If you are getting 3000 errors, there must be a point where X amount of errors means that there is no point continuing and instead to alert the admin via SNMP/email/etc? It might be worth coding that in as well.
Other then that, you should probably post some sample code in relation to the write.
Hmm, difficult or general question.
As far as I understand, you update the documents in the view you are walking through. I would set view.AutoUpdate to false. This ensures that the view is not reloaded while you are running your code. This should speed up your code.
This is an extract from the Designer help:
Avoid automatically updating the parent view by explicitly setting
AutoUpdate to False. Automatic updates degrade performance and may
invalidate entries in the navigator ("Entry not found in index"). You
can update the view as needed with Refresh.
Hope that helps.
If that does not help you might want to post a code fragment or more details.
Create separate documents for each error rather than one huge document.
or
Write to a text file directly rather than a database and then pulling if necessary into a document. This should speed things up considerably.
I am trying to profile an application with jvisualvm. The application consists of a loop, in which data is loaded from a database and then some complex calculations are performed on the data. When a set of data is processed, the next set is loaded and calculated.
When I start my application and attach jvisualvm, I set up a filter on the CPU profiling page ("Sart profiling from classes" and "Do not profile classes"), since I am not interested in anything that relates to the database access, and other input/output related stuff.
The filter works - almost. My problem is, that the profiler reports most of the time is spent in sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(), even though sun.* is entered into the "Do not profile classes" filter. This is the only method in sun.* appearing in my profiling results.
Has anyone seen this before and knows how to get rid of it? Problem is, all other methods show up only with tiny amounts (<1%) in the "Self Time" column, most are displayed with 0%.
The jvisualvm version used is 1.3.2.
Thanks in advance,
Axel
Sounds like most of the time is spent waiting from the database. If you want to profile the rest of the stuff, you can either
stub the database so that it returns quickly (thus making the rest of your code take most of the time), or
use a better profiler such as YourKit or JProfiler (paid, definitely support what you want) or TPTP (free, but I'm not sure how powerful it us)
Uncheck 'Profile new Runnables' on the CPU profiling page.
To answer your other question with "Self Time" - you need to take a CPU snapshot of profiled data. The snapshot contains total method time info.