I need to time/performance check a piece of code, in production.
The code has java stack. It most probably has log4j integrated. It interacts with a JMS, sends some request on it and pick some response from it. I need to prove that from the user event i.e. click on the front end to the point where it goes and waits for JMS, it is relatively fast. I need to prove (know) that most of the time that it takes, in the round trip is because it is waiting for some message from the JMS.
I am currently looking at http://perf4j.codehaus.org/devguide.html. However, I would like to poll the group for suggestions. A few restrictions that I need to work with are:
I need something that can be run on production. It needs to be something that I can switch on and off relatively easily.
It needs to be something that can not be too heavy memory / CPU usage wise.
It needs to be something that I can put into the existing code base with least amount of change in the existing code.
So, does anyone have any suggestions apart from http://perf4j.codehaus.org/devguide.html?
Aspects and JVM system arguments (for enabling disabling but requires a restart) or JMX if you need real time on/off.
Related
When I receive a request on my API, I want to do a series of steps, each being a check or an enrichment. Each step could either succeed or fail. On Success, the next step should be carried out. On Failure, an end-step should be executed, and the flow is done. For that I have considered Spring State Machine, as it seems to fit the bill.
I have read up on the documentation and played around with it, but some things elude me:
Should there be a 1-to-1 relationship between a request and a State Machine, meaning that for every request, I make a new State Machine instance? Or should I somehow reuse a completed State Machine by resetting the machine for the next request?
What about cleanup of completed State Machines? There doesn't seem to be a way to destroy and clean a State Machine instance. If I create 1 per request, I've effectively introduced a memory leak, unless the framework somehow handles resources.
There is no absolutely correct answer to your question so I just need to leave some comments here. State machine as a concept is so loose that it gives you so many different ways to do things.
Whole concept if steps one after another kinda relates to how tasks recipe was implemented. It executes a dag of tasks and if parent task fails machine enters into error state giving user a chance to fix things and request machine to continue. statemachine-recipes-tasks statemachine-examples-tasks. Might be that this kind of use case would be a good candidate to create a new recipe as it is pretty generic.
Framework should clear things after machine has been stopped and eventually jvm should clear garbage. If you find something abnormal, please file a gh issue and we'll fix things.
We have sample statemachine-examples-eventservice which is reusing machines but I'm currently re-implementing that sample(it works but should be implemented better) as I was told by our head-chef that what I did there is dump SPR-15042. Machines cannot be used with a session scope and things go south if rich object(which ssm is) is serialised.
It is relatively easy to do a combination of states and choices which would do your step flow. It's only question how much you want this to be re-usable(thus generic recipe would be a good thing, PR's welcomed :) )
What comes for error handling something I presented in a statechart in gh-240 is also something to consider.
There has been some questions if ssm could work as a more generic flow engine but it's probably something it's never going to be as it would be a completely new project. Thought most of a flows could be handled as a separate recipes.
Alright, so I have a Spring application that takes in a Network Representation and boots up virtual machines to represent the network that was passed in.
It uses a low level API to bring up the VMs, there is no database involved.
What I need to figure out how to do is handle the situation where a user submits a 10 node (or any number) network model and the application goes through and builds up the network (starting VMs), if a node fails to startup I want to be able to react to that. I would like to be able to roll back my changes (i.e destroy all nodes that were created).
I've been told that I need to look into "Transactions" but I am unsure whether or not that applies to this scenario when I'm not using a database.
As a side note, I do have logic to take down nodes if a user sends in that request.
My question is -- how do I handle this?
Also, is this the best stack overflow for this question?
It does seem that you are looking for transactional behavior, and specifically, for atomicity ("all or nothing"). But usually "transaction" connotes certain guarantees (particularly around ACID properties) that will be difficult or impossible to achieve where human-level timescales on the order of minutes are involved.
Probably "workflow with compensation for errors" is more what you would be looking for here.
I would implement this manually, perhaps with tool support (e.g. workflow engines). Kick off a process to spawn your network, and keep track of the current progress, such as VMs created, VMs in progress, etc. If there are errors that demand a rollback, then have another process that performs a cleanup. The behavior of the cleanup process itself could fail, so it might retry its various steps a couple times before generating a report that says "this cleanup step failed".
If there are shared resources involved then you would need to implement some kind of isolation mechanism as well. Sometimes this is easy enough--e.g., DHCP helps you avoid duplicate IPs. If you're updating a DNS zone file then you'd want to synchronize access to that to avoid concurrent writes. Etc.
I have to write a program that is thought to run 'forever' , meaning that it won't terminate regularly. Up until now I always wrote programs that would run and be terminated at the end of the day. The program has to do some synchronizations, pause for n minutes and than sync again.
AFAIK there should be no problem with my current implementation and it should theoretically run just fine, but I'm lacking any real-world experience.
So are there any 'patterns' or best practices for writing very robust and resource efficient java programs that have a very long runtime? What could be possible problems after for example a month/year of runtime?
Some background :
Java : 1.7 but compiled down to 1.5
OS : Windows (exact version is not certain yet)
Thanks in advance
Just a brain dump of all the things I've had to keep in mind when writing this kind of app.
Avoid Memory Leaks
I had an app that runs once at mid day, every day, and in that I had a FileWriter. I wasn't closing that properly, and then we started wondering why our virtual machine was going into melt down after a few weeks. Memory leaks can come in the form of anyhing really, with one of the most common examples being that you don't de-reference an object appropriately. For example, using a class's field as a method of temporary storage. Often the class persists, and so does the reference. This leaves you with objects, sitting in memory and doing nothing.
Use the right kind of Scheduler
I used a java Timer in that app, and later I learnt that it's better to use a ScheduledThreadPoolExecutor when another app was changing the System clock. So if you plan on keeping it completely Java based, I would strongly recommend using that over a Timer for all of the reasons detailed in this question.
Be mindful of memory usage and your environment
If your app is loading large amounts of data each and every day, and you have other apps running on the same server, you may want to be careful about the timing. For example, say at mid day, three of the apps run their scheduled operation, I would say running it at any other time would probably be a smart move. Be mindful of the environment in which you're executing your code in.
Error handling
You probably want to configure your app to let you know if something has gone wrong, without the app breaking down. If it's running at a certain time every few hours, that means people are probably depending on it, so I would have a function in your Java code that sends out an email to you, detailing the nature of the exception.
Make it configurable
Again, if it needs to run at various points in the day, you don't want to have to pull the thing down for a few hours to work out some minor changes to your code. Instead, port it into a java Properties file, or into an XML Config (or really, whatever). The advantage of this is that you can update your program and get it up and running before anyone really noticed the difference.
Be afraid of the static keyword
That bad boy will make objects persist, even when you destroy their parent reference. It is the mother of all memory leaks if you are not careful with it. It's fine for constants, and things that you know don't need to change and need to exist within the project to run well, but if you're using it for random values inside a project, you're going to quickly wonder why your app is crashing every few hours rather than syncing.
Props to #X86 for reminding me of that one.
Memory leaks are likely to be the biggest problem. Ensure that there are no long-term references held after an iteration of your logic. Even a relatively small object being referenced forever, will exhaust the memory eventually (and worse, it's going to be harder to detect during testing if the growth rate is 1GB/month). One approach that may help is using the snapshot functionality of profilers: take a snapshot during the pause, let the sync run a few times, and take another snapshot. Comparing these should show the delta between the synchronizations, which should hopefully be zero.
Cache maintenance is another issue. The overall size of a cache needs to be strictly limited (whereas often you can get away without in short-running programs, because everything seen will be small enough to not cause problems). Equally it's more important to do cache-invalidation properly - broadly speaking, everything that gets cached will become stale at some point while your program is still running, and you need to be able to detect this and take appropriate action. This can be tricky depending on where the golden source of the cached data is.
The last thing I'll mention is exception-handling. For short-running processes, it's often enough to simply let the process die when an exception is encountered, so the issue can be dealt with, and the app rerun. With a long-running process you'll likely need to be more defensive than this. Consider running parts of your program in threads, which can be restarted* if/when they fail. You may need a supervisor-type module, which checks that everything else is still heartbeating and reboots it if not. If appropriate to your structure, this is anecdotally a lot easier to achieve with actors-style libraries rather than Java's standard executors. And if it's at all possible, you may want to have hooks (perhaps exposed over JMX/MBeans) that let you modify the behaviour somewhat, to allow a short-term hack/workaround to be affected without having to bring the process down. Though this requires quite some amount of foresight to predict exactly what's going to go wrong in several months...
*or rather, the job can be restarted in another thread
Edit: Of the several extremely generous and helpful responses this question has already received, it is obvious to me that I didn't make an important part of this question clear when I asked it earlier this morning. The answers I've received so far are more about optimizing applications & removing bottlenecks at the code level. I am aware that this is way more important than trying to get an extra 3- or 5% out of your JVM!
This question assumes we've already done just about everything we could to optimize our application architecture at the code level. Now we want more, and the next place to look is at the JVM level and garbage collection; I've changed the question title accordingly. Thanks again!
We've got a "pipeline" style backend architecture where messages pass from one component to the next, with each component performing different processes at each step of the way.
Components live inside of WAR files deployed on Tomcat servers. Altogether we have about 20 components in the pipeline, living on 5 different Tomcat servers (I didn't choose the architecture or the distribution of WARs for each server). We use Apache Camel to create all the routes between the components, effectively forming the "connective tissue" of the pipeline.
I've been asked to optimize the GC and general performance of each server running a JVM (5 in all). I've spent several days now reading up on GC and performance tuning, and have a pretty good handle on what each of the different JVM options do, how the heap is organized, and how most of the options affect the overall performance of the JVM.
My thinking is that the best way to optimize each JVM is not to optimize it as a standalone. I "feel" (that's about as far as I can justify it!) that trying to optimize each JVM locally without considering how it will interact with the other JVMs on other servers (both upstream and downstream) will not produce a globally-optimized solution.
To me it makes sense to optimize the entire pipeline as a whole. So my first question is: does SO agree, and if not, why?
To do this, I was thinking about creating a LoadTester that would generate input and feed it to the first endpoint in the pipeline. This LoadTester might also have a separate "Monitor Thread" that would check the last endpoint for throughput. I could then do all sorts of processing where we check for average end-to-end travel time for messages, maximum throughput before faulting, etc.
The LoadTester would generate the same pattern of input messages over and over again. The variable in this experiment would be the JVM options passed to each Tomcat server's startup options. I have a list of about 20 different options I'd like to pass the JVMs, and figured I could just keep tweaking their values until I found near-optimal performance.
This may not be the absolute best way to do this, but it's the best way I could design with what time I've been given for this project (about a week).
Second question: what does SO think about this setup? How would SO create an "optimizing solution" any differently?
Last but not least, I'm curious as to what sort of metrics I could use as a basis of measure and comparison. I can really only think of:
Find the JVM option config that produces the fastest average end-to-end travel time for messages
Find the JVM option config that produces the largest volume throughput without crashing any of the servers
Any others? Any reasons why those 2 are bad?
After reviewing the play I could see how this might be construed as a monolithic question, but really what I'm asking is how SO would optimize JVMs running along a pipeline, and to feel free to cut-and-dice my solution however you like it.
Thanks in advance!
Let me go up a level and say I did something similar in a large C app many years ago.
It consisted of a number of processes exchanging messages across interconnected hardware.
I came up with a two-step approach.
Step 1. Within each process, I used this technique to get rid of any wasteful activities.
That took a few days of sampling, revising code, and repeating.
The idea is there is a chain, and the first thing to do is remove inefficiences from the links.
Step 2. This part is laborious but effective: Generate time-stamped logs of message traffic.
Merge them together into a common timeline.
Look carefully at specific message sequences.
What you're looking for is
Was the message necessary, or was it a retransmission resulting from a timeout or other avoidable reason?
When was the message sent, received, and acted upon? If there is a significant delay between being received and acted upon, what is the reason for that delay? Was it just a matter of being "in line" behind another process that was doing I/O, for example? Could it have been fixed with different process priorities?
This activity took me about a day to generate logs, combine them, find a speedup opportunity, and revise code.
At this rate, after about 10 working days, I had found/fixed a number of problems, and improved the speed dramatically.
What is common about these two steps is I'm not measuring or trying to get "statistics".
If something is spending too much time, that very fact exposes it to a dilligent programmer taking a close meticulous look at what is happening.
I would start with finding the optimum recommended jvm values specified for your hardware/software mix OR just start with what is already out there.
Next I would make sure that I have monitoring in place to measure Business throughputs and SLAs
I would not try to tweak just the GC if there is no reason to.
First you will need to find what are the major bottlenecks in your application. If it is I/O bound, SQL bound etc.
Key here is to MEASURE, IDENTIFY TOP bottlenecks, FIX them and conduct another iteration with a repeatable load.
HTH...
The biggest trick I am aware of when running multiple JVMs on the same machine is limiting the number of core the GC will use. Otherwise what can happen when one JVM does a full GC is it will attempt to grab every core, impacting the performance of all the JVMs even though they are not performing a GC. One suggestion is to limit the number of gc threads to 5/8 or less. (I can't remember where it is written)
I think you should test the system as a whole to ensure you have realistic interaction between the services. However, I would assume you may need to tune each service differently.
Changing command line options is useful if you cannot change the code. However if you profile and optimise the code you can make far for difference than tuning the GC parameters (in which cause you need to change them again)
For this reason, I would only change the command line parameters as a last resort, after you there is little improvement which can be made in the code of the application.
I have a Java servlet that's getting overloaded by client requests during peak hours. Some clients span concurrent requests. Sometimes the number of requests per second is just too great.
Should I implement application logic to restrict the number of request client can send per second? Does this need to be done on the application level?
The two most common ways of handling this are to turn away requests when the server is too busy, or handle each request slower.
Turning away requests is easy; just run a fixed number of instances. The OS may or may not queue up a few connection requests, but in general the users will simply fail to connect. A more graceful way of doing it is to have the service return an error code indicating the client should try again later.
Handling requests slower is a bit more work, because it requires separating the servlet handling the requests from the class doing the work in a different thread. You can have a larger number of servlets than worker bees. When a request comes in it accepts it, waits for a worker bee, grabs it and uses it, frees it, then returns the results.
The two can communicate through one of the classes in java.util.concurrent, like LinkedBlockingQueue or ThreadPoolExecutor. If you want to get really fancy, you can use something like a PriorityBlockingQueue to serve some customers before others.
Me, I would throw more hardware at it like Anon said ;)
Some solid answers here. I think more hardware is the way to go. Having too many clients or traffic is usually a good problem to have.
However, if you absolutely must throttle clients, there are some options.
The most scalable solutions that I've seen revolve around a distributed caching system, like Memcached, and using integers to keep counts.
Figure out a rate at which your system can handle traffic. Either overall, or per client. Then put a count into memcached that represents that rate. Each time you get a request, decrement the value. Periodically increment the counter to allow more traffic through.
For example, if you can handle 10 requests/second, put a count of 50 in every 5 seconds, up to a maximum of 50. That way you aren't refilling it all the time, but you can also handle a bit of bursting limited to a window. You will need to experiment to find a good refresh rate. The key for this counter can either be a global key, or based on user id if you need to restrict that way.
The nice thing about this system is that it works across an entire cluster AND the mechanism that refills the counters need not be in one of your current servers. You can dedicate a separate process for it. The loaded servers only need to check it and decrement it.
All that being said, I'd investigate other options first. Throttling your customers is usually a good way to annoy them. Most probably NOT the best idea. :)
I'm assuming you're not in a position to increase capacity (either via hardware or software), and you really just need to limit the externally-imposed load on your server.
Dealing with this from within your application should be avoided unless you have very special needs that are not met by the existing solutions out there, which operate at HTTP server level. A lot of thought has gone into this problem, so it's worth looking at existing solutions rather than implementing one yourself.
If you're using Tomcat, you can configure the maximum number of simultaneous requests allowed via the maxThreads and acceptCount settings. Read the introduction at http://tomcat.apache.org/tomcat-6.0-doc/config/http.html for more info on these.
For more advanced controls (like per-user restrictions), if you're proxying through Apache, you can use a variety of modules to help deal with the situation. A few modules to google for are limitipconn, mod_bw, and mod_cband. These are quite a bit harder to set up and understand than the basic controls that are probably offered by your appserver, so you may just want to stick with those.