I have a SOAP Web Service written in Java and using Spring-ws.
I need to know that if this can handle 2 million requests per day, and how its performance would be.
-To what extend the massive usage performance is related with my java code and architecture, anything I can improve on it?
-And which extend this is related with the Application Server I use, which app server should I use, what are the limitations, or settings..how can I set and test this performance?
Thanks
With SOAP having an architectural underpinning in the HTTP protocol there are literally dozens of commercial and open source tools which you can use to perform your load and scalability tests.
What you will want to do is make sure that whatever tool you select meets the technical needs of exercising your interface correctly, is a good match your your in house skills and contains the reporting that you need to determine success or failure in your testing efforts. If you get a tool which is a mismatch on any of the three fronts then you may as well have purchased the most expensive tool on the market and hired the most expensive consulting form to use them....sort or like driving nails with the butt end of a screwdriver instead of using the proper tool (a hammer).
Most of the commercial tools on the market today have lease/rental options for short term use, and then there are the open source varieties as well. Each tool has an engineered efficiency associated with core tasks such as script construction, test construction, test execution and analysis which is distinct to the tool. The commercial tools tend to have a more balanced weight across the tasks while the open source ones tend to have higher LsOE required on the edge tasks of script creation and analysis.
Given that you are going to be working with at least a couple of million samples (assuming you will want to run for at least 24 hours) then you need to make sure that the analysis tools have a demonstrable track record with large data sets. The long standing commercial performance test tools all have demonstrable track records at this level, the open source ones are hit and miss and in some cases analysis becomes a roll-your-own proposition against logged response time data. You can see where you can burn lots of time building your own analysis engine here!
What you want to do is technically possible. You may want to re-examine your performance requirements, here is why. I happen to be working with an organization that today uses a web services interface to service the needs of clients around the world. Their backend archive of transactional data is approaching 250TB of data from over ten years worth of work. On an hourly basis over the past year the high water mark was around 60K requests per hour. Projected over a 24 hour basis this still works out to less than your 2 million requests per day. If you test to this level and you find issues are you finding genuine issues or are you finding engineering ghosts, things that would never occur in production due to the differences in production versus test volumes. Modeling your load properly is always difficult, but the time spent nailing the load model for your mix of transactions and proper volume will be time well spent in not using your developer skills to chase performance ghosts and burning budget while doing so.
Good luck!
You could also use the StresStimulus which is a plugin for Fiddler
Keep in mind that 1 day (24 hours) is a large sample. I suspect that 2,000,000 hits won't be evenly spread across the 24 hours. Try to understand what the traffic pattern will look like. If 1/2 of the traffic comes between the hours of 1am and 4am (for example) you'll want to test for 1M hits in 3 hours as well as 2M in 24.
SOAPUI will be able to help you in load testing your web service bu computing the soap message from the wsdl:
http://www.soapui.org/
Apache JMeter, which is a performance/load test tool will help you load test it with either soap sampler or regular HTTP Sampler :
http://jmeter.apache.org/
Related
I have a legacy product in financial domain.Using tomcat 6. We get millions of request 10k of request in hour. I am wondering at high level
should i go for ditributed application where my mvc component is on one system and service/dao on another box(can use spring remote/EJB).
The reason i am planning to go in this direction so that load is distribute and get better performance With this it becomes scalable also.
I only see the positive side of it but somehow not able to figure out what can be the negative aspect of it?
If some expert can help
what is the criteria i should consider to go for distributed model and pros/cons of it? I also tried googling where i could get some stats
like how much load a given webserver (tomcat in my case)handle efiiciently with given hardware(16 gb ram, windows 7, processor ).
Yes i am going
to do POC where i will be measuring performance with distributed model vs without bit high level input will be highly appreciated?
It is impossible to answer this questions without more details - how long does it take to reply to one request on the current server? How many resources are allocated for one request?
having 10k requests per hour means ~3 requests per second. If performing the necessary operations and replying to a request, using 1 CPU takes ~300ms - one simple machine is totally fine. This is simple math, and doesn't always work. I guess you still have peaks within those 10k requests per hour and they aren't gradually distributed.
If we assume, one reply can take up to 1 second, than you can handle as many replies per second as your system has CPUs (given that a CPU would be the bottle neck) If the CPU isn't the bottle neck for your application server, there's probably something wrong. You should set up the database(s) on a different machine and only perform computation tasks on the application server machine.
Especially in the financial sector with a legacy software, I wouldn't try splitting a running product. How old is the current server? I believe that a new Server should be cheaper than rewriting an application. Unless you expect 50-100k requests per hour very soon, I don't think, splitting up such small parts makes sense.
Instead - run it on an up to date server hardware, split application server and data storage and you should be fine.
I am wondering at high level if should i go for ditributed application where my mvc component is on one system and service/dao on another box(can use spring remote/EJB).
I'm not sure what you mean for "system" in this context, but if it means that you are planning to run your application in two servers,
one dedicated to presentation and other dedicated to business layer, take in mind that a simpler approach (and probably more suitable for your app)
is build a co-located architecture.
Basically, the idea is to replicate your app in several servers (at least two) and put in front of them a load balancer that routes the incoming requests among the available servers.
All servers share the same database instance. This will give you vertical scalability and also will improve the availability of your system.
I only see the positive side of it but somehow not able to figure out what can be the negative aspect of it?
Distributing your business logic will probably involve a refactor of your application code, if the system is working well you will add some bugs for sure.
The necessary remote calls will add latency and the fact that you execute your business logic in several servers doesn't resolve the performance problems on the presentation tier.
In Expert One-on-One J2EE Development Without EJB (pag. 65), you can find a good reading about why not distribute your business logic.
It's difficult to find all bottlenecks, deadlocks, and memory leaks in a Java application using unit tests alone.
I'd like to add some level of stress testing for my application. I want to test the limits of the application and determine how it reacts under high load.
I'd like to gauge the following:
Availablity under high load
Performance under high load
Memory / CPU / Disk Usage under high load
Does it crash under high load or react gracefully
It would also be interesting to measure and contrast such characteristics under normal load.
Are their well known, standard techniques to address stress testing.
I am looking for help / direction in setting up such an environment.
Ideally, I would like to run these tests regularly, so that wecan determine if recent deliveries impact performance.
I am a big fan of JMeter. You can set up calls directly against the server just as users would access it. You can control the number of user (concurrent threads) and accesses. It can follow a workflow, scraping pertinent information page to page. It takes 1 to 2 days to learn it well enough to be productive. (You can do the basics within an hour of downloading!)
As for seeing how all that affects the server, that is a tougher question. I have used professional tools from CA and IBM. (I am drawing a blank on specific tool names - maybe due to PTSD!) I have used out-of-the-box JVM profilers. I have used native linux and windows tools. If you are not too concerned about profiling what parts of your application causes issues, then you can just use the native tools for your OS to monitor CPU/Memory/IO.
One of our standard techniques is running stepped-ramp load tests to measure scalability.
There are mainly two approaches for performance on an application:
Performance test and System Test
How do they differ? Well it's easy, it's based on their scope, Performance tests' scope is limited and are highly unrealistic. Example: Test the IncomingMessage handler on some App X, for this you would setup a test which sends meesages to this handler on a X,Y,Z basis. This approach will help you pin down problems and measure performance of individual and limited zones on your application.
So this should now take you to the question, so am I to benchmark and performance test each one of the components in my app individually? Yes if you believe the component's behavior is critical and changes on newer versions are likely to induce performance penalties. But, if you want to get a feel on your application as a whole, the bunch of components interacting with each other and see how performance comes out, then you need a system test.
A system test will always, try to replicate as close as possible any customer production environment. Here you can observe what a real world feel of your app's performance is like and act accordingly to correct it.
So as conclusion,setup a system test on your app and measure what you were saying you wanted to measure. Then Stress the system as a whole and see how it reacts, you will be surprised on the outcome.
Finally, Performance test individually any critical components you have identified or would like to keep tracked on your app.
As a general guideline, when doing performance you should always:
1.- Get a baseline for the system on an idle state.
2.- Get a baseline for the system under normal expected load.
3.- Get a baseline for the system under stress conditions.
Keep in mind that Normal load results should be extrapolated to stress conditions, and a nice system will always be that one which scales linearly.
Hope this helps.
P.S. Tests, envirnoment setup and even data collection should be as fully automated as possible, this will help you run this on a basis and spend time diagnosing performance problems and not setting up the test.
As mentioned by others; tools like JMeter (Commercial tools like LoadRunner and more) can help you generate concurrent test load.
Many monitoring tools (some provided within JDK like MissionControl, some other open source/ free tools like java Melody and many commercial one's) can help you do generic monitoring of various system (memory, CPU, network bandwidth) and JVM resources (Heap, CPU, GC overheads etc).
But to really identify bottlenecks within your code as well as other dependencies of your applications (like external services invoked, DB queries/updates etc) in a very quick and easy way; I recommend considering a good APM i.e. Application Performance Monitoring Tools like AppDynamics/ DynaTrace and more. They can help you pinpoint bottlenecks for specific request level, highlight slower parts of apps, generate percentile metrics at individual service end point or component / method level etc. They can be immensely useful , if one is dealing with very high concurrent users and stringent response time NFR's. They help uncover many bottlenecks across the layers of your application. Many even configure these tools in production (expected to cause 2-3% overheads; but worth it per me for the benefits they provide) - as production logging is not at debug level by default; so once some errors or slowness is observed; it's often extremely difficult to reproduce in lower environments or debug in absence of debug level logs from specific past duration.
There's no one tool to tackle this as far as I know. So build you own environment
Load Injecting & Scripting: JMeter, SOAP UI, LoadUI
Scheduling Tests & Automation: Jenkins, Rundeck
Analytics on transaction data, resources, application performance logs: AppDynamics, ElasticSearch, Splunk
Profiling: AppDynamics, YouKit, Java Mission Control, VisualVm
Edit: Of the several extremely generous and helpful responses this question has already received, it is obvious to me that I didn't make an important part of this question clear when I asked it earlier this morning. The answers I've received so far are more about optimizing applications & removing bottlenecks at the code level. I am aware that this is way more important than trying to get an extra 3- or 5% out of your JVM!
This question assumes we've already done just about everything we could to optimize our application architecture at the code level. Now we want more, and the next place to look is at the JVM level and garbage collection; I've changed the question title accordingly. Thanks again!
We've got a "pipeline" style backend architecture where messages pass from one component to the next, with each component performing different processes at each step of the way.
Components live inside of WAR files deployed on Tomcat servers. Altogether we have about 20 components in the pipeline, living on 5 different Tomcat servers (I didn't choose the architecture or the distribution of WARs for each server). We use Apache Camel to create all the routes between the components, effectively forming the "connective tissue" of the pipeline.
I've been asked to optimize the GC and general performance of each server running a JVM (5 in all). I've spent several days now reading up on GC and performance tuning, and have a pretty good handle on what each of the different JVM options do, how the heap is organized, and how most of the options affect the overall performance of the JVM.
My thinking is that the best way to optimize each JVM is not to optimize it as a standalone. I "feel" (that's about as far as I can justify it!) that trying to optimize each JVM locally without considering how it will interact with the other JVMs on other servers (both upstream and downstream) will not produce a globally-optimized solution.
To me it makes sense to optimize the entire pipeline as a whole. So my first question is: does SO agree, and if not, why?
To do this, I was thinking about creating a LoadTester that would generate input and feed it to the first endpoint in the pipeline. This LoadTester might also have a separate "Monitor Thread" that would check the last endpoint for throughput. I could then do all sorts of processing where we check for average end-to-end travel time for messages, maximum throughput before faulting, etc.
The LoadTester would generate the same pattern of input messages over and over again. The variable in this experiment would be the JVM options passed to each Tomcat server's startup options. I have a list of about 20 different options I'd like to pass the JVMs, and figured I could just keep tweaking their values until I found near-optimal performance.
This may not be the absolute best way to do this, but it's the best way I could design with what time I've been given for this project (about a week).
Second question: what does SO think about this setup? How would SO create an "optimizing solution" any differently?
Last but not least, I'm curious as to what sort of metrics I could use as a basis of measure and comparison. I can really only think of:
Find the JVM option config that produces the fastest average end-to-end travel time for messages
Find the JVM option config that produces the largest volume throughput without crashing any of the servers
Any others? Any reasons why those 2 are bad?
After reviewing the play I could see how this might be construed as a monolithic question, but really what I'm asking is how SO would optimize JVMs running along a pipeline, and to feel free to cut-and-dice my solution however you like it.
Thanks in advance!
Let me go up a level and say I did something similar in a large C app many years ago.
It consisted of a number of processes exchanging messages across interconnected hardware.
I came up with a two-step approach.
Step 1. Within each process, I used this technique to get rid of any wasteful activities.
That took a few days of sampling, revising code, and repeating.
The idea is there is a chain, and the first thing to do is remove inefficiences from the links.
Step 2. This part is laborious but effective: Generate time-stamped logs of message traffic.
Merge them together into a common timeline.
Look carefully at specific message sequences.
What you're looking for is
Was the message necessary, or was it a retransmission resulting from a timeout or other avoidable reason?
When was the message sent, received, and acted upon? If there is a significant delay between being received and acted upon, what is the reason for that delay? Was it just a matter of being "in line" behind another process that was doing I/O, for example? Could it have been fixed with different process priorities?
This activity took me about a day to generate logs, combine them, find a speedup opportunity, and revise code.
At this rate, after about 10 working days, I had found/fixed a number of problems, and improved the speed dramatically.
What is common about these two steps is I'm not measuring or trying to get "statistics".
If something is spending too much time, that very fact exposes it to a dilligent programmer taking a close meticulous look at what is happening.
I would start with finding the optimum recommended jvm values specified for your hardware/software mix OR just start with what is already out there.
Next I would make sure that I have monitoring in place to measure Business throughputs and SLAs
I would not try to tweak just the GC if there is no reason to.
First you will need to find what are the major bottlenecks in your application. If it is I/O bound, SQL bound etc.
Key here is to MEASURE, IDENTIFY TOP bottlenecks, FIX them and conduct another iteration with a repeatable load.
HTH...
The biggest trick I am aware of when running multiple JVMs on the same machine is limiting the number of core the GC will use. Otherwise what can happen when one JVM does a full GC is it will attempt to grab every core, impacting the performance of all the JVMs even though they are not performing a GC. One suggestion is to limit the number of gc threads to 5/8 or less. (I can't remember where it is written)
I think you should test the system as a whole to ensure you have realistic interaction between the services. However, I would assume you may need to tune each service differently.
Changing command line options is useful if you cannot change the code. However if you profile and optimise the code you can make far for difference than tuning the GC parameters (in which cause you need to change them again)
For this reason, I would only change the command line parameters as a last resort, after you there is little improvement which can be made in the code of the application.
I want to gain more insight regarding the scale of workload a single-server Java Web application deployed to a single Tomcat instance can handle. In particular, let's pretend that I am developing a Wiki application that has a similar usage pattern like Wikipedia. How many simultaneous requests can my server handle reliably before going out of memory or show signs of excess stress if I deploy it on a machine with the following configuration:
4-Core high-end Intel Xeon CPU
8GB RAM
2 HDDs in RAID-1 (No SSDs, no PCIe based Solid State storages)
RedHat or Centos Linux (64-bit)
Java 6 (64-bit)
MySQL 5.1 / InnoDB
Also let's assume that the MySQL DB is installed on the same machine as Tomcat and that all the Wiki data are stored inside the DB. Furthermore, let's pretend that the Java application is built on top of the following stack:
SpringMVC for the front-end
Hibernate/JPA for persistence
Spring for DI and Security, etc.
If you haven't used the exact configuration but have experience in evaluating the scalability of a similar architecture, I would be very interested in hearing about that as well.
Thanks in advance.
EDIT: I think I have not articulated my question properly. I mark the answer with the most up votes as the best answer and I'll rewrite my question in the community wiki area. In short, I just wanted to learn about your experiences on the scale of workload your Java application has been able to handle on one physical server as well as some description regarding the type and architecture of the application itself.
You will need to use group of tools :
Loadtesting Tool - JMeter can be used.
Monitoring Tool - This tool will be used to monitor various numbers of resources load. There are Lot paid as well as free ones. Jprofiler,visualvm,etc
Collection and reporting tool. (Not used any tool)
With above tools you can find optimal value. I would approach it in following way.
will get to know what should be ratio of pages being accessed. What are background processes and their frequency.
Configure my JMeter accordingly (for ratios) , and monitor performance for load applied ( time to serve page ...can be done in JMeter), monitor other resources using Monitor tool. Also check count of error ratio. (NOTE: you need to decide upon what error ratio is not acceptable.)
Keep increasing Load step by step and keep writting various numbers of interest till server fails completely.
You can decide upon optimal value based on many criterias, Low error rate, Max serving time etc.
JMeter supports lot of ways to apply load.
To be honest, it's almost impossible to say. There's probably about 3 ways (of the top of my head to build such a system) and each would have fairly different performance characteristics. You best bet is to build and test.
Firstly try to get some idea of what the estimated volumes you'll have and the latency constraints that you'll need to meet.
Come up with a basic architecture and implement a thin slice end to end through the system (ideally the most common use case). Use a load testing tool like (Grinder or Apache JMeter) to inject load and start measuring the performance. If the performance is acceptable - be conservative your simple implementation will likely include less functionality and be faster than the full system - continue building the system and testing to make sure you don't introduce a major performance bottleneck. If not come up with a different design.
If your code is reasonable the bottleneck will likely be the database and somewhere in the region 100s of db ops per second. If that is insufficient then you may need to think about caching.
Definitely take a look at Spring Insight for performance monitoring and analysis.
English Wikipedia has 14GB data. A 8GB mem cache would have very high hit/miss ratio, and I think harddisk read would be well within its capacity. Therefore, the app is most likely network bound.
English Wikipedia has about 3000 page views per second. It is possible that tomcat can handle the load by careful tuning, and the network has enough throughput to server the traffic.
So the entire wikipedia site can be hosted on one moderate machine? Probably not. Just an idea.
-
http://stats.wikimedia.org/EN/TablesWikipediaEN.htm
http://stats.wikimedia.org/EN/TablesPageViewsMonthly.htm
Tomcat doesn't allow for spreading over multiple machines. If you really are concerned about scalability, you must consider what to do when your application outgrows a single machine.
I am on a LAMP stack for a website I am managing. There is a need to roll up usage statistics (a variety of things related to our desktop product).
I initially tackled the problem with PHP (being that I had a bunch of classes to work with the data already). All worked well on my dev box which was using 5.3.
Long story short, 5.1 memory management seems to suck a lot worse, and I've had to do a lot of fooling to get the long-term roll up scripts to run in a fixed memory space. Our server guys are unwilling to upgrade PHP at this time. I've since moved my dev server back to 5.1 so I don't run into this problem again.
For mining of MySQL databases to roll up statistics for different periods and resolutions, potentially running a process that does this all the time in the future (as opposed to on a cron schedule), what language choice do you recommend? I was looking at Python (I know it more or less), Java (don't know it that well), or sticking it out with PHP (know it quite well).
Edit: design clarification for commenter
Resolutions: The way the rollup script works currently, is I have some classes for defining resolutions and buckets. I have year, month, week, day -- given a "bucket number" each class gives a start and end timestamp that defines the time range for that bucket -- this is based on arbitrary epoch date. The system maintains "complete" records, ie it will complete its rolled up dataset for each resolution since the last time it was run, currently.
SQL Strat: The base stats are located in many dissimilar schemas and tables. I do individual queries for each rolled up stat for the most part, then fill one record for insert. Your are suggesting nested subqueries such as:
INSERT into rolled_up_stats (someval, someval, someval, ... ) VALUES (SELECT SUM(somestat) from someschema, SELECT AVG(somestat2) from someschema2)
Those subqueries will generate temporary tables, right? My experience is that had been slow as molasses in the past. Is it a better approach?
Edit 2: Adding some inline responses to the question
Language was a bottleneck in the case of 5.1 php -- I was essentially told I made the wrong language choice (though the scripts worked fine on 5.3). You mention python, which I am checking out for this task. To be clear, what I am doing is providing a management tool for usage statistics of a desktop product (the logs are actually written by an EJB server to mysql tables). I do apache log file analysis, as well as more custom web reporting on the web side, but this project is separate. The approach I've taken so far is aggregate tables. I'm not sure what these message queue products could do for me, I'll take a look.
To go a bit further -- the data is being used to chart activity over time at the service and the customer level, to allow management to understand how the product is being used. You might select a time period (April 1 to April 10) and retrieve a graph of total minutes of usage of a certain feature at different granularities (hours, days, months etc) depending on the time period selected. Its essentially an after-the-fact analysis of usage. The need seems to be tending towards real-time, however (look at the last hour of usage)
There are a lot of different approaches to this problem, some of which are mentioned here, but what you're doing with the data post-rollup is unclear...?
If you want to utilize this data to provide digg-like 'X diggs' buttons on your site, or summary graphs or something like that which needs to be available on some kind of ongoing basis, you can actually utilize memcache for this, and have your code keep the cache key for the particular statistic up to date by incrementing it at the appropriate times.
You could also keep aggregation tables in the database, which can work well for more complex reporting. In this case, depending on how much data you have and what your needs are, you might be able to get away with having an hourly table, and then just creating views based on that base table to represent days, weeks, etc.
If you have tons and tons of data, and you need aggregate tables, you should look into offloading statistics collection (and perhaps the database queries themselves) to a queue like RabbitMQ or ActiveMQ. On the other side of the queue put a consumer daemon that just sits and runs all the time, updating things in the database (and perhaps the cache) as needed.
One thing you might also consider is your web server's logs. I've seen instances where I was able to get a somewhat large portion of the required statistics from the web server logs themselves after just minor tweaks to the log format rules in the config. You can roll the logs every , and then start processing them offline, recording the results in a reporting database.
I've done all of these things with Python (I released loghetti for dealing with Apache combined format logs, specifically), though I don't think language is a limiting factor or bottleneck here. Ruby, Perl, Java, Scala, or even awk (in some instances) would work.
I have worked on a project to do a similar thing in the past, so I have actual experience with performance. You would be hard pressed to beat the performance of "INSERT ... SELECT" (not "INSERT...VALUES (SELECT ...)". Please see http://dev.mysql.com/doc/refman/5.1/en/insert-select.html
The advantage is that if you do that, especially if you keep the roll-up code in MySQL procedures, is that all you need from the outside is just a cron-job to poke the DB into performing the right roll-ups at the right times -- as simple as a shell-script with 'mysql <correct DB arguments etc.> "CALL RollupProcedure"'
This way, you are guaranteeing yourself zero memory allocation bugs, as well as having decent performance when the MySQL DB is on a separate machine (no moving of data across machine boundary...)
EDIT: Hourly resolution is fine -- just run an hourly cron-job...
If you are running mostly SQL commands, why not just use MySQL etc on the command line? You could create a simple table that lists aggregate data then run a command like mysql -u[user] -p[pass] < commands.sql to pass SQL in from a file.
Or, split the work into smaller chunks and run them sequentially (as PHP files if that's easiest).
If you really need it to be a continual long-running process then a programming language like python or java would be better, since you can create a loop and keep it running indefinitely. PHP is not suited for that kind of thing. It would be pretty easy to convert any PHP classes to Java.