I'm new here and I'm not that very good in CPU consumption and Multi Threading. But I was wondering why my web app is consuming too much of the CPU process? What my program does is update values in the background so that users don't have to wait for the processing of the data and will only need to fetch it upon request. The updating processes are scheduled tasks using executor library that fires off 8 threads every 5 seconds to update my data.
Now I'm wondering why my application is consuming too much of the CPU. Is it because of bad code or is it because of a low spec server? (2 cores with 2 database and 1 major application running with my web app)
Thank you very much for your help.
You need to profile your application to find out where the CPU is actually being consumed. Java has some basic profiling methods built in, or if your environment permits it, you could run the built in "hprof" compiler:
java -Xrunhprof ...
(In reality, you probably want to set some extra options: Google "hprof" for more details.)
The latter is easier in principle, but I mention the possibility of adding your own profiling routine because it's more flexible and you can do it e.g. in a Servlet environment where running another profiler is more cumbersome.
Paulo,
It is not possible for someone here to say whether the problem is that your code is inefficient or the server is under spec. It could be either or both of those, or something else.
You are going to need to do some research of your own:
Profile the code. This will allow you to identify where your webapp is spending most of its time.
Look at the OS-level stats that are available to you. This might tell you that the real problem is memory usage or disk I/O.
Look at the performance of the back-end database. Is it using a lot of CPU?
Once you have identified the area(s) where the CPU is being used, you need to figure out the real cause of the problem is and work out how to fix it. And once you've got a potential fix implemented, you can rerun your profiling, etc to see it has helped.
Related
I have a project where I have to send emails using Amazon SES REST API, now amazon allows concurrent connections at a same time based on account. So in my case amazon allows me to open 50 connections at a same time, which means I can send 50 emails/sec. To achieve this, currently I am using JAVA Executioner threads where I control the thread speed to be 50/sec. Also I have implemented this in Hibernate framework because I need to execute some SQL queries before sending emails.
This java program runs continuously in background(its a jar file). This takes around 512MB RAM, so my question is that can I use some other frameworks or better thread system to make it more lighter? The SQL query I execute is only a select query, update/delete/create queries are not used.
I am not good in JAVA so may be this sounds stupid.
I guess the smallest possible framework to use would be plain JDBC.
This would limit your libraries to those in the jre plus the DB driver and maybe libs for AWS / Email. Depending on what else you need, selecting a compact profile might be worth investigating.
Also check your memory settings:
If you set -Xms512m it's really not surprising your app uses 512m, is it?
Edit due to rephrased question
In your level of parallelism, most of your Memory is consumed by Objects, not by Threads (well, Threads are objects, but small ones). Threads are good the way they are in Java. You can run hundrets of them without them consuming 500 mb of heap or more as you claim.
So the issue with 50 threads consuming 512m of your memory is more likely rooted in your code and your objects, not (only) in your threads.
In order to reduce memory footprint, tra the follwing:
Remove hibernate. As you say you only have a simple select SQL, so you don't need the overhead and additional libraries.
Take a memory dump of your running app and analyse it. (MAT - Eclipse Memory Analyser tool comes to mind)
Check other objects and how you use them. When you say "sending emails" - how large are your emails? Might there be duplicate buffers do to bad choice of coding? Share your code for how you do it, then we can have a look.
Try running without any memory options and see how the program runs on defaults.
Add garbage collector output and check that
In order to improve the execution speed of a Java program running in Google App Engine, can I create additional Java threads during the runtime to make use of idle machines in the data center?
I've found conflicting data thus far.
If your primary concern is to improve the execution time, take a look at Memcache and Tasks. They can be used to reduce or avoid the latency of reading from or writing to the Datastore or other storage options, fetching URLs, sending emails, etc. If you do a lot of difficult computations that can run in parallel, look at MapReduce API.
Once you remove all the delays from your program, there will be no reason to use multiple threads within a single request.
Note that App Engine instances can use multithreading to execute multiple requests at the same time, so they tend to use allocated resources efficiently. To enable it, see:
https://developers.google.com/appengine/docs/java/config/appconfig#Java_appengine_web_xml_Using_concurrent_requests
If you have a problem that calls for a multithreaded solution, you can use threads (as described on the link that you included in your question).
However, based on your reasoning ("to make use of idle machines in the datacenter"), it seems like you're misguided. You should not use threads for that reason. You use the machines hours that you pay for and not more. The only time you will have an idle machine is if you tell App Engine to keep around an extra idle machine so that it doesn't have to start up an extra machine your app gets a big usage spike.
Most of the time, unless you are truly doing parallel computation, you won't need to use multiple threads in App Engine. For instance, the datastore has an asynchronous API so that you can do multiple datastore operations in parallel without having to deal with threads yourself.
Does that make sense?
I have certain requirement where I need to process large data with some mysql operation, there are multiple run of the similar kind. A single run takes around 2 hrs.
If I run each run in separate java thread there was no major time saving. As per my understanding java threads are not multi process ie its only a way to obtain parallelism not to improve CPU utilization.
If there is any way I can make use of multiple processor on the same machine through java, I guess that could save some time for all run operations.
Please let me know if the problem is clear here and have any idea on the solution.
Thanks,
Ashish
I think that your problem is in your application or in mysql.
Java does support multi-threading and your application should benefit automatically from multiple cores.
Probably there is a common resource that needs to be synchronized.
From what you say, "process large data", i bet the common resource is the database file and memory.
If a single run takes a minute or more (in your case: 120 minutes), than you're better off with multiple processes anyway, as the overhead of the JVM startup is neglectible.
We have a 64 bit linux machine and we make multiple HTTP connections to other services and Drools Guvnor website(Rule engine if you don't know) is one of them. In drools, we create knowledge base per rule being fired and creation of knowledge base makes a HTTP connection to Guvnor website.
All other threads are blocked and CPU utilization goes up to ~100% resulting into OOM. We can make changes to compile the rules after 15-20 mins. but I want to be sure of the problem if someone has already faced it.
I checked for "cat /proc/sys/kernel/threads-max" and it shows 27000 threads, Can it be a reason?
I have a couple of question:
When do we know that we are running over capacity?
How many threads can be spawned internally (any rough estimate or formula relating diff parameters will work)?
Has anyone else seen similar issues with Drools? Concurrent access to Guvnor website is basically causing the issue.
Thanks,
I am basing my answer on the assumption that you are creating a knowledge base for each request, and this knowledge base creation incudes the download of latest rule sources from Guvnor please correct if I am mistaken.
I suspect that the build /compilation of packages is taking time and hog your system.
Instead of compiling packages on each and every request, you can download pre build packages from guvnor, and also you can cache this packages locally if your rules does not change much. Only restriction is that you need to use the same version of drools both on guvnor and in your application.
I checked for "cat /proc/sys/kernel/threads-max" and it shows 27000
threads, Can it be a reason?
That number does look large but we dont know if a majority of those threads belong to you java app. Create a java thread dump to confirm this. Your thread dump will also show the CPU time taken by each thread.
When do we know that we are running over capacity?
You have 100% CPU and an OOM error. You are over capacity :) Jokes aside, you should monitor your HTTP connection queue to determine what you are doing wrong. Your post says nothing about how you are handling the HTTP connections (presumably through some sort of pooling mechanism backed by a queue ?). I've seen containers and programs queue requests infinitely causing them to crash with a big bang. Plot the following graphs to isolate your problem
The number of blocking threads over time
Time taken for each thread
Number of threads per thread pool and how they increase / decrease with time (pool size)
How many threads can be spawned internally (any rough estimate or
formula relating diff parameters will work)?
Only a load test can answer this question. Load your server and determine the number of concurrent users it can support at 60-70% capacity. Note the number of threads spawned internally at this point. That is your peak capacity (allowing room for unexpected traffic)
Has anyone else seen similar issues with Drools? Concurrent access to
Guvnor website is basically causing the issue
I cant help there since I've not accessed drools this way. Sorry.
How can we increase the performance of an application. My application is written using Java, Hibernate, Servlets, Wsdl i have used for web services. I have executed some of the tests on linux machine, so that i can get proper TPS of the execution.
but still , i am not satisfied by the performance.
So for this, what all steps i should try to increase the performance.
adding to above, i have executed code coverage and used find bugs in the code prominently for each and every test and every service i have written.
Individual suggestions are invited.
Thanks.
Profile your application, and remove all of your bottlenecks.
In addition, or better before, take a day or two and read as much from the Java Performance Tuning newsletters as you understand.
You should monitor your application with a tool like VisualVM, JProfiler etc. to determine the performance bottleneck(s). It is pointless to tune the application without knowing where the actual performance problems are located.
In a professional environment, I suggest dynaTrace that can show you performance bottlenecks along the execution path. The tool can show you exactly where the application spends its time.
Is the performance related to disk I/O or network I/O? In a high throughput system (from DB point of view) Hibernate might not be the best way to go. If you have a lot of writes I would recommend you use a different mechanism to write to database -- perhaps simply switching to simple JDBC might speed it up?
Secondly, is it the case that your webservices are taking too long to get back with results? SOAP is not the fastest protocols really -- have you looked at something like REST maybe coupled with JSON ?