We have an issue in our server at job and I'm trying to understand what is happening. It's a Java application that runs in a linux server, the application recieve inforamtion form TCP socket and analyse them and after analyse write into the database.
Sometimes the quantity of packets is too many and the Java application need to write many times into the database per second (like 100 to 500 times).
I try to reproduce the issue in my own computer and look how the application works with JProfiler.
The memory look always going up, is it a memory leak (sorry I'm not a Java programmer, i'm C++ programmer)?
After 133 minute
After 158 minute
I have many locked thread, does it means that the application did not programmed correctly?
Is it too many connection to the database (the application use BasicDataSource class to use a connection pool)?
The program don't have FIFO to manage database writing for continual information entering from TCP port. My questions are (remeber that I'm not a Java programmer and I don't know if this is way that a Java application should work or the program can be programmed more efficient)
Do you think that something is wrong with the code that are not correctly managing write, read, updates on the database and cosume too many memory and CPU time, or is it the way that it works in BasicDataSource class?
How do you think I can improve it (if you think it's an issue) this issue, by creating a FIFO and removing the part of code that create too many threads? Or the threads is not the application threads himself and thats the BasicDataSource threads?
There are several areas to dig into, but first I would try and find what is actually blocking the threads in question. I'll assume everything before the app is being looked at as well, so this is from the app down.
I know the graphs show free memory but they are just point in time so I can't see a trend. GC logging is available, I haven't used JProfiler much though so I am not sure how to point you to it in that tool. I know in DynaTrace I can see GC events and their duration as well as any other blocking events and their root cause as well. If this isn't available there are command line switches to log GC activity to see its duration and frequency. That is one area that could block.
I would also look at how many connections you have in your pool. If there are 100-500 requests/second trying to write and they are stacking up because you don't have enough connections to work them then that could be a problem as well. The image shows all transactions but doesn't speak to the pool size. Transactions blocked with nowhere to go could lead to your memory jumps as well.
There is also the flip side that your database can't handle the traffic and is pegged, and that is what is blocking the connections as well so you would want to monitor that end of things and see if that is a possible cause of the blocking.
There is also the chance that the blocking is occurring from the SQL being run as well, waiting for page locks to be released, etc.
Lots of areas to look at, but I would address and verify one layer at a time starting with the app and working down.
Related
A while back I tried implementing a crawler in Java and left the project for a while (made a lot of progress since). Basically I have implemented a crawler with circa 200-400 threads, each thread connects and downloads the content of one page (simplified for clarity, but that's basically it):
// we're in a run() method of a truely generic Runnable.
// _url is a member passed to the Runnable object beforehand.
Connection c = Jsoup.connect(_url).timeout(10000);
c.execute();
Document d = c.response().parse();
// use Jsoup to get the links, add them to the backbone of the crawler
// to be checked and maybe later passed to the crawling queue.
This works. The problem is I only use a very small fraction of my internet bandwidth. Having the ability to download at >6MB/s, I've identified (using NetLimiter and my own calculations) that I only use about 1MB/s at best when downloading pages sources.
I've done a lot of statistics and analyses and it is somewhat reasonable - if the computer cannot efficiently support over ~400 threads (I don't know about that also, but a larger number of threads seems to be ineffective) and each connection takes about 4 seconds to complete, then I'm supposed to download 100 pages per second which is indeed what happens. The bizarre thing is that many times while I run this program, the internet connection is completely clogged - neither I nor anyone else on my wifi connection can access the web normally (when I'm only using 16%! which does not happen when downloading other files, say movies).
I've spent literally weeks calculating, analyzing and collecting various statistics (making sure all threads are operating with VM monitor, calculating mean run time for threads, excel charts...), before coming here, but I've ran out of answers. I wonder if this behavior could be explained. I realize there's a lot of "ifs" in this question, but it's the best I can do without it turning into an essay.
My computer specs are i5 4460 with 8GB DDR3-1600 and a 100Mb/s (effectively around 8MB/s) internet connection, connected directly via LAN to the crawler. I'm looking for general directions - where else should I look
(I mean obvous stuff that are clear to experienced developers and not myself) in order to either:
Improve the download speed (maybe not Jsoup? different number of threads? I've already tried using selectors instead of threads and it was slower), or:
Free up the internet when I'm running this program.
I've thought about the router itself (Netgear N600) limiting the number of outgoing connections (seems odd), so I'm saturating the number of connections, and not the bandwidth, but couldn't figure out if that's even possible.
Any general direction / advice would be warmly welcomed :) feel free to point out newish mistakes, that's how I learn.
Amir.
The issue was not DNS resolutions, as creating the connections with an IP address (I stored all addresses in advance then used those) resulted in the exact same response times and bandwidth use. Nor was it the threads issue.
I suspect now it was the netlimiter program's "fault". I've measured directly the number of bytes received and outputted these to disk (I've done this before but apprantly I've made some changes in the program). It seems I'm really saturating the bandwidth. Also, when switching to HttpUrlConnection objects instead of Jsoup, the netlimiter program does show a much larger bandwidth usage. Perhaps it has some issue with Jsoup.
I'm not sure this was the problem, but empirically, the program downloads a lot of data. So I hope this helps anyone who might encounter a similar issue in the future.
I run multiple game servers and I want to develop a custom application to manage them. Basically all the game servers will connect to the application to exchange data. I don't want any of this data getting lost so I think it would be best to use TCP. I have looked into networking and understand how it works however I have a question about cpu usage. More servers are being added and in the next few months it could potentially reach around 100 - 200 and will continue to grow as needed. Will new threads for each server use a lot of cpu and is it a good idea to do this? Does anyone have any suggestions on how to go about this? Thanks.
You should have a look at non blocking io. With blocking io, each socket will consume 1 thread and the number of threads in a system is limited. And even if you can create 1000+, it is a questionable approach.
With non blocking io, you can server multiple sockets with a single thread. This is a more scalable approach + you control how many threads at any given moment are running.
More servers are being added and in the next few months it could potentially reach around 100 - 200 and will continue to grow as needed. Will new threads for each server use a lot of cpu and is it a good idea to do this?
It is a standard answer to caution away from 100s of threads and to the NIO solution. However, it is important to note that the NIO approach has a significantly more complex implementation. Isolating the interaction with a server connection to a single thread has its advantages from a code standpoint.
Modern OS' can fork 1000s of threads with little overhead aside from the stack memory. If you are sure of your scaling factors (i.e. you're not going to reach 10k connections or something) and you have the core memory then I would say that a thread per TCP connection could work very well. I've very successfully run applications with 1000s of threads and have not seen fall offs in performance due to context switching which used to be the case with earlier processors/kernels.
Have a web application running across multiple locations,
I can see many connections piling up by running this command on linux:
ps -ef|grep LOCAL
shows me the count of active oracle connections with process id's, and the connection count has been growing up by 5-7 number every hour. After few hours, application slows down and eventually tomcat server needs to be restarted.
As, I am able to see connections growing, Is there any way to get the source of these connections, to find out what classes or object's have created these laid up connections?
And I am not using Tomcat connection pooling, I tried generating thread dumps by issuing kill -3 tomcat pid, but of no use to me, as I am not able to understand them, even tried thread analyzers.
Is there any simple way to get the originator classes associated with these laid up connections to get a small hint, using some tomcat feature, or by any other means?
In JProfiler, you yould use the JDBC probe to get the stack trace that opened a connection. You would select the connection in the timeline
and jump to the events view
where you can select the "Connection opened" event. in the lower pane, the associated stack trace is shown.
Disclaimer: My company develops JProfiler
You could search for uses of javax.sql.DataSource.getConnection() using your IDE.
If you start tomcat in debug mode, you can look for instances of the connection class (and see them increasing). Also, putting a breakpoint on the constructor will catch them in the act of being created.
But really you should be using a connection pool. That is the easiest solution to your problems.
Perhaps these two tools can help you to determine what slows your sever application's performance.
jmeter
ab benchmarking tool
Performance might have slowed due to some simple implementation issues too. You might want to use NIO (buffer oriented, non-blocking IO) instead of IO for web applications, also you might be doing a lot of string concatenations (use StringBuffer).
We have a 64 bit linux machine and we make multiple HTTP connections to other services and Drools Guvnor website(Rule engine if you don't know) is one of them. In drools, we create knowledge base per rule being fired and creation of knowledge base makes a HTTP connection to Guvnor website.
All other threads are blocked and CPU utilization goes up to ~100% resulting into OOM. We can make changes to compile the rules after 15-20 mins. but I want to be sure of the problem if someone has already faced it.
I checked for "cat /proc/sys/kernel/threads-max" and it shows 27000 threads, Can it be a reason?
I have a couple of question:
When do we know that we are running over capacity?
How many threads can be spawned internally (any rough estimate or formula relating diff parameters will work)?
Has anyone else seen similar issues with Drools? Concurrent access to Guvnor website is basically causing the issue.
Thanks,
I am basing my answer on the assumption that you are creating a knowledge base for each request, and this knowledge base creation incudes the download of latest rule sources from Guvnor please correct if I am mistaken.
I suspect that the build /compilation of packages is taking time and hog your system.
Instead of compiling packages on each and every request, you can download pre build packages from guvnor, and also you can cache this packages locally if your rules does not change much. Only restriction is that you need to use the same version of drools both on guvnor and in your application.
I checked for "cat /proc/sys/kernel/threads-max" and it shows 27000
threads, Can it be a reason?
That number does look large but we dont know if a majority of those threads belong to you java app. Create a java thread dump to confirm this. Your thread dump will also show the CPU time taken by each thread.
When do we know that we are running over capacity?
You have 100% CPU and an OOM error. You are over capacity :) Jokes aside, you should monitor your HTTP connection queue to determine what you are doing wrong. Your post says nothing about how you are handling the HTTP connections (presumably through some sort of pooling mechanism backed by a queue ?). I've seen containers and programs queue requests infinitely causing them to crash with a big bang. Plot the following graphs to isolate your problem
The number of blocking threads over time
Time taken for each thread
Number of threads per thread pool and how they increase / decrease with time (pool size)
How many threads can be spawned internally (any rough estimate or
formula relating diff parameters will work)?
Only a load test can answer this question. Load your server and determine the number of concurrent users it can support at 60-70% capacity. Note the number of threads spawned internally at this point. That is your peak capacity (allowing room for unexpected traffic)
Has anyone else seen similar issues with Drools? Concurrent access to
Guvnor website is basically causing the issue
I cant help there since I've not accessed drools this way. Sorry.
I'm new here and I'm not that very good in CPU consumption and Multi Threading. But I was wondering why my web app is consuming too much of the CPU process? What my program does is update values in the background so that users don't have to wait for the processing of the data and will only need to fetch it upon request. The updating processes are scheduled tasks using executor library that fires off 8 threads every 5 seconds to update my data.
Now I'm wondering why my application is consuming too much of the CPU. Is it because of bad code or is it because of a low spec server? (2 cores with 2 database and 1 major application running with my web app)
Thank you very much for your help.
You need to profile your application to find out where the CPU is actually being consumed. Java has some basic profiling methods built in, or if your environment permits it, you could run the built in "hprof" compiler:
java -Xrunhprof ...
(In reality, you probably want to set some extra options: Google "hprof" for more details.)
The latter is easier in principle, but I mention the possibility of adding your own profiling routine because it's more flexible and you can do it e.g. in a Servlet environment where running another profiler is more cumbersome.
Paulo,
It is not possible for someone here to say whether the problem is that your code is inefficient or the server is under spec. It could be either or both of those, or something else.
You are going to need to do some research of your own:
Profile the code. This will allow you to identify where your webapp is spending most of its time.
Look at the OS-level stats that are available to you. This might tell you that the real problem is memory usage or disk I/O.
Look at the performance of the back-end database. Is it using a lot of CPU?
Once you have identified the area(s) where the CPU is being used, you need to figure out the real cause of the problem is and work out how to fix it. And once you've got a potential fix implemented, you can rerun your profiling, etc to see it has helped.