I encounter a problem with java threads and I am not sure if its related to my approach or if thread pooling is going to resolve what I am trying to achieve.
for (int i = 0; i < 100; i++) {
verifier[i]=new Thread();
verifier[i].start();
}
I initialize 100 threads and start them. In the threads the code that gets executed is just
con = (HttpURLConnection) website.openConnection(url);
// gets only the header
con.setRequestMethod("HEAD");
con.setConnectTimeout(2000); // set timeout to 2 seconds
These threads repeat the process above over a long list of url/data.
The first 50 threads execute almost instantly then they just stop for 60 seconds or so and then there is another spike of execution that 20 of them or so finish at the same time and so on. The same deadlock occurs even if there is 4 of them.
My first guess was a deadlock. I am not sure how to resolve the issue and maintain a constant execution pace, without deadlocks and stops.
I am looking for an explanation of why this occurs and how it can be resolved.
By DeadLock I reefer to the Java Virtual Machine and how it handles thread. Not deadlock caused by my threads.
SCREENSHOT OF THREAD EXECUTION:
It looks like the threads are dying for no reason and I don't know why?!
It could be that the operating system configurable limit of tcp/ip connections gets hit, which causes the JVM to block waiting for a new TCP/IP connection to be created, which will only happen if a connection already used get's closed.
This could help to find what is going on:
profile the run with visualvm which comes with the JVM itself (run it on the command line with jvisualvm). There should be indication of how many threads are created and why are they blocked, deadlocks, etc.
Wait for it to block and take thread dumps of the JVM process to check for deadlocks in the thread stack traces using jstack or visualvm, search for the deadlock keyword.
Check with netstat -nao the state of your TCP connections, to see if the operating system limit is getting hit, if there are many connections in CLOSE_WAIT at the times the blocking occurs
Are you behind a corporate proxy/firewall, you could be hitting some other sort of security limit that prevents you from opening more TCP connections, not necessarily the limit of the operating system
If none of this helps you can always edit the question with further findinds, but based on the description of the code other limits are getting hit that on a first look don't seem related to JVM thread deadlocks, hope this helps.
Related
I'm writing a program using java, where I use some error printing statements for debugging.
My program generates about 2000 threads. The program runs fine till the moment when a large number of threads access this statement:
System.err.println("Some error message");
When this happens one of my threads successfully manages to get access to println function, while the other threads have status:
State in JVM: Waiting for synchronized block
Digging deeper in the debugging statement I noticed that the thread which managed to access println function is stopped at this function:
private native void writeBytes(byte b[], int off, int len , boolean append) throws IOException;
and it has the following stack trace:
java.io.FileOutputStream.write(FileOutputStream.java:327)
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
java.io.PrintStream.write(PrintStream.java:482)
sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104)
java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185)
java.io.PrintStream.write(PrintStream.java:527)
java.io.PrintStream.print(PrintStream.java:669)
java.io.PrintStream.println(PrintStream.java:806)
fetcher.responseHandler.ExtendedResponseHandler500.handleResponse(ExtendedResponseHandler500.java:20)
fetcher.FetchWorker.run(FetchWorker.java:79)
java.lang.Thread.run(Thread.java:745)
While the other threads are stopped at the first line of the println function (inside the java core code):
synchronized(this)
Is this problem caused by me? or is this error related to JVM? Can I do anything about this issue?
The most likely cause is that the output stream of the process isn't being consumed by the parent process, so the stdout buffer fills up and then the next call to System.err.println just hangs forever.
This is common when one process is used to launch another, but doesn't set up "flushing" threads to drain the child's stdout and stderr streams.
Note that this doesn't have anything in particular to do with "threading" - but launching many threads can certainly increase the rate at which errors are generated (and perhaps cause more total errors if something else fails due to contention downstream) which means your output buffer fills up faster and hangs earlier.
It is perfectly normal for 2000 threads to be waiting to acquire lock on the println call.
Your stack trace shows that you are getting some HTTP 500 errors. Probably the majority your threads got this error and now are all in line to report it on the standard error. What you are seeing is the consequence of your problem, not the cause.
2000 threads is an insane number, it will not improve performances for about any reasonable scenario and will most likely degrade performances. Start with something like 4 and see if incrementally doubling that value gives you any improvement. The JVM can handle this number of threads (so this is NOT the source of your problem) but it is just useless. Using more JVM will not fix the problem (that is probably a simple HTTP 500 and/or network timeouts).
Also check the server side logs.
If you need to maximize performances (in a real high concurrency scenario) consider asynchio, but normal IO is extremely good for all common cases and seems fine here.
Update:
My hypotesis is this:
a thread does a remote call
it gets a 500 error
it joins the line to report the error behind other 2000 threads
it succedes and goes back to point 1
The step 3 may take a lot of time so even taking multiple thread dumps you will see the very same thread that appears always locaked (I mean that you need to be very lucky to see the Thread-1352 during the steps 1 or 2). I'm assuming that you checked the thread name and that the locked thread we are discussing is always the same.
Do you see any log while the program "freeze" (does it freezes, right?) or everything is still? How many thread dumps did you took, how much time in between?
I encountered a critical problem with the c3p0 library (version 0.9.5.2) that I use in my Java SE application.
My application uses a Thread Pool to parallelize task by executing jobs.
Each job uses the database to read, update or delete data at least once up to (in very rare cases but it can happen) more than 10,000 times.
I therefore included in my project c3p0 library to have a connection pool to the database so that all workers in my thread pool can simultaneously interact with it.
I do not have any problems when running my application on my development environment (OSX 10.11), but when I run it in production (Linux Debian 8) I encounter a big problem ! Indeed it freezes....
At first it was a deadlock with the following trace stack:
[WARNING] com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#479d237b -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
[WARNING] com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#479d237b -- APPARENT DEADLOCK!!! Complete Status:
Managed Threads: 3
Active Threads: 3
Active Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#264fb34f
on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#2
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#39a5576b
on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#1
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#5e676544
on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#0
Pending Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#6848208c
Pool thread stack traces:
Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#2,5,main]
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
com.microsoft.sqlserver.jdbc.SocketFinder.findSocketUsingJavaNIO(IOBuffer.java:2438)
com.microsoft.sqlserver.jdbc.SocketFinder.findSocket(IOBuffer.java:2290)
com.microsoft.sqlserver.jdbc.TDSChannel.open(IOBuffer.java:551)
com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1962)
com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1627)
com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1458)
com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:772)
com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1168)
com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1138)
com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1125)
com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1870)
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)
Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#1,5,main]
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
com.microsoft.sqlserver.jdbc.SocketFinder.findSocketUsingJavaNIO(IOBuffer.java:2438)
com.microsoft.sqlserver.jdbc.SocketFinder.findSocket(IOBuffer.java:2290)
com.microsoft.sqlserver.jdbc.TDSChannel.open(IOBuffer.java:551)
com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1962)
com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1627)
com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1458)
com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:772)
com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1168)
com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1138)
com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1125)
com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1870)
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)
Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#0,5,main]
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
com.microsoft.sqlserver.jdbc.SocketFinder.findSocketUsingJavaNIO(IOBuffer.java:2438)
com.microsoft.sqlserver.jdbc.SocketFinder.findSocket(IOBuffer.java:2290)
com.microsoft.sqlserver.jdbc.TDSChannel.open(IOBuffer.java:551)
com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1962)
com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1627)
com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1458)
com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:772)
com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1168)
com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1138)
com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1125)
com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1870)
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)
Subsequently I made some changes following the advice on different websites:
System.setProperty("com.mchange.v2.log.MLog", "com.mchange.v2.log.FallbackMLog");
System.setProperty("com.mchange.v2.log.FallbackMLog.DEFAULT_CUTOFF_LEVEL", "WARNING");
// Create db pool
final ComboPooledDataSource cpds = new ComboPooledDataSource() ;
// Driver
cpds.setDriverClass( "com.microsoft.sqlserver.jdbc.SQLServerDriver" ); // loads the jdbc driver
// Url
cpds.setJdbcUrl( "jdbc:xxxx://xxxxx:xxxx;database=xxxxx;" );
// Username / Password
cpds.setUser( "xxxx" ) ;
cpds.setPassword( "xxxx" ) ;
// Start size of db pool
cpds.setInitialPoolSize( 8 );
// Min and max db pool size
cpds.setMinPoolSize( 8 ) ;
cpds.setMaxPoolSize( 10 ) ;
// ????
cpds.setNumHelperThreads( 5 ) ;
// Max allowed time to execute statement for a connection
// #See http://stackoverflow.com/questions/14730379/apparent-deadlock-creating-emergency-threads-for-unassigned-pending-tasks
cpds.setMaxAdministrativeTaskTime( 60 ) ;
// ?????
cpds.setMaxStatements( 180 ) ;
cpds.setMaxStatementsPerConnection( 180 ) ;
// ?????
cpds.setUnreturnedConnectionTimeout( 60 ) ;
// ?????
cpds.setStatementCacheNumDeferredCloseThreads(1);
// We make a test : open and close opened connection
cpds.getConnection().close() ;
After these changes, after the execution of some jobs, the application freezes for several ten of seconds and then displays this error message:
[WARNING] A task has exceeded the maximum allowable task time. Will interrupt() thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#4,5,main]], with current task: com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#4128b402
[WARNING] Thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#4,5,main]] interrupted.
[WARNING] A task has exceeded the maximum allowable task time. Will interrupt() thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#3,5,main]], with current task: com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#5d6aab6d
[WARNING] Thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#3,5,main]] interrupted.
[WARNING] A task has exceeded the maximum allowable task time. Will interrupt() thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#0,5,main]], with current task: com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#70a3328f
[WARNING] Thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#0,5,main]] interrupted.
My questions are :
Why does the application work perfectly in a development environment and encounters these problems during production?
Above all, how to remedy it?
When a connection reaches the maximum number of statements defined with setMaxStatements and setMaxStatementsPerConnection, what happens to it? The connection is closed then another takes over while another one is created?
I did not quite understand the impact that the setStatementCacheNumDeferredCloseThreads function has on my application.
Thank you very much ! Have a good day.
OK. So. Your basic problem is simple. In your production environment, Connection acquisition attempts are eventually freezing, that is they are neither succeeding nor failing with an Exception, they are simply hanging. Ultimately, this is what you have to debug: Why is it that when c3p0 tries to connect to your production database, sometimes those calls to Driver.connect() hang? Whatever is causing that is outside of c3p0's control. You could be hitting limits in total connections at the DBMS side (not from this application, your maxPoolSize is quite modest, but perhaps your production server is overextended). If you are running on an older JVM, there was a known problem with hangs to SQLServer, see e.g. JDBC connection hangs with no response from SQL Server 2008 r2 Driver.getConnection hangs using SQLServer driver and Java 1.6.0_29 but I doubt you are running Java 6 at this point, and I don't know of more recent issues. In any case, it's quite clear from your logs that this is what is happening: c3p0 is trying to acquire Connections from the DBMS, the DBMS is hanging indefinitely, eventually all of c3p0's helper threads get saturated by hung tasks and you see an APPARENT DEADLOCK. To resolve the issue, you have to debug why attempts by your JDBC driver to connect to your DBMS sometimes hang.
Most of the things you did after scrounging random troubleshooting posts were not very relevant to this issue. The thing that did cause your logs to change was this setting
cpds.setMaxAdministrativeTaskTime( 60 );
That works around the problem in an ugly way. If a task hangs for a long period of time, that setting causes c3p0 to interrupt() the Thread on which it is running and abandon it. That prevents the deadlocks, but doesn't address their cause.
There is a surprising change, though, between the two logs. The replacement of APPARENT DEADLOCK spew with reports that a 'task has exceeded the maximum allowable task time' were to be expected. But interestingly, in your second log, the tasks that get interrupt()ed are not Connection acquisition attempts, but Connection destruction attempts. I don't know why that has changed, but the core issue is the same: attempts by your JDBC driver to interact with your DBMS are freezing indefinitely, nether succeeding nor failing promptly with an Exception. That is what you need to debug.
If you can't resolve the problem, you may be able to work around it. It's very ugly, but if you reduce maxAdministrativeTaskTime (say to 30) and increase numHelperThreads (say to 20), you may be able to largely eliminate the application pauses, as long as the freezes are infrequent. Increasing numHelperThreads increases the number of frozen tasks c3p0's Thread pool can tolerate before being completely blocked. reducing maxAdministrativeTaskTime reduces the lifetime of blockages. Obviously, the right thing to do is to debug the problem between the JDBC driver and the DBMS. But if that proves impossible, sometimes a workaround is the best you can do.
I would eliminate (at least for now) these three settings:
// ?????
cpds.setMaxStatements( 180 ) ;
cpds.setMaxStatementsPerConnection( 180 ) ;
// ?????
cpds.setStatementCacheNumDeferredCloseThreads(1);
The first two turn Statement caching on, which may or may not be desirable from a performance perspective for your application. But they increase the complexity of c3p0's interaction with the DBMS. SQLServer (among several databases) is very fragile with respect to multithreaded use of a Connection (which, at least per early versions of the JDBC spec, should be legal, but too bad). Setting statementCacheNumDeferredCloseThreads to 1 ensures that the Statement cache doesn't try to close an expiring Statement while the Connection is otherwise in use, and so prevents freezes, APPARENT DEADLOCKs that show up usually as hung Statement close tasks, not your issue. If you turn the Statement cache on, by all means keep statementCacheNumDeferredCloseThreads set to 1 to avoid freezes. But the safest, sanest thing is to avoid all the complexity of the Statement cache until you have your main issue debugged. You can restore these settings later to test whether they improve your application's performance. (If you do turn the Statement cache back on, my suggestion is that you just set maxStatementsPerConnection, and do not set a global maxStatements, or if you set both, set the per-Connection limit to a value much smaller than the global limit. But again, for now, just turn all this stuff off.)
To get to your specific questions:
Why does the application work perfectly in a development environment and encounters these problems during production?
That's an important clue that you want to use in debugging the hang between you JDBC driver and your DBMS. Something about your production server leads to a hang that does not show up in your development server. That may just a matter of the relatively low load on your development server and high load on your production server. But there may be other differences in settings that provide clues about the hangs.
Above all, how to remedy it?
Debug the hangs. If you cannot debug the hangs, try working around the issue with a shorter maxAdministrativeTaskTime and a larger numHelperThreads.
When a connection reaches the maximum number of statements defined with setMaxStatements and setMaxStatementsPerConnection, what happens to it? The connection is closed then another takes over while another one is created?
A Connection doesn't reach any of those things. These are parameters that describe the Statement cache. When the total number of cached Statements hits maxStatements, the least-recently-used cached Statement is closed (just the Statement, not its Connection). When a Connection's maxStatementsPerConnection is hit, that Connection's least-recently-used cached Statement is closed (but the Connection itself remains open and active).
I did not quite understand the impact that the setStatementCacheNumDeferredCloseThreads function has on my application.
If you are using the Statement cache (again, I recommend you turn it off for now), this setting ensures that expired Statements (see above) are not close()ed while their parent Connection is in use by some other Thread. The setting creates a dedicated Thread (or Threads) whose sole purpose is to wait for when Connections are no longer in use and close them only then (thus, statement cache deferred close threads).
I hope this helps!
Update: The bug that you are experiencing does look very much like the Java 6 bug. If you are running Java 6, you are in luck, the fix is probably just to update your production JVM to the most recent version of Java 6.
I encountered several stuck JDBC connections in my code due to poor network health. I am planning java.sql.Connection.setNetworkTimeout library function. As per docs:-
Sets the maximum period a Connection or objects created from the Connection will wait for the database to reply to any one request
Now, what exactly is the request here? my query takes really long time to respond and even longer time to process (I am using jdbc interface to a big data DB). So do I need to keep this timeout time, bigger than the expected query execution time (to prevent false trigger) or will there exist keep alive messages, being exchanged to keep track on network connection?, in which case I will keep it really low
So if your NetworkTimeout is smaller than the QueryTimeout, the query will be terminated on your side - thread that waits for the DB to reply (notice that setNetworkTimeout has Executor executor parameter) will be interrupted. Depending on the underlying implementation NetworkTimeout may cancel the query on the DB side as well.
If NetworkTimeout > QueryTimeout, and query completes within QueryTimeout then nothing bad should happen. If problems you experience are exactly in this case, you should try to work on the OS level settings for keeping TCP connections alive so that no firewall terminates them too soon.
When it comes to keeping TCP connections alive it is usually more a matter of the OS level settings than the application itself. You can read more about it (Linux) here.
I have a server-side application opens a socket thread for each connected client. I have a DataInputStream in each thread, which calls read(byte[]array) to read data. I also set the socket timeout to a few minutes. The main code is something like this:
while (dataInputStream.read(array) != -1) { do something... }
However, after several hours running, in jconsole with topthreads plugin, I can see several client threads use 20%ish CPU each. If I click on it, the call stack shows the thread is blocked on the above line, on the read() function.
I know read() function will normally block to wait for data. When blocked, it consumes little CPU cycles. Now it's using 20%ish each and my server is running more and more slower when more threads have the same problem. My server has about 5 connection requests per second, and this happens really rarely as in several hours only 5 threads have the problem.
I am really confused. Can someone help me?
when jvm is waiting to read data from a socket there is a lot more activities the system needs to constantly do..
I don't have the exact technique used but this link should give some idea..
why don't you try using a BufferedInputStream or any of the StreamReaders .. these classes would help in the performance.
you could try using classes from java.util.concurrent package to improve thread handling (creating a thread pool would help in reducing the total memory consumed, thereby helping in the overall system performance).. not sure if you are doing this already
while (dataInputStream.read(array) != -1) { do something... }
This code is wrong anyway. You need to store the return value of read() in a variable so you know how many bytes were returned. The rest of your application can't possibly be working reliably without this anyway, so worrying about timing at this stage is most premature.
However unless the array is exceptionally small, I doubt you are really using 20% CPU in here. More likely 20% of elapsed time is spent here. Blocking on a network read doesn't use any CPU.
Reviewing a stack trace of a non-responsive web-app, I realized that some of the data did not match how-I-thought-tomcat-works.
Context
The application is getting hit in rapid succession for a slow url. These requests seem to pile up, i.e. form a traffic jam in the app server. Logging shows that the number of threads/http connectors have maxed out (number of busy threads has crept up to to the maxThreads value of 120).
Jboss 4.2.2
Uses a variation of tomcat 6.0 called 'jboss-web'
The question
Many of the threads are "doing something"--i.e. reading from the database, writing to the output stream etc...Yet over 50 of the threads are "waiting on the connector to provide a new socket" (From the comments)
What exactly does this mean to the non-socket programmer?
My prior assumptions: wrong
I had assumed that each http thread would 'do its own work'--get the request, do some work, and write the response -- and not need to wait for anything.
so...
What's going on? Could someone clarify the socket-ish stuff?
What implication does such a bottleneck for the tomcat settings? (i.e. increase this setting, decrease that one, etc.)
** Stack Trace **
"http-0.0.0.0-80-90" daemon prio=6 tid=0x695e1400 nid=0x24c in Object.wait() [0x6e8cf000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x09c34480> (a org.apache.tomcat.util.net.JIoEndpoint$Worker)
at java.lang.Object.wait(Object.java:485)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.await(JIoEndpoint.java:416)
- locked <0x09c34480> (a org.apache.tomcat.util.net.JIoEndpoint$Worker)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:442)
at java.lang.Thread.run(Thread.java:619)
Locked ownable synchronizers:
- None
Code fragment from Tomcat's org.apache.tomcat.util.net.JIOEndpoint
/**
* Process an incoming TCP/IP connection on the specified socket. Any
* exception that occurs during processing must be logged and swallowed.
* <b>NOTE</b>: This method is called from our Connector's thread. We
* must assign it to our own thread so that multiple simultaneous
* requests can be handled.
*
* #param socket TCP socket to process
*/
synchronized void assign(Socket socket) {
// Wait for the Processor to get the previous Socket
while (available) {
try {
wait();
} catch (InterruptedException e) {
}
}
// Store the newly available Socket and notify our thread
this.socket = socket;
available = true;
notifyAll();
}
thanks
The maxNumThreads setting doesn't affect the servlet container's performance rather than requiring bigger heap and more cpu cycles (if your threads get activated). However, every time you modify that setting to a number bigger than 150, you might think of a bottleneck in your application.
The web server is not designed to handle more than 100 simultaneous requests. If you do find yourself in such a situation, consider clustering. I see you are using jbossweb and there is a really nice article here:
http://refcardz.dzone.com/refcardz/getting-started-jboss
However, as I don't think you have more than 100 simultaneous requests, I think it is a bottleneck in your applications. Things to check are your jdbc driver, the version of jdk you use, the tomcat version (in your case 6.0). Requests to your application should finish in less than 1 second - network latency (and even that is a way too huge delay ) and if you find they take more, it is probably somewhere in your code. Do you manually close/open your database connections, do you use efficient threading in the background, do you use JMS. That are the things to look at usually. The other might be a bug in your particular servlet container version.
P.S If you do decide to use a higher number of max threads, it might make sense to decrease/increase the thread stack size and see how it affects performance. If you have long lived threads( which should not be), you might want to increase the stack size. If you have short - lived threads, try decreasing the stack size to preserve a bit memory.
-Xss is the flag.
Also, i just saw the jboss AS version you are using. Check that also. Now that I look at your symptoms, I believe your problem is somewhere in the configuration files.