SQLRecoverableException: I/O Exception: Connection reset - java

Yesterday evening I left the office with a running Java program written by me. It should insert a lot of records into our company database (Oracle) using a JDBC connection. This morning when I came back to work I saw this error (caught by a try-catch):
java.sql.SQLRecoverableException: I/O Exception: Connection reset
The program wrote almost all records before getting this problem, but what if it happens early (just minutes after I leave the office at evening)? I cannot understand what happened, I contacted my database admin and he said there was no particular issue on the database.
Any idea on what happened and what can I do do to avoid it?

The error occurs on some RedHat distributions. The only thing you need to do is to run your application with parameter java.security.egd=file:///dev/urandom:
java -Djava.security.egd=file:///dev/urandom [your command]

I want to produce a complementary answer of nacho-soriano's solution ...
I recently search to solve a problem where a Java written application (a Talend© ELT job in fact) want to connect to an Oracle database (11g and over) then randomly fail. OS is both RedHat Enterprise and CentOS. Job run very quily in time (no more than half a minute) and occur very often (approximately one run each 5 minutes).
Some times, during night-time as work-time, during database intensive-work usage as lazy work usage, in just a word randomly, connection fail with this message:
Exception in component tOracleConnection_1
java.sql.SQLRecoverableException: Io exception: Connection reset
at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:101)
at oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:112)
at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:173)
at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:229)
at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:458)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:411)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:490)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:202)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:33)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:465)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
and StackTrace follow ...
##Problem explanation:##
As detailed here
Oracle connection needs some random numbers to assume a good level of security. Linux random number generator produce some numbers bases keyboard and mouse activity (among others) and place them in a stack. You will grant me, on a server, there is not a big amount of such activity. So it can occur that softwares use more random number than generator can produce.
When the pool is empty, reads from /dev/random will block until additional environmental noise is gathered. And Oracle connection fall in timeout (60 seconds by default).
##Solution 1 - Specific for one app solution##
The solution is to give add two parameters given to the JVM while starting:
-Djava.security.egd=file:/dev/./urandom
-Dsecurerandom.source=file:/dev/./urandom
Note: the '/./' is important, do not drop it !
So the launch command line could be:
java -Djava.security.egd=file:/dev/./urandom -Dsecurerandom.source=file:/dev/./urandom -cp <classpath directives> appMainClass <app options and parameters>
One drawback of this solution is that numbers generated are a little less secure as randomness is impacted. If you don't work in a military or secret related industry this solution can be your.
##Solution 2 - General Java JVM solution##
As explained here
Both directives given in solution 1 can be put in Java security setting file.
Take a look at $JAVA_HOME/jre/lib/security/java.security
Change the line
securerandom.source=file:/dev/random
to
securerandom.source=file:/dev/urandom
Change is effective immediately for new running applications.
As for solution #1, one drawback of this solution is that numbers generated are a little less secure as randomness is impacted. This time, it's a global JVM impact. As for solution #1, if you don't work in a military or secret related industry this solution can be your.
We ideally should use "file:/dev/./urandom" after Java 5 as previous path will again point to /dev/random.
Reported Bug : https://bugs.openjdk.java.net/browse/JDK-6202721
##Solution 3 - Hardware solution##
Disclamer: I'm not linked to any of hardware vendor or product ...
If your need is to reach a high quality randomness level, you can replace your Linux random number generator software by a piece of hardware.
Some information are available here.
Regards
Thomas

This simply means that something in the backend ( DBMS ) decided to stop working due to unavailability of resources etc.
It has nothing to do with your code or the number of inserts.
You can read more about similar problems here:
http://kr.forums.oracle.com/forums/thread.jspa?threadID=941911
http://forums.oracle.com/forums/thread.jspa?messageID=3800354
This may not answer your question, but you will get an idea of why it might be happening. You could further discuss with your DBA and see if there is something specific in your case.

Solution
Change the setup for your application, so you this parameter[-Djava.security.egd=file:/dev/../dev/urandom] next to the java command:
java -Djava.security.egd=file:/dev/../dev/urandom [your command]
Ref :- https://community.oracle.com/thread/943911

We experienced these errors intermittently after upgraded from 11g to 12c and our java was on 1.6.
The fix for us was to upgrade java and jdbc from 6 to 7
export JAVA_HOME='/usr/java1.7'
export CLASSPATH=/u01/app/oracle/product/12.1.0/dbhome_1/jdbc/libojdbc7.jar:$CLASSPATH
Several days later, still intermittent connection resets.
We ended up removing all the java 7 above. Java 6 was fine. The problem was fixed by adding this to our user bash_profile.
Our groovy scripts that were experiencing the error were using /dev/random on our batch VM server. Below forced java and groovy to use /dev/urandom.
export JAVA_OPTS=" $JAVA_OPTS -Djava.security.egd=file:///dev/urandom "

Your exception says it all "Connection reset".
The connection between your java process and the db server was lost, which could have happened for almost any reason(like network issues). The SQLRecoverableException just means that its recoverable, but the root cause is connection reset.

I had a similar situation when reading from Oracle in a Spark job. This connection reset error was caused by an incompatibility between the Oracle server and the JDBC driver used. Worth checking it.

add java security in your run command
java -jar -Djava.security.egd="file:///dev/urandom" yourjarfilename.jar

Related

Java server application slow after period of idleness (Windows)

I'm having trouble with a Jetty 9 server application that seems to go into some kind of resting state after a longer period of idleness. Normally the memory usage of the Java process is ~500 MB, but after being idle for some time it seems to drop down to less than 50MB. The first request that comes takes up to several seconds to respond whereas requests are normally on the scale of tens of milliseconds. But after one or two requests it seems like the application is back to it's normal responsive state.
I'm running on the 32-bit Oracle Java 8 JVM. My JVM configuration is very basic:
java -server -jar start.jar
I was hoping that this issue might be solvable through JVM configuration. Does anyone know if there's any particular parameter to disable this type of behavior?
edit: Based on the comment from Ivan, I was able to identify the source of the issue. Turns out Windows was swapping parts of the Java process out to disk. See my own answer below for a description of my solution.
Based on the comment from Ivan, I was able to identify the source of the issue. Turns out Windows was swapping parts of the Java process out to disk. This was clearly visible when comparing the private working set to the commit size in the task manager.
My solution to this was two-fold. First, I made a simple scheduled job inside my server app that runs every minute and does a simple test run to make sure that the important services never go inactive for long periods. I'm hoping this should ensure that Windows doesn't regard the related pages as inactive.
Afterwards, I also noticed that the process was executing with "Below normal" priority. So I changed the script that starts the server to ensure that it's running with "High" priority going forward. This seems likely to affect swapping behavior and may very well also have been enough to resolve the issue on it's own, but I only found it after already deploying my first solution so that remains unclear. In any case, everything seems to be working as it should now.

How to define host of some strange java-process?

Good day everybody!
I'm developing java GWT web application. Yesterday it was working fine - task manager was showing netbeans process and ONE java process - definetely it was tomcat. But today I'm observing netbeans process, java process of tomcat and some unknown java process which causes java heap space error. This strange process eats a lot of memory and it's memory consumption grows dramatically in time.
Probably useful information: the only thing I changed in my app is dropping database and creating it again from some backup. I suspect java JDBC driver can't connect to DB because of probable incorrect user privileges - it is not a problem, queries are performing successfully but strange java process is exists.
Question: How to define host of this unknown java-process? What application, netbeans or tomcat or something else creates it?
On a Unix platform, ps has several options that show more than just the process name ("java") - e.g. on Linux try ps ax | grep java and you'll see the whole command line that was used to start the java process. It's easy to determine from there what process is running and what they're supposed to do.
On Windows you'll have to find an equivalent - if you're lucky the user executing the process will help you as well - e.g. if it's you or SYSTEM (for services), but the full commandline definitely beats it.
OK, I found the reason - I select a lot of data from my DB. It seems JDBC driver was loading incoming data continuously until memory was enough.

Invalid memory access of location in Java

I've been working on a Java project for year. My code had been working fine for months. A few days ago I upgraded the Java SDK to the newest version 1.6.0_26 on my Mac (Snow Leopard 10.6.8). After the upgrade, something very weird happens. When I run some of the classes, I get this error:
Invalid memory access of location 0x202 rip=0x202
But, if I run them with -Xint (interpreted) they work, slow but work fine. I get that problem in classes where I use bitwise operators (bitboards for the game Othello). I can't put any code here because I don't get an error, exception or something similar. I just get that annoying message.
Is it normal that the code doesn't run without -Xint but it works with it? What should I do?
Thanks in advance
When a JVM starts crashing like that, it is a sign that something has broken the JVM's execution model.
Does your application include any native code? Does it use any 3rd-party libraries with native code components? If neither is true, then the chances are that this is a bug in the Apple port of the JVM. It could be a JIT compiler bug, or a bug in some JVM native code library.
What can you do about a bug like that?
Not a lot.
Reduce your application by progressively chopping out bits until you have a small testcase that exhibits the problem.
Based on the testcase, see if there's some empirical way to avoid the problem.
Submit a bug report to Apple with the testcase.
I just came across this situation and it turned out to be related to a piece of code that was serializing a JSON object with a cyclic reference to itself. I removed the cycle and the error went away. I suspect this is related to a memory overflow error that is now handled differently by newer JVMs on Mac OSX. In this case, I was running Mac OSX 10.7.
For completeness the errors I was receiving were:
Invalid access of stack red zone 0x10e586d30 rip=0x10daabba6
Bus error: 10
And:
Invalid memory access of location 0x10b655890 rip=0x10a8baba6
Segmentation fault: 11
Also verify that you are building the GUI on the event dispatch thread and never updating a GUI component from any other thread.
Related errors are notoriously hard to reproduce, but the change associated with altered timing is suggestive.
Please check if /etc/hosts is empty and verify that it include these configurations :
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
fe80::1%lo0 localhost

Java Multithreaded Socket Server hangs after getting ~ 50 simultaneous Connections

So basically the problem is described in the title.
The server works in the following way:
Listens to a new connection
Once connection is requested - adds the request to the Q,
Continues listening to a new connection
Separate process takes care of a Q and spawns a new thread to deal with the clients' requests.
The server code is similar to this tutorial (everything is in try / catch, unfortunately I cant show the source-code - company policy)
It seems to work very well, until the number of clients exceeds ~ 50, Then it just hangs with no exceptions / warnings / etc. There is a cpu thread limit of 32k, no limits on the number of open files / open sockets / etc. OS = CentOS 5.5 (same seems to happen in ubuntu tho). The server logs data to MySQL using ODBC. Separate stress tests of both showed that I can have up to 32k java processes (limited by /proc/sys/kernel/threads-max ) and MySQL can perform up to 20k simple operations / second, so Im assuming the problem is with the sockets.
So the question really is:
What is the limiting factor in socket connections and how can I make it bigger?
OR am I looking in the wrong place?
The chances are that you have induced a deadlock somewhere in the code. The key indicator here is if by 'hang' you mean the CPU usage of the server drops to nothing and no futher activity is seen in the server.
When the server hangs run jdk tool: jstack against it's process. This should show you what is waiting on what lock. Also in the tool kit is jvisualvm and if on a unix box a simple kill -3 pid will do a thread dump to stderr.
With out the code or at least a reproducable sample I'm afraid I can't help much more. One thing you might want to look at is using jetty as your embedded server instead of a hand roled one, they have already been through the deadlock/threading pain so you don't have to.
Don´t know if this will help you and if your are using it, but try to run your socket server with java switch "-server",this will select the Java HotSpot Server VM.The -server turns on the optimizing JIT along with a few other "server-class" settings. Generally you get the best performance out of this setting. The default VM is -client.
Also check your other params, so your socket server don´t run with minimal resources
Have a nice day

Stored proc running 30% slower through Java versus running directly on database

I'm using Java 1.6, JTDS 1.2.2 (also just tried 1.2.4 to no avail) and SQL Server 2005 to create a CallableStatement to run a stored procedure (with no parameters). I am seeing the Java wrapper running the same stored procedure 30% slower than using SQL Server Management Studio. I've run the MS SQL profiler and there is little difference in I/O between the two processes, so I don't think it's related to query plan caching.
The stored proc takes no arguments and returns no data. It uses a server-side cursor to calculate the values that are needed to populate a table.
I can't see how the calling a stored proc from Java should add a 30% overhead, surely it's just a pipe to the database that SQL is sent down and then the database executes it....Could the database be giving the Java app a different query plan??
I've posted to both the MSDN forums, and the sourceforge JTDS forums (topic: "stored proc slower in JTDS than direct in DB") I was wondering if anyone has any suggestions as to why this might be happening?
Thanks in advance,
-James
(N.B. Fear not, I will collate any answers I get in other forums together here once I find the solution)
Java code snippet:
sLogger.info("Preparing call...");
stmt = mCon.prepareCall("SP_WB200_POPULATE_TABLE_limited_rows");
sLogger.info("Call prepared. Executing procedure...");
stmt.executeQuery();
sLogger.info("Procedure complete.");
I have run sql profiler, and found the following:
Java app :
CPU: 466,514 Reads: 142,478,387 Writes: 284,078 Duration: 983,796
SSMS :
CPU: 466,973 Reads: 142,440,401 Writes: 280,244 Duration: 769,851
(Both with DBCC DROPCLEANBUFFERS run prior to profiling, and both produce the correct number of rows)
So my conclusion is that they both execute the same reads and writes, it's just that the way they are doing it is different, what do you guys think?
It turns out that the query plans are significantly different for the different clients (the Java client is updating an index during an insert that isn't in the faster SQL client, also, the way it is executing joins is different (nested loops Vs. gather streams, nested loops Vs index scans, argh!)). Quite why this is, I don't know yet (I'll re-post when I do get to the bottom of it)
Epilogue
I couldn't get this to work properly. I tried homogenising the connection properties (arithabort, ansi_nulls etc) between the Java and Mgmt studio clients. It ended up the two different clients had very similar query/execution plans (but still with different actual plan_ids). I posted a summary of what I found to the MSDN SQL Server forums as I found differing performance not just between a JDBC client and management studio, but also between Microsoft's own command line client, SQLCMD, I also checked some more radical things like network traffic too, or wrapping the stored proc inside another stored proc, just for grins.
I have a feeling the problem lies somewhere in the way the cursor was being executed, and it was somehow giving rise to the Java process being suspended, but why a different client should give rise to this different locking/waiting behaviour when nothing else is running and the same execution plan is in operation is a little beyond my skills (I'm no DBA!).
As a result, I have decided that 4 days is enough of anyone's time to waste on something like this, so I will grudgingly code around it (if I'm honest, the stored procedure needed re-coding to be more incremental instead of re-calculating all data each week anyway), and chalk this one down to experience. I'll leave the question open, big thanks to everyone who put their hat in the ring, it was all useful, and if anyone comes up with anything further, I'd love to hear some more options...and if anyone finds this post as a result of seeing this behaviour in their own environments, then hopefully there's some pointers here that you can try yourself, and hope fully see further than we did.
I'm ready for my weekend now!
-James
You can attach the Profiler and monitor for the events SQL:BatchCompleted and SP:Completed, with a filter on duration > 1000. Run the procedure from your Java client and from SSMS. Compare the Reads and the Writes of the two events (Java vs. SSMS). Are they significantly different? This would indicate considerably different execution paths or plans, with significant difference in I/O.
Also try to capture the Showplan XML event of the two and compare the plans (save the event as a .sqlplan file, open it in SSMS to easy analysis). Do they have similar plans? Are there wild differences in Estimate vs. Actual (rows, rewinds, rebinds)? Do they have same degree of parallelism? The plans can aso be retrieved from sys.dm_exec_requests view.
Are there any warning events raised, like Missing Column Statistics, Sort Warnings, Hash Warning, Execution Warnings, Blocked Process?
the point is that you have at your disposal a whole arsenal of investigation tools. Once you find the root cause of the difference, you can trace it down to what is different between your Java environment settings and the SSMS environment (ADO.Net SqlClient). Things like default transaction isolation level, ANSI settings etc etc.
Checking: Is your problem that two applications (SSMS, Java) are making the exact same identical call to SQL Server, and SQL Server is acting differently for each? If so, I hit things like this every year or two, and they hurt my brain for days.
Once, I ultimately isolated each process call and logging everything for the entire process in Profiler. I eventually noticed that the Login event (under TextData) showed a host of information, like so:
-- network protocol: TCP/IP
set quoted_identifier on
set arithabort off
set numeric_roundabort off
set ansi_warnings on
set ansi_padding on
set ansi_nulls on
set concat_null_yields_null on
set cursor_close_on_commit off
set implicit_transactions off
set language us_english
set dateformat mdy
set datefirst 7
set transaction isolation level read committed
The "Existing Connection" event will show this information as well--but, sometimes immediately subsequent calls (batches, RPCs, I disremember just now) are sent [ISQL or OSQL did this, I think] to immediately reset some of these -- Arithabort and Quoted_Identifier seem to be favorites, and other SET options also get modified depending on the settings or requirements of whatever connectivity protocols your application's database interface is using.
Another one: some settings are kept as attributes of a procedure at "create" time, and others are factored in at compile time. On the one hand, your connection's SET values may be being overwritten by the configuration saved at the time the procedure was created; on the other hand, your two connections may differ so much that two execution plans are generated for one procedure. (All of this information is, after sufficient research, available in the sys. tables and DMVs.)
In short, it seems to me that SQL obscurities are messing you up. To this day, I loathe all these goombah settings. Things below my notice keep messing around with them [I mean, really, what fool would set implicit_transaction for a connection pool on? But once they did...] and it's hard to build structures when the ground (rules) keep changing out from underneath you. After all, remember what the guy said about building castles in a swamp...
I recall having a similar issue a while ago, because JTDS was silently converting a string parameter to Unicode or something similar. As a result of that conversion, SQL Server was unable to use the index which is was using when we ran the stored proc from SSMS.
HIH
Does the Java case include transmission of the results to the Java server (network overhead) plus some Java processing? A 12 minute query might produce quite a large amount of data.
If you are looking at the profiler and there is no difference between the executions then the difference must be with the client systems.
4 mins does seem like to long just to prepare a statement to send so the 12 min wait must cause some other effect -- no idea what it is.
I am not sure if this post is still relevant. We faced a similar problem in our application.
One key difference between running a stored procedure in SQL Management studio and one running from JDBC is that of transaction context. If you are using an ORM in Java, by default the stored procedure runs in a transaction context. When you run a stored procedure directly in SQL management studio the transaction is off. There is a substantial performance difference.
Sorry, I've not found a correct answer to this, so I don't want to allocate any of these as correct, so I am going to mark this answer as correct, and wish anyone luck who comes across anything similar!
Did you know that Microsoft ship JDBC drivers for their databases?
These may be more performant.
Obviously.. you may have resolved the problem by now.

Categories

Resources