In many try-with-resource examples I have searched, Statement and ResultSet are declared separately. As the Java document mentioned, the close methods of resources are called in the opposite order of their creation.
try (Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery(sql) ) {
} catch (Exception e) {
}
But now I have multiple queries in my function.
Can I make Statement and ResultSet in just one line ? My code is like:
try (ResultSet rs = con.createStatement().executeQuery(sql);
ResultSet rs2 = con.createStatement().executeQuery(sql2);
ResultSet rs3 = con.createStatement().executeQuery(sql3)) {
} catch (Exception e) {
}
If I only declare them in one line, does it still close resource of both ResultSet and Statement?
When you have a careful look you will see that the concept is called try-with-resources.
Note the plural! The whole idea is that you can declare one or more resources in that single statement and the jvm guarantees proper handling.
In other words: when resources belong together semantically, it is good practice to declare them together.
Yes, and it works exactly as you put it in your question, multiple statements separated by semicolon.
You may declare one or more resources in a try-with-resources statement. The following example retrieves the names of the files packaged in the zip file zipFileName and creates a text file that contains the names of these files:
try (
java.util.zip.ZipFile zf =
new java.util.zip.ZipFile(zipFileName);
java.io.BufferedWriter writer =
java.nio.file.Files.newBufferedWriter(outputFilePath, charset)
) {
// Enumerate each entry
for (java.util.Enumeration entries =
zf.entries(); entries.hasMoreElements();) {
// Get the entry name and write it to the output file
String newLine = System.getProperty("line.separator");
String zipEntryName =
((java.util.zip.ZipEntry)entries.nextElement()).getName() +
newLine;
writer.write(zipEntryName, 0, zipEntryName.length());
}
}
https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html
ResultSet implements AutoCloseable, which means try-with-resources will also enforce closing it when it finishes using it.
https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html
Related
I am looking at examples of try-with-resources in Java and I understand the following one:
try (Connection conn = DriverManager.getConnection(url, user, pwd);
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery(query);) {
...
}
So, the order of closing is:
rs.close();
stmt.close();
conn.close();
which is perfect because a connection has a statement and a statement has a result set.
However, in the following examples, the order of close I think it is the reverse of the expected:
Example 1:
try (FileReader fr = new FileReader(file);
BufferedReader br = new BufferedReader(fr)) {
...
}
The order of closing is:
br.close();
fr.close();
Example 2:
try (FileOutputStream fos = new FileOutputStream("testSer.ser");
ObjectOutputStream oos = new ObjectOutputStream(fs);) {
...
}
The order of closing is:
oos.close();
fos.close();
Are these examples correct? I think the close in those examples should be different because:
In the example 1 a BufferedReader has a FileReader.
In the example 2 an ObjectOutputStream has a FileOutputStream.
The ordering is the same: it's always the reverse of the order in which resources are specified. From JLS:
Resources are closed in the reverse order from that in which they were initialized.
However, if the later-specified resources themselves invoke the close() method of the earlier-specified resources (as is the case with BufferedReader and ObjectOutputStream), it may look like they are not happening in the expected order (and that close() will be invoked multiple times).
I think I wasn't right when I said "a connection has a statement and a statement has a result set". Maybe it's the opposite "a result set has a statement and a statement has a connection" or at least "a result set was created by a statement and a statement was created by a connection".
So I think:
try (Parent parent = new Parent();
Child child = parent.createChild();) {
...
}
Is equivalent to:
try (Parent parent = new Parent();
Child child = new Child(parent);) {
...
}
I have few tables with big amount of data (about 100 million records). So I can't store this data in memory but I would like to stream this result set using java.util.stream class and pass this stream to another class. I read about Stream.of and Stream.Builder operators but they are buffered streams in memory. So is there any way to resolve this question?
UPDATE #1
Okay I googled and found jooq library. I'm not sure but looks like it could be applicable to my test case. To summarize I have few tables with big amount of data. I would like to stream my resultset and transfer this stream to another method. Something like this:
// why return Stream<String>? Because my result set has String type
private Stream<Record> writeTableToStream(DataSource dataSource, String table) {
Stream<Record> record = null;
try (Connection connection = dataSource.getConnection()) {
String sql = "select * from " + table;
try (PreparedStatement pSt = connection.prepareStatement(sql)) {
connection.setAutoCommit(false);
pSt.setFetchSize(5000);
ResultSet resultSet = pSt.executeQuery();
//
record = DSL.using(connection)
.fetch(resultSet).stream();
}
} catch (SQLException sqlEx) {
logger.error(sqlEx);
}
return record;
}
Could please someone advise, am I on correct way? Thanks.
UPDATE #2
I made some experiment on jooq and could say now that above decision is not suitable for me. This code record = DSL.using(connection).fetch(resultSet).stream(); takes too much time
The first thing you have to understand is that code like
try (Connection connection = dataSource.getConnection()) {
…
try (PreparedStatement pSt = connection.prepareStatement(sql)) {
…
return stream;
}
}
does not work as by the time you leave the try blocks, the resources are closed while the processing of the Stream hasn’t even started.
The resource management construct “try with resources” works for resources used within a block scope inside a method but you are creating a factory method returning a resource. Therefore you have to ensure that the closing of the returned stream will close the resources and the caller is responsible for closing the Stream.
Further, you need a function which produces an item out of a single line from the ResultSet. Supposing, you have a method like
Record createRecord(ResultSet rs) {
…
}
you may create a Stream<Record> basically like
Stream<Record> stream = StreamSupport.stream(new Spliterators.AbstractSpliterator<Record>(
Long.MAX_VALUE,Spliterator.ORDERED) {
#Override
public boolean tryAdvance(Consumer<? super Record> action) {
if(!resultSet.next()) return false;
action.accept(createRecord(resultSet));
return true;
}
}, false);
But to do it correctly you have to incorporate the exception handling and closing of resources. You can use Stream.onClose to register an action that will be performed when the Stream gets closed, but it has to be a Runnable which can not throw checked exceptions. Similarly the tryAdvance method is not allowed to throw checked exceptions. And since we can’t simply nest try(…) blocks here, the program logic of suppression exceptions thrown in close, when there is already a pending exception, doesn’t come for free.
To help us here, we introduce a new type which can wrap closing operations which may throw checked exceptions and deliver them wrapped in an unchecked exception. By implementing AutoCloseable itself, it can utilize the try(…) construct to chain close operations safely:
interface UncheckedCloseable extends Runnable, AutoCloseable {
default void run() {
try { close(); } catch(Exception ex) { throw new RuntimeException(ex); }
}
static UncheckedCloseable wrap(AutoCloseable c) {
return c::close;
}
default UncheckedCloseable nest(AutoCloseable c) {
return ()->{ try(UncheckedCloseable c1=this) { c.close(); } };
}
}
With this, the entire operation becomes:
private Stream<Record> tableAsStream(DataSource dataSource, String table)
throws SQLException {
UncheckedCloseable close=null;
try {
Connection connection = dataSource.getConnection();
close=UncheckedCloseable.wrap(connection);
String sql = "select * from " + table;
PreparedStatement pSt = connection.prepareStatement(sql);
close=close.nest(pSt);
connection.setAutoCommit(false);
pSt.setFetchSize(5000);
ResultSet resultSet = pSt.executeQuery();
close=close.nest(resultSet);
return StreamSupport.stream(new Spliterators.AbstractSpliterator<Record>(
Long.MAX_VALUE,Spliterator.ORDERED) {
#Override
public boolean tryAdvance(Consumer<? super Record> action) {
try {
if(!resultSet.next()) return false;
action.accept(createRecord(resultSet));
return true;
} catch(SQLException ex) {
throw new RuntimeException(ex);
}
}
}, false).onClose(close);
} catch(SQLException sqlEx) {
if(close!=null)
try { close.close(); } catch(Exception ex) { sqlEx.addSuppressed(ex); }
throw sqlEx;
}
}
This method wraps the necessary close operation for all resources, Connection, Statement and ResultSet within one instance of the utility class described above. If an exception happens during the initialization, the close operation is performed immediately and the exception is delivered to the caller. If the stream construction succeeds, the close operation is registered via onClose.
Therefore the caller has to ensure proper closing like
try(Stream<Record> s=tableAsStream(dataSource, table)) {
// stream operation
}
Note that also the delivery of an SQLException via RuntimeException has been added to the tryAdvance method. Therefore you may now add throws SQLException to the createRecord method without problems.
jOOQ
I'm going to answer the jOOQ part of your question. As of jOOQ 3.8, there have now been quite a few additional features related to combining jOOQ with Stream. Other usages are also documented on this jOOQ page.
Your suggested usage:
You tried this:
Stream<Record> stream = DSL.using(connection).fetch(resultSet).stream();
Indeed, this doesn't work well for large result sets because fetch(ResultSet) fetches the entire result set into memory and then calls Collection.stream() on it.
Better (lazy) usage:
Instead, you could write this:
try (Stream<Record> stream = DSL.using(connection).fetchStream(resultSet)) {
...
}
... which is essentially convenience for this:
try (Cursor<Record> cursor = DSL.using(connection).fetchLazy(resultSet)) {
Stream<Record> stream = cursor.stream();
...
}
See also DSLContext.fetchStream(ResultSet)
Of course, you could also let jOOQ execute your SQL string, rather than wrestling with JDBC:
try (Stream<Record> stream =
DSL.using(dataSource)
.resultQuery("select * from {0}", DSL.name(table)) // Prevent SQL injection
.fetchSize(5000)
.fetchStream()) {
...
}
The dreaded SELECT *
As was criticised in the comments, their jOOQ usage seemed slow because of how jOOQ eagerly fetches LOB data into memory despite using fetchLazy(). The word "lazy" corresponds to fetching records lazily (one by one), not fetching column data lazily. A record is completely fetched in one go, assuming you actually want to project the entire row.
If you don't need some heavy rows, don't project them! SELECT * is almost always a bad idea in SQL. Drawbacks:
It causes a lot more I/O and memory overhead in the database server, the network, and the client.
It prevents covering index usage
It prevents join elimination transformations
More info in this blog post here.
On try-with-resources usage
Do note that a Stream produced by jOOQ is "resourceful", i.e. it contains a reference to an open ResultSet (and PreparedStatement). So, if you really want to return that stream outside of your method, make sure it is closed properly!
I'm not aware of any well-known library that will do it for you.
That said, this article shows how to wrap the resultset with an Iterator (ResultSetIterator) and pass it as the first parameter to Spliterators.spliteratorUnknownSize() in order to create a Spliterator.
The Spliterator can then be used by StreamSupport in order to create a Stream on top of it.
Their suggested implementation of ResultSetIterator class:
public class ResultSetIterator implements Iterator {
private ResultSet rs;
private PreparedStatement ps;
private Connection connection;
private String sql;
public ResultSetIterator(Connection connection, String sql) {
assert connection != null;
assert sql != null;
this.connection = connection;
this.sql = sql;
}
public void init() {
try {
ps = connection.prepareStatement(sql);
rs = ps.executeQuery();
} catch (SQLException e) {
close();
throw new DataAccessException(e);
}
}
#Override
public boolean hasNext() {
if (ps == null) {
init();
}
try {
boolean hasMore = rs.next();
if (!hasMore) {
close();
}
return hasMore;
} catch (SQLException e) {
close();
throw new DataAccessException(e);
}
}
private void close() {
try {
rs.close();
try {
ps.close();
} catch (SQLException e) {
//nothing we can do here
}
} catch (SQLException e) {
//nothing we can do here
}
}
#Override
public Tuple next() {
try {
return SQL.rowAsTuple(sql, rs);
} catch (DataAccessException e) {
close();
throw e;
}
}
}
and then:
public static Stream stream(final Connection connection,
final String sql,
final Object... parms) {
return StreamSupport
.stream(Spliterators.spliteratorUnknownSize(
new ResultSetIterator(connection, sql), 0), false);
}
Here is the simplest sample by abacus-jdbc.
final DataSource ds = JdbcUtil.createDataSource(url, user, password);
final SQLExecutor sqlExecutor = new SQLExecutor(ds);
sqlExecutor.stream(sql, parameters).filter(...).map(...).collect(...) // lazy execution&loading and auto-close Statement/Connection
Or:
JdbcUtil.prepareQuery(ds, sql)
.stream(ResultRecord.class) // or RowMapper.MAP/...
.filter(...).map(...).collect(...) // lazy execution&loading and auto-close Statement/Connection
This is totally lazy loading and auto-closure. The records will loaded from db by fetch size (default if not specified) and the Statement and Connection will automatically closed after the result/records are collected.
Disclosure: I'm the developer of AbacusUtil.
Using my library it would be done like this:
attach maven dependency:
<dependency>
<groupId>com.github.buckelieg</groupId>
<artifactId>db-fn</artifactId>
<version>0.3.4</version>
</dependency>
use library in code:
Function<Stream<I>, O> processor = stream -> //process input stream
try (DB db = new DB("jdbc:postgresql://host:port/database?user=user&password=pass")) {
processor.apply(
db.select("SELECT * FROM my_table t1 JOIN my_table t2 ON t1.id = t2.id")
.fetchSize(5000)
.execute(rs -> /*ResultSet mapper*/)
);
}
See more here
Some common module called Tools of a Ujorm framework offers a simple solution using the RowIterator class.
Example of use:
PreparedStatement ps = dbConnection.prepareStatement("SELECT * FROM myTable");
new RowIterator(ps).toStream().forEach((RsConsumer)(resultSet) -> {
int value = resultSet.getInt(1);
});
Maven dependency on the Tools library (50KB):
<dependency>
<groupId>org.ujorm</groupId>
<artifactId>ujo-tools</artifactId>
<version>1.93</version>
</dependency>
See jUnit test for more information.
I just did the summary to provide the real example about how to stream ResultSet and do the simple SQL query without using 3rd
click here for detail
Blockquote: Java 8 provided the Stream family and easy operation of it. The way of pipeline usage made the code clear and smart.
However, ResultSet is still go with very legacy way to process. Per actual ResultSet usage, it is really helpful if converted as Stream.
....
StreamUtils.uncheckedConsumer is required to convert the the SQLException to runtimeException to make the Lamda clear.
I currently am working on a project that does a lot of work with Database.
One core idiom that I have reused many, many times in my code is the following.
My question is, is there a better way to handle the exceptions at each step of the getTransformedResults method? Is this a proper way of handling the SQLExceptions, or is there a better, more concise way of doing this?
Thanks for your input!
public ResultType handleResultSet(ResultSet rs);
public ResultType getTransformedResults(String query) throws SQLException {
ResultType resultObj = new ResultType();
Connection connection = null;
try {
connection = dataSource.getConnection();
} catch (SQLException sqle) {
// cleanup
throw sqle;
}
Statement stmt = null;
try {
stmt = connection.createStatement();
} catch (SQLException sqle) {
try { connection.close() } catch (SQLException dontCare) {}
// cleanup
throw sqle;
}
ResultSet rs = null;
try {
ResultSet rs = stmtm.executeQuery(query);
resultObj = handleResultSet(rs);
} catch (SQLException sqle) {
// cleanup
throw sqle;
} finally {
if (rs != null) try { rs.close() } catch (SQLException dontCare) {}
try { stmt.close() } catch (SQLException dontCare) {}
try { connection.close() } catch (SQLException dontCare) {}
}
return resultObj;
}
Java 7 has some constructs you might appreciate, I think you can use try/finally without catch (which mimics your catch and rethrow).
Also, since you've caught and handled the SQL exception, perhaps you should re-throw it as something else--perhaps as a runtime exception--this makes it easier to catch all runtime exceptions at a primary entry point rather than having to deal with exceptions every single time you access the DB.
Personally I might handle this by passing in an interface implementation rather than subclassing.
Ultimately, if you're only handling the exceptions in that method, and not polluting the mainline code, what else can you really do, and what would be the point of doing it? You might make each step a bit more granular so it's not all in one method, but other than that...
You might consider an application-specific exception, which may make testing and configuration cleaner, but that depends on context.
Clarification of interface idea
Instead of subclassing you'd have an interface that implemented the handling of result sets and query string retrieval, so two methods--one for the query, one for the results.
You'd pass an implementation to an instance of mostly what you have now, but it takes the interface instead of a query string. The rest of the code is essentially identical, but it gets the query string from the interface impl, and calls the interface impl's result handling method, saving the result until the cleanup.
It's essentially the same as you have now, but IMO cleaner since any class could implement the interface, including anonymous classes, or other classes in your domain.
You may be interested in using Apache Commons DbUtils which is aimed exactly at such purposes.
It has some drawbacks when trying to use more sophisticated JDBC but for regular usage it should be more than enough.
Besides that, your code contains too much try/catch blocks and can be simplified to something like the following:
public interface ResultSetHandler<ResultType> {
ResultType handleResultSet(ResultSet rs);
}
public <ResultType> ResultType getTransformedResults(String query, ResultSetHandler<ResultType> rsh) throws SQLException {
Connection connection = null;
Statement stmt = null;
try {
connection = dataSource.getConnection();
stmt = connection.createStatement();
ResultSet rs = stmtm.executeQuery(query);
return rsh.handleResultSet(rs);
} catch (SQLException sqle) {
// cleanup
throw sqle;
} finally {
if(stmt != null) {
statement.close(); // closes also resultSet
connection.close();
}
}
}
Though Apache Commons DbUtils library does exactly the same under the hood.
org.springframework.jdbc.core.JdbcTemplate - "...simplifies the use of JDBC and helps to avoid common errors."
Connection c = null;
Statement s = null;
ResultSet r = null;
try {
c = datasource.getConnection();
s = c.createStatement();
r = s.executeQuery(sql);
rsh.handleResultSet(r);
}
finally {
DbUtils.closeQuietly(r);
DbUtils.closeQuietly(s);
DbUtils.closeQuietly(c);
}
Note that DbUtils is apaache commons-dbutils, and the closeQuietly is equivalent to:
try {
c.close();
}
catch (SQLException e) {
}
This all being said, i'd recommend using spring's jdbc features:
JdbcTemplate template = new JdbcTemplate(dataSource);
List data = template.query(sql, new RowMapper() { ... });
The RowMapper is an interface whose implementation has the job of converting the current position in the resultset to an object. So by simply giving it the logic of what to do with one row, you automatically collect the list of the objects for all rows in these two lines of code plus whatever it takes to map the row. There's other methods which let you work with the ResultSet in different ways, but this is a pretty standard way in which people use it.
All the connection and statement management is done for you, and you don't have to worry about resource management at all.
I am currently writing a Java program which loops through a folder of around 4000 XML files.
Using a for loop, it extracts the XML from each file, assigns it to a String 'xmlContent', and uses the PreparedStatement method setString(2,xmlContent) to insert the String into a table stored in my SQL Server.
The column '2' is a column called 'Data' of type XML.
The process works, but it is slow. It inserts about 50 rows into the table every 7 seconds.
Does anyone have any ideas as to how I could speed up this process?
Code:
{ ...declaration, connection etc etc
PreparedStatement ps = con.prepareStatement("INSERT INTO Table(ID,Data) VALUES(?,?)");
for (File current : folder.listFiles()){
if (current.isFile()){
xmlContent = fileRead(current.getAbsoluteFile());
ps.setString(1, current.getAbsoluteFile());
ps.setString(2, xmlContent);
ps.addBatch();
if (++count % batchSize == 0){
ps.executeBatch();
}
}
}
ps.executeBatch(); // performs insertion of leftover rows
ps.close();
}
private static String fileRead(File file){
StringBuilder xmlContent = new StringBuilder();
FileReader fr = new FileReader(file);
BufferedReader br = new BufferedReader(fr);
String strLine = "";
br.readLine(); //removes encoding line, don't need it and causes problems
while ( (strLine = br.readLine() ) != null){
xmlContent.append(strLine);
}
fr.close();
return xmlContent.toString();
}
Just from a little reading and a quick test - it looks like you can get a decent speedup by turning off autoCommit on your connection. All of the batch query tutorials I see recommend it as well. Such as http://www.tutorialspoint.com/jdbc/jdbc-batch-processing.htm
Turn it off - and then drop an explicit commit where you want (at the end of each batch, at the end of the whole function, etc).
conn.setAutoCommit(false);
PreparedStatement ps = // ... rest of your code
// inside your for loop
if (++count % batchSize == 0)
{
try {
ps.executeBatch();
conn.commit();
}
catch (SQLException e)
{
// .. whatever you want to do
conn.rollback();
}
}
Best make the read and write parallel.
Use one thread to read the files and store in a buffer.
Use another thread to read from the buffer and execute queries on database.
You can use more than one thread to write to the database in parallel. That should give you even better performance.
I would suggest you follow this MemoryStreamMultiplexer approach where you can read the XML files in one thread and store in a buffer and then use one or more thread to read from the buffer and execute against database.
http://www.codeproject.com/Articles/345105/Memory-Stream-Multiplexer-write-and-read-from-many
It is a C# implementation, but you get the idea.
My application has a memory leak resulting from my usage of JDBC. I have verified this by looking at a visual dump of the heap and seeing thousands of instances of ResultSet and associated objects. My question, then, is how do I appropriately manage resources used by JDBC so they can be garbage collected? Do I need to call ".close()" for every statement that is used? Do I need to call ".close()" on the ResultSets themselves?
How would you free the memory used by the call:
ResultSet rs = connection.createStatement().executeQuery("some sql query");
??
I see that there are other, very similar, questions. Apologies if this is redundant, but either I don't quite follow the answers or they don't seem to apply universally. I am trying to achieve an authoritative answer on how to manage memory when using JDBC.
::EDIT:: Adding some code samples
I have a class that is basically a JDBC helper that I use to simplify database interactions, the main two methods are for executing an insert or update, and for executing select statements.
This one for executing insert or update statements:
public int executeCommand(String sqlCommand) throws SQLException {
if (connection == null || connection.isClosed()) {
sqlConnect();
}
Statement st = connection.createStatement();
int ret = st.executeUpdate(sqlCommand);
st.close();
return ret;
}
And this one for returning ResultSets from a select:
public ResultSet executeSelect(String select) throws SQLException {
if (connection == null || connection.isClosed()) {
sqlConnect();
}
ResultSet rs = connection.createStatement().executeQuery(select);
return rs;
}
After using the executeSelect() method, I always call resultset.getStatement().close()
Examining a heap dump with object allocation tracing on shows statements still being held onto from both of those methods...
You should close the Statement if you are not going to reuse it. It is usually good form to first close the ResultSet as some implementations did not close the ResultSet automatically (even if they should).
If you are repeating the same queries you should probably use a PreparedStatement to reduce parsing overhead. And if you add parameters to your query you really should use PreparedStatement to avoid risk of sql injection.
Yes, ResultSets and Statements should always be closed in a finally block. Using JDBC wrappers such as Spring's JdbcTemplate helps making the code less verbose and close everything for you.
I copied this from a project I have been working on. I am in the process of refactoring it to use Hibernate (from the code it should be clear why!!). Using a ORM tool like Hibernate is one way to resolve your issue. Otherwise, here is the way I used normal DAOs to access the data. There is no memory leak in our code, so this may help as a template. Hope it helps, memory leaks are terrible!
#Override
public List<CampaignsDTO> getCampaign(String key) {
ResultSet resultSet = null;
PreparedStatement statement = null;
try {
statement = connection.prepareStatement(getSQL("CampaignsDAOImpl.getPendingCampaigns"));
statement.setString(1, key);
resultSet = statement.executeQuery();
List<CampaignsDTO> list = new ArrayList<CampaignsDTO>();
while (resultSet.next()) {
list.add(new CampaignsDTO(
resultSet.getTimestamp(resultSet.findColumn("cmp_name")),
...));
}
return list;
} catch (SQLException e) {
logger.fatal(LoggerCodes.DATABASE_ERROR, e);
throw new RuntimeException(e);
} finally {
close(statement);
}
}
The close() method looks like this:
public void close(PreparedStatement statement) {
try {
if (statement != null && !statement.isClosed())
statement.close();
} catch (SQLException e) {
logger.debug(LoggerCodes.TRACE, "Warning! PreparedStatement could not be closed.");
}
}
You should close JDBC statements when you are done. ResultSets should be released when associated statements are closed - but you can do it explicitly if you want.
You need to make sure that you also close all JDBC resources in exception cases.
Use Try-Catch-Finally block - eg:
try {
conn = dataSource.getConnection();
stmt = conn.createStatement();
rs = stmet.executeQuery("select * from sometable");
stmt.close();
conn.close();
} catch (Throwable t) {
// do error handling
} finally {
try {
if (stmt != null) {
stmt.close();
}
if (conn != null) {
conn.close();
}
} catch(Exception e) {
}
}