We all know that we should rather reuse a JDBC PreparedStatement than creating a new instance within a loop.
But how to deal with PreparedStatement reuse between different method invocations?
Does the reuse-"rule" still count?
Should I really consider using a field for the PreparedStatement or should I close and re-create the prepared statement in every invocation (keep it local)?
(Of course an instance of such a class would be bound to a Connection which might be a disadvantage in some architectures)
I am aware that the ideal answer might be "it depends".
But I am looking for a best practice for less experienced developers that they will do the right choice in most of the cases.
Of course an instance of such a class would be bound to a Connection which might be a disadvantage
Might be? it would be a huge disadvantage. You'd either need to synchronize access to it, which would kill your multi-user performance stone-dead, or create multiple instances and keep them in a pool. Major pain in the ass.
Statement pooling is the job of the JDBC driver, and most, if not all, of the current crop of drivers do this for you. When you call prepareStatement or prepareCall, the driver will handle re-use of existing resource and pre-compiled statements.
Statement objects are tied to a connection, and connections should be used and returned to the pool as quickly as possible.
In short, the standard practice of obtaining a PreparedStatement at the start of the method, using it repeatedly within a loop, then closing it at the end of the method, is best practice.
Many database workloads are CPU-bound, not IO-bound. This means that the database ends up spending more time doing work such as parsing SQL queries and figuring out how to handle them (doing the 'execution plan'), than it spends accessing the disk. This is more true of 'transactional' workloads than 'reporting' workloads, but in both cases the time spent preparing the plan may be more than you expect.
Thus it is always a good idea, if the statement is going to be executed frequently and the hassle of making (correct) arrangements to cache PreparedStatements 'between method invocations' is worth your developer time. As always with performance, measurement is key, but if you can do it cheaply enough, cache your PreparedStatement out of habit.
Some JDBC drivers and/or connection pools offer transparent 'prepared statement caching', so that you don't have to do it yourself. So long as you understand the behaviour of your particular chosen transparent caching strategy, it's fine to let it keep track of things ... what you really want to avoid is the hit on the database.
Yes it can be reused, but I believe this only counts if the same Connection object is being used and if you are using a Database Connection Pool (from within a Web Application, for example) then the Connection objects will be potentially different each time.
I always recreate the PreparedStatement before each use within a Web Application for this reason.
If you aren't using a Connection Pool then you are golden!
I don't see the difference: If I execute the same statement repeatedly against the same connection, why not reuse the PreparedStatement in any way? If multiple methods execute the same statement, then maybe that statement needs to be encapsulated in its own method (or even its own class). That way you wouldn't need to pass around a PreparedStatement.
Related
Let's say we have a class that writes in a database a log message. This class is called from different parts of the code and executes again and again the same INSERT statement. It seems that is calling to use a PreparedStatement.
However I am wondering what is the right usage of it. Do I still get the benefit of using it, like the DBMS using the same execution path each time it is executed, even if I create a new PreparedStatement each time the method is called or should I have a PreparedStatement as a class member and never close it in order to re use it and get benefit from it?
Now, if the only way to obtain benefit using the PreparedStatement in this scenario is to keeping it opened as class member, may the same connection have different PreparedStatement's (with different queries) opened at the same time? What happens when two of these PreparedStatements are executed at the same time? Does the JDBC driver queue the execution of the PreparedStatements?
Thanks in advance,
Dani.
For all I know and experienced, statements don't run in parallel on one connection. And as you observed correctly, PreparedStatements are bound to the Connection they were created on.
As you probably don't want to synchronize your logging call (one insert at a time plus locking overhead), you'd have to keep the connections reserved for this logging statement.
But having a dedicated pool for only one statement seems very wasteful - don't want to do that as well.
So what options are left?
prepare the statement for every insert. As you'll have I/O operations to send data to the db, the overhead of preparing is relatively small.
prepare the statement inside your pool on creating a new connection and build a Map <Connection,PreparedStatement> to reference them later. Makes creating new connections a bit slower but allowes to recycle the statement.
Use some async way to queue your logs (JMS) and do the Insert as batch inside a message driven bean or similar
Probably some more options - but that's all I could think of right now.
Good luck with that.
I'm pretty green to DBMS and I'm required to write a Java program using JDBC to interact with an Access database file. I'm wondering if it's better practice, or even possible, to initialize the Connection in main and pass it to each method as needed (closing it after the program has run) or to open and close a new connection in each individual method.
Sorry if this is a repeat but none of the questions/answers on I've found on this have been conclusive.
Opening a connection takes quite a long time. You should use the same connection through your program if there is no special reason to close it.
There is even a special technique called connection pooling, which allows re-using open connections in large applications, which improves the performance.
I think creating a single connection object is a best way as you are decreasing the overhead for JVM for creating and garbage collecting an object.
(Use try-with-resource. it will take care for closing of connection object automatically)
I've been researching all around the web the most efficient way to design a connection pool and tried to analyze into details the available libraries (HikariCP, BoneCP, etc.).
Our application is a heavy-load consumer webapp and most of the time the users are working on similar business objects (thus the underlying SQL queries executed are the often the same, but still there are numerous).
It is designed to work with different DBMS (Oracle and MS SQL Server especially).
So a simplified use case would be :
User goes on a particular JSP page (e.g. Enterprise).
A corresponding Bean is created.
Each time it realizes an action (e.g. getEmployees(), computeTurnover()), the Bean asks the pool for a connection and returns it back when done.
If we want to take advantage of the Prepared Statement caching of the underlying JDBC driver (as PStatements are attached to a connection - jTDS doc.), from what I understand an optimal way of doing it would be :
Analyze what kind of SQL query a particular Bean want to execute before providing it an available connection from the pool.
Find a connection where the same prepared statement has already been executed if possible.
Serve the connection accordingly (and use the benefits of the cache/precompiled statement).
Return the connection to the pool and start over.
Am I missing an important point here (like JDBC drivers capable of reusing cached statements regardless of the connection) or is my analysis correct ?
The different sources I found state it is not possible, but why ?
For your scheme to work, you'd need to be able to get the connection that already has that statement prepared.
This falls foul on two points:
In JDBC you obtain the connection first,
Cached prepared statements (if a driver or connection pool even supports that) aren't exposed in a standardized way (if at all) nor would you be able to introspect them.
The performance overhead of finding the right connection (and the subsequent contention on the few connections that already have it prepared) would probably undo any benefit of reusing the prepared statement.
Also note that some database systems also have a serverside cache for prepared statements (meaning that it already has the plan etc available), limiting the overhead from a new prepare from the client.
If you really think the performance benefit is big enough, you should consider using a data source specific for this functionality (so it is almost guaranteed that the connection will have the statement in its cache).
A solution could be for a connection pool implementation to delay retrieving the connection from the pool until the Connection.prepareStatement() is called. At that time a connection pool would look up available connections by the SQL statement text and then play forward all the calls made before Connection.prepareStatement(). This way it would be possible to get a connection with a ready PreparedStatement without the issues other guys suggested.
In other words, when you request a connection from the pool, it would return a wrapper that logs everything until the first operation requiring DB access (such as prepareStatement() is requested.
You'd need to ask a vendor of your connection pool functionality to add this feature.
I've logged this request with C3P0:
https://github.com/swaldman/c3p0/issues/55
Hope this helps.
I have been unable to find an exact answer to this question. I'm using C3P0's ComboPooledDataSource. Which of these methodologies is better practice:
dataSource = connectionClass.getDataSource();
conn = dataSource.getConnection;
executeQuery(query1, conn);
executeQuery(query2, conn);
...
executeQuery(finalQuery, conn);
conn.close();
OR
executeQuery(query1);
executeQuery(query2);
...
executeQuery(finalQuery);
where executeQuery:
conn = dataSource.getConnection;
st = conn.createStatement();
rs = executeQuery(query);
conn.closed();
In short, I have to do a decent amount of queries every so often. Is it better to go with the first design, which gets the connection once for each batch and passes it as an argument. Or is it better to go with the second approach and just get a connection each time I call my executeQuery method. If I was using DriverManager I would obviously choose the first (only get the connection once), but when using the C3P0 package I am not sure if doing that is the right way to go or not. Or does it not matter with such a package?
With a connection pool, the difference is neglectible, because even if you use the second approach, bringing back a pooled connection takes little time. Still, using the first approach is the better way to go, because
It avoids the additional (little) overhead of getting a connection from the pool.
If you later need to introduce transactions (do all of your changes or, in case of an error, conveniently and securely roll back your changes), then the first approach is your only option.
Some comments/suggestions
If you application is single threaded (unless you mention), it does not matter. It even does not matter whether you use connection pool or not. Just use a single connection and pass the same to function where you need it.
Connection pools are useful when the use case involves multiple database connections simultaneously.
Since your application is a batch and single threaded, it does not warrant use of connection pool.
Regarding your application, both the approaches are equivalent. When you call connection.close() on pooled datasource connection, its not actually closed but returned to pool.
What is the fastest option to issue stored procedures in a threaded environment in Java? According to http://dev.mysql.com/doc/refman/5.1/en/connector-j-usagenotes-basic.html#connector-j-examples-preparecall Connection.prepareCall() is an expensive method. So what's the alternative to calling it in every thread, when synchronized access to a single CallableStatement is not an option?
The most JDBC drivers use only a single socket per connection. I think MySQL also use also a single socket. That it is a bad performance idea to share one connection between multiple threads.
If you use multiple connection between different threads then you need a CallableStatment for every connection. You need a CallabaleStatement pool for every connection. The simplest to pool it in this case is to wrap the connection class and delegate all calls to the original class. This can be create very fast with Eclipse. In the wrapped method prepareCall() you can add a simple pool. You need also a wrapped class of the CallableStatement. The close method return the CallableStatement to the pool.
But first you should check if the call is real expensive because many driver has already such poll inside. Create a loop of prepareCall() and close() and count the time.
Connection is not thread safe, so you can't share it across threads.
When you prepareCall, the JDBC driver (may) be telling the RDBMS system to do a lot of work that is stored on the server side. You may be guilty of premature optimization here.
After giving this a little thought it seems that if you are having issues with this infrastructure code then your problems are elsewhere. Most applications do not take an inordinate amount of time doing this stuff.
Make sure you are using a DataSource, most do connection caching and some even do caching of statements.
Also for this to be a performance bottle neck it would imply that you are doing many queries one after the other, or that your pool of connections is too small. Maybe you should do some benchmarking on your code to see how much time the stored proc is taking vs how much time the JDBC code is taking.
Of course I would follow the MySQL recommendation of using CallableStatement, I am sure they have benchmarked this. Most apps do not share anything between Threads and it is rarely an issue.