Spring NamedParameterJDBCTemplate reuse of Prepared Statements gets slow performance

Spring NamedParameterJDBCTemplate reuse of Prepared Statements gets slow performance - java

Requirement:
Have to insert a row to database table multiple times (may be like 50,000 times in a batch job)
Database used: MS SQL Server
Approach taken:
Used NamedParameterJdbcTemplate and PreparedStatement to implement the above.
For example,
String query = "insert into table_Name (Column1, Column2, column3) values (?,?,?)";
Then by the help of PreparedStatement, I have assigned dynamic values to the ? fields in the insert query and then triggered executeUpdate function of NamedParameterJdbcTemplate to execute the insert query.
This above process is repeated multiple times; every time creating new insert queries with different values in the fields ? by the help of PreparedStatement.
Issue:
Performance of the executeUpdate function of NamedParameterJdbcTemplate is very slow.
After few research, I got the below explanation:
"The cost based optimizer (that 's what we 're talking about, not) makes its
choices based on the availability of indexes (among other objects), and the
distribution of values in the indexes (how selective the index will be for a
given value). Obviuosly, when working with bind variables, the suitability
of the index from a distribution point of view is harder to determine. The
optimizer has no way to determine beforehand to what value matches will be
sought. This might (should) lead to another execution plan. No surprise
here, as far as I am concerned."
https://bytes.com/topic/oracle/answers/65559-jdbc-oracle-beware-bind-variables
A PreparedStatement has two advantages over a regular Statement:
We add parameters to the SQL using methods instead of doing it inside the SQL query itself. With this we avoid SQL injection attacks and also let the driver to do type conversions for us.
The same PreparedStatement can be called with different parameters, and the database engine can reuse the query execution plan.
It seems that NamedParameterJdbcTemplate helps us with the first advantage, but does nothing for the latter.
Query:
If NamedParameterJdbcTemplate does not helps us with the second advantage, then what is the alternative solution instead of NamedParameterJdbcTemplate

Related

How prepared statement improves performance [duplicate]

This question already has answers here:
Difference between Statement and PreparedStatement
(15 answers)
Closed 7 years ago.
I came across below statement that tells about the performance improvement that we get with JDBC PreparedStatement class.
If you submit a new, full SQL statement for every query or update to
the database, the database has to parse the SQL and for queries create
a query plan. By reusing an existing PreparedStatement you can reuse
both the SQL parsing and query plan for subsequent queries. This
speeds up query execution, by decreasing the parsing and query
planning overhead of each execution.
Let's say I am creating the statement and providing different values while running the queries like this :
String sql = "update people set firstname=? , lastname=? where id=?";
PreparedStatement preparedStatement =
connection.prepareStatement(sql);
preparedStatement.setString(1, "Gary");
preparedStatement.setString(2, "Larson");
preparedStatement.setLong (3, 123);
int rowsAffected = preparedStatement.executeUpdate();
preparedStatement.setString(1, "Stan");
preparedStatement.setString(2, "Lee");
preparedStatement.setLong (3, 456);
int rowsAffected = preparedStatement.executeUpdate();
Then will I still get performance benefit, because I am trying to set different values so I can the final query generated is changing based on values.
Can you please explain exactly when we get the performance benefit? Should the values also be same?

When you use prepared statement(i.e pre-compiled statement), As soon as DB gets this statement, it compiles it and caches it so that it can use the last compiled statement for successive call of same statement. So it becomes pre-compiled for successive calls.
You generally use prepared statement with bind variables where you provide the variables at run time. Now what happens for successive execution of prepared statements, you can provide the variables which are different from previous calls. From DB point of view, it does not have to compile the statement every time, will just insert the bind variables at rum time. So becomes faster.
Other advantages of prepared statements is its protection against SQL-injection attack
So the values does not have to be same

Although it is not obvious SQL is not scripting but a "compiled" language. And this compilation aka. optimization aka hard-parse is very exhaustive task. Oracle has a lot of work to do, it must parse the query, resolve table names, validate access privileges, perform some algebraic transformations and then it has to find effective execution plan. Oracle (and other databases too) can join only TWO tables - not more. It means then when you join several tables in SQL, Oracle has to join them one-by-one. i.e. if you join n tables in a query there can be at least up to n! possible execution plans. By default Oracle is limited up to 8000 permutations when search for "optimal" (not the best one) execution plan.
So the compilation(hard-parse) might be more exhaustive then query execution itself. In order to spare resources, Oracle shares execution plans between sessions in a memory structure called library cache. And here another problem might occur, too many parsing require exclusive access to a shared resource.
So if you do too many (hard) parsing your application can not scale - sessions are blocking each other.
On the other hand, there are situations where bind variables are NOT helpful.
Imagine such a query:
update people set firstname=? , lastname=? where group=? and deleted='N'
Since the column deleted is indexed and Oracle knows that there are 98% of values ='Y' and only 2% of values = 'N' it will deduce to use and index in the column deleted. If you used bind variable for condition on deleted column Oracle could not find effective execution plan, because it also depends on input which is unknown in the time of the compilation.
(PS: since 11g it is more complicated with bind variable peeking)

Should I use preparedStatement in a repetitive query in which where clause predicates change often causing change of plan chosen

I have a Java application which is executing queries on PostgreSQL 9.3 Server using JDBC. In my java application, I had to execute same query many times(in thousands) with different arguments in 'where' clause predicates alone. I have been using Statement class till now. I recently read about PreparedStatement class somewhere and I am thinking should I use it to speedup processing. But my doubt is this. Since my query executes each time with different values in Where clause predicates, the selectivity will change and hence plan chosen by the db server will change. In that case, will using PreparedStatement speedup the processing? Is the plan chosen when Preparedstatement is created or plan is chosen only when execute is called on the preparedstatement object? If plan is chosen when preparedstatement is created itself, how is it done since optimizer chooses plans based on selectivity calculated using actual predicate values.
My Query is a complex one involving many tables. Template is like,
select something from tables where predicate1 and predicate2 and price < X and date < Y;
where X and Y varies for each query.

From PostgreSQL doc :
PREPARE creates a prepared statement. A prepared statement is a
server-side object that can be used to optimize performance. When the
PREPARE statement is executed, the specified statement is parsed,
analyzed, and rewritten. When an EXECUTE command is subsequently
issued, the prepared statement is planned and executed. This division
of labor avoids repetitive parse analysis work, while allowing the
execution plan to depend on the specific parameter values supplied.
moe was right : preparing a query will only remove the overhead of reparsing it again and again. The planing is done only when you will execute the prepared query with its parameters.

In 9.3, it uses a heuristic. It does something like planning the query with the specific bind values the first 5 times the prepared statement is executed. If none of those plans turn out to be substantially better than the generic plan, then it stops the individual planning and justs uses the generic plan from then on.
But there is another wrinkle in that just because your code told the driver to use a prepared statement doesn't mean driver is actually doing so. A lot of drivers do weird things.
The real answer is test, test, test.

Memcache implementation design

Iam trying to implement memcache in my web application and just wanted to get suggestions that whether what iam doing is right in terms of design.
I have a class SimpleDataAccessor which run all my insert, update and select sql queries. So any query that has to be performed is executed inside the method of this class.
So inside the method where I have my select query implementation i have a method which stores the resultset in memcache like this.
storeinMC(resultset.getJSON(),sqlquery);
the sqlquery here is my key.
Also before running the selectquery i check in memcache that whether I have a resultset already for that query.
if((String res=getRSFromMC(sqlquery)==null)
So i've tried to keep it plain and simple.
Do you see any issues with this.?

As rai.skumar rightfully pointed out your SQL statements could be constructed differently (e.g. WHERE clause could contain same conditions in diff order, etc.)
So to overcome above mentioned issues, you need to parse your SQL and get all the relevant pieces from it. Then you can combine these pieces into a cache key.
You can take a look at SQL parsers: ZQL, JSqlParser, General SQL Parser for Java that return you java classes out of your SQL.
Another option would be to use JPA instead of straight JDBC. For example Hibernate has great JPA support and fully capable of caching your queries.
If you feel closer to JDBC you could use MyBatis that has very JDBC like syntax and caching support.

Consider below queries:
String k1 = "Select * from table"; //Query1
String k2 = "Select * from TABLE"; // Query2 ; notice TABLE is in caps
Both of above SQL queries are same and will fetch same data. But if above queries are used as keys in Memchached they will get stored at different places ( as k1.equals(k2) will return false).
Also if somehow you can ensure that there are no typos or extra spaces, it won't be very efficient as keys/queries could be very big.

Merge multiple PreparedStatements together

I'm trying to set up a while loop that inserts multiple rows into a MySQL table using the jdbc drivers in Java. The idea is that I end up with a statement along the lines of:
INSERT INTO table (column1, column2) VALUES (column1, column2), (column1, column2);
I want to set up this statement using a java.sql.PreparedStatement, but I'd like to prepare small bits of the statement, one row at a time - mainly because the number of entries will be dynamic, and this seems like the best way to create one big query.
This requires the small parts to be 'merged' together every time another chunk is generated. How do I merge these together? Or would you suggest to forget about this idea, and simply execute thousands of INSERT statements at once?
Thank you,
Patrick

It sort of depends on how often you plan to run this loop to execute thousands of statements, but that is one of the exact purposes of prepared statements and stored procedures - since the query does not have to be recompiled on each execution, you get potentially massive performance gains when querying in a loop over a simple SQL statement execution, which must be compiled and executed on every loop iteration.
Those gains may still not match the performance of a prepared statement built up into a long multi-insert in a loop as you're asking, but will be simpler to code. I would recommend staying with an execution loop unless the performance becomes problematic.

Better to prepare one PreparedStatement and reuse it as much as you want:
INSERT INTO (table) (column1, column2) VALUES (column1, column2)

What's the correct way to use PreparedStatement in java?

Specifically if I have three queries should I do
PreparedStatement singleQuery ...
and "share" the one object. Or should I do
PreparedStatement query1 ...
PreparedStatement query2 ...
PreparedStatement query3 ...

It depends on how different the three queries are. If they are the same query but with different arguments then use a single PreparedStatement and set the arguments each time. If they are essentially three different queries (e.g. a select followed by an update) then you'll need three different PreparedStatements.
For example, if the SQL for all three is of the form SELECT * FROM table WHERE id = something then a single statement is fine.
If the first query is SELECT name FROM customers WHERE id = ? and the second is SELECT price FROM products WHERE id = ? then you're gonna need different objects.

If the three queries use the same SQL, reuse the same object.
If not, have three separate objects.
Do not share the same object across multiple threads.

If the queries are different, you probably need separate PreparedStatement s but if the different queries can be handled by one parameterized query, you should probably go that route.

Use a single parameterized query and call it 3 times, each time binding the new parameter values. If your db supports caching of prepared statements, you'll get better performance because the actual query will only need to be compiled once by the RDBMS.

Generally you would use a separate PreparedStatement object for each query; that way you could keep them around to re-use, potentially saving preparation overhead.
If you aren't planning to re-use your statements, though, it probably doesn't matter.

You can use it any way, there is NO performance issue as long as you open the Prepared statement and Close the prepared Statements appropriately.
Suppose if you use the single prepared Statements, Make sure if you are going to use that one for the second one, the required variables to be provided appropriately.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.