JDBC batch query for high performance - java

I want to do batch query DB for high performance, example sql to query based on different customer_id:
select order_id,
cost
from customer c
join order o using(id)
where c.id = ...
order by
I'm not sure how to do it using JDBC statement. I know I can use stored procedure for this purpose, but it's much better if I can just write sql in Java app instead of SP.
I'm using DBCP for my Java client and MySQL DB.

The JDBC Specification 4.0 describes a mechanism for batch updates. As such, the batch features in JDBC can be used for insert or update purposes. This is described in chapter 14 of the specification.
AFAIK there is not a mechanism for select batches, probably because there is no apparent need for that since, as others have recommended, you can simply retrieve all the rows that you want at once by properly constructing your query.
int[] ids = { 1, 2, 3, 4 };
StringBuilder sql = new StringBuilder();
sql.append("select jedi_name from jedi where id in(");
for (int i = 0; i < ids.length; i++) {
sql.append("?");
if(i+1 < ids.length){
sql.append(",");
}
}
sql.append(")");
System.out.println(sql.toString());
try (Connection con = DriverManager.getConnection(...)) {
PreparedStatement stm = con.prepareStatement(sql.toString());
for(int i=0; i < ids.length; i++){
stm.setInt(i+1, ids[i]);
}
ResultSet rs = stm.executeQuery();
while (rs.next()) {
System.out.println(rs.getString("jedi_name"));
}
} catch (SQLException e) {
e.printStackTrace();
}
Output
select jedi_name from jedi where id in(?,?,?,?)
Luke, Obiwan, Yoda, Mace Windu
Is there any reason why you would consider that you need a thing like a batch-select statement?

It is really does not matter what is your SQL statement (you can use as many nested joins as your DB can handle). Below is basic Java example (not DBCP). For DBCP example which is pretty similar you can check out their example.
Connection connect = DriverManager.getConnection(YOUR_CONNECTION_STRING);
// Statements allow to issue SQL queries to the database
Statement statement = connect.createStatement();
ResultSet resultSet = statement.executeQuery("select order_id, cost
from customer c
join order o using(id)
where c.id = ...
order by");

Related

Insert performance tuning

Currently we are selecting data from one database and inserting it into a backup database(SQL SERVER).
This data always contains more than 15K records in one select.
We are using Enumeration to iterate over the data selected.
We are using JDBC PreparedStatement to insert data as:
Enumeration values = ht.elements(); -- ht is HashTable containing selected data.
while(values.hasMoreElements())
{
pstmt = conn.prepareStatement("insert query");
pstmt.executeUpdate();
}
I am not sure if this is the correct or efficient way to do the faster insert.
For inserting 10k rows it takes near about 30 min or more.
Is there any efficient way to make it fast?
Note: Not using any indexes on the table.
Use a batch insert, but commit after a few entris, don't try to send all 10K at once. Try investigating to get the best size, it' a trade off to memory vs network trips.
Connection connection = new getConnection();
Statement statement = connection.createStatement();
int i = 0;
for (String query : queries) {
statement.addBatch("insert query");
if ((i++ % 500) == 0) {
// Do an execute now and again, don't send too many at once
statement.executeBatch();
}
}
statement.executeBatch();
statement.close();
connection.close();
Also, from your code I'm not sure what you are doing, but use paramaterised queries rather than sending 10K insert statements as text. Something like:
String q= "INSERT INTO data_table (id) values (?)";
Connection connection = new getConnection();
PreparedStatement ps = connection.prepareStatement(q);
for (Data d: data) {
ps.setString(1, d.getId());
ps.addBatch();
}
ps.executeBatch();
ps.close();
connection.close();
You can insert all the values in one sql command:
INSERT INTO Table1 ( Column1, Column2 ) VALUES
( V1, V2 ), ( V3, V4 ), .......
You may also insert the values by bulks of 500 records, for example, if the query would become very big. It is not efficient at all to insert on row per statement remotely (using a connection). Another solution is to do the inserts using a stored procedure. You just pass the values to it as parameters.
Here is how you can do it using the INSERT command above:
Enumeration values = ht.elements(); -- ht is HashTable containing selected data.
int i=0;
String sql="";
while(values.hasMoreElements())
{
sql+="(" + values + ")"; //better use StringBuffer here
i++;
if(i % 500 == 0) {
pstmt = conn.prepareStatement("insert query "+sql);
pstmt.executeUpdate();
sql="";
}
else
sql += " , ";
}

Passing an array to SELECT * FROM table WHERE country IN (?...)

Is there a way to pass an array of strings for a "WHERE country IN (...)" query?
something like this:
String[] countries = {"France", "Switzerland"};
PreparedStatement pstmt = con.prepareStatement("SELECT * FROM table WHERE country IN (?...)");
pstmt.setStringArray(1, countries);
pstmt.executeQuery();
an ugly workaround would be to create the query based on the size of the array
String[] countries = {"France", "Switzerland"};
if (countries.size() == 0) { return null; }
String query = "SELECT * FROM table WHERE country IN (?";
for (int i = 1; i < countries.size; i++) { query += ", ?"; }
PreparedStatement pstmt = con.prepareStatement(query);
for (int i = 0; i < countries.size; i++) { pstmt.setString(1+i, countries[i]); }
pstmt.executeQuery();
but this looks really ugly.
any idea?
No, it's not possible. ORMs like Hibernate or wrapper APIs like Spring JDBC allows doing that. But with plain JDBC, you must do it yourself.
I think the work around would be formulating the entire query string at runtime and using a Statement object instead of PreparedStatement.
No way to try this.See here to other ways.But there is one exception, if you use oracle database then you can try this
If your database engine support IN (subquery), you can create a view or memory table to do it.

Inserting an Array of Characters into DB

I want to insert a 2 dimensional array into a DB table. Is there any way to insert these values into DB with a single INSERT statement rather than using multiple INSERT statements? These multiple statements create a tendency for DB connection pool issues and can create a latency in the application.
String[][] a = new String[10][2];
for(int i =0;i<10;i++)
{
st.executeUpdate("Insert into sap_details VALUES a[i][0],a[i][1]);
}
What happens here is there are effectively 10 INSERT statements being called for each row. I don't want it to; it should happen with only one INSERT statement.
Is there any way to do that?
Use JDBC Batch Updates? Using prepared statements should also help.
Example
String[][] a = new String[10][2];
PreparedStatement pst = con.prepareStatement("INSERT INTO sap_details VALUES (?,?)");
for (int i = 0; i < 10; i++) {
pst.setString(1, a[i][0]);
pst.setString(2, a[i][1]);
pst.addBatch();
}
int[] results = pst.executeBatch();
With MySQL, something like this should do the trick, perfectly fine , Oracle won't like it. This feature is supported by DB2, SQL Server (since version 10.0 - i.e. 2008), PostgreSQL (since version 8.2), MySQL, and H2.
String[][] a = new String[10][2];
StringBuilder sb = new StringBuilder("Insert into sap_details (a,b) VALUES ");
for(int i =0;i<a.length;i++){
sb.append("(\'");
sb.append(a[i][0]);
sb.append("\',\'");
sb.append(a[i][1]);
sb.append("\')");
if(i < a.length -1 )
sb.append(",");
}
st.executeUpdate(sb.toString());

java.lang.OutOfMemoryError: Java heap space error when performing millions of queries

In my application, I need to perform millions of queries to MySQL database. The codes look as follows:
for (int i=0; i< num_rows ; i++) {
String query2="select id from mytable where x='"+ y.get(i) "'";
Statement stmt2 = Con0.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
ResultSet rs2 = stmt2.executeQuery(query2);
... // process result in rs2
rs2.close();
}
where num_rows is around 2 million. After 600k loops, java report an error and exit:
java.lang.OutOfMemoryError: Java heap space error.
What's wrong in my codes? How should I avoid such an error?
Thanks in advance!
Close your statements as well.
Statement is no good solution here. Try the following code:
PreparedStatement pre = Con0.prepareStatement("select id from mytable where x=?");
for (int i=0; i< num_rows ; i++) {
pre.setString(1, y.get(i));
ResultSet rs2 = pre.executeQuery();
... // process result in rs2
rs2.close();
pre.clearParameters();
}
pre.close();
I don't know if the answer accepted by you have solved your problem, since it doesn't change anything that could cause the problem.
The problem is when ResultSet is caching all the rows returned by the query, which can either be stored while you iterate through set, or prefetched. I've had similar problem with PostgreSQL JDBC driver, which ignored the cursor fetch size, when running in no-trasactional mode.
The JDBC driver should use cursors for such queries, so you should check driver's documentation about fetchSize parameter. As alternative, you can manage cursors yourself executing SQL command to create cursor and fetch next X rows.
Using a preparedStatement, since only the value of X changes in each loop, declared outside de loop should help. You're also, at least in the shown code, not closing the statement used, which might not help the garbage collector to free the used memory.
Assuming that you are using a single connection for all your queries, and assuming your code is more complicated than what you show us, it is critical that you ensure that each Statement and each ResultSet is closed when you are finished with it. This means that you need a try/finally block like this:
for (int i=0; i< num_rows ; i++) {
String query2="select id from mytable where x='"+ y.get(i) "'";
Statement stmt2 = Con0.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
ResultSet rs2 = null;
try {
rs2 = stmt2.executeQuery(query2);
... // process result in rs2
} finally {
try {
stmt2.close();
} catch (SQLException sqle) {
// complain to logs
}
try {
if (rs2 != null) { rs2.close(); }
} catch (SQLException sqle) {
}
}
}
If you do not aggressively and deterministically close all result set and statement objects, and if you do requests quickly enough, you will run out of memory.

using JDBC preparedStatement in a batch

Im using Statements batchs to query my data base.
Iv'e done some research now and i want to rewrite my application to use preparedStatement instead but i'm having hard time to figure out how to add queries to a preparedStatement batch.
This is what i'm doing now:
private void addToBatch(String sql) throws SQLException{
sttmnt.addBatch(sql);
batchSize++;
if (batchSize == elementsPerExecute){
executeBatches();
}
}
where sttmnt is a class member of type Statement.
What i want to do is to use the preparedStatement's setString(int, String) method to set some dynamic data and then add it to the batch.
Unfortunately, i don't fully understand how it works, and how i can use setString(int, String) to a specific sql in the batch OR create a new preparedStatemnt for every sql i have and then join them all to one batch.
is it possible to do that? or am i really missing something in my understanding of preparedStatement?
Read the section 6.1.2 of this document for examples. Basically you use the same statement object and invoke the batch method after all the placeholders are set. Another IBM DB2 example which should work for any JDBC implementation. From the second site:
try {
connection con.setAutoCommit(false);
PreparedStatement prepStmt = con.prepareStatement(
"UPDATE DEPT SET MGRNO=? WHERE DEPTNO=?");
prepStmt.setString(1,mgrnum1);
prepStmt.setString(2,deptnum1);
prepStmt.addBatch();
prepStmt.setString(1,mgrnum2);
prepStmt.setString(2,deptnum2);
prepStmt.addBatch();
int [] numUpdates=prepStmt.executeBatch();
for (int i=0; i < numUpdates.length; i++) {
if (numUpdates[i] == -2)
System.out.println("Execution " + i +
": unknown number of rows updated");
else
System.out.println("Execution " + i +
"successful: " + numUpdates[i] + " rows updated");
}
con.commit();
} catch(BatchUpdateException b) {
// process BatchUpdateException
}
With PreparedStatement's you have wild cards in a way, for example
Sring query = "INSERT INTO users (id, user_name, password) VALUES(?,?,?)";
PreparedStatement statement = connection.preparedStatement(query);
for(User user: userList){
statement.setString(1, user.getId()); //1 is the first ? (1 based counting)
statement.setString(2, user.getUserName());
statement.setString(3, user.getPassword());
statement.addBatch();
}
This will create 1 PreparedStatement with that query shown above.You can loop through list when you want to insert or whatever you intentions are. When you want to execute you,
statement.executeBatch();
statement.clearBatch(); //If you want to add more,
//(so you don't do the same thing twice)
I'm adding an extra answer here specifically for MySQL.
I found that the time to do a batch of inserts was similar to the length of time to do individual inserts, even with the single transaction around the batch.
I added the parameter rewriteBatchedStatements=true to my jdbc url, and saw a dramatic improvement - in my case, a batch of 200 inserts went from 125 msec. without the parameter to about 10 to 15 msec. with the parameter.
See MySQL and JDBC with rewriteBatchedStatements=true

Categories

Resources