Insert data into Database - What is the best way to do it - java

I have to insert anywhere between 50 - 500 contact's information into the database. I have 4 arraylists that contain image, name, number, bool variable respectively.
Each row in the data is made up of a combination of all the 4 arraylists along with a SNO. Please refer to the image below.
My question is, let's say i have 500 contacts that I retrieve from the User's Contacts list. Is it a good thing that, I have a function that inserts each row at a time into the table and call it 500 times? or is there any other way? A mean idea would be to combine all the arraylists together, pass it to the function and retrieve the data there and repeat the insert statement 500 times.
What is better in terms of performance?
for(int i =0; i < 500; i++)
{
dbObj.insert_row(par1, par2, par3, par4, ...);
}
OR
function insert_row(Combined ArrayLists)
{
for(int i=0; i<500; i++)
{
db.execSql(//Insert Statement);
}
}

Insert data into Database - What is the best way to do it
I suggest you to create own object that will represent your table where properties of object will be equal to columns in table, e.q.
public class Contact {
private String name;
private String number;
private String image;
private boolean conn;
//getters and setters
}
Now your approach sounds like "spaghetti code". There is not need to have four ArrayLists and this design is not efficient and correct.
Now, you will have one ArrayList of Contact object with 500 childs.
What is the best way to insert?
For sure wrap your insertion to one TRANSACTION that speed up your inserts rapidly and what is the main your dealing with database becomes much more safer and then you don't need to care about possibility of losing database integrity.
Example of transaction(one method from my personal example project):
public boolean insertTransaction(int count) throws SQLException {
boolean result = false;
try {
db = openWrite(DataSource.getInstance(mContext));
ContentValues values = new ContentValues();
if (db != null) {
db.beginTransaction();
for (int i = 0; i < count; i++) {
values.put(SQLConstants.KEY_TYPE, "type" + i);
values.put(SQLConstants.KEY_DATE, new Date().toString());
db.insertOrThrow(SQLConstants.TEST_TABLE_NAME, SQLConstants.KEY_TYPE, values);
values.clear();
}
db.setTransactionSuccessful();
result = true;
}
return result;
}
finally {
if (db != null) {
db.endTransaction();
}
close(db);
}
}

If you are going to insert 500 records into database you should use transaction
database.beginTransaction();
try {
// perform inserts
database.setTransactionSuccessful();
finally {
database.endTranasction();
}

As mentioned before create Your own class to represent one row or use ContentValues class
SQlite doesn't provide possibility to insert many rows in one query like in MySQL but there is some way around it You can read here.
If You decide to use method described in this link it is better if You create a function to generate this query and run it just once. Otherwise As others mentioned already you may use transaction to improve a performance of many inserts.

Converting 4 arrays into one object array makes the code better. but you can create these objects without doing it like this.
Prepare the sql statement with bind variables (? or :vars), then execute the statement multiple times in a loop by setting the bind variables for each row.
String sql = "insert into..... values (?,?,?,?)";
// Get connection etc
PreparedStatement stmt = conn.prepareStatement(sql);
for(int i =0; i < 500; i++)
{
stmt.setString(1, name.get(i));
stmt.setNumber(2, number.get(i));
...
stmt.executeUpdate();
}

Related

Can I execute batch of queries using jpa/hibernate in a single shot?

Here is a sample code.
sourceFileItemNo and entityDplStatus are array of some values.
Query query;
for(int i = 0 ;(sourceFileItemNo!=null && i< sourceFileItemNo.length); i++){
Object[] parameters = {entityDplStatus[i],"1",sourceFileItemNo[i]};
String queryString = "UPDATE GtcEntityDetailsValue c SET c.dplStatus=?1 where c.referenceNo=?2 and c.itemNo=?3";
query = manager.createQuery(queryString);
query = setQueryParameters(query, parameters);
query.executeUpdate();
}
Here in this case we are updating the details each time in each iteration. Does JPA provides provision to add the queries to a list or something and execute all the queries in a single shot just like old connection statement execute batch?
Yes you can easily do it using hibernate,pass the list of objects with updated values that you want to persist, then just do:
private void updteRecord(List<Records> records){
int batchSize=10;
for(int i=0;i<records.size();i++)
{
getSession().saveOrUpdate(records.get(i));
if(i%batchSize==0)
{
getSession().flush();
getSession().clear();
}
}
}
The List of records you are feeding to this method should contain the values you need to upodate.

Spring jdbc template executing batch updates has bad performance [duplicate]

I'm trying to find the faster way to do batch insert.
I tried to insert several batches with jdbcTemplate.update(String sql), where
sql was builded by StringBuilder and looks like:
INSERT INTO TABLE(x, y, i) VALUES(1,2,3), (1,2,3), ... , (1,2,3)
Batch size was exactly 1000. I inserted nearly 100 batches.
I checked the time using StopWatch and found out insert time:
min[38ms], avg[50ms], max[190ms] per batch
I was glad but I wanted to make my code better.
After that, I tried to use jdbcTemplate.batchUpdate in way like:
jdbcTemplate.batchUpdate(sql, new BatchPreparedStatementSetter() {
#Override
public void setValues(PreparedStatement ps, int i) throws SQLException {
// ...
}
#Override
public int getBatchSize() {
return 1000;
}
});
where sql was look like
INSERT INTO TABLE(x, y, i) VALUES(1,2,3);
and I was disappointed! jdbcTemplate executed every single insert of 1000 lines batch in separated way. I loked at mysql_log and found there a thousand inserts.
I checked the time using StopWatch and found out insert time:
min[900ms], avg[1100ms], max[2000ms] per Batch
So, can anybody explain to me, why jdbcTemplate doing separated inserts in this method? Why method's name is batchUpdate?
Or may be I am using this method in wrong way?
These parameters in the JDBC connection URL can make a big difference in the speed of batched statements --- in my experience, they speed things up:
?useServerPrepStmts=false&rewriteBatchedStatements=true
See: JDBC batch insert performance
I found a major improvement setting the argTypes array in the call.
In my case, with Spring 4.1.4 and Oracle 12c, for insertion of 5000 rows with 35 fields:
jdbcTemplate.batchUpdate(insert, parameters); // Take 7 seconds
jdbcTemplate.batchUpdate(insert, parameters, argTypes); // Take 0.08 seconds!!!
The argTypes param is an int array where you set each field in this way:
int[] argTypes = new int[35];
argTypes[0] = Types.VARCHAR;
argTypes[1] = Types.VARCHAR;
argTypes[2] = Types.VARCHAR;
argTypes[3] = Types.DECIMAL;
argTypes[4] = Types.TIMESTAMP;
.....
I debugged org\springframework\jdbc\core\JdbcTemplate.java and found that most of the time was consumed trying to know the nature of each field, and this was made for each record.
Hope this helps !
I have also faced the same issue with Spring JDBC template. Probably with Spring Batch the statement was executed and committed on every insert or on chunks, that slowed things down.
I have replaced the jdbcTemplate.batchUpdate() code with original JDBC batch insertion code and found the Major performance improvement.
DataSource ds = jdbcTemplate.getDataSource();
Connection connection = ds.getConnection();
connection.setAutoCommit(false);
String sql = "insert into employee (name, city, phone) values (?, ?, ?)";
PreparedStatement ps = connection.prepareStatement(sql);
final int batchSize = 1000;
int count = 0;
for (Employee employee: employees) {
ps.setString(1, employee.getName());
ps.setString(2, employee.getCity());
ps.setString(3, employee.getPhone());
ps.addBatch();
++count;
if(count % batchSize == 0 || count == employees.size()) {
ps.executeBatch();
ps.clearBatch();
}
}
connection.commit();
ps.close();
Check this link as well
JDBC batch insert performance
Simply use transaction. Add #Transactional on method.
Be sure to declare the correct TX manager if using several datasources #Transactional("dsTxManager"). I have a case where inserting 60000 records. It takes about 15s. No other tweak:
#Transactional("myDataSourceTxManager")
public void save(...) {
...
jdbcTemplate.batchUpdate(query, new BatchPreparedStatementSetter() {
#Override
public void setValues(PreparedStatement ps, int i) throws SQLException {
...
}
#Override
public int getBatchSize() {
if(data == null){
return 0;
}
return data.size();
}
});
}
Change your sql insert to INSERT INTO TABLE(x, y, i) VALUES(1,2,3). The framework creates a loop for you.
For example:
public void insertBatch(final List<Customer> customers){
String sql = "INSERT INTO CUSTOMER " +
"(CUST_ID, NAME, AGE) VALUES (?, ?, ?)";
getJdbcTemplate().batchUpdate(sql, new BatchPreparedStatementSetter() {
#Override
public void setValues(PreparedStatement ps, int i) throws SQLException {
Customer customer = customers.get(i);
ps.setLong(1, customer.getCustId());
ps.setString(2, customer.getName());
ps.setInt(3, customer.getAge() );
}
#Override
public int getBatchSize() {
return customers.size();
}
});
}
IF you have something like this. Spring will do something like:
for(int i = 0; i < getBatchSize(); i++){
execute the prepared statement with the parameters for the current iteration
}
The framework first creates PreparedStatement from the query (the sql variable) then the setValues method is called and the statement is executed. that is repeated as much times as you specify in the getBatchSize() method. So the right way to write the insert statement is with only one values clause.
You can take a look at http://docs.spring.io/spring/docs/3.0.x/reference/jdbc.html
I had also some bad time with Spring JDBC batch template. In my case, it would be, like, insane to use pure JDBC, so instead I used NamedParameterJdbcTemplate. This was a must have in my project. But it was way slow to insert hundreds os thousands of lines in the database.
To see what was going on, I've sampled it with VisualVM during the batch update and, voilà:
What was slowing the process was that, while setting the parameters, Spring JDBC was querying the database to know the metadata each parameter. And seemed to me that it was querying the database for each parameter for each line every time. So I just taught Spring to ignore the parameter types (as it is warned in the Spring documentation about batch operating a list of objects):
#Bean(name = "named-jdbc-tenant")
public synchronized NamedParameterJdbcTemplate getNamedJdbcTemplate(#Autowired TenantRoutingDataSource tenantDataSource) {
System.setProperty("spring.jdbc.getParameterType.ignore", "true");
return new NamedParameterJdbcTemplate(tenantDataSource);
}
Note: the system property must be set before creating the JDBC Template object. It would be possible to just set in the application.properties, but this solved and I've never after touched this again
I don't know if this will work for you, but here's a Spring-free way that I ended up using. It was significantly faster than the various Spring methods I tried. I even tried using the JDBC template batch update method the other answer describes, but even that was slower than I wanted. I'm not sure what the deal was and the Internets didn't have many answers either. I suspected it had to do with how commits were being handled.
This approach is just straight JDBC using the java.sql packages and PreparedStatement's batch interface. This was the fastest way that I could get 24M records into a MySQL DB.
I more or less just built up collections of "record" objects and then called the below code in a method that batch inserted all the records. The loop that built the collections was responsible for managing the batch size.
I was trying to insert 24M records into a MySQL DB and it was going ~200 records per second using Spring batch. When I switched to this method, it went up to ~2500 records per second. so my 24M record load went from a theoretical 1.5 days to about 2.5 hours.
First create a connection...
Connection conn = null;
try{
Class.forName("com.mysql.jdbc.Driver");
conn = DriverManager.getConnection(connectionUrl, username, password);
}catch(SQLException e){}catch(ClassNotFoundException e){}
Then create a prepared statement and load it with batches of values for insert, and then execute as a single batch insert...
PreparedStatement ps = null;
try{
conn.setAutoCommit(false);
ps = conn.prepareStatement(sql); // INSERT INTO TABLE(x, y, i) VALUES(1,2,3)
for(MyRecord record : records){
try{
ps.setString(1, record.getX());
ps.setString(2, record.getY());
ps.setString(3, record.getI());
ps.addBatch();
} catch (Exception e){
ps.clearParameters();
logger.warn("Skipping record...", e);
}
}
ps.executeBatch();
conn.commit();
} catch (SQLException e){
} finally {
if(null != ps){
try {ps.close();} catch (SQLException e){}
}
}
Obviously I've removed error handling and the query and Record object is notional and whatnot.
Edit:
Since your original question was comparing the insert into foobar values (?,?,?), (?,?,?)...(?,?,?) method to Spring batch, here's a more direct response to that:
It looks like your original method is likely the fastest way to do bulk data loads into MySQL without using something like the "LOAD DATA INFILE" approach. A quote from the MysQL docs (http://dev.mysql.com/doc/refman/5.0/en/insert-speed.html):
If you are inserting many rows from the same client at the same time,
use INSERT statements with multiple VALUES lists to insert several
rows at a time. This is considerably faster (many times faster in some
cases) than using separate single-row INSERT statements.
You could modify the Spring JDBC Template batchUpdate method to do an insert with multiple VALUES specified per 'setValues' call, but you'd have to manually keep track of the index values as you iterate over the set of things being inserted. And you'd run into a nasty edge case at the end when the total number of things being inserted isn't a multiple of the number of VALUES lists you have in your prepared statement.
If you use the approach I outline, you could do the same thing (use a prepared statement with multiple VALUES lists) and then when you get to that edge case at the end, it's a little easier to deal with because you can build and execute one last statement with exactly the right number of VALUES lists. It's a bit hacky, but most optimized things are.
Solution given by #Rakesh worked for me.
Significant improvement in performance. Earlier time was 8 min, with this solution taking less than 2 min.
DataSource ds = jdbcTemplate.getDataSource();
Connection connection = ds.getConnection();
connection.setAutoCommit(false);
String sql = "insert into employee (name, city, phone) values (?, ?, ?)";
PreparedStatement ps = connection.prepareStatement(sql);
final int batchSize = 1000;
int count = 0;
for (Employee employee: employees) {
ps.setString(1, employee.getName());
ps.setString(2, employee.getCity());
ps.setString(3, employee.getPhone());
ps.addBatch();
++count;
if(count % batchSize == 0 || count == employees.size()) {
ps.executeBatch();
ps.clearBatch();
}
}
connection.commit();
ps.close();
Encountered some serious performance issue with JdbcBatchItemWriter.write() (link) from Spring Batch and find out the write logic delegates to JdbcTemplate.batchUpdate() eventually.
Adding a Java system properties of spring.jdbc.getParameterType.ignore=true fixed the performance issue entirely ( from 200 records per second to ~ 5000 ).
The patch was tested working on both Postgresql and MsSql (might not be dialect specific)
... and ironically, Spring documented this behaviour under a "note" section link
In such a scenario, with automatic setting of values on an underlying PreparedStatement, the corresponding JDBC type for each value needs to be derived from the given Java type. While this usually works well, there is a potential for issues (for example, with Map-contained null values). Spring, by default, calls ParameterMetaData.getParameterType in such a case, which can be expensive with your JDBC driver. You should use a recent driver version and consider setting the spring.jdbc.getParameterType.ignore property to true (as a JVM system property or in a spring.properties file in the root of your classpath) if you encounter a performance issue — for example, as reported on Oracle 12c (SPR-16139).
Alternatively, you might consider specifying the corresponding JDBC
types explicitly, either through a 'BatchPreparedStatementSetter' (as
shown earlier), through an explicit type array given to a
'List<Object[]>' based call, through 'registerSqlType' calls on a
custom 'MapSqlParameterSource' instance, or through a
'BeanPropertySqlParameterSource' that derives the SQL type from the
Java-declared property type even for a null value.

Is it possible to have a generic bulk insert statement?

I have an database import process where I have a list of files which correspond to tables. The files are simple csv text files filled with data.
I would like to speed up the data import by using bulk inserts, so I have the following method:
private void bulkInsert(String tableName) {
String sql = "INSERT INTO "+ tableName +" VALUES (?,?,?);";
SQLiteStatement statement = sampleDB.compileStatement(sql);
sampleDB.beginTransaction();
for (int i = 0; i<100; i++) {
statement.clearBindings();
statement.bindLong(1, i);
statement.bindLong(2, i);
statement.bindLong(3, i*i);
statement.execute();
}
sampleDB.setTransactionSuccessful();
sampleDB.endTransaction();
}
The above method is not the method I'm currently using. In the method. I have 3 columns of type long.
However, as I have different tables with columns of different data types, is it possible generalize the above method? Maybe iterate through the columns and get the type?
EDIT
Would bindAllArgsAsStrings work? The data stored in the columns might be text, integer or real. I might also have NULL values in the data.

Database doesn't like reading values from a loop

I have a java database successfully connected to my java code. Thats all fine as it works and all.
When I store a result from the database into a variable ... it works perfectly.
Now as I have to do this 8 times I used a loop and a array however by using a try catch tool it gives out a error of, Error is: java.lang.NullPointerException
Futher investigation shows that it seems to not like the loop strangely.
public String Title []; //in class but out of any methods
public void gettinginfo ()
{
try
{
int AB = 0; //array base starts from 0
//ID in database starts from 1
for (int i = 1; i<=8; i++)
{
String query = "SELECT * FROM students WHERE ID = " + i;
Rs = St.executeQuery(query);
while (Rs.next())
{
Title[AB] = Rs.getString("StudentName");
AB++;
}
}
}
catch (Exception ex)
{
System.out.println("Error is: " + ex);
}
}
What line is your NullPointerException occurring on? Likely your Title array has not been initialized. If you know how many rows the query will return, you can say:
Title = new String[numRows];
But if you don't, you'll need to either run a SELECT count(*) ... query or use an ArrayList or other resizable list, instead of an array.
Your code is also very poorly structured, which is no small part of why you're having trouble debugging this. I've cleaned up your code below, with comments explaining my changes:
public class YourClass
{
private static final int MAX_ID = 8; // or however you want to set the size
private String[] title; // [] after the type is easier to read, lower case variables
private Connection conn; // I'm assuming the class will be provided a DB connection
// Note the Statement and ResultSet objects are not defined in the class, to
// minimize their scope.
public void queryInfo() // name suggests a query (potentially expensive) will be run
{
title = new String[MAX_ID]; // **initialize title**
// We use a try-with-resources block to ensure the statement is safely closed
// even better would be to use a PreparedStatement here
try(Statement st = conn.statement())
{
// You're executing 8 separate queries here, where one will do
String query = "SELECT * FROM students WHERE ID >= 1 AND ID <= "+MAX_ID;
// Again, we need to close the result set when we're done
try(ResultSet rs = st.executeQuery(query))
{
int i = 0;
while (rs.next())
{
title[i++] = rs.getString("StudentName");
}
} // close our ResultSet
} // close our Statement
}
// provide a separate getter method, rather than making the array public
public String[] getTitles()
{
return title;
}
}
There's still more that could be improved - using an array seems like a poor design, as does calling a method which populates a class variable rather than simply having queryInfo() return a new array. You can also look into using PreparedStatement. Hopefully these suggestions help.
Make sure that Title array and Statement St objects are and not null. These are the only two reasons that I suspect. Give FULL stacktrace if it doesn't work.
Title array is NULL. "new" this array to the size equal to number of rows. If you don't know the rows, fire a count(*) query first, find out the no of rows and then intantiate Title array or use ArrayList<String> instead of String array.
I am assuming that you have not initialized your Title array, you have to set it equal to something or it will just be null which will cause a nullPointerException, but as others have stated there is no way to be sure since your haven't given us a full stack trace or even the line number of the exception. In this case the exception should be handled as such:
try{
//your code here
}catch(Exception ex){
ex.printStackTrace();
}
This code will give you the full stack trace making it much easier to track down the issue.
Also you may want to consider using an ArrayList instead of an array:
List<String> Title = new ArrayList<String>();
Then to add to it:
Title.add(Rs.getString("StudentName"));
If you need it as an array later then:
String[] title = Title.toArray(new String[Title.size()]);
You can read more about ArrayLists here.

Total Number of Row Resultset getRow Method

Read the Following Code:
public class selectTable {
public static ResultSet rSet;
public static int total=0;
public static ResultSet onLoad_Opetations(Connection Conn, int rownum,String sql)
{
int rowNum=rownum;
int totalrec=0;
try
{
Conn=ConnectionODBC.getConnection();
Statement stmt = Conn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY);
String sqlStmt = sql;
rSet = stmt.executeQuery(sqlStmt);
total = rSet.getRow();
}
catch(Exception e)
{
System.out.println(e.getMessage());
}
System.out.println("Total Number of Records="+totalrec);
return rSet;
}
}
The folowing code dos't show actual total:
total = rSet.getRow();
my jTable display 4 record in jTable but total = 0; when I evaluate through debug, it shows:
total=(int)0;
rather than total=(int)4
And if I use
rSet=last(); above from the code total = rSet.getRow();
then total shows accurate value = 4 but rSet return nothing. then jTable is empty.
Update me!
BalusC's answer is right! but I have to mention according to the user instance variable such as:
rSet.last();
total = rSet.getRow();
and then which you are missing
rSet.beforeFirst();
the remaining code is same you will get your desire result.
You need to call ResultSet#beforeFirst() to put the cursor back to before the first row before you return the ResultSet object. This way the user will be able to use next() the usual way.
resultSet.last();
rows = resultSet.getRow();
resultSet.beforeFirst();
return resultSet;
However, you have bigger problems with the code given as far. It's leaking DB resources and it is also not a proper OOP approach. Lookup the DAO pattern. Ultimately you'd like to end up as
public List<Operations> list() throws SQLException {
// Declare Connection, Statement, ResultSet, List<Operation>.
try {
// Use Connection, Statement, ResultSet.
while (resultSet.next()) {
// Add new Operation to list.
}
} finally {
// Close ResultSet, Statement, Connection.
}
return list;
}
This way the caller has just to use List#size() to know about the number of records.
The getRow() method retrieves the current row number, not the number of rows. So before starting to iterate over the ResultSet, getRow() returns 0.
To get the actual number of rows returned after executing your query, there is no free method: you are supposed to iterate over it.
Yet, if you really need to retrieve the total number of rows before processing them, you can:
ResultSet.last()
ResultSet.getRow() to get the total number of rows
ResultSet.beforeFirst()
Process the ResultSet normally
As others have answered there is no way to get the count of rows without iterating till the end. You could do that, but you may not want to, note the following points:
For a many RDBMS systems ResultSet is a streaming API, this means
that it does not load (or maybe even fetch) all the rows from the
database server. See this question on SO. By iterating to the
end of the ResultSet you may add significantly to the time taken to
execute in certain cases.
A default ResultSet object is not updatable and has a cursor
that moves forward only. I think this means that unless you
execute
the query with ResultSet.TYPE_SCROLL_INSENSITIVE rSet.beforeFirst() will throw
SQLException. The reason it is this way is because there is cost
with scrollable cursor. According to the documentation, it may throw SQLFeatureNotSupportedException even if you create a scrollable cursor.
Populating and returning a List<Operations> means that you will
also need extra memory. For very large resultsets this will not
work
at all.
So the big question is which RDBMS?. All in all I would suggest not logging the number of records.
One better way would be to use SELECT COUNT statement of SQL.
Just when you need the count of number of rows returned, execute another query returning the exact number of result of that query.
try
{
Conn=ConnectionODBC.getConnection();
Statement stmt = Conn.createStatement();
String sqlStmt = sql;
String sqlrow = SELECT COUNT(*) from (sql) rowquery;
String total = stmt.executeQuery(sqlrow);
int rowcount = total.getInt(1);
}
The getRow() method will always yield 0 after a query:
ResultSet.getRow()
Retrieves the current row number.
Second, you output totalrec but never assign anything to it.
You can't get the number of rows returned in a ResultSet without iterating through it. And why would you return a ResultSet without iterating through it? There'd be no point in executing the query in the first place.
A better solution would be to separate persistence from view. Create a separate Data Access Object that handles all the database queries for you. Let it get the values to be displayed in the JTable, load them into a data structure, and then return it to the UI for display. The UI will have all the information it needs then.
I have solved that problem. The only I do is:
private int num_rows;
And then in your method using the resultset put this code
while (this.rs.next())
{
this.num_rows++;
}
That's all
The best way to get number of rows from resultset is using count function query for database access and then rs.getInt(1) method to get number of rows.
from my code look it:
String query = "SELECT COUNT() FROM table";
ResultSet rs = new DatabaseConnection().selectData(query);
rs.getInt(1);
this will return int value number of rows fetched from database.
Here DatabaseConnection().selectData() is my code for accessing database.
I was also stuck here but then solved...

Categories

Resources