How to insert records faster

How to insert records faster - java

I have to read records from CSV file and store them in Mysql database.
I know about "LOAD DATA INFILE" but in my case I have to get single record from file, check if it is in valid format/length etc and then store it in database.
// list to store records from CSV file
ArrayList<String> list = new ArrayList<String>();
//Read one line at a time
while ((nextLine = reader.readNext()) != null)
{
for (String number : nextLine)
{
if (number.length() > 12 && number.startsWith("88"))
{
list.add(number);
} else if (number.length() > 9 && number.startsWith("54"))
{
list.add(number);
}
else if (number.length() > 8 && number.startsWith("99"))
{
list.add(number);
}
else
{
// ....
}
// method to insert data in database
insertInToDatabase(list);
}
}
and method to insert record in db: taken from here
private void insertInToDatabase(ArrayList<String> list)
{
try
{
String query = "INSERT INTO mytable(numbers) VALUES(?)";
prepStm = conn.prepareStatement(query);
for (String test : list)
{
prepStm.setString(1, test);
prepStm.addBatch();// add to batch
prepStm.clearParameters();
}
prepStm.executeBatch();
}
}
This is working, but the rate at which the records are inserting is very slow.
is there any way by which I can insert records faster.

You would need to use: "rewriteBatchedStatement" as that is a MYSQL optimization which attempts to reduce round trips to the server by consolidating the inserts or updates in as few packets as possible.
Please refer to:
https://anonymousbi.wordpress.com/2014/02/11/increase-mysql-output-to-80k-rowssecond-in-pentaho-data-integration/
Also, there are other optimizations as well in that article. Hope this speed up the batching.
EDIT 1:
There is a lucid explanation of this parameter on this site as well: refer to: MySQL and JDBC with rewriteBatchedStatements=true

#Khanna111's answer is good.
I don't know if it helps, but try checking the table engine type. I once encountered the problem in which records are inserting very slow. I changed the engine from InnoDB to MyISAM and insertion becomes very fast.

i think the better approach is to process the csv file with the rules defined and then create another csv out it, and once the output csv is prepared. do load data infile.
it'll be pretty quick.

If you want to insert through your own application create batch query like this and execute to MySQL server.
String query = "INSERT INTO mytable(numbers)
VALUES (0),
(1),
(2),
(3)";

Related

How to utilize Pageable when running a custom delete query in Spring JPA for mongodb?

I am working on creating a tool allowing admins to purge data from the database. Our one collection has millions of records making deletes seize up the system. Originally I was just running a query with that returns a Page and dropping that into the standard delete. Ideally i'd prefer to run the query and delete in one go.
#Query(value = "{ 'timestamp' : {$gte : ?0, $lte: ?1 }}")
public Page deleteByTimestampBetween(Date from, Date to, Pageable pageable);
Is this possible, using the above code the system behaves the same where the program doesnt continue the delete function and the data isnt removed from mongo. Or is there a better approach?

I don't think it is possible using Pageable/Query annotation. You can use Bulk Write to process deletes in batches.
Something like
int count = 0;
int batch = 100; //Send 100 requests at a time
BulkOperations bulkOps = mongoTemplate.bulkOps(BulkOperations.BulkMode.UNORDERED, YourPojo.class);
List<DateRange> dateRanges = generateDateRanges(from, to, step); //Add a function to generate date ranges with the defined step.
for (DateRange dateRange: dateRanges){
Query query = new Query();
Criteria criteria = new Criteria().andOperator(Criteria.where("timestamp").gte(dateRange.from), Criteria.where("timestamp").lte(dateRange.to));
query.addCriteria(criteria);
bulkOps.remove(query);
count++;
if (count == batch) {
bulkOps.execute();
count = 0;
}
}
if (count > 0) {
bulkOps.execute();
}

out of memory when insert record batch through jdbc

I want to copy a table (10 million records) in originDB(sqlite3) into another database called targetDB.
The process of my method is:
read data from the origin table and generate a ResultSet, then generate corresponding insert sql about every record and execute commit to batch insert when the count of record reach 10000. The code as follow:
public void transfer() throws IOException, SQLException {
targetDBOperate.setCommit(false);//batch insert
int count = 0;
String[] cols = parser(propertyPath);//get fields of data table
String query = "select * from " + originTable;
ResultSet rs = originDBOperate.executeQuery(query);//get origin table
String base = "insert into " + targetTable;
while(rs.next()) {
count++;
String insertSql = buildInsertSql(base,rs,cols);//corresponding insert sql
targetDBOperate.executeSql(insertSql);
if(count%10000==0) {
targetDBOperate.commit();// batch insert
}
}
targetDBOperate.closeConnection();
}
The follow picture is the trend of using memory, and vertical axis represents memory usage
As we can say it will be bigger and bigger until out of memory. The stackoverflow has some relevant question such as Out of memory when inserting records in SQLite, FireDac, Delphi
, but I havent solve my problem for we use different implement method. My hypothesis is that when the count of record hasn't reach 10000, these corresponding insert sql will be cached in memory and they haven't been removed when execute commit by default? Every advice will be appreciate.

By moving a higher number of rows in SQLite or any other relational database you should follow some basic principles:
1) set autoCommit to false, i.e. do not commit each insert
2) use batch update, i.e. do not round trip for each row
3) use prepared statement, i.e. do not parse each insert.
Putting this together you get following code:
cn is the source connection, cn2 is the target connection.
For each inserted row you call addBatch, but only once per batchSize you call executeBatch which initiates a round trip.
Do not forget a last executeBatch at the end of the loop and the final commit.
cn2.setAutoCommit(false)
String SEL_STMT = "select id, col1,col2 from tab1"
String INS_STMT = "insert into tab2(id, col1,col2) values(?,?,?)"
def batchSize = 10000
def stmt = cn.prepareStatement(SEL_STMT)
def stmtIns = cn2.prepareStatement(INS_STMT)
rs = stmt.executeQuery()
while(rs.next())
{
stmtIns.setLong(1,rs.getLong(1))
stmtIns.setString(2,rs.getString(2))
stmtIns.setTimestamp(3,rs.getTimestamp(3))
stmtIns.addBatch();
i += 1
if (i == batchSize) {
def insRec = stmtIns.executeBatch();
i = 0
}
}
rs.close()
stmt.close()
def insRec = stmtIns.executeBatch();
stmtIns.close()
cn2.commit()
Sample test with your size with sqlite-jdbc-3.23.1:
inserted rows: 10000000
total time taken to insert the batch = 46848 ms
I do not observe any memory issues or problems with a large transaction

You are trying to fetch 10M records in one go by doing the following. This will definitely eat your memory like anything
String query = "select * from " + originTable;
ResultSet rs = originDBOperate.executeQuery(query);//get origin table
Use paginated queries to read the batches and do batch updates according.
You are not even doing a batch-update You are simply firing 10K queries one after the other by doing the following code
String insertSql = buildInsertSql(base,rs,cols);//corresponding insert sql
targetDBOperate.executeSql(insertSql);
if(count%10000==0) {
targetDBOperate.commit();// This simply means that you are commiting after 10K records
}

Spring queryForList not working

I have the following query and the following piece of code to get the results.
List<Map<String, Object>> rows = this.getBemsConnection().queryForList(ItemWorkflowDetails.BEMS_CREATION_DATE_QUERY, new Object[]{itemName});
if (rows != null && !rows.isEmpty()) {
for (Map<String, Object> row : rows) {
itemSetupObj.setBemsCreation((String) row.get("BEMS_CREATION"));
LOGGER.info("Bems Creation Date: {}", itemSetupObj.getBemsCreation());
}
}
String BEMS_CREATION_DATE_QUERY = "SELECT creation_date bems_creation FROM xxref_cg1_o.mtl_system_items_b WHERE segment1 = ? AND organization_id = 1";
I am getting data for this from the backend database but nothing happens when I execute the query in Java. Am I missing something?

Figured the issue, the input from Java was going in small case where as the data was stored in upper case in the table. Plus the query I ran in DB had the value in upper case. So indeed, I was not getting data as it was not able to make a match.

Mysql JDBC - not entering ResultSet + incorrect no. of rows

I came across a problem with going through a ResultSet I'm generating from my MySQL db. My query should return at most one row per table (I'm looping through several tables searching by employee number). I've entered data in some of the tables; but my test o/p says that the resultset contains 0 rows and doesn't go through the ResultSet at all. The o/p line it's supposed to print never appears. It was in a while loop before I realised that it'd be returning at most one row, at which point I just swapped the while(rs.next()) for an if(rs.first()). Still no luck. Any suggestions?
My code looks like this:
try
{
rsTablesList = stmt.executeQuery("show tables;");
while(rsTablesList.next())
{
String tableName = rsTablesList.getString(1);
//checking if that table is a non-event table; loop is skipped in such a case
if(tableName.equalsIgnoreCase("emp"))
{
System.out.println("NOT IN EMP");
continue;
}
System.out.println("i'm in " + tableName); //tells us which table we're in
int checkEmpno = Integer.parseInt(empNoLbl.getText()); //search key
Statement s = con.createStatement();
query = "select 'eventname','lastrenewaldate', 'expdate' from " + tableName + " where 'empno'=" + checkEmpno + ";"; // eventname,
System.out.println("query is \n\t" + query + "");
rsEventDetails = s.executeQuery(query) ;
System.out.println("query executed\n");
//next two lines for the number of rows
rsEventDetails.last();
System.out.println("no. of rows is " + rsEventDetails.getRow()+ "\n\n");
if(rsEventDetails.first())
{
System.out.println("inside the if");
// i will add the row now
System.out.println("i will add the row now");
// cdTableModel.addRow(new Object[] {evtname,lastRenewalDate,expiryDate});
}
}
}
My output looks like this:
I'm in crm
query is
select 'eventname','lastrenewaldate', 'expdate' from crm where 'empno'=17;
query executed
no. of rows is 0
I'm in dgr
query is
select 'eventname','lastrenewaldate', 'expdate' from dgr where 'empno'=17;
query executed
no. of rows is 0
NOT IN EMP
I'm in eng_prof
query is
select 'eventname','lastrenewaldate', 'expdate' from eng_prof where 'empno'=17;
query executed
no. of rows is 0
I'm in frtol
query is
select 'eventname','lastrenewaldate', 'expdate' from frtol where 'empno'=17;
query executed
no. of rows is 0
(and so on, upto 17 tables.)
The '17' in the query is the empno that I've pulled from the user.
The thing is that I've already entered data in the first two tables, crm and dgr. The same query in the command line interface works; this morning, I tried the program out and it returned data for the one table that had data in it (crm). The next time onwards, nothing.
Context: I'm working on a school project to create some software for my dad's office, it'll help them organise the training etc schedules for the employees. (a little like Google Calendar I guess.) I'm using Netbeans and Mysql on Linux Mint. There are about 17 tables in the database. The user selects an employee name and the program searches for all entries in the database that correspond to an 'event' (my generic name for a test/training/other required event) and puts them into a JTable.

The single quotes around the column names and table name in the creation of the query seem to have caused the problem. On changing them to backticks, retrieval works fine and the data comes in as expected.
Thank you, #juergend (especially for the nice explanation) and #nailgun!

cursor.getCount() returning wrong count using rawQuery

I have used rawQuery to fetch records from DB table. I have checked the query from log and it is executing perfectly in SQLite. But the cursor.getCount is returning wrong row count and cursor containing wrong resultset. I have used the following code:
Cursor productCursor = dataHelper.rawQuery(query_str, null);
int list_count = productCursor.getCount();
Log.d("list_count", ""+list_count);
productCursor.moveToFirst();
while(productCursor.isAfterLast() == false) {
......
}
There are actually 4 records but the cursor contains only 3 records. Tested in SQLite and got the correct resultset.
It would be helpful if anyone can point out my fault.

try this..
productCursor.moveToFirst();
do {
// your code..
} while (productCursor.moveToNext());

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to insert records faster - java

#Khanna111's answer is good. I don't know if it helps, but try checking the table engine type. I once encountered the problem in which records are inserting very slow. I changed the engine from InnoDB to MyISAM and insertion becomes very fast.

i think the better approach is to process the csv file with the rules defined and then create another csv out it, and once the output csv is prepared. do load data infile. it'll be pretty quick.

If you want to insert through your own application create batch query like this and execute to MySQL server. String query = "INSERT INTO mytable(numbers) VALUES (0), (1), (2), (3)";

Related

How to utilize Pageable when running a custom delete query in Spring JPA for mongodb?

out of memory when insert record batch through jdbc

Spring queryForList not working

Mysql JDBC - not entering ResultSet + incorrect no. of rows

cursor.getCount() returning wrong count using rawQuery

Categories

Resources