How to use multi threading to fetch data in mysql

How to use multi threading to fetch data in mysql - java

Hi I am trying to fetch 50K + rows from one of the table in MYSQL DB. It is taking more than 20 minutes to retrieve all the data and writing it to text file. Can I use multi threading to reduce this fetching time and make the code more efficient. Any help will be appreciated.
I have used normal JDBC connection and ResultSetMetaData to fetch rows from the Table.
String row = "";
stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("select * from employee_details");
ResultSetMetaData rsmd = rs.getMetaData();
int columnCount = rsmd.getColumnCount();
while (rs.next()) {
for (int i = 1; i < columnCount; i++) {
row = row + rs.getObject(i) + "|";// check
}
row = row + "\r\n";
}
And I am writing the fetched values in text file as below.
BufferedWriter writer = new BufferedWriter(new FileWriter(
"C:/Users/430398/Desktop/file/abcd.txt"));
writer.write(row);
writer.close();

Remember that rs.next will fetch Results from the DB in n-batches. Where n is a number defined by the JDBC-Implementation. I assume it's at 10 right now. So for every 10 batches it will again query the DB, hence there'll be an network-overhead - even if it's on the very same machine.
Just increasing that number will result in a faster loading time.
edit:
adding this
stmt.setFetchSize(50000);
might be it.
Be aware, that this results in heavy memory consumption.

First you need to identify where the bottleneck is. Is it the SQL query? Or the fetching of the rows via the ResultSet? Or the building of the huge string? Or perhaps writing the file?
You need to measure the duration of the above mentioned individual parts of your algorithm and tells us the results. Without this knowledge is not possible to tell how to speed the algorithm.

Related

fastest way to load multiple count queries

I have developed a MIS Report in my swing application. Means a table
in which the first column contains date column followed by 7 columns of invoice status like pending for payment, paid, pending for park, post etc..
Just trying to show the EXCEL PIVOT LIKE table report in swing, I have successfully implemented it.
but the problem is: day by day the row count is increasing and I have developed a query like
select count(invoice_No)
from MyTable
where ClaimStatus='columnHeader'
AND date='row1stColumn'
and it will loop through all rows and column to get desired count and will display it into jtable.
But as I said the row count gets increased day by day and SQL Server table data is also increasing a lot.
So my count query gets lot of time to populate that table
Is there any way to make above query faster?
If anyone wants to see my code I will provide it.
I have implemented my application but because of huge data its taking time to show MIS Report
Please see the picture i have attached which is the output of my code.
And The code is,
1) for distinct dates in 1st column
q="Select distinct(Inward_Date_Short) from Inward_Master";
PreparedStatement ps=con.prepareStatement(q);
ResultSet rs=ps.executeQuery();
while(rs.next())
{
inwardDateList.add(rs.getString(1));
}
2) static columns of JTable
headers.add("Pending For Digitization");
headers.add("Pending For Claim Creation");
headers.add("Resolution - Pending For Claim Creation");
headers.add("Pending For Approval");
headers.add("Pending For Parking");
headers.add("Pending For Posting");
headers.add("Objection");
headers.add("Pending For Payment");
headers.add("Paid");
headers.add("Rejected");
headers.add("Outward");
3) now this is most important code which i want to make more faster
for(int i=0;i<inwardDateList.size();i++)
{
Vector varsha=new Vector();
varsha.add(inwardDateList.get(i).toString());
for(int c=1;c<headers.size();c++)
{
try(Connection con=dbConnection.dbConnector();)
{
String q="";
q=headers.get(c).toString();
PreparedStatement ps=con.prepareStatement("Select COUNT_BIG(Inward_No) from Inward_Master where Inward_Date_Short='"+inwardDateList.get(i).toString()+"' AND Claim_Status='"+q+"'");
//PreparedStatement ps=con.prepareStatement("Select count(Inward_No) from(Select Inward_No from Inward_Master where Inward_Date_Short='"+inwardDateList.get(i).toString()+"' AND Claim_Status='"+q+"') X");
ResultSet rs=ps.executeQuery();
rs.next();
data.add(rs.getInt(1));
}
catch(Exception e)
{
e.printStackTrace();
}
}
rowdata.add(data);
}

Java Streaming JDBC resultset to read rows from a table keeping resultset open and without firing SQL again

We have a scenario, where we are continuously reading data from a table while another application is inserting records into that table. As per our current implementation we are polling (executing SQL on that table) for new records after certain period of time using Spring's JdbcTemplate.
I am thinking of some way through which I can implement streaming ResultSet where I can keep on reading table data from ResultSet without executing SQL again and again.
EDIT: Adding code snippet from out current implementation:
String SQL = "SELECT * FROM ACTIONSTATUS WHERE STATUS = 'FAILED' and ALERTID > :alertId";
public static void fetchActionStatus(NamedParameterJdbcTemplate template, HashMap<String, String> paramMap, String fetchSql) throws Exception{
int i = 0;
List<Map<String, Object>> results = null;
while(true){
if(results == null || results.size() == i){
results = template.queryForList(SQL, paramMap);
i = 0;
}else{
Map<String, Object> row = results.get(i);
// SOME DATA PROCESSING WHICH MAY TAKE XX MILISEC.
System.out.println("ALERTID: " + row.get("ALERTID").toString() + " ACTIONID: " + row.get("ACTIONID").toString() + " STATUS: " +row.get("STATUS").toString());
Thread.sleep(100);
paramMap.put("alertId", row.get("ALERTID").toString());
i++;
}
}
}
Some other application is inserting new records in ACTIONSTATUS table at random interval, we need to continuously monitor ACTIONSTATUS table to process all records. Our current implementation is working fine without any issues. We are looking for some other solution to optimize this approach to somehow stream ResultSet without executing SQL again and again.

I don't think there is a way to do this with JDBC.
From an SQL perspective, it is potentially problematic because of the issue of transaction isolation.

Exporting large data into CSV from sqlserver using java

I have 9 million records in sqlserver. I am trying to import it into csv files so that I can put that data into mongo db. I have written Java code for sql2csv import. But I have two issue
If I read all the data in list and then try to insert into CSV, I got outofmemorry exception.
If I read line by line and try to insert every line in CSV, it took very long time to export data.
My code is some thing like
List list = new ArrayList();
try {
Class.forName(driver).newInstance();
conn = DriverManager.getConnection(url, databaseUserName, databasePassword);
stmt = conn.prepareStatement("select OptimisationId from SubReports");
result = null;
result = stmt.executeQuery();
// stmt.executeQuery("select * from Subscription_OptimisationReports");
result.setFetchSize(1000);
while (result.next()) {
//System.out.println("Inside while");
SubReportsBean bean = new SubReportsBean();
bean.setOptimisationId(result.getLong(("OptimisationId")));
list.add(bean);
generateExcel(list);
}
//generateExcel(list);
conn.close();
}
Can there be a faster approach to export all data quickly? Or even better if it can directly be exported to mongo instead of csv.

Maybe you should paginate your data by only reading a little at a time by using LIMIT and OFFSET.
select OptimisationId from SubReports OFFSET 0 ROWS FETCH NEXT 1000 ROWS ONLY;
select OptimisationId from SubReports OFFSET 1000 ROWS FETCH NEXT 1000 ROWS ONLY;
select OptimisationId from SubReports OFFSET 2000 ROWS FETCH NEXT 1000 ROWS ONLY;
...
Just keep a counter of the offset.
Another Example
If you use this solution then you'd need to modify your code to append to the end of the Excel file -- don't keep all your results in memory otherwise you'll still run into the OutOfMemoryException.

Definitely when dealing with so much records, collecting all date in a list before dumping in to CSV is bound to fail.
So your solution 2 is the way to go.
Your code seems to correspond to this solution but I think you 've just forgotten to move your list declaration or to empty your list in the loop. You could do :
try {
Class.forName(driver).newInstance();
conn = DriverManager.getConnection(url, databaseUserName, databasePassword);
stmt = conn.prepareStatement("select OptimisationId from SubReports");
result = null;
result = stmt.executeQuery();
// stmt.executeQuery("select * from Subscription_OptimisationReports");
result.setFetchSize(1000);
while (result.next()) {
//System.out.println("Inside while");
SubReportsBean bean = new SubReportsBean();
bean.setOptimisationId(result.getLong(("OptimisationId")));
List list = new ArrayList();
list.add(bean);
generateExcel(list);
}
//generateExcel(list);
conn.close();
}

java: retrieve keys after executeBatch() in H2

I am trying to retrieve generated keys from an executeBatch() transaction but I only get the last key to be added.
this is my code:
PreparedStatement ps_insert = conn.prepareStatement(insertQuery, PreparedStatement.RETURN_GENERATED_KEYS);
for (int i = 0 ; i < adding_dates.length ; i++){
ps_insert.setInt(1, Integer.parseInt(consultant_id));
ps_insert.setDate(2, adding_dates[i]);
ps_insert.setInt(3, Integer.parseInt(room_id));
ps_insert.addBatch();
}
ps_insert.executeBatch();
ResultSet rs = ps_insert.getGeneratedKeys(); //<-- Only the last key retrieved
conn.commit();
What am I doing wrong?
EDIT: Apologies for not mentioning that I use H2 (http://www.h2database.com/html/main.html) database in embedded mode.

According to H2 jdbc driver javadocs, this is the normal behaviour:
Return a result set that contains the last generated auto-increment
key for this connection, if there was one. If no key was generated by
the last modification statement, then an empty result set is returned.
The returned result set only contains the data for the very last row.

You must iterate the ResultSet to retrieve the keys.
PreparedStatement ps_insert = conn.prepareStatement(insertQuery, PreparedStatement.RETURN_GENERATED_KEYS);
for (int i = 0 ; i < adding_dates.length ; i++){
ps_insert.setInt(1, Integer.parseInt(consultant_id));
ps_insert.setDate(2, adding_dates[i]);
ps_insert.setInt(3, Integer.parseInt(room_id));
ps_insert.addBatch();
}
ps_insert.executeBatch();
ResultSet rs = ps_insert.getGeneratedKeys(); //<-- Only the last key retrieved
if (rs.next()) {
ResultSetMetaData rsmd = rs.getMetaData();
int colCount = rsmd.getColumnCount();
do {
for (int i = 1; i <= colCount; i++) {
String key = rs.getString(i);
System.out.println("key " + i + "is " + key);
}
}
while (rs.next();)
}
conn.commit();

This is a limitation of H2 implementation. This is an issue.
For now use inserts/updates without batch, or query generated keys somehow through select.

If you are sharing a session/connection between 2 threads, and two of those threads try to execute statements at the same time, then you might see this kind of problem.
You probably need to either (a) use a connection pool or (b) synchronise your entire access to the DB.
for instance for option (b)
put a synchronize token infront of your method to make it thread safe
Just a thought as i dont know you complete execution context

(outofmemoryerror: java heap space) when iterating through oracle records

hello fellow java developers.
I'm having a bit of an issue here. I have code that gets a resultset from an oracle database, prints each row to a file, then gets the next row - and continues till the end of the resultset.
Only this isn't what happens. What happens is that it gets the resultset, starts iterating through the rows, printing to file as it goes, until it runs out of memory - claiming it needs more space on the java heap.
The app is currently running with 2g of memory on the heap and the code breaks at about the 150000th row.
I'm using jodbc6.jar and java 6
Here is an idea of what my code is doing:
Connection conn = DriverManager.getConnection(url,"name","pwd");
conn.setAutoCommit(false);
Statement stmt = conn.createStatement();
ResultSet rset = stmt.executeQuery(strSql);
String strVar_1 = null;
long lCount = 0;
while(rset.next()){
lCount++;
if (lCount % 100000 == 0){
System.out.println(lCount + " rows completed");
}
strVar_1 = rset.getString("StringID"); /// breaks here!!!!!!!!!
if (strVar_1 == null){
strVar_1 = "";
}
if (!strQuery_1.equals("")){
out.write(strVar_1 + "\n");
}
}
out.close();

Try below:
Statement stmt = conn.createStatement();
stmt.setFetchSize(someInt);
ResultSet rset = stmt.executeQuery(strSql);
This will control how many records are fetched at a time.

Well maybe another way of dealing with such large data is to keep returning the row and writing it to a file. This way the string buffer does not keep growing and you should be able to write all the records to file system and then read them late

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to use multi threading to fetch data in mysql - java

Related

fastest way to load multiple count queries

Java Streaming JDBC resultset to read rows from a table keeping resultset open and without firing SQL again

Exporting large data into CSV from sqlserver using java

java: retrieve keys after executeBatch() in H2

(outofmemoryerror: java heap space) when iterating through oracle records

Categories

Resources