Insert, Select and Update Query Slows Down the Entire server - java

I am having an application for handling more than 10000000 data.
The MainTable has more than 10000000 data
I am trying to Insert the Data into a SubTable From the Main Table as
INSERT INTO SubTable(Value1,Value2)
SELECT Value1,Value2 FROM MainTable
GROUP BY Value1_ID;
After performing certain processing in SubTable..Again I update the new values into the Main Table as
UPDATE MainTable inf,SubTable in
SET inf.Value1=in.Value1, inf.Value2=in.Value2
WHERE inf.Value1_ID= in.Value1_ID;
While Running this query the Entire Server gets very slow and it stops the entire other transaction.I am using the JDBC Driver Manager connection here. How to avoid this? How to solve this problem?

If it's something that you have to do only once in a while, instead of updating the whole table in a single update, you can set up a small script that will update by batch of rows every few seconds/minutes or so. The other processes will have their query executed freely between two updates.
For example, by updating a batch of 100,000 rows every minutes, if your tables have the right indexes, that would take 1~2 hours, but with a far lesser impact on the performance.
The other solution would be do the update when the activity on the server is at its lowest (maybe during the week-ends?), that way you won't impact the other processes as much.

Related

Updating row while iterating through ResultSet takes a lot of time

I am trying to improve a data transfer program that I wrote. I am looking for suggestions on how to make it quicker.
My program extracts data from a database (usually Oracle 11g) by filling a ResultSet and writing this result into a file. The program looks periodically into the tables and queries if a special column has changed. For example, this could be such a query:
select columnA, columnB from scheme.table where changeColumn = '1'
Now comes the critical part. After extracting the data I need to update this changeColumn to '0'. Since I have just used the ResultSet for exporting the data into a file I have to rewind it, so the code looks like this:
extractedData.beforeFirst();
while (extractedData.next()) {
extractedData.updateString("changeColumn", "0");
extractedData.updateRow();
}
Now if this ResultSet is bigger (let's say more than 100.000 entries) then this loop can take hours. Does anyone have any suggestions on how to increase the performance of this?
I heard of setting the fetch size to a bigger value, but usually the ResultSet only contains less than a dozen entries. Is there a way to dynamically set the fetch size?
Use a JDBC Batch Update. From all the row that needs updating, take the primary key on the row that needs updating, add it to a batch update (SQL query) and execute the batch.
A good example from Mkyong shows you how to do JDBC Batch Update with JDBC PreparedStatement.

is Select Before Update a good approach or vice versa?

I am developing an application using normal JDBC connection. The application is developed with Java-Java EE SpringsMVC 3.0 and SQL Server 08 as database. I am required to update a table based on a non primary key column.
Now, before updating the table we had to decide an approach for updating the table, as table may contain huge amount of data. The update Query will be executed in a batch and we are required to design application in a manner wherein it doesn't hog the system resources.
Now, We had to decide between either of the approaches,
1. SELECT DATA BEFORE YOU UPDATE or
2. UPDATE DATA AND THEN SELECT MISSING DATA.
Select data before update is only benificial if chances of failure are maximum, i.e. if a batch 100 Query update is executed, and out of which if only 20 rows are updated successfully, then this approach should be taken
Update data and then check missing data is benificial only when failure records are far less. By this ap[proach one database select call can be avoided, i.e after a batch update, the count of records updated can be taken and the select query should be executed if and only if theres is a count in mismatch w.r.t no of query.
We are totally unaware about the system on Production environment, but we want to counter for all possibilities and want a faster system. I need your inputs as which is a better approach.
Since there is 50:50 chance of successful updates or faster selects, its hard to tell from the current scenario mentioned. You probably would want a fuzzy logic approach, getting constant feedback of how many updates were successful over the period of time, and then decide on the basis of that data to either do an update before select or do a select before update.

Possibility of inserting many rows in to table with single database hit

I came across this scenario where I have to insert some 100 rows into my table using the Java application that I support.
I do not want my application to hit the DB every time with the insert query to do this.
Can you suggest some idea for me to insert all those 100 rows in to that table with a single DB hit.
You can do batch query processing for this purpose.

Database Loading batch job Java

Have a batch job written in Java which truncates and then loads certain table in Oracle database every few minutes. There are reports generated on web pages based on the data in the table. Am wondering of a good way of not affecting the report querying part when the data loading process is happeneing so that the users won't end up with some and/or no data.
If you process all your SQL statements inside a single transaction there will be always a valid state seen from outside. Beware that TRUNCATE doe not work in transactions, so you have to use DELETE. While this guarantees to always have reasonable data in your table it needs a bigger rollback segment and will be considerably slower.
you could have 2 tables and a meta table which tracks which table is the main table being used for querying. your batch job will be truncating and loading one of the table and you can switch the main tables once the loading is completed. so the query app will get recent data now and u can load now in the other table
What I would do is set a flag in a DB table to indicate that that the update is in progress and have the reports look for that flag and display an appropriate message and wait for the update to finish. Once the update is complete clear the flag.

Fastest way to iterate through large table using JDBC

I'm trying to create a java program to cleanup and merge rows in my table. The table is large, about 500k rows and my current solution is running very slowly. The first thing I want to do is simply get an in-memory array of objects representing all the rows of my table. Here is what I'm doing:
pick an increment of say 1000 rows at a time
use JDBC to fetch a resultset on the following SQL query
SELECT * FROM TABLE WHERE ID > 0 AND ID < 1000
add the resulting data to an in-memory array
continue querying all the way up to 500,000 in increments of 1000, each time adding results.
This is taking way to long. In fact its not even getting past the second increment from 1000 to 2000. The query takes forever to finish (although when I run the same thing directly through a MySQL browser its decently fast). Its been a while since I've used JDBC directly. Is there a faster alternative?
First of all, are you sure you need the whole table in memory? Maybe you should consider (if possible) selecting rows that you want to update/merge/etc. If you really have to have the whole table you could consider using a scrollable ResultSet. You can create it like this.
// make sure autocommit is off (postgres)
con.setAutoCommit(false);
Statement stmt = con.createStatement(
ResultSet.TYPE_SCROLL_INSENSITIVE, //or ResultSet.TYPE_FORWARD_ONLY
ResultSet.CONCUR_READ_ONLY);
ResultSet srs = stmt.executeQuery("select * from ...");
It enables you to move to any row you want by using 'absolute' and 'relative' methods.
One thing that helped me was Statement.setFetchSize(Integer.MIN_VALUE). I got this idea from Jason's blog. This cut down execution time by more than half. Memory consumed went down dramatically (as only one row is read at a time.)
This trick doesn't work for PreparedStatement, though.
Although it's probably not optimum, your solution seems like it ought to be fine for a one-off database cleanup routine. It shouldn't take that long to run a query like that and get the results (I'm assuming that since it's a one off a couple of seconds would be fine). Possible problems -
is your network (or at least your connection to mysql ) very slow? You could try running the process locally on the mysql box if so, or something better connected.
is there something in the table structure that's causing it? pulling down 10k of data for every row? 200 fields? calculating the id values to get based on a non-indexed row? You could try finding a more db-friendly way of pulling the data (e.g. just the columns you need, have the db aggregate values, etc.etc)
If you're not getting through the second increment something is really wrong - efficient or not, you shouldn't have any problem dumping 2000, or 20,000 rows into memory on a running JVM. Maybe you're storing the data redundantly or extremely inefficiently?

Categories

Resources