Best way to optimize DB inserts using Java - java

The program is written in java. 25 threads are processing 1 million tasks. Each task saves data into DB hence 1 million times DB insert happens. In order to optimize this, We tried following approach
Tasks save data into ConcurrentLinkedDeque
A thread polls the duque in periodic interval and gets all the available objects at that point in time.
Once the available objects' count reaches a threshold ( say 100K ), then create a thread to save.
But this approach is not improving overall performance.
I would like to reduce the number of times (1 million times currently ) DB insert happens to improve performance. Are there any alternate solution like High Performing - multiple concurrent publisher and single concurrent subscriber kind of implementation ?

Reduce the overhead of row-by-row processing by batching commands. Many APIs include ways to batch commands, or you can combine them yourself with a statement like this:
INSERT INTO products (product_no, name, price)
SELECT 1, 'Cheese', 9.99 FROM dual UNION ALL
SELECT 2, 'Bread' , 1.99 FROM dual UNION ALL
SELECT 3, 'Milk' , 2.99 FROM dual;

Related

Multithread program to update different parts of table with different cursors

I have a table of one million records. on a column of each record I have to do a math operation on java that takes a fair amount of time. To improve total time, I created threads. If I create 10 threads, I make a query on each thread of 100.000 records by using a SQL offset and a first fetch so I end up operating in the whole one million along all the threads.
It does improve the processing. I have 12 free processors. But I realized that if I create for example 4 threads, it is much better than one thread (half the total time), but if I create 8 threads, the time is almost the same as 4 threads.
On each thread, I am using something like
Connection conn=DriverManager.getConnection(...);
Statement stmt= conn.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_UPDATABLE);
ResultSet rs= stmt.executeQuery("select COL from DATA OFFSET X FETCH FIRST Y");
while(rs.next()){
//do math stuff on col
rs.updateString("COL", new_math_stuff);
rs.updateRow();
}
I was wondering if having different cursors on different connection on different parts of the table, still can cause too much IO on the table, hence having more threads doesn't help, or if the problem might by somewhere else. (I have very limited profiling capabilities because I am working on a remote computer and to compile the code I have to send it to some guy to approve it and do it.)

How to process large log file in java

I have 4 files and each one is 200 MB. I have created 4 threads and parallelly running 4 thread and each thread processing and adding in to Array blocking queue.
Some other thread is taking Array Blocking Queue and process and adding in to batch. The batch size is 5000 and executing batch and inserting records into database.But still its taking complete 4 files is around 6 mins to complete.
How increase performance in this case?
1) Make sure you have enough memory for queue+processor buffers+db buffers.
2) Batch size of 5k is a bit more than needed, in general you get up to speed in 100, not that iе makes much difference here though.
3) You can push data into oracle in multiple threads. Fetching sequences for ID fields population ahead, you'll be able to insert into 1 table in parallel, if you have not many indexes. Otherwise consider disabling/recalculating indexes, or insert into temporary table and then move everything into main one.
4) Take a look at oracle settings with fellow DB admin. Things like extend size/increase can change performance.

Reading from Database through multiple threads in java

I am reading data from vertica database using multiple threads in java.
I have around 20 million records and I am opening 5 different threads having select queries like this....
start = threadnum;
while (start*20000<=totalRecords){
select * from tableName order by colname limit 20000 offset start*20000.
start +=5;
}
The above query assigns 20K distinct records to read from db to each thread.
for eg the first thread will read first 20k records then 20K records starting from 100 000 position,etc
But I am not getting performance improvement. In fact using a single thread if it takes x seconds to read 20 million records then it is taking almost x seconds for each thread to read from database.
Shouldn't there be some improvement from x seconds (was expecting x/5 seconds)?
Can anybody pinpoint what is going wrong?
Your database presumably lies on a single disk; that disk is connected to a motherboard using a single data cable; if the database server is on a network, then it is connected to that network using a single network cable; so, there is just one path that all that data has to pass through before it can arrive at your different threads and be processed.
The result is, of course, bad performance.
The lesson to take home is this:
Massive I/O from the same device can never be improved by multithreading.
To put it in different terms: parallelism never increases performance when the bottleneck is the transferring of the data, and all the data come from a single sequential source.
If you had 5 different databases stored on 5 different disks, that would work better.
If transferring the data was only taking half of the total time, and the other half of the time was consumed in doing computations with the data, then you would be able to halve the total time by desynchronizing the transferring from the processing, but that would require only 2 threads. (And halving the total time would be the best that you could achieve: more threads would not increase performance.)
As for why reading 20 thousand records appears to perform almost as bad as reading 20 million records, I am not sure why this is happening, but it could be due to a silly implementation of the database system that you are using.
What may be happening is that your database system is implementing the offset and limit clauses on the database driver, meaning that it implements them on the client instead of on the server. If this is in fact what is happening, then all 20 million records are being sent from the server to the client each time, and then the offset and limit clauses on the client throw most of them away and only give you the 20 thousand that you asked for.
You might think that you should be able to trick the system to work correctly by turning the query into a subquery nested inside another query, but my experience when I tried this a long time ago with some database system that I do not remember anymore is that it would result in an error saying that offset and limit cannot appear in a subquery, they must always appear in a top-level query. (Precisely because the database driver needed to be able to do its incredibly counter-productive filtering on the client.)
Another approach would be to assign an incrementing unique integer id to each row which has no gaps in the numbering, so that you can select ... where unique_id >= start and unique_id <= (start + 20000) which will definitely be executed on the server rather than on the client.
However, as I wrote above, this will probably not allow you to achieve any increase in performance by parallelizing things, because you will still have to wait for a total of 20 million rows to be transmitted from the server to the client, and it does not matter whether this is done in one go or in 1000 gos of 20 thousand rows each. You cannot have two stream of rows simultaneously flying down a single wire.
I will not repeat what Mike Nakis says as it is true and well explained :
I/O from a physical disk cannot be improved by multithreading
Nevertheless I would like to add something.
When you execute a query like that :
select * from tableName order by colname limit 20000 offset start*20000.
from the client side you may handle the result of the query that you could improve by using multiple threads.
But from the database side you have not the hand on the processing of the query and the Vertica database is probably designed to execute your query by performing parallel tasks according to the machine possibilities.
So from the client side you may split the execution of your query in one, two or three parallel threads, it should not change many things finally as a professional database is designed to optimize the response time according to the number of requests it receives and machine possibilities.
No, you shouldn't get x/5 seconds. You are not thinking about the fact that you are getting 5 times the number of records in the same amount of time. It's about throughput, not about time.
In my opinion, the following is a good solution. It has worked for us to stream and process millions of records without much of a memory and processing overhead.
PreparedStatement pstmt = conn.prepareStatement(sql, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
pstmt.setFetchSize(Integer.MIN_VALUE);
ResultSet rs = pstmt.executeQuery();
while(rs.next()) {
// Do the thing
}
Using OFFSET x LIMIT 20000 will result in the same query being executed again and again. For 20 million records and for 20K records per execution, the query will get executed 1000 times.
OFFSET 0 LIMIT 20000 will perform well, but OFFSET 19980000 LIMIT 20000 itself will take a lot of time. As the query will be executed fully and then from the top it will have to ignore 19980000 records and give the last 20000.
But using the ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY options and setting the fetch size to Integer.MIN_VALUE will result in the query being executed only ONCE and the records will be streamed in chunks, and can be processed in a single thread.

Java - DB2 Performance Improvements

We have a SELECT statement which will take approx. 3 secs to execute. We are calling this DB2 query inside a nested While loop.
Ex:
While(hashmap1.hasNext()){
while(hashmap2.hasNext()){
SQL Query
}
}
Problem is, the outer While loop will execute approx. 1200 times and inner While loop will execute 200 times. Which means the SQL will be called 1200*200 = 240,000 times. Approx. each iteration of Outer While loop will take 150 secs. So, 1200 * 150 secs = 50 hrs.
We can afford only around 12-15hrs of time, before we kick off the next process.
Is there any way to do this process quickly? Any new technology which can help us in fetching these records faster from DB2.
Any help would be highly appreciated.
Note: We already looked into all possible ways to cut down the no.of iterations.
Sounds to me like you're trying to use the middle tier for something that the database itself is better suited for. It's a classic "N+1" query problem.
I'd rewrite this logic to execute entirely on the database as a properly indexed JOIN. That'll not only cut down on all that network back and forth, but it'll bring the database optimizer to bear and save you the expense of bringing all that data to the middle tier for processing.

Optimizing the data access layer

I have a web service (JAX-RS/Spring) that generates SQL queries which run against a temp table in Oracle. The data is then archived to another table (through 3 MERGE statements). The execution of each "job" (querying and merging) is done in the background through a JMS broker (ActiveMQ). The sequence of operations of each job is something like:
insert/update into table Q (select from table F) -- done between 4 and 20 times
merge into table P (select from table Q) -- twice
merge into table P (select from table F)
merge into table P (select from table F)
create a view V as select from table P
(table Q is a temp table).
When I dispatch two or three jobs like that, it takes around 6-7 minutes for each job to execute. But when I dispatch up to 15 running at the same time, the duration stretches out way longer.
is this happening because all these processes are trying to insert/update into the temp table Q? thus fighting for the resource? What techniques should I be looking at to optimize this? For example, I thought of making 5 or 6 duplicates of table Q and "load balance" the data access object queries against them.
Thanks
When I dispatch two or three jobs like
that, it takes around 6-7 minutes for
each job to execute. But when I
dispatch up to 15 running at the same
time, the duration stretches out way
longer.
There's any number of resources your processes could be contending for, not just the temporary table.
For starters, how many processors (CPUs/cores) does your database have? There's a pretty good rule of thumb that we shouldn't run more than 1.8 background jobs per processor. So there's no point in worrying about cloning your temporary table if you don't have enough processors to support a high degree of parallelism.
The key thing with tuning is: don't guess. Unlike some other DBMS products, Oracle has lots of instrumentation we can use to find out exactly where the time goes. It's called the Wait Interface. It's not perfect but it's a lot better than blindly re-designing your database schemas. Find out more.
If Q is really a temp table (as in a GLOBAL TEMPORARY TABLE) then each session will have a separate physical object, so they won't contend for locks or at the data level.
You are more likely to get contention on the permanent table P, or on server resources (memory and disk).

Categories

Resources