I want to get all table definition for all my tables.
And I want to do it fast (it is part of a script that I'm running a lot)
I am using oracle 11g, and I have 700 tables. On a plain jdbc code it takes 4 minutes and does:
s = con.statement("select DBMS_METADATA.GET_DDL(object_type,object_name) from user_objects where object_type = 'TABLE');
s.execute();
rs = s.getResultSet();
while(rs.next()){
rs.getString(1);
}
SO I want to optimize this code and reach around 20 sec.
I have already reached 40-50 sec by creating 14 threads that each opens a connection to the database and reads a part of the information, using mod on the rownum.
But this is not enough.
I am thinking in these directions:
http://docs.oracle.com/cd/B10501_01/java.920/a96654/connpoca.htm#1063660 - connection caching. can it help speed up things by replacing my 14 connections with connectionCaching?
Is it possible to keep the tables accessed by this function, in the KEEP buffer cache area?
Anyway of indexing some of the information here?
Any other suggestions will be greatly appreciated.
Thank you
Is it required to always get the DDL even if the tables haven't been changed? Otherwise only get the DDL of those tables where ALL_OBJECTS.LAST_DDL_TIME has changed since you last retrieved it.
Another option would be to write your own GET_DDL in a way that is able to get more than one table at once.
I'm afraid there is no easy to make it faster. The whole GET_DDL thing is implemented in Java and uses XSLT transformation as a part of generation process.
Maybe you will find this faster.
http://metacpan.org/pod/DDL::Oracle
I would firstly go for HAL's suggestion of only capturing changes, but I'd also look at eliminating any options that I do not need -- STORAGE clauses, for example?
Related
I recently got into an interview and I was asked a question
We have a table employee(id, name). And in our java code, we are writing a logic to fetch data from this table and display it in UI. The query is
Select id,name from employee
Query was that during debugging, we found that this jdbc call to fire the query and get the output is taking say 20 secs and we want to reduce this to say 5 seconds or to the optimal time. How can we you do that, or how will I tackle this problem?
As there is no where clause in the query, I didn't suggest to index the column.
As this logic is taking 20 secs every time, so, some other code getting a lock on this table is also out of question.
I suggested that limiting the number of records fetched from the table should help but the interviewer didn't look convinced
Is there anything else we can do as a developer to optimize the call. I guess DBA might tune database setting to improve the performance of this query, but is there any other way
OK, so this is an interview question, so both the problem and the solutions are hypothetical. The interviewer is asking for possible optimizations and / or approaches. Here are some that are most likely to help:
Modify the query to page the data rather than fetching the whole lot. This looks applicable for the example query. Note that this is not just "limiting the number of rows selected from the table" ... which is probably why the interviewer looked doubtful when you said that!
If you do need to display the entire selected record set but in a reduced form (e.g. summed, averaged, sorted, collated etc), do the reduction in the query rather than by fetching the records and doing it in the client.
Tune the fetchSize() as suggested by Ivan.
Here are some other ideas that are less likely to help and / or will require extensive reworking.
Look at the network configs. For example you may be able to get better throughput by OS-level tuning TCP buffer, or optimizing physical or virtual network paths.
Run the query on the database server itself (to eliminate network overheads)
Use an in-memory table
Query a secondary database server; e.g. a readonly snapshot or a slave
You can try to increase fetchSize() for Statement/PreparedStatement to decrease number of network roundtrips between application server/desktop and database server.
You can start several threads that will query some piece of data and then merge all data from several threads.
EDIT: doesn't apply to this situation because id and name are the only columns on this table, but still useful for other readers to note.
If you create an index covering both id and name, then the database can use that index to read the data faster since it wont even have to even read the table.
See this link for a more thorough explanation.
if the index contains all the columns you’re requesting it doesn’t even need to look in the table. That concept is known as index coverage.
I was curious about how fast we can execute MySQL query through a loop in Java and well it is taking extremely long time with the code I wrote. I am sure there is a faster way to do it. Here is the piece of code that is executing the query.
PreparedStatement ps=connection.prepareStatement("INSERT INTO books.author VALUES (?,?,?)");
for (int i=0;i<100000;i++)
{
ps.setString(1,test[i][0]);
ps.setString(2,test[i][1]);
ps.setString(3,test[i][2]);
ps.addBatch();
}
int []p=ps.executeBatch();
Any help would be much appreciated. Thank you
Your basic approach is correct. Since you are using MySQL you may want to add rewriteBatchedStatements=true to your connection URL as discussed in this answer. It has been known to significantly speed up batch INSERT operations to MySQL databases.
There are a number of other things that kick in when you have a huge batch. So going "too" big will slow down the rows/second. This depends on a few settings, the specific table schema, and other things. I have see a too-big batch run twice as slow as a more civilized batch.
Think of the overhead of parsing the query as being equivalent to inserting an extra 10 rows. Based on that Rule of Thumb, a batch insert of 1000 rows is within 1% of the theoretical maximum speed.
It may be faster to create a file with all the rows and do LOAD DATA. But, again, this depends on various settings, etc.
I have a complex SQL statement that takes a long time to execute. This is going to be a problem as more users start using the system simultaneously.
Are there any options for sorting results in advance and then assigning them to Java POJO's using Hibernate? This way the processed information is already sitting in the MySQL DB waiting for retrieval without doing it upon execute...
I've looked into DB Views + Hibernate but didn't find much...
I think you should look at indexing. I dont think that ist possible prefetch results of sql queries. If query cannot be optimized and it is really REALLY important then you can maybe use some parallel implementation for processing query.
I need one help from you guys regarding JDBC performance optimization. One of our pojo is using jdbc to connect to a oracle database and retrieve the records. Basically the records are email addresses basing upon which emails will be sent to the users. The problem here is the performance. This process happens every weekend and the records are very huge in number, around 100k.
The performance is very slow and it worries us a lot. Only 1000 records seem to be fetched from the database every 1 hour, which means that it will take 100 hours for this process to complete (which is very bad). Please help me on this.
The database server and the java process are in two different remote servers. We have used rs_email.setFetchSize(1000); hoping that it would make any difference but no change at all.
The same query executed on server takes 0.35 seconds to complete. Any quick suggestion would of great help to us.
Thanks,
Aamer.
First look at your queries. Analyze them. See if the SQL could be made more efficient (ie, ask the database for what you want, not for what you don't want -- makes a big difference). Also check to see if there are indexes on any fields in your where and join clauses. Indexes make a big difference. But it can't be just any indexes. They have to be good indexes (ie, that the fields that make up the index provide enough uniqueness for the database to retrieve things appropriately). Work with your DBA on this. Look for either high run time against the db or check for queries with high CPU usage (even if the queries run sub-second). These are the thing that can kill your database.
Also from a code perspective, check to see if you are opening and closing your connections or if you are re-using them. Can make a big difference too.
It would help to post your code, queries, table layouts, and any indexes you have.
Use log4jdbc to get the real sql for fetching single record. Then check speed and plan for that sql. You may need a proper index or even db defragmentation.
Not sure about the Oracle driver, but I do know that the MySQL driver supports two different results retrieval methods: "stream" and "wait until you've got it all".
The streaming method lets you start process the results the moment you've got the first row returned from the query, whereas the other method retrieves the entire resultset before you can start work on it. In cases where you deal with huge recordsets, this often leads to memory exceptions, or slow performance because java hit the "memory roof" and the garbage collector can't throw away "used" records like it can in the streaming mode.
The streaming mode doesn't let you navigate/scroll the resultset the way the "normal"/"wait until you've got it all" mode...
Anyway, not sure if this is of any help but it might be worth checking out.
My answer to your question, in summary is:
1. Check network
2. Check SQL
3. Check Java code.
It sounds very slow. First thing to check would be to see if you have a slow network. You can do this pretty quickly by just pinging the database server. Or run the database server on the same machine as your JVMM. If it is not the network, get an explain plan for your SQL and ensure you are not doing table scans when you don't need to be. If it is not the network or the SQL, then it's time to check your Java code. Are you doing anything like blocking when you shouldn't be?
I'm trying to create a java program to cleanup and merge rows in my table. The table is large, about 500k rows and my current solution is running very slowly. The first thing I want to do is simply get an in-memory array of objects representing all the rows of my table. Here is what I'm doing:
pick an increment of say 1000 rows at a time
use JDBC to fetch a resultset on the following SQL query
SELECT * FROM TABLE WHERE ID > 0 AND ID < 1000
add the resulting data to an in-memory array
continue querying all the way up to 500,000 in increments of 1000, each time adding results.
This is taking way to long. In fact its not even getting past the second increment from 1000 to 2000. The query takes forever to finish (although when I run the same thing directly through a MySQL browser its decently fast). Its been a while since I've used JDBC directly. Is there a faster alternative?
First of all, are you sure you need the whole table in memory? Maybe you should consider (if possible) selecting rows that you want to update/merge/etc. If you really have to have the whole table you could consider using a scrollable ResultSet. You can create it like this.
// make sure autocommit is off (postgres)
con.setAutoCommit(false);
Statement stmt = con.createStatement(
ResultSet.TYPE_SCROLL_INSENSITIVE, //or ResultSet.TYPE_FORWARD_ONLY
ResultSet.CONCUR_READ_ONLY);
ResultSet srs = stmt.executeQuery("select * from ...");
It enables you to move to any row you want by using 'absolute' and 'relative' methods.
One thing that helped me was Statement.setFetchSize(Integer.MIN_VALUE). I got this idea from Jason's blog. This cut down execution time by more than half. Memory consumed went down dramatically (as only one row is read at a time.)
This trick doesn't work for PreparedStatement, though.
Although it's probably not optimum, your solution seems like it ought to be fine for a one-off database cleanup routine. It shouldn't take that long to run a query like that and get the results (I'm assuming that since it's a one off a couple of seconds would be fine). Possible problems -
is your network (or at least your connection to mysql ) very slow? You could try running the process locally on the mysql box if so, or something better connected.
is there something in the table structure that's causing it? pulling down 10k of data for every row? 200 fields? calculating the id values to get based on a non-indexed row? You could try finding a more db-friendly way of pulling the data (e.g. just the columns you need, have the db aggregate values, etc.etc)
If you're not getting through the second increment something is really wrong - efficient or not, you shouldn't have any problem dumping 2000, or 20,000 rows into memory on a running JVM. Maybe you're storing the data redundantly or extremely inefficiently?