I have a complex SQL statement that takes a long time to execute. This is going to be a problem as more users start using the system simultaneously.
Are there any options for sorting results in advance and then assigning them to Java POJO's using Hibernate? This way the processed information is already sitting in the MySQL DB waiting for retrieval without doing it upon execute...
I've looked into DB Views + Hibernate but didn't find much...
I think you should look at indexing. I dont think that ist possible prefetch results of sql queries. If query cannot be optimized and it is really REALLY important then you can maybe use some parallel implementation for processing query.
Related
I am working on a Java Service (Hibernate) and I am calling sequentially a count query and a query to fetch the corresponding records (native queries). There are cases where the count is different than the actual records fetched by the query retrives the data.
I would like to secure that both queries are about to use the same dataset.
Any ideas on this?
I guess it is quite not good idea to use counts.
think about what primary key on record stands for... or maybe other fields identify records you need.
Retrieved Dataset on client gives you what was in DB at time you ran your query.
There are some dangerous abilities to lock table or records while your transaction not commited yet... but I do not recommend to try them. if it is about Db used by multiple services/clients or threads in parallel. I guess you have such system as counts change while your queries run.
It needs very careful handling to use locks and really dangerous to slow and hang other threads
The following query generated by hibernate takes 13+ seconds and locks the table:
SELECT COUNT(auditentit0_.audit_id) AS col_0_0_ FROM Audit auditentit0_ WHERE 1=1;
The growing Microsoft SQL server database table contains 90+ million rows.
For Microsoft SQL server, I have found an accurate meta data way of getting the same information very quickly.
However, I would rather not write custom code for Microsoft sql server and oracle (the next database) if hibernate has a way of getting this information.
Here is an example meta data query for Microsoft sql server that is accurate and almost instant:
SELECT SUM (row_count) FROM sys.dm_db_partition_stats WHERE object_id=OBJECT_ID('huge_audit_table') AND (index_id=0 or index_id=1);
Is there a way to have hibernate issue a similar query for a table row count?
One posted answer has indicated that a view could be of use. I'm investigating this post to see if it can solve the issue:
https://vladmihalcea.com/map-jpa-entity-to-view-or-sql-query-with-hibernate/
In hibernate you should use projections like in the link you provided in order to guarantee that it works on multiple dbms:
protected Long countByCriteria(DetachedCriteria criteria) {
Criteria crit = criteria.getExecutableCriteria(getSession());
crit.setProjection(Projections.rowCount());
return (Long)crit.uniqueResult();
}
What engine are you using in mysql? I never had a blocking problem with row count in MySql or Oracle. Maybe the following link will help you: Any way to select without causing locking in MySQL?
Also, after some quick reading i see that Sql Server does indeed block on count.
Maybe you could use a stored procedure or some other mechanism to pass the problem to the dbms.
Edit:
Projections in Hibernate are used to select the columns to fetch, the columns to group elements by, and to use built-in aggregate functions (sum, count, avg, max, min, countDistinct).
It helps you keep your application database-agnotic. Remember that hibernate supports around 30 databases.
In your case you have an specific problem with mssql as the count blocks the table prioritizing accuracy. And using the system views is really quick as you get an estimate but isnĀ“t standard.
You could encapsulate the problem into a view or stored procedure dbms dependant. Or maybe you could try with a NOLOCK hint or READ UNCOMMITED in hibernate (in a count of an audit table it should be acceptable).
To solve this particular problem we stepped back and changed how the UI functions. Through a collaborative effort between UIX and UI developers we agreed that unfiltered queries will NOT ask for total counts. The initial screen load will show only a page full of data. No page 1 of 60,000 controls will exists. Only when the user enters specific criteria will the total count come into play. Those queries should be very fast. Now... it is possible for the user to still setup a query that will be just as bad as the original problem. It should be the exception versus the norm.
So there really is not a solid answer for the OP. If you are faced with this type of problem, if you have control of the UI and API, then it is time to rethink the solution. Think of how google handles paging from a UI perspective. The days of showing a "page 1 of (XX)" are gone IMHO.
I am using jdbc mysql. Let's assume there is a table in my db called Test. And there is a 700k rows. But fetching all rows are taking huge time. I am using preparedStatement. But I want to use multi threading in such a way that think there is 10 threads. for. eg 1st thread will fetch 70k rows then 2nd will fetch next 70k and so on. How to implement this?
Forgive me if this is too obvious and you tried it or it won't work in your situation, but caching might be very helpful here.
Regarding actually doing it with multi-threading, It might make sense to have some procedure you run (might need a new column in your table to do this) that would assign ids that you can query - something like " WHERE id BETWEEN value1 AND value2". Each Thread would query a different range. This would be faster than using order by, since this way avoids the need for the database to sort.
If you do want to go the order by route though, consider indexing your database so that that ordering doesn't take extra time.
I need one help from you guys regarding JDBC performance optimization. One of our pojo is using jdbc to connect to a oracle database and retrieve the records. Basically the records are email addresses basing upon which emails will be sent to the users. The problem here is the performance. This process happens every weekend and the records are very huge in number, around 100k.
The performance is very slow and it worries us a lot. Only 1000 records seem to be fetched from the database every 1 hour, which means that it will take 100 hours for this process to complete (which is very bad). Please help me on this.
The database server and the java process are in two different remote servers. We have used rs_email.setFetchSize(1000); hoping that it would make any difference but no change at all.
The same query executed on server takes 0.35 seconds to complete. Any quick suggestion would of great help to us.
Thanks,
Aamer.
First look at your queries. Analyze them. See if the SQL could be made more efficient (ie, ask the database for what you want, not for what you don't want -- makes a big difference). Also check to see if there are indexes on any fields in your where and join clauses. Indexes make a big difference. But it can't be just any indexes. They have to be good indexes (ie, that the fields that make up the index provide enough uniqueness for the database to retrieve things appropriately). Work with your DBA on this. Look for either high run time against the db or check for queries with high CPU usage (even if the queries run sub-second). These are the thing that can kill your database.
Also from a code perspective, check to see if you are opening and closing your connections or if you are re-using them. Can make a big difference too.
It would help to post your code, queries, table layouts, and any indexes you have.
Use log4jdbc to get the real sql for fetching single record. Then check speed and plan for that sql. You may need a proper index or even db defragmentation.
Not sure about the Oracle driver, but I do know that the MySQL driver supports two different results retrieval methods: "stream" and "wait until you've got it all".
The streaming method lets you start process the results the moment you've got the first row returned from the query, whereas the other method retrieves the entire resultset before you can start work on it. In cases where you deal with huge recordsets, this often leads to memory exceptions, or slow performance because java hit the "memory roof" and the garbage collector can't throw away "used" records like it can in the streaming mode.
The streaming mode doesn't let you navigate/scroll the resultset the way the "normal"/"wait until you've got it all" mode...
Anyway, not sure if this is of any help but it might be worth checking out.
My answer to your question, in summary is:
1. Check network
2. Check SQL
3. Check Java code.
It sounds very slow. First thing to check would be to see if you have a slow network. You can do this pretty quickly by just pinging the database server. Or run the database server on the same machine as your JVMM. If it is not the network, get an explain plan for your SQL and ensure you are not doing table scans when you don't need to be. If it is not the network or the SQL, then it's time to check your Java code. Are you doing anything like blocking when you shouldn't be?
I've been looking around trying to determine some Hibernate behavior that I'm unsure about. In a scenario where Hibernate batching is properly set up, will it only ever use multiple insert statements when a batch is sent? Is it not possible to use a DB independent multi-insert statement?
I guess I'm trying to determine if I actually have the batching set up correctly. I see the multiple insert statements but then I also see the line "Executing batch size: 25."
There's a lot of code I could post but I'm trying to keep this general. So, my questions are:
1) What can you read in the logs to be certain that batching is being used?
2) Is it possible to make Hibernate use a multi-row insert versus multiple insert statements?
Hibernate uses multiple insert statements (one per entity to insert), but sends them to the database in batch mode (using Statement.addBatch() and Statement.executeBatch()). This is the reason you're seeing multiple insert statements in the log, but also "Executing batch size: 25".
The use of batched statements greatly reduces the number of roundtrips to the database, and I would be surprised if it were less efficient than executing a single statement with multiple inserts. Moreover, it also allows mixing updates and inserts, for example, in a single database call.
I'm pretty sure it's not possible to make Hibernate use multi-row inserts, but I'm also pretty sure it would be useless.
I know that this is an old question but i had the same problem that i thought that hibernate batching means that hibernate would combine multiple inserts into one statement which it doesn't seem to do.
After some testing i found this answer that a batch of multiple inserts is just as good as a multi-row insert. I did a test inserting 1000 rows one time using hibernate batch and one time without. Both tests took about 20s so there was no performace gain in using hibernate batch.
To be sure i tried using the rewriteBatchedStatements option from the MySQL Connector/J which actually combines multiple inserts into one statement. It reduced the time to insert 1000 records down to 3s.
So after all hibernate batch seems to be useless and a real multi-row insert to be much better. Am i doing something wrong or what causes my test results?
The Oracle bulk insert collect an array of entyty and pass in a single block to the db associating to it a unic ciclic insert/update/delete.
Is unic way to speed network throughput .
Oracle suggest to do it calling a stored procedure from hibernate passing it an array of datas.
http://biemond.blogspot.it/2012/03/oracle-bulk-insert-or-select-from-java.html?m=1
Is not only a software problem but infrastructural!
Problem is network data flow optimization and TCP stack fragmentation.
Mysql have function.
You have to do something like what is described in this article.
Normal transfer on network the correct volume of data is the solution
You have also to verify network mtu and Oracle sdu/tdu utilization respect data transferred between application and database