I have a query which has 2 'in' Clauses. First in clause takes around 125 values and second in clause of query takes around 21000 values. Its implemented using JPA CriteriaBuilder.
Query itself executes very fast and return results within seconds. Only problem is entityManager.createQuery(CriteriaQuery) takes around 12-13 minutes to return.
I search all over SO, all the threads are related to performance of Query.getResultList. None of them discuss about performance of entityManager.createQuery(CriteriaQuery). If you have seen such behavior earlier, please let me know, how to resolve it.
My JDK version is 1.7. Dependency version of javaee-api is 6.0. Application is deployed on JBOSS EAP 6.4. But that's not the concern as of now, as I am testing my code using junit using EntityManager connected to actual Oracle database. If you require more information, kindly let me know.
A hybrid approach is to dynamically create a query and then save it as a named query in the entity manager factory.
At that point it becomes just like any other named query that may have been declared statically in metadata. While this may seem like a good compromise, it turns out to be useful in only a few specific cases. The main advantage it offers is if there are queries that are not known until runtime, but then reissued repeatedly. Once the dynamic query becomes a named query it will only bear the cost of processing once.
It is implementation-specific whether that cost is paid when the query is registered as a named query, or deferred until the first time it is executed.
A dynamic query can be turned into a named query by using the
EntityManagerFactory addNamedQuery()
Keep us informed by the result and good luck
I observed that, having single query with 21 IN clauses (each with 1000 expressions) and all combined with OR clauses, made query run slower. I tried another approach of executing every IN Clause as a part of separate query. So these 21 individual queries performed better overall.
Another issue I observed was that Query with CriteriaBuilder was slow when result set is huge (something like 20K rows in result set). I solved this issue by adding query hint to my typed query:
TypedQuery.setHint("org.hibernate.fetchSize", 5000);
Hope it will help others.
Code in Hibernate is not expected to be used for binding lots of params:
for ( ImplicitParameterBinding implicitParameterBinding : parameterMetadata.implicitParameterBindings() ) {
implicitParameterBinding.bind( jpaqlQuery );
}
Unfortunately you need to find different approach if you want to do something similar.
Related
I'm running Hibernate 4.1 with Javassist runtime instrumentation running on top of Hikari pool, on top of Oracle 12c. JDK is 1.7.
I have a query that runs pretty fast on the database and fetches about 500 entities in Hibernate. The query runtime, according to JProfiler is quite small, about 11 ms, but in total Query.list runs about 7 seconds.
I've tried removing all filters and it shows that most of the time is spent in Javaassist and other Hibernate-related reflection calls (like AbstractProxyHandler and such). I read that the reflection overhead should be pretty small, but it seems like it is not, and it seems like it is too much.
Could you please advise what could be the bottleneck here?
Make sure the object you are retrieving does not have sub-objects that are being fetched lazily as a SELECT instead of eagerly as a JOIN. This can result in a behavior known as SELECT N + 1, where Hibernate ends up running a query to get the 500 objects from their respective table, then an additional query for each object to get the child object. If you have 4 or 5 relationships that are being fetched as SELECT statements for each record, and you have 500 records, suddenly you're running around 2000 queries in order to get the List.
I would recommend turning on the SQL logging for Hibernate to see which queries it's running. If it dumps out a really long list of SELECT queries when you're fetching your list, look at your mapping file to see how your relationships are set up. Try adjusting them to be a fetch="join" and see if those queries go away and if the performance improves.
Here are some possibly related Stack Overflow questions that may be able to provide more specific details.
Hibernate FetchMode SELECT vs JOIN
What is N+1 SELECT query issue?
Something else to note about profilers and other tools of that nature. Often when tracking down a performance issue, a particular block of code will show up as where the most time is being spent. The common conclusion people tend to jump to is that the block of code in question is slow. In your case, you seem to be observing Hibernate's reflective code as being where the time is spent. It is very important to consider that this code itself might not actually be slow, but it is where the time is being spent because it is being called frequently due to algorithm complexity. I have found in many problems, serialization or reflection appears to be slow, when the reality is that the code is used to communicate with an external resource, and that resource is being called 1000s of times when it should only be called a handful. Making 1000s of queries to fetch your list will result in your sampling showing that a lot of time is being spent in the code that processes those queries. Be careful not to mistake code that is called often due to a design/configuration issue for code that is slow. The problem very likely does not lay in hibernate's use of reflection, since reflection generally isn't slow on the order of seconds.
I have a question, would like to get some help with.
I have the query running from Java.
SELECT DISTINCT field1, field1
from tblTableA WITH (NOLOCK)
WHERE criteriaField='CONSTANT TEXT'
I run it with jpa
Query qry = entMgr.createNativeQuery(myQry) ;
List sqlResult = qry.getResultList() ;
Now, that qry.getResultList() takes too much time to run - 75 or more seconds. Yes, it returns close to 700 000 records, but the same query ran on Weblogic 10, using ejb2 runs in less than 5 seconds time
Can anyone help resolving this issue, seems like there maybe a configuration I am missing, or a technique I am not following.
There is something on account of using
jbosscmp-jdbc.xml.
I don't have that in my set up, but found out that there is a lazy-loading feature that we can configure. Now, I am not sure how make the query I am running be configured in xml file.
Also, can this be used with annotations instead of xml file ?
I would try to run this query inside of a non-transactional method:
#TransactionAttribute(TransactionAttributeType.NOT_SUPPORTED)
List getResults(..){
Query qry = entMgr.createNativeQuery(myQry) ;
return qry.getResultList() ;
}
This is sometimes not allowed depending on the environment and is mainly used for the optimization of queries expecting to have large results sets and which would later be managed by the PersistenceContext (so basically when you would use HQL instead of native)
But i would give it a try.
You are performing this select query within a transaction scope. I found an old JIRA ticket on Jboos's site. As the ticket suggests, there is a potential around the flush. If you perform a query with EJB3, a flush is performed or attempted automatically for all the objects you retrieve with your native query. The idea seems to be avoid getting stale objects from the database. But in your case, it is not applicable. Set the flush mode to COMMITand see if the performance improves.
query.setFlushMode( FlushModeType.COMMIT );
Also turn off the Hibernate logging and see if that makes any difference.
I've been using ORM frameworks for a while but I am rather new to Hibernate, though.
Suppose you have a query (is it a Query or a Criteria, does not matter) that retrieves a great result set and that you want to paginate though it. Would you rather use the setMaxResult() and setFirstResult() methods combo, or a ScrollableResult?
Which is the best approach regarding the performances (execution time AND memory consumption)?
If you are implementing a Web application that serves separate pages of results in separate request-response cycles, then there's no way you can use ScrollableResult to any advantage. Use setFirst/Max/Result. However, this can be a real performance killer, depending on the exact query and the total size of the result. Especially if the poor db must sort the whole result set every time so it can calculate what are the 100-110th records.
We had the same questions the other day, and settled for setMaxResult(..) and setFirstResult(..). The problems are two:
ScrollableResult may execute one query for each call to next() if your jdbc driver or database are not handling it properly. This was the case with us (MySQL)
it is hibernate-specific, rather than JPA standard.
I have the following configuration:
SQL Server 2008
Java as backend technology - Spring + Hibernate
Basically what I want to do is a select with a where clause on a table. The problem is the table has about 700M entries and the query takes a really long time.
Can you please indicate some pointers on where to optimize the query or what sort of techniques are can I use in order to get an improvement in performance?
Thanks.
Using indexes is the standard technique used to deal with this problem. As requested, here are some pointers that should get you started:
http://odetocode.com/articles/70.aspx
http://www.simple-talk.com/sql/learn-sql-server/sql-server-index-basics/
http://www.petri.co.il/introduction-to-sql-server-indexes.htm
The first thing I do in this case is isolate whether it is the amount of data I am returning that is the problem or not (an i/o issue). A simple non-scientific way to do this is change your query to just return the count:
select count(*) --just return a count, no data!
from MyTable
inner join MyOtherTable on ...
where ...
If this runs very quickly, it tells you your indexes are in order (assuming no sub-selects in your WHERE clause). If not, then you need to work on indexes, the WHERE clause, or your query construction itself (JOINs being done, etc).
Once that is satisfactory, add back in your SELECT clause. If it is slow, you are going to have to look at your data access pattern:
Can you return fewer columns?
Can you return fewer rows at once?
Is there caching you can do in the application layer?
Is this query a candidate for partitioned/materialized views (if your database supports those)?
I would run Profiler to find the exact query that is being generated. ORMs can create less than optimal queries. Once you know the query, you can run it in SSMS and see the execution plan. This will give you clues as to where you have performance problems.
Several things that can cause performance problems:
Lack of correct indexing (Foreign keys should be indexed if you have
joins as well as the criteria in the where clause)
Lack of sargability in the where clause forcing the query to not use
existing indexes
Returning more columns than are needed
Correlated subqueries and scalar functions that cause
row-by-agonzing-row operations
Returning too much data (will anybody really be looking at 1 million
records returned? You only want to return the amount you show on page
not the whole possible recordset)
Locking and blocking
There's more (After all whole very long books are written o nthis subject) but that should be enough to get you started at where to look.
You should provide some indexes for those column you often use to restrict the result. Other thing is the pagination of the result set.
Regardless of the specific DB, I would do the following:
run an explain analyze
make sure you have an index for the columns that are part of your where clause
If indexes are ok, it's very likely that you are fetching a lot of
records from disk, which is very slow: if you really cannot refine
your query so that you fetch fewer records, consider clustering your
table, to improve disk locality of your records.
I have a Java code that uses Spring to connect and execute sql on an Oracle DB. I have a query that takes long time to execute (20 minutes or sometimes more). I have a Executor Service and it has a Thread that will execute the query and process the results. If i put a timeout to the DB and Spring, the system will time out correctly but will return nothing else before that. If i run the query from SQL plus, it will return values. The time out is set up 3 times what it takes to execute on SQL Developer.
Any ideas!?
Assuming that your Spring query is using bind variables, are you using bind variables when you execute the query in SQL*Plus/ SQL Developer? Or are you using literals?
What version of Oracle are you using?
Have you checked to see whether the query plans for the two environments are different?
20 minutes for a query in Oracle? I'll bet you don't have appropriate indexes on the columns in your WHERE clause.
The dead giveaway is to do an EXPLAIN PLAN on the query. If you see a TABLE SCAN, take appropriate measures.
If you can run the same query in SQL*Plus and see it return in a reasonable time, then I'm incorrect and the problem is due to something else that you did in Java code.
I don't see why you need a separate thread for a query. I'd run the code straight, without a thread, and see how it behaves. If you aren't indexed properly, add some; if the query brings back too much data, add WHERE clauses to restrict it. You've taken extraordinary measures without really understanding what the root cause is.