setParameterList list with huge data - java

I have an Integer array list with 8000 items in it.
And i set that array list in hql using setParameterList method.
Just an example query
return (Integer) sessionFactory.getCurrentSession().createQuery("update data where Id in (:list)").setParameterList("list", arrayList).executeUpdate();
but after executing the query i got this error.
java.lang.StackOverflowError
at org.hibernate.hql.ast.QueryTranslatorImpl$JavaConstantConverter.visit(QueryTranslatorImpl.java:585)
at org.hibernate.hql.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:64)
at org.hibernate.hql.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:65)
at org.hibernate.hql.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:66)
at org.hibernate.hql.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:66)
at org.hibernate.hql.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:66)
at org.hibernate.hql.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:66)
at org.hibernate.hql.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:66)
at org.hibernate.hql.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:66)
is there any way to solve this issue in hibernate. may be this will work with pure sql query. But i just want to know is there any other way in HQL.

If your list comes from another SQL query, try using WHERRE EXISTS instead.
Otherwise, you might have to update each element independently inside a loop.
IN clause on thousands of items is usually not well handled by databases.

This issue seems to be covered by the documentation. The authors recommend doing this kind of operation in batches.
http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/batch.html

Your query isn't a valid HQL query. You missed a set clause in the query:
update Data set foo = 7 where id in (:list)
That said, 8000 IDs in an in clause is a lot. Databases have limits on the size of a query. Oracle for example doesn't accept more than 1000 elements in an IN clause.

Related

Is there any possibility for a workaround to PostgreSQL 32760 bind parameters limitation?

I have a JPA method in my repository trying to find entities with a where clause. The problem is that i have huge data set, and when i try to send more than 32k elements in the list clause, i received an error. I found that is a PostgreSQL driver limitation, but i cant find a workaround.
I tried Pageable request but is hard to send only 30k for 8 millions record. Is there any possibility to send more than 30k objects in my in list where clause?
List<Object> findAllByIdIn(List<Long> ids)
No, you don't want to do it especially if you plan to send 8 million identifiers. Working around the IN statement or bind parameter limit is inefficient. Consider the following:
Thousands of bind parameters will result in megabytes of SQL. It will take considerable time to send the SQL text to the database. In fact the database might take longer to read the SQL text than execute the query as per Tom's answer to "Limit and conversion very long IN list: WHERE x IN ( ,,, ...)" question.
SQL parsing will be inefficient. Not only the megabytes of SQL text take time to read but with increased bind parameter count each query will usually have a distinct number of bound parameters used. This distinct bound parameter count is going to result in each query being parsed and planned separately (see this article which explains it).
There is a hard limit of bound parameters in a SQL statement. You just discovered it, 32760.
For those types of queries it's usually better to create temporary tables. Create a new temporary table before your query, insert all the identifiers into it and join it with the entity table. This join will be equivalent to IN condition except SQL text will be short.
It's important to understand from where are these 8 million identifiers loaded. If you are pulling these from the database in the previous query just to pass them back to the the next query you most likely want to write a stored procedure. There is possibly a flaw in your current approach, JPA is not always the right tool for the job.

Is there a way to make query return a ResultSet?

I have the following query:
#Select("SELECT* FROM "+MyData.TABLE_NAME+" where data_date = #{refDate}")
public List<MyData> getMyData(#Param("refDate") Date refDate);
This table data is HUGE! Loading so many rows in memory is not the best way!
Is it possible to have this same query return a resultset so that I can just iterate over one item?
edit:
I tried adding:
#ResultType(java.sql.ResultSet.class)
public ResultSet getMyData(#Param("refDate") Date refDate);
but it gives me:
nested exception is org.apache.ibatis.reflection.ReflectionException: Error instantiating interface java.sql.ResultSet with invalid types () or values (). Cause: java.lang.NoSuchMethodException: java.sql.ResultSet.<init>()
I'd suggest you use limit in your query. limit X, Y syntax is good for you. Try it.
If the table is huge, the query will become slower and slower. Then the best way to to iterate will be to filter based on id and use limit.
such as
select * from table where id>0 limit 100 and then
select * from table where id>100 limit 100 etc
There are multiple options you have ...
Use pagination on database side
I will just suppose the database is oracle. However other db vendors would also work. In oracle you have a rownum with which you can limit number of records to return. To return desired number of records you need to prepare a where clause using this rownum. Now, the question is how to supply a dynamic rownum in a query. This is where dynamic sqls of mybatis comes in use. You can pass these rownum values inside a parameter map which there onwards you can use in your query inside a mapper xml using a #{} syntax. With this approach you filter the records on db level itself and only bring or prepare java objects which are needed or in the current page.
Use pagination on mybatis side
Mybatis select method on sqlSession has a Rowbounds attribute. Populate this as per your needs and it will bring you those number of records only. Here, you are limiting number of records on mybatis side whereas in first approach the same was performed on db side which is better performant .
Use a Result handlers
Mybatis will give you control of actual jdbc result set. So, you can do/iterate over the result one by one here itself.
See this blog entry for more details.

Using Hibernate setFirstResult without any ordering

I am new to hibernate. I am confused with criteria's setFirstResult method.
From the documentation it seems hibernate returns rows from the the given number.
Since SQL query does not guarantee the ordering of rows without order by clause,
how setFirstQuery works in this case(without orderBy clause)?
Does hibernate read index information from the database?
If I execute same SQL query multiple times, ordering might change, in this case how setFirstResult work?
Hibernate can not do something by its own unless its supported by underlying databases. Because Hibernate queries finally get transformed to Sql only.
Having said that it uses underlying databases capabilities like for PostgresSQL and MySQL it will generate query like limit ? offset ? .
You can add custom order using addOrder
.addOrder( Order.asc("name") )
It is your task to add an order by, the function as you noticed won't guarantee same results if executed several times over the same set of data.
setFirstResult is typically used in pagination.

Large SQL dataset query using java

I have the following configuration:
SQL Server 2008
Java as backend technology - Spring + Hibernate
Basically what I want to do is a select with a where clause on a table. The problem is the table has about 700M entries and the query takes a really long time.
Can you please indicate some pointers on where to optimize the query or what sort of techniques are can I use in order to get an improvement in performance?
Thanks.
Using indexes is the standard technique used to deal with this problem. As requested, here are some pointers that should get you started:
http://odetocode.com/articles/70.aspx
http://www.simple-talk.com/sql/learn-sql-server/sql-server-index-basics/
http://www.petri.co.il/introduction-to-sql-server-indexes.htm
The first thing I do in this case is isolate whether it is the amount of data I am returning that is the problem or not (an i/o issue). A simple non-scientific way to do this is change your query to just return the count:
select count(*) --just return a count, no data!
from MyTable
inner join MyOtherTable on ...
where ...
If this runs very quickly, it tells you your indexes are in order (assuming no sub-selects in your WHERE clause). If not, then you need to work on indexes, the WHERE clause, or your query construction itself (JOINs being done, etc).
Once that is satisfactory, add back in your SELECT clause. If it is slow, you are going to have to look at your data access pattern:
Can you return fewer columns?
Can you return fewer rows at once?
Is there caching you can do in the application layer?
Is this query a candidate for partitioned/materialized views (if your database supports those)?
I would run Profiler to find the exact query that is being generated. ORMs can create less than optimal queries. Once you know the query, you can run it in SSMS and see the execution plan. This will give you clues as to where you have performance problems.
Several things that can cause performance problems:
Lack of correct indexing (Foreign keys should be indexed if you have
joins as well as the criteria in the where clause)
Lack of sargability in the where clause forcing the query to not use
existing indexes
Returning more columns than are needed
Correlated subqueries and scalar functions that cause
row-by-agonzing-row operations
Returning too much data (will anybody really be looking at 1 million
records returned? You only want to return the amount you show on page
not the whole possible recordset)
Locking and blocking
There's more (After all whole very long books are written o nthis subject) but that should be enough to get you started at where to look.
You should provide some indexes for those column you often use to restrict the result. Other thing is the pagination of the result set.
Regardless of the specific DB, I would do the following:
run an explain analyze
make sure you have an index for the columns that are part of your where clause
If indexes are ok, it's very likely that you are fetching a lot of
records from disk, which is very slow: if you really cannot refine
your query so that you fetch fewer records, consider clustering your
table, to improve disk locality of your records.

Hibernate Hql find result size for paginator

I need to add paginator for my Hibernate application. I applied it to some of my database operations which I perform using Criteria by setting Projection.count().This is working fine.
But when I use hql to query, I can't seem to get and efficient method to get the result count.
If I do query.list().size() it takes lot of time and I think hibernate does load all the objects in memory.
Can anyone please suggest an efficient method to retrieve the result count when using hql?
You'll have to use another query and Query#iterate(). See section 14.16. Tips & Tricks:
You can count the number of query results without returning them:
( (Integer) session.createQuery("select count(*) from ....").iterate().next() ).intValue()
I've done similar things in the past. If you want to get a total count of records for the paginator, then I'd suggest a separate query that you do first. This query should just do a count and return the total.
As you suspect, in order to count your records on your main query, hibernate does have to load up all the records, although it will do it's best to not load all the data for each record. But still this takes time.
If you can get away with it, because even a count query can take time if your where clauses are inefficient, just check that you got a full page of records back, and put up an indicator of some sort to show that there could be more results on the next page. That's the fastest method because you are only queries for each page as you need it.

Categories

Resources