Count found rows on LIMIT affected queries using JDBI - java

I'm using JDBI and I have allowMultiQueries set to true for my JDBC connection.
For the good old pagination issue, I'd like to get the number of affected rows if no LIMIT would have been set.
I've tried the following:
do a SELECT SQL_CALC_FOUND_ROWS * FROM table LIMIT 0,100; SELECT FOUND_ROWS() as myCount but I get always 1. This is not a problem with the MySQL server nor the statistics being obsolete: I can run the query on MySQL Workbench and the results are correct
use the same JDBI handlefor both queries (the main query and the count query afterwards) without closing it. It will return 1 again
If I do it in pure JDBC I can iterate through the ResultSet but this is not an option, since I rely heavily on JDBI
Note I'd like to stay away from the use of SQL_CALC_FOUND_ROWS debate. I need to use it in some cases, only when suitable
Thanks in advance!

Related

Is there a way to make query return a ResultSet?

I have the following query:
#Select("SELECT* FROM "+MyData.TABLE_NAME+" where data_date = #{refDate}")
public List<MyData> getMyData(#Param("refDate") Date refDate);
This table data is HUGE! Loading so many rows in memory is not the best way!
Is it possible to have this same query return a resultset so that I can just iterate over one item?
edit:
I tried adding:
#ResultType(java.sql.ResultSet.class)
public ResultSet getMyData(#Param("refDate") Date refDate);
but it gives me:
nested exception is org.apache.ibatis.reflection.ReflectionException: Error instantiating interface java.sql.ResultSet with invalid types () or values (). Cause: java.lang.NoSuchMethodException: java.sql.ResultSet.<init>()
I'd suggest you use limit in your query. limit X, Y syntax is good for you. Try it.
If the table is huge, the query will become slower and slower. Then the best way to to iterate will be to filter based on id and use limit.
such as
select * from table where id>0 limit 100 and then
select * from table where id>100 limit 100 etc
There are multiple options you have ...
Use pagination on database side
I will just suppose the database is oracle. However other db vendors would also work. In oracle you have a rownum with which you can limit number of records to return. To return desired number of records you need to prepare a where clause using this rownum. Now, the question is how to supply a dynamic rownum in a query. This is where dynamic sqls of mybatis comes in use. You can pass these rownum values inside a parameter map which there onwards you can use in your query inside a mapper xml using a #{} syntax. With this approach you filter the records on db level itself and only bring or prepare java objects which are needed or in the current page.
Use pagination on mybatis side
Mybatis select method on sqlSession has a Rowbounds attribute. Populate this as per your needs and it will bring you those number of records only. Here, you are limiting number of records on mybatis side whereas in first approach the same was performed on db side which is better performant .
Use a Result handlers
Mybatis will give you control of actual jdbc result set. So, you can do/iterate over the result one by one here itself.
See this blog entry for more details.

Using Hibernate setFirstResult without any ordering

I am new to hibernate. I am confused with criteria's setFirstResult method.
From the documentation it seems hibernate returns rows from the the given number.
Since SQL query does not guarantee the ordering of rows without order by clause,
how setFirstQuery works in this case(without orderBy clause)?
Does hibernate read index information from the database?
If I execute same SQL query multiple times, ordering might change, in this case how setFirstResult work?
Hibernate can not do something by its own unless its supported by underlying databases. Because Hibernate queries finally get transformed to Sql only.
Having said that it uses underlying databases capabilities like for PostgresSQL and MySQL it will generate query like limit ? offset ? .
You can add custom order using addOrder
.addOrder( Order.asc("name") )
It is your task to add an order by, the function as you noticed won't guarantee same results if executed several times over the same set of data.
setFirstResult is typically used in pagination.

How to limit resultset globaly (or by session) in Oracle?

I try to limit resultset in size without having to change each SQL statement from the app (at least for instance). So adding ROWNUM clause to every query is not an option.
I'm looking more or less for some global parameter, or at least a session parameter (like SET ROWCOUNT in SQLServer).
The app uses JDBC ressource pool to connect to Oracle, so I can set session-wide parameter in SQL init JDBC pool.
Have you tried using the Statement.setMaxRows method? That's a standard JDBC approach that will silently drop rows after the specified maximum number have been fetched.
From a performance standpoint, however, you are likely better served by actually modifying the queries that are sent to the database. If you do as #MK suggests, you'll be telling the optimizer that you're only going to fetch N rows. That potentially allows the optimizer to pick a more efficient plan.
I suspect there is no way to do it. You could create an AOP advice which will modify each executeQuery() call by wrapping the query with
"select * from ( " + origQuery + " ) where ROWNUM <= 5"
but I would hope this is a temporary or test thing we are talking about.

JDBC setMaxRows database usage

I am trying to write a database independant application with JDBC. I now need a way to fetch the top N entries out of some table. I saw there is a setMaxRows method in JDBC, but I don't feel comfortable using it, because I am scared the database will push out all results, and only the JDBC driver will reduce the result. If I need the top 5 results in a table with a billion rows this will break my neck (the table has an usable index).
Writing special SQL-statements for every kind of database isn't very nice, but will let the database do clever query planning and stop fetching more results than necessary.
Can I rely on setMaxRows to tell the database to not work to much?
I guess in the worst case I can't rely on this working in the hoped way. I'm mostly interested in Postgres 9.1 and Oracle 11.2, so if someone has experience with these databases, please step forward.
will let the database do clever query planning and stop fetching more
results than necessary.
If you use
PostgreSQL:
SELECT * FROM tbl ORDER BY col1 LIMIT 10; -- slow without index
Or:
SELECT * FROM tbl LIMIT 10; -- fast even without index
Oracle:
SELECT *
FROM (SELECT * FROM tbl ORDER BY col1 DESC)
WHERE ROWNUM < 10;
.. then only 10 rows will be returned. But if you sort your rows before picking top 10, all basically qualifying rows will be read before they can be sorted.
Matching indexes can prevent this overhead!
If you are unsure, what JDBC actually send to the database server, run a test and have the database engine log the statements received. In PostgreSQL you can set in postgresql.conf:
log_statement = all
(and reload) to log all statements sent to the server. Be sure to reset that setting after the test or your log files may grow huge.
The thing which could/may kill you with billion(s) of rows is the (highly likely) ORDER BY clause in your query. If this order cannot be established using an index then . . . it'll break your neck :)
I would not depend on the jdbc driver here. As a previous comment suggests it's unclear what it really does (looking at different rdbms).
If you are concerned regarding speed of your query you can use a LIMIT clause as well. If you use LIMIT you can at least be sure that it's passed on to the DB server.
Edit: Sorry, I was not aware that Oracle doesn't support LIMIT.
In direct answer to your question regarding PostgreSQL 9.1: Yes, the JDBC driver will tell the server to stop generating rows beyond what you set.
As others have pointed out, depending on indexes and the plan chosen, the server might scan a very large number of rows to find the five you want. Proper server configuration can help accurately model the costs to prevent this, but if value distribution is unusual you may need to introduce and optimization barrier (like with a CTE) to coerce the planner to produce a good plan.

Hibernate Criteria Limit mechanism?

Hibernate Criteria support provides a setMaxResults() method to limit the results returned from the db.
I can't find any answer to this in their documentation - how is this implemented? Is it querying for the entire result set and then returning only the request number? Or is it truly limiting the query on the database end (think LIMIT keyword as in mySql).
This is important because if a query could potentially return many many results, I really need to know if the setMaxResults() will still query for all the rows in the database (which would be bad).
Also - if its truly limiting the number of rows on the database end, how is it achieving this cross-db (since I don't think every rdbms supports a LIMIT functionality like mySql does).
Hibernate asks the database to limit the results returned by the query. It does this via the dialect, which uses whatever database-specific mechanism there is to do this (so for SQL Server it will do somthing like "select top n * from table", Oracle will do "select * from table where rownum < n", MySQL will do "select * from table limit n" etc). Then it just returns what the database returns.
The class org.hibernate.dialect.Dialect contains a method called supportsLimit(). If dialect subclasses override this method, they can implement row limit handling in a fashion native to their database flavor. You can see where this code is called from in the class org.hibernate.loader.Loader which has a method titled prepareQueryStatement, just search for the word limit.
However, if the dialect does not support this feature, there is a hard check in place against the ResultSet iterator that ensures Java object (entity) results will stop being constructed when the limit is reached. This code is also located in Loader as well.
I use both Hibernate and Hibernate Search and without looking at the underlying implementation I can tell you that they definitely do not return all results. I have implemented the same query returning all results and then changed it to set the first result and max results (to implement pagination) and the performance gains were massive.
They likely use dialect specific SQL for this, e.g. LIMIT in MySQL, ROWNUM in Oracle. Your entity manager is aware of the dialect that you are using so this is simple.
Lastly if you really want to check what SQL Hibernate is producing for this query, just set the "show_sql" property to true when you create your entity manager / factory and it spits out all the SQL it is running to the console.
HQL does not suppport a limitation inside a query like in SQL, only the setMaxResults() which you also found.
To find out if it transform the setMaxResults() into a LIMIT query, you can turn on your SQL logging.
I know Question is bit old. But yes setMaxResults() is truly limiting the number of rows on the database end.
If you really look into your Hibernate SQL output, you can find the following SQL statement has been appended to your query.
limit ?

Categories

Resources