How to get the first result from a group (Jooq) - java

My requirement is to take a list of identifiers, each of which could refer to multiple records, and return the newest record per identifier.
This would seem to be doable with a combination of orderBy(date, desc) and fetchGroups() on the identifier column. I then use values() to get the Result objects.
At this point, I want the first value in each result object. I can do get(0) to get the first value in the list, but that seems like cheating. Is there a better way to get that first result from a Result object?

You're going to write a top-1-per-category query, which is a special case of a top-n-per-category query. Most syntaxes that produce this behaviour in SQL are supported by jOOQ as well. You shouldn't use grouping in the client, because you'd transfer all the excess data from the server to the client, which corresponds to the remaining results per group.
Some examples:
Standard SQL (when window functions are supported)
Field<Integer> rn = rowNumber().over(T.DATE.desc()).as("rn");
var subquery = table(
select(T.fields())
.select(rn)
.from(T)
).as("subquery");
var results =
ctx.select(subquery.fields(T.fields())
.from(subquery)
.where(subquery.field(rn).eq(1))
.fetch();
Teradata and H2 (we might emulate this soon)
var results =
ctx.select(T.fields())
.from(T)
.qualify(rowNumber().over(T.DATE.desc()).eq(1))
.fetch();
PostgreSQL
var results =
ctx.select(T.fields())
.distinctOn(T.DATE)
.from(T)
.orderBy(T.DATE.desc())
.fetch();
Oracle
var results =
ctx.select(
T.DATE,
max(T.COL1).keepDenseRankFirstOrderBy(T.DATE.desc()).as(T.COL1),
max(T.COL2).keepDenseRankFirstOrderBy(T.DATE.desc()).as(T.COL2),
...
max(T.COLN).keepDenseRankFirstOrderBy(T.DATE.desc()).as(T.COLN))
.from(T)
.groupBy(T.DATE)
.fetch();

Related

Introducing a named parameter breaks jOOQ query

To query a PostgreSQL 10.11 database, I am using jOOQ 3.12.4, which comes bundled with Spring Boot 2.2.
Let's assume I have built a query using jOOQ like this:
final String[] ids = ...;
final var query = dslContext.selectFrom(MY_TABLE).where(MY_TABLE.ID.in(ids));
final Map<String, List<MyTable>> changeDomains = query.fetch().intoGroups(MY_TABLE.ID, MyTable.class);
This code runs fine and produces the expected results. But when I refactor my query and introduce a named parameter (to reuse the query in multiple parts of my code), like this:
final String[] ids = ...;
final var query = dslContext.selectFrom(MY_TABLE).where(MY_TABLE.ID.in(param("ids")));
final Map<String, List<MyTable>> changeDomains = query.bind("ids", ids).fetch().intoGroups(MY_TABLE.ID, MyTable.class);
I suddenly start to get the following error:
org.springframework.jdbc.BadSqlGrammarException: jOOQ; bad SQL grammar ...; nested exception is org.postgresql.util.PSQLException: ERROR: operator does not exist: text = character varying[]
Hinweis: No operator matches the given name and argument type(s). You might need to add explicit type casts.
Edit: I get the same error when I use
MY_TABLE.ID.in(param("ids", String[].class))
instead.
How can I solve or work around this problem?
A better solution to your code reuse approach
But when I refactor my query and introduce a named parameter (to reuse the query in multiple parts of my code)
While you could use jOOQ this way (be careful, when mutating and reusing jOOQ queries in a non-threadsafe way!), it is generally recommended to use jOOQ in a more functional way, see e.g.:
https://blog.jooq.org/2017/01/16/a-functional-programming-approach-to-dynamic-sql-with-jooq/
https://www.jooq.org/doc/latest/manual/sql-building/dynamic-sql/
You don't gain much by re-using a jOOQ query, specifically, there's hardly any performance gain.
So, instead of this:
final var query = dslContext.selectFrom(MY_TABLE)
.where(MY_TABLE.ID.in(param("ids")));
final Map<String, List<MyTable>> changeDomains = query
.bind("ids", ids).fetch().intoGroups(MY_TABLE.ID, MyTable.class);
Write this:
public ResultQuery<MyTableRecord> query(String[] ids) {
return dslContext.selectFrom(MY_TABLE).where(MY_TABLE.ID.in(ids));
}
// And then:
final Map<String, List<MyTable>> changeDomains = query(ids)
.fetch().intoGroups(MY_TABLE.ID, MyTable.class);
The actual problem you ran into:
jOOQ, JDBC, and SQL don't support single bind value IN lists. While it seems useful to write this:
SELECT * FROM t WHERE c IN (:bind_value)
And passing an array or list as a single bind value, this is not supported in SQL. Some APIs might pretend that this is supported (but behind the scenes replace the single bind value by multiple ?, ?, ..., ?
PostgreSQL supports the = ANY (:bind_value) operator with arrays
SELECT * FROM t WHERE c = ANY (:bind_value)
You could use it in jOOQ using
dslContext.selectFrom(MY_TABLE).where(MY_TABLE.ID.eq(any(ids)));
That way, you could call the bind() method to replace the array prior to execution. However, I still recommend you write functions returning queries dynamically.

jOOQ Postgres PERCENTILE_CONT & MEDIAN Issue with Type Casting

Coercion of data types does not seem to work within median() or percentileCont(). Data type coercion works just fine with other aggregate functions like max() and min(). The Postgres queries that are produced as a result show that type casting is not applied in the final result. Below are the snippets from jOOQ and Postgres for reference. As of now, I have no work-around or knowledge of an open ticket for this issue.
Any direction would be much appreciated!
MEDIAN
jOOQ Snippet
selectFields.add(
median(
field(String.format("%s.%s", a.getDataSourceName(), a.getField()))
.coerce(Double.class)) // Seems to not successfully coerce data types
.as(
String.format(
"%s.%s.%s", a.getDataSourceName(), a.getField(), "median")));
SQL Output
select
tableA.columnA,
percentile_cont(0.5) within group (order by tableA.columnA) as "tableA.columnA.median"
from tableA
group by tableA.columnA
limit 100;
ERROR: function percentile_cont(numeric, text) does not exist
PERCENTILE_CONT
jOOQ Snippet
selectFields.add(
percentileCont(a.getPercentileValue())
.withinGroupOrderBy(
field(String.format("%s.%s", a.getDataSourceName(), a.getField()))
.coerce(Double.class)) // Seems to not successfully coerce data types
.as(
String.format(
"%s.%s.%s", a.getDataSourceName(), a.getField(), "percentile_" + Math.round(a.getPercentileValue() * 100))));
SQL Output
select
tableA.columnA,
percentile_cont(0.0) within group (order by tableA.columnA) as "tableA.columnA.percentile_0"
from tableA.columnA
group by tableA.columnA
limit 100;
ERROR: function percentile_cont(numeric, text) does not exist
POSTGRES -- This works due to type casting
select
percentile_cont(0.5)
within group (
order by tableA.columnA::INTEGER
)
as "tableA.columnA.median"
from tableA.columnA
group by (select 1)
https://www.jooq.org/javadoc/latest/org.jooq/module-summary.html
You're not looking for coercion, which in jOOQ-speak means changing a data type only in the client without letting the server know. This is mostly useful when fetching data of some type (e.g. Integer) despite jOOQ producing some other data type (e.g. BigInteger), otherwise. See the Javadoc on Field.coerce()
Unlike with casting, coercing doesn't affect the way the database sees a Field's type.
// This binds an int value to a JDBC PreparedStatement
DSL.val(1).coerce(String.class);
// This binds an int value to a JDBC PreparedStatement
// and casts it to VARCHAR in SQL
DSL.val(1).cast(String.class);
Cleary, you want to Field.cast(), instead, just like in your example where you actually used a cast tableA.columnA::INTEGER.

How to use JOOQ's parser to extract table names from SQL statements [duplicate]

Using the JOOQ parser API, I'm able to parse the following query and get the parameters map from the resulting Query object. From this, I can tell that there is one parameter, and it's name is "something".
However, I haven't been able to figure out how to determine that the parameter "something" is assigned to a column named "BAZ" and that column is part of the table "BAR".
Does the parser API have a way to get the table/column metadata associated to each parameter?
String sql = "SELECT A.FOO FROM BAR A WHERE A.BAZ = :something";
DSLContext context = DSL.using...
Parser parser = context.parser();
Query query = parser.parseQuery(sql);
Map<String, Param<?>> params = query.getParams();
Starting from jOOQ 3.16
jOOQ 3.16 introduced a new, experimental (as of 3.16) query object model API, which can be traversed, see:
The manual
A blog post about traversing jOOQ expression trees
Specifically, you can write:
List<QueryPart> parts = query.$traverse(
Traversers.findingAll(q -> q instanceof Param)
);
Or, to conveniently produce exactly the type you wanted:
Map<String, Param<?>> params = query.$traverse(Traversers.collecting(
Collectors.filtering(q -> q instanceof Param,
Collectors.toMap(
q -> ((Param<?>) q).getParamName(),
q -> (Param<?>) q
)
)
));
The Collectors.toMap() call could include a mergeFunction, in case you have the same param name twice.
Pre jOOQ 3.16
As of jOOQ 3.11, the SPI that can be used to access the internal expression tree is the VisitListener SPI, which you have to attach to your context.configuration() prior to parsing. It will then be invoked whenever you traverse that expression tree, e.g. on your query.getParams() call.
However, there's quite a bit of manual plumbing that needs to be done. For example, the VisitListener will only see A.BAZ as a column reference without knowing directly that A is the renamed table BAR. You will have to keep track of such renaming yourself when you visit the BAR A expression.

Jpa returns more than one result although only one row is in database

I am running different JPA queries in the form of
Float getDepositVolumeByDepositIdDepositAndSpeciesIdSpeciesAndRangeIdRangeAndSubRangeIdSubrange(Long idDeposit, Long idSpecies, Long idRange, Long idSubRange); // this is one of the methods that fail
#Query("select stock.depositVolume from Stock s where s.deposit.idDeposit = ?1 and s.species.idSpecies = ?2 and s.range.idRange = ?3 and s.subrange.idSubrange = ?4")
Float getVolumeByDepositIdDepositAndSpeciesIdSpeciesAndRangeIdRangeAndSubRangeIdSubrange(Long idDeposit, Long idSpecies, Long idRange, Long idSubRange); // this one is just for ilustrative purpose and throws the exact same error
These two queries being just some of those that should return one row. Although the database has only one row corresponding to the data provided to the query, hibernate throws the following error message:
Result returns more than one elements
I have turned on hibernate query log and the query generated is the following for the first method:
select stock0_volume_stock as col_0_0_ from public.stock stock0_ where stock0_.id_deposit = ? and stock0_.id_species = ? and stock0_.id_range = ? and stock0_.id_sub_range = ?
with the correctly bound parameters. I ran the query on PostGres and it returns only one row with a float.
It is worth mentioning that my class declaration is:
public interface StockRepository extends QueryDslPredicateExecutro<Stock>, JpaRepository<Stock, long>
What I have ended doing is change those methods into
List<Stock> findFirstByDepositIdDepositAndSpeciesIdSpeciesAndRangeIdRangeAndSubRangeIdSubrange(Long idDeposit, Long idSpecies, Long idRange, Long idSubRange); // now it justly returns only one row, the first one
I usually suppose, and certainly on previous projects this was the behavior observed, that the first method should map the only result fetched from the database into the expected float
I am very interested what is the explanation of this behaviour.
If you do SELECT some-column FROM some-table WHERE... there is no way for jpa/hibernate to know that only one row matches the condition, on the contrary it must assume that many rows are returned and my guess is that it always uses the same logic for a query like this (assuming multiple rows) and that the error message is misleading here.
To get one row you would need a query with an aggregate function like SUM or COUNT as the only element in the SELECT clause. Maybe for fun you could try to use SUM in your original query and see if it returns the expected result.
Maybe this is more suitable as a comment than an answer but it felt like to long for a comment

How to stream Select results (not String query results) from CassandraOperations?

Spring Data Cassandra 1.5.0 comes with a streaming API in CassandraTemplate. I'm using spring-data-cassandra 1.5.1. I have a code like:
String tableName = cassandraTemplate.getTableName(MyEntity.class).toCql();
Select select = QueryBuilder.select()
.all()
.from(tableName);
// In real world, WHERE statement is much more complex
select.where(eq(ENTITY_FIELD_NAME, expectedField))
List<MyEntity> result = cassandraTemplate.select(select, MyEntity.class);
and want to replace this code with iterable or Java 8 Stream in order to avoid fetching a big list of results to memory at once.
What I'm looking for is a method signature like CassandraOperations.stream(Select query, Class<T> entityClass), but it is not available.
The only available method in CassandraOperations accepts query string: stream(String query, Class<T> entityClass). I tried to pass here a string generated by Select like
cassandraTemplate.stream(select.getQueryString(), MyEntity.class)
But that fails with InvalidQueryException: Invalid amount of bind variables, because getQueryString() returns query with question mark placeholders instead of variables.
I see 3 options to get what I want, but every option looks bad:
Use Spring Query creation mechanism with Stream/Iterator expected return type (good only for simple queries) http://docs.spring.io/spring-data/cassandra/docs/current/reference/html/#repositories.query-methods.query-creation
Use raw CQL query and not to use QueryBuilder
Call select.getQueryString() and then substitute parameters again via BoundStatement
Is there any better way to stream selection results?
Thanks.
So, as of now the answer on my question is to wait until stable version of spring-data-cassandra 2.0.0 comes out:
https://github.com/spring-projects/spring-data-cassandra/blob/2.0.x/spring-data-cassandra/src/main/java/org/springframework/data/cassandra/core/CassandraTemplate.java#L208

Categories

Resources