I have an issue.
I have SQL that I need to append different type of "restrictions" or even do a join. This depends on user's search criteria.
This SQL will involve different table as it can search one-to-many relationship, therefore hibernate ORM can't support my requirement.
May I know if there is a design pattern to help construct such SQL statements?
The design pattern that fits to the problem of representing a language statement is the Interpreter pattern. But before you start to code your SQL parser, take a look to ANTLR.
And what is more important, ask yourself two questions:
Are the number of different SQL's justify the effort of develop a general SQL interpreter solutions instead of programming (just if-else statements) my 5-10 different queries?
Have I reviewed in detail the Hibernate reference manual?
I exactly have a similar requirement where I have a context-free language to define the search criteria, parsed to ParseEntry objects in a ParseTree which are analogous to the Restrictions. I use a SQLQueryGeneratorVisitor to visit the parse table and generate the SQL query, similary a HibernateCriteriaGeneratorVisitor if the criteria needs to be generated for a single entity. So, I essentially used the Visitor pattern making the parse tree and the entries visitable so that different types of criteria can be generated (SQL/Hibernate or something else in future).
Related
I'm trying to parse a SELECT statement in Java. I'm familiar with JOOQ, and was hoping to use that. I know it's not explicitly designed as an SQL parser—it's actually a lot more than that, so I was thinking there might be a way to use its internal parsers to parse SELECT queries.
I saw some information on how to access some of JOOQ's internals using the Visitor pattern, but I need to navigate inside the query using a tree-like structure that will allow access to each part of the query individually. I don't want to use the Visitor pattern for all use cases.
Is this possible? How would I go about doing it?
Yes, you can. jOOQ has a parser that can be used:
Programmatically
As a CLI
Online, as a SQL dialect translator
As of jOOQ 3.17, there's an experimental model API which can be used to traverse your expression tree externally, e.g. using pattern matching, or internally using the new Traverser API. It is also still possible to traverse the expression tree using a VisitListener when rendering the expression tree back to SQL.
A full-fledged SQL parser is available from DSLContext.parser() and from DSLContext.parsingConnection() (see the manual's section about parsing connections for the latter).
The SQL Parsing API page gives this trivial example:
ResultQuery<?> query =
DSL.using(configuration)
.parser()
.parseResultQuery("SELECT * FROM (VALUES (1, 'a'), (2, 'b')) t(a, b)");
parseResultQuery is the method you need for a single SELECT query, use parse(String) if you may have multiple queries.
In the Hibernate 6.0 Roadmap (https://github.com/hibernate/hibernate-orm/wiki/Roadmap6.0) SQM is mentioned as upcoming.
What is SQM?
In this roadmap the following short words describe it:
SQM integration: Improved performance for SQL generation and execution (smaller SQL, position-based extraction of results rather than name(alias)-based); Unified approach for HQL, JPQL and Criteria queries.
This is all I've found about SQM. Could someone please explain it a little more in detail? What exactly is it, how will it look like when it comes to coding, which benefits will it have?
SQM stands for Semantic Query Model, and it is the new entity query parser that addresses both JPQL and Criteria API.
The new parser is much more flexible, and it provides better SQL translation of entity queries.
From a user perspective, SQM provides more features like Window Functions, CTE (Common Table Expressions), Lateral Joins, etc.
SQM provides better performance as well since Criteria API is parsed directly to SQL.
Good Afternoon,
We are bulding a web application and as part of it building a search functionality, have a design question on "Search Functionality"
The field names on the UI vs DB are different .i.e. a field on the UI called as "Number" the same is called Text10 in the DB. following are the two issues
How to generate a SQL as user gives the UI field names, we have a table in the DB where we r maintaining configuration(UI name to DB Name)?
User selects the columns which he wants to search, say for example there fields are selected "Number, Description, Price" and once the sql is generated, how to know what data corresponds to what column? Do we have to maintain an index capturing position or a bean?
what is the better way to gather the data based on the resultset?
Thanks
A solution that promotes commonality between UI and database column names would be nice but probably not feasible.
Some sort of mapping table that captures the following will work:
META-DB-TABLE-NAME
META-DB-COLUMN-NAME
META-UI-COLUMN-NAME
Personally I would prefer to keep this mapping meta-data as close to the database as possible.
User-defined meta data is nicely described here from an Oracle perspective:
http://docs.oracle.com/cd/B28359_01/appdev.111/b28369/xdb_repos_meta.htm
Do some research on this and keep us informed with what you find. Very interesting question!
In such a dynamic SQL scenario, query builders like jOOQ really shine. See for example the jOOQ manual section about dynamic SQL.
In your specific case, assuming you're using generated code in jOOQ (which isn't a must, but certainly recommended), you'll be maintaining some sort of lookup between UI fields and SQL fields, such as:
Map<UIField, Field<?>> lookup = ...
lookup.put(UI.NUMBER, TABLE.NUMBER);
lookup.put(UI.DESCRIPTION, TABLE.DESCRIPTION);
lookup.put(UI.PRICE, TABLE.PRICE);
You can then construct your query dynamically according to user needs:
List<UIField> userRequestedFields = ...
List<Field<?>> queryFields = userRequestedFields
.stream()
.map(lookup::get)
.toList();
And then:
ctx.select(queryFields)
.from(TABLE)
.where(...)
.fetch();
There are other query builders, even JPA has the criteria API for these purposes. You could also roll your own, though you'll be re-inventing a lot of wheels.
Disclaimer: I work for the company behind jOOQ.
I want to get the index of main where clause of database query in Java ?
How can I handle this with Regex?
For example in this query I want to get Second where clause:
select u.id, (select x.id from XEntity where x.id = 200)
from UserEntity u
**where** u.id in (select g.id from AnotherEntity g where g.id = 100)
I think the main where is which that number of "(" characters and ")" characters is equal after it.
but I don't know how can I get this with regex.
With Best Regards
What Toote and David Brabant said is absolutely correct. Parsing SQL, especially complex SQL using only regex is a very hard problem.
In terms of parsing SQL in Java, which seems to be the thrust of your question, there's a very good (if apparently un-maintained) library called JSQLParser. A more up-to-date version of this library can be found on Github here (disclaimer: I made a very small contribution to this). The main page shows an example of visitor designed to consume the output of the AST here.
There is also a grammar for ANTLR available in it's grammar list. Or, if you're feeling adventurous, the H2 database supports a rather extensive range of SQL, including some proprietary features of, e.g., MySQL. You could modify it's Parser to generate an appropriate structure for extracting the information you need.
Regular Expressions are not very good at recognizing such complex structures as SQL queries can be. Mainly because SQL is not context-free, which is exactly the issue you are running into: the WHERE can appear in a lot of places and you want one in particular that depends on the overall structure of the query.
What you will need is an appropriate parser. The only JavaScript SQL parser I could find is not too complete, but you can always help develop it by making sure it fits your needs.
The implementing-result-paging-in-hibernate-getting-total-number-of-rows question trigger another question for me, about some implementation concern:
Now you know you have to reuse part of the HQL query to do the count, how to reuse efficiently?
The differences between the two HQL queries are:
the selection is count(?), instead of the pojo or property (or list of)
the fetches should not happen, so some tables should not be joined
the order by should disappear
Is there other differences?
Do you have coding best-practices to achieve this reuse efficiently (concerns: effort, clarity, performance)?
Example for a simple HQL query:
select a from A a join fetch a.b b where a.id=66 order by a.name
select count(a.id) from A a where a.id=66
UPDATED
I received answers on:
using Criteria (but we use HQL mostly)
manipulating the String query (but everybody agrees it seems complicated and not very safe)
wrapping the query, relying on database optimization (but there is a feeling that this is not safe)
I was hoping someone would give options along another path, more related to String concatenation.
Could we build both HQL queries using common parts?
Have you tried making your intentions clear to Hibernate by setting a projection on your (SQL?)Criteria?
I've mostly been using Criteria, so I'm not sure how applicable this is to your case, but I've been using
getSession().createCriteria(persistentClass).
setProjection(Projections.rowCount()).uniqueResult()
and letting Hibernate figure out the caching / reusing / smart stuff by itself.. Not really sure how much smart stuff it actually does though.. Anyone care to comment on this?
Well, I'm not sure this is a best-practice, but is my-practice :)
If I have as query something like:
select A.f1,A.f2,A.f3 from A, B where A.f2=B.f2 order by A.f1, B.f3
And I just want to know how many results will get, I execute:
select count(*) from ( select A.f1, ... order by A.f1, B.f3 )
And then get the result as an Integer, without mapping results in a POJO.
Parse your query for remove some parts, like 'order by' is very complicated. A good RDBMS will optimize your query for you.
Good question.
Nice question. Here's what I've done in the past (many things you've mentioned already):
Check whether SELECT clause is present.
If it's not, add select count(*)
Otherwise check whether it has DISTINCT or aggregate functions in it. If you're using ANTLR to parse your query, it's possible to work around those but it's quite involved. You're likely better off just wrapping the whole thing with select count(*) from ().
Remove fetch all properties
Remove fetch from joins if you're parsing HQL as string. If you're truly parsing the query with ANTLR you can remove left join entirely; it's rather messy to check all possible references.
Remove order by
Depending on what you've done in 1.2 you'll need to remove / adjust group by / having.
The above applies to HQL, naturally. For Criteria queries you're quite limited with what you can do because it doesn't lend itself to manipulation easily. If you're using some sort of a wrapper layer on top of Criteria, you will end up with equivalent of (limited) subset of ANTLR parsing results and could apply most of the above in that case.
Since you'd normally hold on to offset of your current page and the total count, I usually run the actual query with given limit / offset first and only run the count(*) query if number of results returns is more or equal to limit AND offset is zero (in all other cases I've either run the count(*) before or I've got all the results back anyway). This is an optimistic approach with regards to concurrent modifications, of course.
Update (on hand-assembling HQL)
I don't particularly like that approach. When mapped as named query, HQL has the advantage of build-time error checking (well, run-time technically, because SessionFactory has to be built although that's usually done during integration testing anyway). When generated at runtime it fails at runtime :-) Doing performance optimizations isn't exactly easy either.
Same reasoning applies to Criteria, of course, but it's a bit harder to screw up due to well-defined API as opposed to string concatenation. Building two HQL queries in parallel (paged one and "global count" one) also leads to code duplication (and potentially more bugs) or forces you to write some kind of wrapper layer on top to do it for you. Both ways are far from ideal. And if you need to do this from client code (as in over API), the problem gets even worse.
I've actually pondered quite a bit on this issue. Search API from Hibernate-Generic-DAO seems like a reasonable compromise; there are more details in my answer to the above linked question.
In a freehand HQL situation I would use something like this but this is not reusable as it is quite specific for the given entities
Integer count = (Integer) session.createQuery("select count(*) from ....").uniqueResult();
Do this once and adjust starting number accordingly till you page through.
For criteria though I use a sample like this
final Criteria criteria = session.createCriteria(clazz);
List<Criterion> restrictions = factory.assemble(command.getFilter());
for (Criterion restriction : restrictions)
criteria.add(restriction);
criteria.add(Restrictions.conjunction());
if(this.projections != null)
criteria.setProjection(factory.loadProjections(this.projections));
criteria.addOrder(command.getDir().equals("ASC")?Order.asc(command.getSort()):Order.desc(command.getSort()));
ScrollableResults scrollable = criteria.scroll(ScrollMode.SCROLL_INSENSITIVE);
if(scrollable.last()){//returns true if there is a resultset
genericDTO.setTotalCount(scrollable.getRowNumber() + 1);
criteria.setFirstResult(command.getStart())
.setMaxResults(command.getLimit());
genericDTO.setLineItems(Collections.unmodifiableList(criteria.list()));
}
scrollable.close();
return genericDTO;
But this does the count every time by calling ScrollableResults:last().