Can I search by multiple fields using the Elastic Search Java API? - java

Example:
|studentname | maths | computers |
++++++++++++++++++++++++++++++++++
|s1 |78 |90 |
==================================
|s2 |56 |75 |
==================================
|s3 |45 |50 |
==================================
The above table represents data that is present in Elasticsearch.
Consider that the user enters 60 and above then Elasticsearch should be able to display only s1, because he is the only student who has scored more than 60 in both subjects.
How do I solve it using Java API?
NOTE: I was able to find out for individual subjects by:
QueryBuilder query = QueryBuilders.boolQuery()
.should(
QueryBuilders.rangeQuery(maths)
.from(50)
.to(100)
)

You can have multiple .should() clauses in the the bool query. So in your case:
QueryBuilder query = QueryBuilders.boolQuery()
.should(
QueryBuilders.rangeQuery(maths)
.from(61)
.to(100)
)
.should(
QueryBuilders.rangeQuery(computers)
.from(61)
.to(100)
)
Note:
RangeQueryBuilder (returned from QueryBuilders.rangeQuery()) also has the method #gte().
According to the docs: "In a boolean query with no must or filter clauses, one or more should clauses must match a document." So if you have have no filter or must, you might not get the desired behavior. The above answer assumes you are using some other clauses. The point is you can combine clauses in the bool query. With the Java API this just means using the clause repeatedly.
You might want to use filter instead of should. This will lead to faster execution and caching (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html)

Related

Hibernate Search Lucene query parser with Special Characters

FIRST QUESTION:
Can somebody explain to me how the lucene query in Hibernate Search handles special characters. I read the documentation about Hibernate search and also the Lucene Regexp Syntax but somehow they don't add up with the generated Queries and Results.
Lets assume i have following database entries:
name
Will
Will Smith
Will - Smith
Will-Smith
and i am using following Query:
Query query = queryBuilder
.keyword()
.onField("firstName")
.matching(input)
.createQuery();
Now I am looking for the following input:
Will -> returns all 4 entries, with the following generated query: FullTextQueryImpl(firstName:will)
Will Smith -> also returns all 4 entries with the following generated query: FullTextQueryImpl(firstName:will firstName:smith)
Will - Smith -> also returns all 4 entries with the following generated query: FullTextQueryImpl(firstName:will firstName:smith) ? Where is the "-" or shouldn't it forbid everything after the "-" according to Lucene Query Syntax?
Will-Smith -> same here
Will-Smith -> here i tried to use backslash but same result
Will -Smith -> Same here
SECOND QUESTION: Lets assume i have following database entries in which the entry without numerical ending always exists and the ones with numerical ending could be in the datebase.
How woul a lucene query for this look like?
name
Will
Will1
Will2
You can play around with Lucene analyzers and see what happens behind the scenes. Here is a tutorial: https://www.baeldung.com/lucene-analyzers
The tokenizer is pluggable, so you can change how special characters are treated.

jOOQ Dynamic Number of WITH Clauses

I am playing around with jOOQ and nesting queries. I have a JSON payload which may contain many subqueries. I want to treat these subqueries as a variable number of Common Table Expressions (i.e. CTE's in the WITH clause). Currently I have this working example, but it is static in terms of the number of CTE's it includes. How would I accomplish a variable number of CTE's within the WITH Clause?
/*
+-----------------+
|sum_of_everything|
+-----------------+
| 100|
+-----------------+
*/
Supplier<Stream<Map<String, Object>>> resultsWith =
() ->
dslContext
.with("s1")
.as(query)
.select(sum(field("1", Integer.class)).as("sum_of_everything"))
.from(table(name("s1")))
.fetchStream()
.map(Record::intoMap);
Ultimately I will need to deserialize the JSON payload to ensure that the CTE reference hierarchy works properly, and I will need to see if jOOQ supports referencing one CTE in another CTE before selecting the final result. I will need to accomplish something like this :
/*
+-----------------+
|sum_of_everything|
+-----------------+
| 100|
+-----------------+
*/
Supplier<Stream<Map<String, Object>>> resultsWith =
() ->
dslContext
.with("s1").as(query1)
.with("s2").as(query2) // should be able to reference "s1" i.e. query1
...
.with("sNMinus1").as(queryNMinus1)
.with("sN").as(queryN) // should be able to reference any upstream CTE
.select(sum(field("1", Integer.class)).as("sum_of_everything"))
.from(table(name("sN")))
.fetchStream()
.map(Record::intoMap);
You can create a CommonTableExpression instance starting out from a Name, using Name.as(Select) e.g.
CommonTableExpression<?> s1 = name("s1").as(query1);
CommonTableExpression<?> s2 = name("s2").as(query2);
// And then (or, of course, use a Collection)
dslContext.with(s1, s2, ..., SN)

Postgresql Array Functions with QueryDSL

I use the Vlad Mihalcea's library in order to map SQL arrays (Postgresql in my case) to JPA. Then let's imagine I have an Entity, ex.
#TypeDefs(
{#TypeDef(name = "string-array", typeClass =
StringArrayType.class)}
)
#Entity
public class Entity {
#Type(type = "string-array")
#Column(columnDefinition = "text[]")
private String[] tags;
}
The appropriate SQL is:
CREATE TABLE entity (
tags text[]
);
Using QueryDSL I'd like to fetch rows which tags contains all the given ones. The raw SQL could be:
SELECT * FROM entity WHERE tags #> '{"someTag","anotherTag"}'::text[];
(taken from: https://www.postgresql.org/docs/9.1/static/functions-array.html)
Is it possible to do it with QueryDSL? Something like the code bellow ?
predicate.and(entity.tags.eqAll(<whatever>));
1st step is to generate proper sql: WHERE tags #> '{"someTag","anotherTag"}'::text[];
2nd step is described by coladict (thanks a lot!): figure out the functions which are called: #> is arraycontains and ::text[] is string_to_array
3rd step is to call them properly. After hours of debug I figured out that HQL doesn't treat functions as functions unless I added an expression sign (in my case: ...=true), so the final solution looks like this:
predicate.and(
Expressions.booleanTemplate("arraycontains({0}, string_to_array({1}, ',')) = true",
entity.tags,
tagsStr)
);
where tagsStr - is a String with values separated by ,
Since you can't use custom operators, you will have to use their functional equivalents. You can look them up in the psql console with \doS+. For \doS+ #> we get several results, but this is the one you want:
List of operators
Schema | Name | Left arg type | Right arg type | Result type | Function | Description
------------+------+---------------+----------------+-------------+---------------------+-------------
pg_catalog | #> | anyarray | anyarray | boolean | arraycontains | contains
It tells us the function used is called arraycontains, so now we look-up that function to see it's parameters using \df arraycontains
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+---------------+------------------+---------------------+--------
pg_catalog | arraycontains | boolean | anyarray, anyarray | normal
From here, we transform the target query you're aiming for into:
SELECT * FROM entity WHERE arraycontains(tags, '{"someTag","anotherTag"}'::text[]);
You should then be able to use the builder's function call to create this condition.
ParameterExpression<String[]> tags = cb.parameter(String[].class);
Expression<Boolean> tagcheck = cb.function("Flight_.id", Boolean.class, Entity_.tags, tags);
Though I use a different array solution (might publish soon), I believe it should work, unless there are bugs in the underlying implementation.
An alternative to method would be to compile the escaped string format of the array and pass it on as the second parameter. It's easier to print if you don't treat the double-quotes as optional. In that event, you have to replace String[] with String in the ParameterExpression row above
For EclipseLink I created a function
CREATE OR REPLACE FUNCTION check_array(array_val text[], string_comma character varying ) RETURNS bool AS $$
BEGIN
RETURN arraycontains(array_val, string_to_array(string_comma, ','));
END;
$$ LANGUAGE plpgsql;
As pointed out by Serhii, then you can useExpressions.booleanTemplate("FUNCTION('check_array', {0}, {1}) = true", entity.tags, tagsStr)

Hibernate DetachedCriteria multiple results in java

This is the SQL statement that I have.
SELECT USER_PROFILE.FIRST_NAME, USER_PROFILE.LAST_NAME, USER_PROFILE.USER_TYPE
FROM USER_PROFILE
INNER JOIN USER_LOGIN_STATUS
ON USER_PROFILE.USER_ID=USER_LOGIN_STATUS.USER_ID
ORDER BY USER_PROFILE.FIRST_NAME
And I'm trying to execute the code below that I thought the equivalent to hibernate DetachedCriteria and expected to only have two data as a result.
DetachedCriteria dc = getDetachedCriteria();
DetachedCriteria userLoginCriteria = DetachedCriteria.forClass(UserLoginStatus.class);
userLoginCriteria.setProjection(Projections.distinct(Projections.property("userId")));
dc.add(Subqueries.propertyIn(UserField.id.name(), userLoginCriteria));
DetachedCriteria profileCriteria = dc.createCriteria("profile");
profileCriteria.addOrder(Order.asc("firstName"));
return getAll(dc, pageSetting);
But unfortunately this is the unexpected result: I am having a multiple data result.
Name | Type |
Ben Jones | User |
Ben Jones | User |
Tom Homer | Guest |
Tom Homer | Guest |
Is anyone there knows the exact equivalent DetachedCriteria or a solution for this?
First of all, your SQL looks incorrect. The reason it is returning multiple rows is because you're joining against the USER_LOGIN_STATUS table which may have multiple rows per USER_PROFILE. Because you are not selecting any fields from the USER_LOGIN_STATUS table, you cannot see why there are multiple rows. Why are you joining on this table in the first place?
Secondly, the detached criteria you are performing is not equivalent to the SQL you have provided since you are doing a sub-query which you are not in the SQL.
You don't need this sub-select and since I don't understand why you are doing the join I will assume some points to give you the following example:
DetachedCriteria dc = getDetachedCriteria();
dc.createAlias("userLoginStatus", "uls");
dc.add(Projections.property("firstName"));
dc.add(Projections.property("lastName"));
dc.add(Projections.property("userType"));
dc.addOrder(Order.asc("firstName"));
return getAll(dc, pageSetting);
This is now roughly equivalent but I am assuming:
You have the correct mappings for your relationship between UserField and UserLoginStatus.
That getDetachedCriteria() is effectively returning DetachedCriteria.forClass(UserField.class).
You can also now refer to a field in UserLoginStatus as so:
dc.add(Projections.property("uls.my_user_login_field"));
And as well, if you get your query sorted out and you still return multiple entities, then dinukadev's answer will then come into play with:
dc.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY);
I suspect the reason this isn't working for you is because of your sub-select.
Sorry I cannot help you more.
Please try to set the result transformer on your root detached criteria as follows. This will eliminate duplicates.
dc.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY);

Antlr AST Tree Approach To Complex Grammar

I have written a complex grammar. The grammar can be seen below:
grammar i;
options {
output=AST;
}
#header {
package com.data;
}
operatorLogic : 'AND' | 'OR';
value : STRING;
query : (select)*;
select : 'SELECT'^ functions 'FROM table' filters?';';
operator : '=' | '!=' | '<' | '>' | '<=' | '>=';
filters : 'WHERE'^ conditions;
conditions : (members (operatorLogic members)*);
members : STRING operator value;
functions : '*';
STRING : ('a'..'z'|'A'..'Z')+;
WS : (' '|'\t'|'\f'|'\n'|'\r')+ {skip();}; // handle white space between keywords
The output is done using AST. The above is only a small sample. However, I am developing some big grammar and need advice on how to approach this.
For example according to the above grammar the following can be produced:
SELECT * from table;
SELECT * from table WHERE name = i AND name = j;
This query could get more complex. I have implemented AST in the Java code and can get the Tree back. I wanted to seperate the grammar and logic, so their are cohesive. So AST was the best approach.
The user will enter a query as a String and my code needs to handle the query in the best way possible. As you can see the functions parser currently is * which means select all. In the future this could expand to include other things.
How can my code handle this? What's the best approach?
I could do something like this:
String input = "SELECT * from table;";
if(input.startsWith("SELECT")) {
select();
}
As you can see this approach is more complicated, as I need to handle * also the optional filters. The operatorLogic which is AND and OR, also needs to be done.
What is the best way? I have looked online, but couldn't find any example on how to handle this.
Are you able to give any examples?
EDIT:
String input = "SELECT * FROM table;";
if(input.startsWith("SELECT")) {
select();
}
else if(input.startsWith("SELECT *")) {
findAll();
}
The easiest way to handle multiple starting rules ("SELECT ...", "UPDATE...", etc) is to let the ANTLR grammar do the work for you at a single, top-level starting rule. You pretty much have that already, so it's just a matter of updating what you have.
Currently your grammar is limited to one command-type of input ("SELECT...") because that's all you've defined:
query : (select)*; //query only handles "select" because that's all there is.
select : 'SELECT'^ functions 'FROM table' filters?';';
If query is your starting rule, then accepting additional top-level input is a matter of defining query to accept more than select:
query : (select | update)*; //query now handles any number of "select" or "update" rules, in any order.
select : 'SELECT'^ functions 'FROM table' filters?';';
update : 'UPDATE'^ ';'; //simple example of an update rule
Now the query rule can handle input such as SELECT * FROM table;, UPDATE;, or SELECT * FROM table; UPDATE;. When a new top-level rule is added, just update query to test for that new rule. This way your Java code doesn't need to test the input, it just calls the query rule and lets the parser handle the rest.
If you only want one type of input to be processed from the input, define query like this:
query : select* //read any number of selects, but no updates
| update* //read any number of updates, but no selects
;
The rule query still handles SELECT * FROM table; and UPDATE;, but not a mix of commands, like SELECT * FROM table; UPDATE;.
Once you get your query_return AST tree from calling query, you now have something meaningful that your Java code can process, instead of a string. That tree represents all the input that the parser processed.
You can walk through the children of the tree like so:
iParser.query_return r = parser.query();
CommonTree t = (CommonTree) r.getTree();
for (int i = 0, count = t.getChildCount(); i < count; ++i) {
CommonTree child = (CommonTree) t.getChild(i);
System.out.println("child type: " + child.getType());
System.out.println("child text: " + child.getText());
System.out.println("------");
}
Walking through the entire AST tree is a matter of recursively calling getChild(...) on all parent nodes (my example above looks at the top-level children only).
Handling alternatives to * is no different than any other alternatives you've defined: just define the alternatives in the rule you want to expand. If you want functions to accept more than *, define functions to accept more than *. ;)
Here's an example:
functions: '*' //"all"
| STRING //some id
;
Now the parser can accept SELECT * FROM table; and SELECT foobar FROM table;.
Remember that your Java code has no reason to examine the input string. Whenever you're tempted to do that, look for a way to make your grammar do the examining instead. Your Java code will then look at the AST tree output for whatever it wants.

Categories

Resources