Hibernate search - '%like%' type query - java

I'm using hibernate-search in my Spring MVC project and I would like to accomplish something but I'm not sure if it's possible. Here is the problem:
I'm using NGramFilterFactoryClass for this and have configured minGramSize=3 and maxGramSize=3.
Let's say my search term is "Keyword"
If I type anything like this:
"ywo", "key", "ord", "blablaordblabla"
query will return "Keyword". This is fine and I understand how this works but what I wanna do is when I type something like:
"bkey", "blablaordblabla"
I don't want to return "Keyword". "Keyword" should be returned only when search term is something like:
"key", "ord", "ywo", "eywo", "word" etc...
So, I guess I'm looking for a '%like%' type query. How can I accomplish this with hinernate-search?

I don't know if is what you are looking for, but maybe you need what is called "wildcard queries".
Try to have a look at this link as reference.
Also have a look at this stackoverflow topic

If you Analyze your input with NGrams you won't be able to perform exact "Like%" queries.
You probably want a SimpleAnalyzer or something similar which doesn't completely break your keywords in smaller pieces, or you might want to skip Analysis for this field and index it as-is.
You then combine this with a WildCard Query; note how example in the reference docs uses the keyword element to build the query, which inherently disables the analyzer on the input. (Make sure you scroll down the the Wildcard queries section in the docs).
I assume you're using NGrams because you need them for another use case. Remember you can use the #Fields annotation to index a same property in various different ways, so you could index it with ngrams and also in another form more suited for wildcard queries.

Related

Jersey REST api filtering and ordering correctly

I have Jersey REST API and I would like to add ordering by column, filtering by column, base, offset and others. But I cant find concrete answer how it should be, or if is there some best practise to follow. It is header param or query param? And should it by under one param like Order = "name:asc" or two like order_by = "name" and order_order_how = "asc". Or it is completely on me how I do it?
Generally this information is place in query parameters. There are a few patterns I'll see. Both the one that seems the most intuitive to me is as follows
/resource?sort=-firstname[,+lastname]
The [] denotes optionally more criteria. The + and - denote the order
The reason I like the above pattern rather than something like
/resource?sort=firstname&order=asc
is that with the above pattern, with the separation of the sort and order, it makes it difficult to ensure correctness with multiple criteria. It seems our algorithm for parsing may be error prone and dependent on the client making careful request.

Query that returns only certain fields?

I've got a very large, structured document(s) stored in MongoDB, and am using Morphia to query and model it in Java. I'd like to write a query that only returns a handful of the fields in that document, rather than returning the entire thing. I've looked in the documentation on the Morphia site, but couldn't find anything that explains how to do this. Is it possible to write a query like this with Morphia? In pseudocode it would be something like
GET doc.propertyA, doc.propertyB, doc.propertyX FROM doc WHERE doc.someOtherProperty = 'Foo'
Thoughts? Or is Morphia not designed to operate in this manner? Is there something better I could try?
Take a look at this: https://rawgithub.com/wiki/mongodb/morphia/javadoc/0.103/apidocs/com/google/code/morphia/query/Query.html#retrievedFields%28boolean,%20java.lang.String...%29
You'll still get back your entity objects but they'll only contain the fields listed.
example is better than words.
Query returns only "_id" field.
datastore.createQuery(entityClazz.class).retrievedFields(true, Mapper.ID_KEY);

How to use clause in order in Hibernate

I would like to get objects via Hibernate from database with concrete order. This order is something like that:
as the first I would like to get objects with column titled for example first_column not null,
as the second I would like to get objects with column second_column not null,
as the last I would like to get objects which third_column is the id for another object/table, and this another object has a field with concrete value for example: "something".
I have created criteria in this way:
criteria.addOrder(Order.asc("firstColumn"));
criteria.addOrder(Order.asc("secondColumn"));
but how can I meet the last requirement?
With the restriction I can do something like that:
criteria.createAlias("thirdColumn", "t");
criteria.add(Restrictions.eq("t.field", "something"));
But I have to use order, not restriction with three separate Criteria results, because I am using also setFirstResult() and setMaxResults() of the Criteria to implement pagination in my frontend.
If you can write the statement in SQL then you can probably get away with the approach mentioned in this post which is to create a custom subclass of Order.
I think you can simply use the "." separator and write your code as follow
criteria.createAlias("thirdColumn", "t");
criteria.addOrder(Order.asc("t.field"));

How to get over limitations of the Hibernate Criteria and Example APIs?

I'm in a position where our company has a database search service that is highly configurable, for which it's very useful to configure queries in a programmatic fashion. The Criteria API is powerful but when one of our developers refactors one of the data objects, the criteria restrictions won't signal that they're broken until we run our unit tests, or worse, are live and on our production environment. Recently, we had a refactoring project essentially double in working time unexpectedly due to this problem, a gap in project planning that, had we known how long it would really take, we probably would have taken an alternative approach.
I'd like to use the Example API to solve this problem. The Java compiler can loudly indicate that our queries are borked if we are specifying 'where' conditions on real POJO properties. However, there's only so much functionality in the Example API and it's limiting in many ways. Take the following example
Product product = new Product();
product.setName("P%");
Example prdExample = Example.create(product);
prdExample.excludeProperty("price");
prdExample.enableLike();
prdExample.ignoreCase();
Here, the property "name" is being queried against (where name like 'P%'), and if I were to remove or rename the field "name", we would know instantly. But what about the property "price"? It's being excluded because the Product object has some default value for it, so we're passing the "price" property name to an exclusion filter. Now if "price" got removed, this query would be syntactically invalid and you wouldn't know until runtime. LAME.
Another problem - what if we added a second where clause:
product.setPromo("Discounts up to 10%");
Because of the call to enableLike(), this example will match on the promo text "Discounts up to 10%", but also "Discounts up to 10,000,000 dollars" or anything else that matches. In general, the Example object's query-wide modifications, such as enableLike() or ignoreCase() aren't always going to be applicable to every property being checked against.
Here's a third, and major, issue - what about other special criteria? There's no way to get every product with a price greater than $10 using the standard example framework. There's no way to order results by promo, descending. If the Product object joined on some Manufacturer, there's no way to add a criterion on the related Manufacturer object either. There's no way to safely specify the FetchMode on the criteria for the Manufacturer either (although this is a problem with the Criteria API in general - invalid fetched relationships fail silently, even more of a time bomb)
For all of the above examples, you would need to go back to the Criteria API and use string representations of properties to make the query - again, eliminating the biggest benefit of Example queries.
What alternatives exist to the Example API that can get the kind of compile-time advice we need?
My company gives developers days when we can experiment and work on pet projects (a la Google) and I spent some time working on a framework to use Example queries while geting around the limitations described above. I've come up with something that could be useful to other people interested in Example queries too. Here is a sample of the framework using the Product example.
Criteria criteriaQuery = session.createCriteria(Product.class);
Restrictions<Product> restrictions = Restrictions.create(Product.class);
Product example = restrictions.getQueryObject();
example.setName(restrictions.like("N%"));
example.setPromo("Discounts up to 10%");
restrictions.addRestrictions(criteriaQuery);
Here's an attempt to fix the issues in the code example from the question - the problem of the default value for the "price" field no longer exists, because this framework requires that criteria be explicitly set. The second problem of having a query-wide enableLike() is gone - the matcher is only on the "name" field.
The other problems mentioned in the question are also gone in this framework. Here are example implementations.
product.setPrice(restrictions.gt(10)); // price > 10
product.setPromo(restrictions.order(false)); // order by promo desc
Restrictions<Manufacturer> manufacturerRestrictions
= Restrictions.create(Manufacturer.class);
//configure manuf restrictions in the same manner...
product.setManufacturer(restrictions.join(manufacturerRestrictions));
/* there are also joinSet() and joinList() methods
for one-to-many relationships as well */
Even more sophisticated restrictions are available.
product.setPrice(restrictions.between(45,55));
product.setManufacturer(restrictions.fetch(FetchMode.JOIN));
product.setName(restrictions.or("Foo", "Bar"));
After showing the framework to a coworker, he mentioned that many data mapped objects have private setters, making this kind of criteria setting difficult as well (a different problem with the Example API!). So, I've accounted for that too. Instead of using setters, getters are also queryable.
restrictions.is(product.getName()).eq("Foo");
restrictions.is(product.getPrice()).gt(10);
restrictions.is(product.getPromo()).order(false);
I've also added some extra checking on the objects to ensure better type safety - for example, the relative criteria (gt, ge, le, lt) all require a value ? extends Comparable for the parameter. Also, if you use a getter in the style specified above, and there's a #Transient annotation present on the getter, it will throw a runtime error.
But wait, there's more!
If you like that Hibernate's built-in Restrictions utility can be statically imported, so that you can do things like criteria.addRestriction(eq("name", "foo")) without making your code really verbose, there's an option for that too.
Restrictions<Product> restrictions = new Restrictions<Product>(){
public void query(Product queryObject){
queryObject.setPrice(gt(10));
queryObject.setPromo(order(false));
//gt() and order() inherited from Restrictions
}
}
That's it for now - thank you very much in advance for any feedback! We've posted the code on Sourceforge for those that are interested. http://sourceforge.net/projects/hqbe2/
The API looks great!
Restrictions.order(boolean) smells like control coupling. It's a little unclear what the values of the boolean argument represent.
I suggest replacing or supplementing with orderAscending() and orderDescending().
Have a look at Querydsl. Their JPA/Hibernate module requires code generation. Their Java collections module uses proxies but cannot be used with JPA/Hibernate at the moment.

In-memory data structure that supports boolean querying

I need to store data in memory where I map one or more key strings to an object, as follows:
"green", "blue" -> object1
"red", "yellow" -> object2
So, in Java the datastructure might implement:
Map<Set<String>, V>
I need to be able to efficiently receive a list of objects, where the strings match some boolean criteria, such as:
("red" OR "green") AND NOT "blue"
I'm working in Java, so the ideal solution would be an off-the-shelf Java library. I am, however, willing to implement something from scratch if necessary.
Anyone have any ideas? I'd rather avoid the overhead of an in-memory database if possible, I'm hoping for something comparable in speed to a HashMap (or at least the same order of magnitude).
Actually, I liked the problem so I implemented a full solution in the spirit of my earlier answer:
http://pastebin.com/6iazSKG9
A simple solution, not thread safe or anything, but fun and a good starting point, I guess.
Edit: Some elaboration, as requested
See the unit test for usage.
There are two interfaces, DataStructure<K,V> and Query<V>. DataStructure behaves somewhat like a map (and in my implementation it actually works with an internal map), but it also provides reuseable and immutable query objects which can be combined like this:
Query<String> combinedQuery =
structure.and(
structure.or(
structure.search("blue"),
structure.search("red")
),
structure.not(
structure.search("green")
)
);
(A Query that searches for objects that are tagged as (blue OR red) AND NOT green). This query is reuseable, which means that it's results will change whenever the backing map is changed (kind of like an ITunes smart playlist).
The query objects are already thread safe, but the backing map is not, so there is some room for improvement here. Also, the queries could cache their results, but that would probably mean that the interface would have to be extended to provide for a purge method (kind of like the detach method in Wicket's models), which wouldn't be pretty.
As for licensing: if anybody wants this code I'll be happy to put it on SourceForge etc. ...
Sean
Would the criteria be amenable to bitmap indexing: http://en.wikipedia.org/wiki/Bitmap_index ?
I would say that the easiest way is simply to do a recursive filtering and being cleaver, when for instance evaluating X AND Y where X has been evaluated to the empty set.
The mapping however, needs to be from tags (such as "red" or "blue") to sets of objects.
The base case (resolving the atomic tags) of the recursion, would then be a simple lookup in this map. AND would be implemented using intersection, OR using union, and so on.
Check out the Apache Commons - Collections project. They have tons of great stuff that you will be able to use, particularly the CollectionUtils class for performing strong collection-based logic.
For instance, if your values were stored in a HashMap (as suggested by another answer) as follows:
myMap["green"] -> obj1
myMap["blue"] -> obj1
myMap["red"] -> obj2
myMap["yellow"] -> obj2
Then to retrieve results that match: ("red" or "green") and not "blue you might do this:
CollectionUtils.disjunction(CollectionUtils.union(myMap.get("red"), myMap.get("green")), myMap.get("blue"))
You could map string keys to a binary constant, then use bit shifting to produce an appropriate mask.
i truly think some type of database solution is your best bet. SQL easily supports querying data by
(X and Y) and not Z
this would have worked too reusable condition/expression classes
The Google Collections SetMultimap looks like an easy way to get the underlying structure, then combining that with the Maps static filters to get the querying behavior you want.
Construction would go something like
smmInstance.put(from1,to1);
smmInstance.put(from1,to2);
smmInstance.put(from2,to3);
smmInstance.put(from3,to1);
smmInstance.put(from1,to3);
//...
queries would then look like
valueFilter = //...build predicate
Set<FromType> result = Maps.filterValues(smmInstance.asMap(),valueFilter).keySet()
You can do any amount of fancy building the predicate, but Predicates has a few methods that would probably be enough to do contains/doesn't contain style queries.
I wasn't able to find a satisfactory solution, so I decided to cook up my own and release it as an open source (LGPL) project, find it here.

Categories

Resources