Can a Hibernate Search FieldBridge configure facets for dynamic fields?

Can a Hibernate Search FieldBridge configure facets for dynamic fields? - java

Using Hibernate Search 5.11.3 with programmatic API (no annotations), is there a way to facet on dynamic fields added in a class or field bridge? I don't see any 'facet' config available in FieldMetadataBuilder when using MetadataProvidingFieldBridge.
I have tried various combinations of luceneOptions.addSortedDocValuesFieldToDocument() and luceneOptions.addFieldToDocument() in the set() method. This successfully updates the index, but I cannot perform facet queries.
I am trying to do a basic attribute facet/filter where I have a generic table of attributes with id/name and attribute values associated with products. For various reasons I am using the programmatic API and especially for attributes I can't make use of the #Facet annotation. So for a product, I added this class bridge to Product.class:
public class ProductClassTagValuesBridge implements FieldBridge
{
#Override
public void set(String name, Object value, Document document, LuceneOptions luceneOptions)
{
Product product = (Product) value;
for (TagValue v : product.getTagValues())
{
Tag tag = v.getTag();
String tagName = "tag-" + tag.getId();
String tagValue = v.getId().toString();
// not sure if this line is required? Have tried with and without
luceneOptions.addFieldToDocument(tagName, tagValue, document);
luceneOptions.addSortedDocValuesFieldToDocument(tagName, tagValue, document);
}
}
}
Then I build my (test) faceting request to search tag-56 (which I confirmed is in the index using Luke):
FacetParameterContext context = queryBuilder.facet()
.name("tag-56")
.onField("tag-56")
.discrete();
FacetingRequest facetingRequest = context.createFacetingRequest();
Which when used in the search/FacetManager gives me the error:
org.hibernate.search.exception.SearchException: HSEARCH000268: Facet request 'TAG_56' tries to facet on field 'tag-56' which either does not exist or is not configured for faceting (via #Facet). Check your configuration.
I have also tried the custom config solution from the solution in this post: Hibernate Search: configure Facet for custom FieldBridge
For the custom field I added a field bridge to tagValues on my product. The same error occurs.
mapping.entity(Product.class).indexed()
.property("tagValues", ElementType.FIELD).field()
.analyze(Analyze.NO).store(Store.YES)
.bridge(ProductTagValuesFieldBridge.class)

Short answer: Hibernate Search does not allow that... yet.
Long answer:
Hibernate Search 5 allows dynamic fields, but does not allow faceting on fields declared in custom bridges.
That is to say, you can add arbitrary values to your index that don't fit a pre-defined schema, but you cannot use faceting on those fields.
Hibernate search 6 allows faceting (now called "aggregations") on fields declared in custom bridges (just declare them as .aggregable(Aggregable.YES)), but does not allow dynamic fields yet.
EDIT: Starting with 6.0.0.Beta7, dynamic fields are supported thanks to field templates. So the rest of my message is not useful anymore.
See this section of the documentation for more information about field templates. It's totally possible to declare an aggregable, dynamic field in your bridge.
Original message about ways to work without dynamic fields (obsolete):
That is to say, if you know the list of tags upon startup, are able to list them all, and are certain they won't change while your application is up, you could declare the fields upfront and use faceting on them. But if you don't know the list of tags upon startup, none of this is possible (yet).
Until dynamic fields are added to Hibernate Search 6, the only solution is to use Hibernate Search 5 and to re-implement faceting yourself. As you can expect, this will be complex and you will have to get your hands dirty with Lucene. You will have to:
Add fields of type SortedSetDocValuesFacetField to your document in your custom bridge.
Ensure Hibernate Search calls FacetsConfig.build on your documents after they are populated. One way to do that (through a hack) would be to declare a dummy #Facet field on your entity, even if you don't use it.
Completely ignore Hibernate Search's query feature and perform faceting yourself from an IndexReader. You can get an IndexReader from Hibernate Search as explained here. There's an example of how to perform faceting in org.hibernate.search.query.engine.impl.QueryHits#updateStringFacets.

Related

Creating and using LuceneAnalysisDefinitionProvider with Hibernate Search

When you search Stackoverflow or the Internet for LuceneAnalysisDefinitionProvider, you'll find hundreds of pages, each of them having the same code copied from another page without any decent explanation or further examples of usage.
So I tried to do it by myself and failed. Here is my code:
public class CustomLuceneAnalysisDefinitionProvider
implements LuceneAnalysisDefinitionProvider {
#Override
public void register(final LuceneAnalysisDefinitionRegistryBuilder builder) {
builder
.analyzer("customAnalyzer")
.tokenizer(StandardTokenizerFactory.class)
.charFilter(MappingCharFilterFactory.class)
.param("mapping",
"org/hibernate/search/test/analyzer/mapping-chars.properties")
.tokenFilter(ASCIIFoldingFilterFactory.class)
.tokenFilter(LowerCaseFilterFactory.class)
.tokenFilter(StopFilterFactory.class)
// WRONG! It's not "mapping"!
// .param("mapping",
// "org/hibernate/search/test/analyzer/stoplist.properties")
.param("words",
"classpath:/stoplist.properties")
.param("ignoreCase", "true");
}
}
Now we have CustomLuceneAnalysisDefinitionProvider and what's next?
Where to put and how to address mapping-chars.properties when adding it as a parameter to MappingCharFilterFactory?
What is the contents of mapping-chars.properties and how to create mine of modify existing?
Where to put stoplist.properties and how to address it when adding as mapping parameter to StopFilterFactory?
How to add previously defined customAnalyzer to single #Field mentioned below?
#Field(
index = Index.YES,
analyze = Analyze.YES,
store = Store.NO,
bridge = #FieldBridge(impl = LocalizedFieldBridge.class)
)
private LocalizedField description;
On some pages I found option to put this definition into application.properties:
hibernate.search.lucene.analysis_definition_provider = com.thevegcat.app.search.CustomAnalysisDefinitionProvider
But I don't want to replace original analyzer, I just want to use custom analyzer for few specific properties.
EDIT#1
Looking into org.apache.lucene.analysis.core.StopFilterFactory line 86, one can notice it takes words as a key, not mapping.
EDIT#2
If you put your stop words file in src/main/resources, then you have to address it:
.param("words", "classpath:/stoplist.properties")

you'll find hundreds of pages, each of them having the same code copied from another page without any decent explanation or further examples of usage.
Hibernate Search 5 had its problems, one of which was lack of documentation in some areas. Now that it's in maintenance mode, those problems are unlikely to get addressed.
There is some documentation for that feature in the Hibernate Search 5 documentation: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#section-programmatic-analyzer-definition
You'll get better documentation of that feature by migrating to Hibernate Search 6+.
That being said, most of your questions related to Lucene features, so you probably won't find answers in Hibernate Search's documentation. You could find them in Lucene's documentation. How to find such documentation is explained in the Hibernate Search 6 documentation:
To know more about the behavior of these character filters, tokenizers and token filters, either browse the Lucene Javadoc or read the corresponding section on the Solr Wiki (you don’t need Solr to use these analyzers, it’s just that there is no documentation page for Lucene proper).
Where to put and how to address mapping-chars.properties when adding it as a parameter to MappingCharFilterFactory?
In your classpath.
What is the contents of mapping-chars.properties and how to create mine of modify existing?
That's the kind of things that Lucene doesn't document, at least not clearly. Solr's documentation is better: https://solr.apache.org/guide/6_6/charfilterfactories.html#CharFilterFactories-solr.MappingCharFilterFactory
Where to put stoplist.properties and how to address it when adding as mapping parameter to StopFilterFactory?
Put it in the classpath, and pass the path to that file from the root of your classpath.
How to add previously defined customAnalyzer to single #Field mentioned below?
Well that is documented, at least: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#_referencing_named_analyzers
#Field(analyzer = #Analyzer(definition = "customAnalyzer"))
On some pages I found option to put this definition into application.properties:
hibernate.search.lucene.analysis_definition_provider = com.thevegcat.app.search.CustomAnalysisDefinitionProvider
But I don't want to replace original analyzer, I just want to use custom analyzer for few specific properties.
You won't replace an "analyzer", you will register an analysis definition provider. Which will add analyzer definitions to Hibernate Search, which can then be referenced from #Field. Setting an analysis definition provider does not, in itself, change your mapping in any way.

Highlighting in Hibernate Search 6 and Elasticsearch backend

We're in the process of converting our java application from Hibernate Search 5 to 6 with an Elasticsearch backend.
For some good background info, see How to do highlighting within HibernateSearch over Elasticsearch for a question we had when upgrading our highlighting code from a Lucene to Elasticsearch backend and how it was resolved.
Hibernate Search 6 seems to support using 2 backends at the same time, Lucene and Elasticsearch, so we'd like to use Elasticsearch for all our queries and Lucene for the highlighting, if that's possible.
Here is basically what we're trying to do:
public boolean matchPhoneNumbers() {
String phoneNumber1 = "603-436-1234";
String phoneNumber2 = "603-436-1234";
LuceneBackend luceneBackend =
Search.mapping(entityManager.getEntityManagerFactory())
.backend().unwrap(LuceneBackend.class);
Analyzer analyzer = luceneBackend.analyzer("phoneNumberKeywordAnalyzer").get();
//... builds a Lucene Query using the analyzer and phoneNumber1 term
Query phoneNumberQuery = buildQuery(analyzer, phoneNumber1, ...);
return isMatch("phoneNumberField", phoneNumber2, phoneNumberQuery, analyzer);
}
private boolean isMatch(String field, String target, Query sourceQ, Analyzer analyzer) {
Highlighter highlighter = new Highlighter(new QueryScorer(sourceQ, field));
highlighter.setTextFragmenter(new NullFragmenter());
try {
String result = highlighter.getBestFragment(analyzer, field, target);
return StringUtils.hasText(result);
} catch (IOException e) {
...
}
}
What I've attempted so far is to configure two separate backends in the configuration properties, per the documentation, like this:
properties.setProperty("hibernate.search.backends.elasticsearch.analysis.configurer", "com.bt.demo.search.AnalysisConfigurer");
properties.setProperty("hibernate.search.backends.lucene.analysis.configurer", "com.bt.demo.search.CustomLuceneAnalysisConfigurer");
properties.setProperty("hibernate.search.backends.elasticsearch.type", "elasticsearch");
properties.setProperty("hibernate.search.backends.lucene.type", "lucene");
properties.setProperty("hibernate.search.backends.elasticsearch.uris", "http://127.0.0.1:9200");
The AnalysisConfigurer class implements ElasticsearchAnalysisConfigurer and
CustomLuceneAnalysisConfigurer implements from LuceneAnalysisConfigurer.
Analyzers are defined twice, once in the Elasticsearch configurer and again in the Lucene configurer.
I don't know why both hibernate.search.backends.elasticsearch.type and hibernate.search.backends.lucene.type are necessary but if I don't include the lucene.type, I get Ambiguous backend type: configuration property 'hibernate.search.backends.lucene.type' is not set.
But if I do have both backend properties types set, I get
HSEARCH000575: No default backend. Check that at least one entity is configured to target the default backend, when attempting to retrieve the Lucene backend, like:
Search.mapping(entityManager.getEntityManagerFactory())
.backend().unwrap(LuceneBackend.class);
And the same error when trying to retrieve the Elasticsearch backend.
I've also added #Indexed(..., backend = "elasticsearch") to my entities since I wish to have them saved into Elasticsearch and don't need them in Lucene. I also tried adding a fake entity with #Indexed(..., backend = "lucene") but it made no difference.
What have I got configured wrong?

I don't know why both hibernate.search.backends.elasticsearch.type and hibernate.search.backends.lucene.type are necessary but if I don't include the lucene.type, I get Ambiguous backend type: configuration property 'hibernate.search.backends.lucene.type' is not set.
That's because the backend name is just that: a name. Hibernate Search doesn't infer particular information from it, even if you name your backend "lucene" or "elasticsearch". You could have multiple Elasticsearch backends for all it knows :)
But if I do have both backend properties types set, I get HSEARCH000575: No default backend. Check that at least one entity is configured to target the default backend, when attempting to retrieve the Lucene backend, like:
Search.mapping(entityManager.getEntityManagerFactory())
.backend().unwrap(LuceneBackend.class);
``
You called .backend(), which retrieves the default backend, i.e. the backend that doesn't have a name and is configured through hibernate.search.backend.* instead of hibernate.search.backends.<somename>.* (see https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#configuration-structure ).
But you are apparently mapping all your entities to a named backends, one named elasticsearch and one named lucene. So the default backend just doesn't exist.
You should call this:
Search.mapping(entityManager.getEntityManagerFactory())
.backend("lucene").unwrap(LuceneBackend.class);
I've also added #Indexed(..., backend = "elasticsearch") to my entities since I wish to have them saved into Elasticsearch
Since you obviously only want to use one backend for indexing, I would recommend reverting that change (keeping #Indexed without setting #Indexed.backend) and simply making using the default backend.
In short, remove the #Indexed.backend and replace this:
properties.setProperty("hibernate.search.backends.elasticsearch.analysis.configurer", "com.bt.demo.search.AnalysisConfigurer");
properties.setProperty("hibernate.search.backends.lucene.analysis.configurer", "com.bt.demo.search.CustomLuceneAnalysisConfigurer");
properties.setProperty("hibernate.search.backends.elasticsearch.type", "elasticsearch");
properties.setProperty("hibernate.search.backends.lucene.type", "lucene");
properties.setProperty("hibernate.search.backends.elasticsearch.uris", "http://127.0.0.1:9200");
With this
properties.setProperty("hibernate.search.backend.analysis.configurer", "com.bt.demo.search.AnalysisConfigurer");
properties.setProperty("hibernate.search.backends.lucene.analysis.configurer", "com.bt.demo.search.CustomLuceneAnalysisConfigurer");
properties.setProperty("hibernate.search.backend.type", "elasticsearch");
properties.setProperty("hibernate.search.backends.lucene.type", "lucene");
properties.setProperty("hibernate.search.backend.uris", "http://127.0.0.1:9200");
You don't technically have to do that, but I think it will be simpler in the long term. It keeps the Lucene backend as a separate hack that doesn't affect your whole application.
I also tried adding a fake entity with #Indexed(..., backend = "lucene")
I confirm you will need that fake entity mapped to the "lucene" backend, otherwise Hibernate Search will not create the "lucene" backend.

Is there a way to retrieve the ADTs? It doesn't seem like using DDMTemplateLocalServiceUtil is the way to go

I'm trying to get all the adts of a website knowing it's groupid. However, it seems like the adts are mixed up with all the other templates of the site in the DDMTemplate table in the Database.

DDMTemplates are a general portal concept to manage templates for different types of portal assets (not only ADTs). DDMTemplateLocalService is a service to use to also list the ADTs for a certain asset type. You need to first fetch your ClassNameId for the desired asset type to render. For example:
com.liferay.portal.kernel.service.ClassNameLocalService.getClassNameId(AssetEntry.class)
for AssetPublisher entries (or BlogsEntry and so on - for all other types of interest).
Having this id, you can query the ADTs of a site (groupId) using:
com.liferay.dynamic.data.mapping.service.DDMTemplateLocalService.getTemplates(long groupId, long classNameId)
Instead of using the *LocalServiceUtil static functions, you could also inject the service using the #Reference annotation. For example #Reference private DDMTemplateLocalService ddmTemplateService;.

What name of generated database by liferay?

Where are tables that generated database by Liferay through service.xml?. I don't see it in my Postgres. There are so many tables, I tried to find it but it not found. Anyone can help me, thanks

Unless you explicitly specify the table name in the entities that you declare in service.xml, the table names are constructed with the namespace and entity name.
<service-builder package-path="com.liferay.docs.guestbook">
<namespace>GB</namespace>
<entity name="Guestbook" local-service="true" uuid="true">
...
would generate GB_Guestbook as table name.
From the very well documented DTD:
<namespace>
The namespace element must be a unique namespace for this component.
Table names will be prepended with this namespace. Generated JSON
JavaScript will be scoped to this namespace as well (i.e.,
Liferay.Service.Test.* if the namespace is Test).
<entity> Child of service-builder
An entity usually represents a business facade and a table in the
database. If an entity does not have any columns, then it only
represents a business facade. The Service Builder will always generate
an empty business facade POJO if it does not exist. Upon subsequent
generations, the Service Builder will check to see if the business
facade already exists. If it exists and has additional methods, then
the Service Builder will also update the SOAP wrappers.
If an entity does have columns, then the value object, the POJO class
that is mapped to the database, and other persistence utilities are
also generated based on the order and finder elements.
...
(and you'll find more hints, e.g. explicit table names, in that document)
Notes:
If you declare that the entities are stored in an external (non-Liferay) datasource, no tables will be created.
Also, some versions of Liferay automatically updated the database structure on deployment of a new plugin version (with updated persistence layers), while others don't do this automatically (it's a developer feature anyways, not good for large - production - amount of data)

JPA - Setting entity class property from calculated column?

I'm just getting to grips with JPA in a simple Java web app running on Glassfish 3 (Persistence provider is EclipseLink). So far, I'm really liking it (bugs in netbeans/glassfish interaction aside) but there's a thing that I want to be able to do that I'm not sure how to do.
I've got an entity class (Article) that's mapped to a database table (article). I'm trying to do a query on the database that returns a calculated column, but I can't figure out how to set up a property of the Article class so that the property gets filled by the column value when I call the query.
If I do a regular "select id,title,body from article" query, I get a list of Article objects fine, with the id, title and body properties filled. This works fine.
However, if I do the below:
Query q = em.createNativeQuery("select id,title,shorttitle,datestamp,body,true as published, ts_headline(body,q,'ShortWord=0') as headline, type from articles,to_tsquery('english',?) as q where idxfti ## q order by ts_rank(idxfti,q) desc",Article.class);
(this is a fulltext search using tsearch2 on Postgres - it's a db-specific function, so I'm using a NativeQuery)
You can see I'm fetching a calculated column, called headline. How do I add a headline property to my Article class so that it gets populated by this query?
So far, I've tried setting it to be #Transient, but that just ends up with it being null all the time.

There are probably no good ways to do it, only manually:
Object[] r = (Object[]) em.createNativeQuery(
"select id,title,shorttitle,datestamp,body,true as published, ts_headline(body,q,'ShortWord=0') as headline, type from articles,to_tsquery('english',?) as q where idxfti ## q order by ts_rank(idxfti,q) desc","ArticleWithHeadline")
.setParameter(...).getSingleResult();
Article a = (Article) r[0];
a.setHeadline((String) r[1]);
-
#Entity
#SqlResultSetMapping(
name = "ArticleWithHeadline",
entities = #EntityResult(entityClass = Article.class),
columns = #ColumnResult(name = "HEADLINE"))
public class Article {
#Transient
private String headline;
...
}

AFAIK, JPA doesn't offer standardized support for calculated attributes. With Hibernate, one would use a Formula but EclipseLink doesn't have a direct equivalent. James Sutherland made some suggestions in Re: Virtual columns (#Formula of Hibernate) though:
There is no direct equivalent (please
log an enhancement), but depending on
what you want to do, there are ways to
accomplish the same thing.
EclipseLink defines a
TransformationMapping which can map a
computed value from multiple field
values, or access the database.
You can override the SQL for any CRUD
operation for a class using its
descriptor's DescriptorQueryManager.
You could define a VIEW on your
database that performs the function
and map your Entity to the view
instead of the table.
You can also perform minor
translations using Converters or
property get/set methods.
Also have a look at the enhancement request that has a solution using a DescriptorEventListener in the comments.
All this is non standard JPA of course.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.