Having a subfield searched under '_all' in elasticsearch

Having a subfield searched under '_all' in elasticsearch - java

I have a field that has a subdivision, like:
//name
.startObject(IndexConstants.FIRST_NAME)
.field("type").value("string")
.startObject("fields")
.startObject("folded")
.field("type").value("string")
.field("analyzer").value("folding")
.endObject()
.endObject()
The _all field only searches on firstname, not firstname.folded. If I specifically query on .folded it works, however it is a catch all query so I would not like to have to specify folded.
I have tried the "include_in_all" true for it but no change.
Thanks

As indicated in the official documentation, it makes no sense to use include_in_all in multi-fields:
The original field value is added to the _all field, not the terms produced by a field’s analyzer. For this reason, it makes no sense to set include_in_all to true on multi-fields, as each multi-field has exactly the same value as its parent.
Using copy_to could be an option with versions <2.x. However, using copy_to with multi-fields will be ignored as of 2.0 and even throw an exception as of 2.0.1 and 2.1.
You're better off matching directly on firstname.folded, if it is really important for you to query that sub-field, simply use its folding analyzer on the main firstname field and get rid of the sub-field.

Related

Morphia update document with undefined number of fields

I'm working on a small project using morphia for MongoDB. I want to know what is the best way to update a document without knowing first hand what field to update. Say for example, I have a form, after having saved to database I might want to come back and change some fields but I haven't decided yet what to change.
My current solution is doing an update for all fields whether it is changed or not and of course it throws null exception. Morphia complains on null value update.
My code looks like this:
Query<Project> q = datastore.find(Project.class, "_id", projectToUpdate.getId());
UpdateOperations<Project> update
= datastore.createUpdateOperations(Project.class)
.set("name", updatedProject.getName())
.set("deadline", updatedProject.getDeadline())
.set("priority", updatedProject.getPriority())
.set("completion", updatedProject.getCompletion())
.set("description", updatedProject.getDescription())
.set("projectManager", updatedProject.getPM())
.set("collaborators", updatedProject.getAllCollaborators())
.set("teams", updatedProject.getAllTeams())
.set("userStories", updatedProject.getUserStories())
.set("log", updatedProject.getLog());
datastore.findAndModify(q, update);
Exception
Exception in thread "main" org.mongodb.morphia.query.QueryException: Value cannot be null.
at org.mongodb.morphia.query.UpdateOpsImpl.set(UpdateOpsImpl.java:220)
at controllers.QueryProjects.updateProject(QueryProjects.java:78)
at controllers.DBConnection.TestMongo(DBConnection.java:152)
at penelope.Main.main(Main.java:12)
I was thinking about using delegate/event handler to update each field individually but I'm afraid that might degrade the performance.

You should just use datastore.save(). See an example at http://mongodb.github.io/morphia/1.3/getting-started/quick-tour/

using $addToset with java morphia aggregation

I have mongodb aggregation query and it works perfectly in shell.
How can i rewrite this query to use with morphia ?
org.mongodb.morphia.aggregation.Group.addToSet(String field) accepts only one field name but i need to add object to the set.
Query:
......aggregate([
{$group:
{"_id":"$subjectHash",
"authors":{$addToSet:"$fromAddress.address"},
---->> "messageDataSet":{$addToSet:{"sentDate":"$sentDate","messageId":"$_id"}},
"messageCount":{$sum:1}}},
{$sort:{....}},
{$limit:10},
{$skip:0}
])
Java code:
AggregationPipeline aggregationPipeline = myDatastore.createAggregation(Message.class)
.group("subjectHash",
grouping("authors", addToSet("fromAddress.address")),
--------??????------>> grouping("messageDataSet", ???????),
grouping("messageCount", new Accumulator("$sum", 1))
).sort(...)).limit(...).skip(...);

That's currently not supported but if you'll file an issue I'd be happy to include that in an upcoming release.

Thanks for your answer, I can guess that according to source code. :(
I don't want to use spring-data or java-driver directly (for this project) so I changed my document representation.
Added messageDataSet object which contains sentDate and messageId (and some other nested objects) (these values become duplicated in a document which is a bad design).
Aggregation becomes : "messageDataSet":{$addToSet:"$messageDataSet"},
and Java code is: grouping("messageDataSet", addToSet("messageDataSet")),
This works with moprhia. Thanks.

Cannot find document when searching for field with Id using Java API in ElasticSearch

I have a field which contains forward slashes. I'm trying to execute this Query:
QueryBuildres.termQuery("id", QueryParser.escape("/my/field/val"))
and I cannot get any results. When I'm looking for 'val' only, then I get the proper results. Any ideas why is that happening? Of course without escaping it also doesn't return the results.
UPDATE
so QP.escape parses string properly, but when request goes to elasticsearch it's double escaped
[2015-07-10 01:53:00,063][WARN ][index.search.slowlog.query] [Aaa AA] [index_name][4] took[420.8micros], took_millis[0], types[page], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"query":{"term":{"pageId":"\\/path\\/and\\/testestest"}}}], extra_source[],
UPDATE 2: It works when I'm using querystring, but I wouldn't like to user that and type everything by hand.

You might have to use _id instead of Id

So the reason why I didn't get any results is the default index which I had created.
I didn't specified mapping for my field, so ElasticSearch didn't treated my field.
In ElasticSearch documentation I read, that during the analysis process, elastic search splits the string into words, lower-case them and do some other stuff.
In my case my "/path/in/my/field" was splitted into four fields:
path
in
my
field
So when I was searching for "pageId:/path/in/my/field" I didn't get any results because pageId in fact didn't contained it.
To solve the issue I had to add proper mapping to pageId field, which didn't do any preprocessing (instead of four words, now I have one "/path/in/my/field")
Links to docs:
https://www.elastic.co/guide/en/elasticsearch/guide/current/analysis-intro.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping-intro.html

Empty field in native facet script

I have a native facet script that checks if a specific field (mapped to type long) in the document is empty, this is how I do it:
Object fieldValue = doc().get("fieldName");
return fieldValue == ScriptDocValues.EMPTY;
However, for some of the documents this returns false even when the field is empty (I've checked this with the exists filter). This behavior is inconsistent and it usually returns the correct result. Furthermore, the same document in a different host with the same mapping, same version and same code - returns the correct result.
Is there a better way to check if a field is empty?
I'm using ElasticSearch 0.90.5 with facet script 1.1.2 and java 1.7u17.

The correct way to check for an empty field is:
ScriptDocValues value = (ScriptDocValues) doc().get("field2");
return value.isEmpty();

ElasticSearch - Using FilterBuilders

I am new to ElasticSearch and Couchbase. I am building a sample Java application to learn more about ElasticSearch and Couchbase.
Reading the ElasticSearch Java API, Filters are better used in cases where sort on score is not necessary and for caching.
I still haven't figured out how to use FilterBuilders and have following questions:
Can FilterBuilders be used alone to search?
Or Do they always have to be used with a Query? ( If true, can someone please list an example? )
Going through a documentation, if I want to perform a search based on field values and want to use FilterBuilders, how can I accomplish that? (using AndFilterBuilder or TermFilterBuilder or InFilterBuilder? I am not clear about the differences between them.)
For the 3rd question, I actually tested it with search using queries and using filters as shown below.
I got empty result (no rows) when I tried search using FilterBuilders. I am not sure what am I doing wrong.
Any examples will be helpful. I have had a tough time going through documentation which I found sparse and even searching led to various unreliable user forums.
private void processQuery() {
SearchRequestBuilder srb = getSearchRequestBuilder(BUCKET);
QueryBuilder qb = QueryBuilders.fieldQuery("doc.address.state", "TX");
srb.setQuery(qb);
SearchResponse resp = srb.execute().actionGet();
System.out.println("response :" + resp);
}
private void searchWithFilters(){
SearchRequestBuilder srb = getSearchRequestBuilder(BUCKET);
srb.setFilter(FilterBuilders.termFilter("doc.address.state", "tx"));
//AndFilterBuilder andFb = FilterBuilders.andFilter();
//andFb.add(FilterBuilders.termFilter("doc.address.state", "TX"));
//srb.setFilter(andFb);
SearchResponse resp = srb.execute().actionGet();
System.out.println("response :" + resp);
}
--UPDATE--
As suggested in the answer, changing to lowercase "tx" works. With this question resolved. I still have following questions:
In what scenario(s), are filters used with query? What purpose will this serve?
Difference between InFilter, TermFilter and MatchAllFilter. Any illustration will help.

Right, you should use filters to exclude documents from being even considered when executing the query. Filters are faster since they don't involve any scoring, and cacheable as well.
That said, it's pretty obvious that you have to use a filter with the search api, which does execute a query and accepts an optional filter. If you only have a filter you can just use the match_all query together with your filter. A filter can be a simple one, or a compund one in order to combine multiple filters together.
Regarding the Java API, the names used are the names of the filters available, no big difference. Have a look at this search example for instance. In your code I don't see where you do setFilter on your SearchRequestBuilder object. You don't seem to need the and filter either, since you are using a single filter. Furthermore, it might be that you are indexing using the default mappings, thus the term "TX" is lowercased. That's why when you search using the term filter you don't find any match. Try searching for "tx" lowercased.
You can either change your mapping if you want to keep the "TX" term as it is while indexing, probably setting the field as not_analyzed if it should only be a single token. Otherwise you can change filter, you might want to have a look at a query that is analyzed, so that your query wil be analyzed the same way the content was indexed.
Have a look at the query DSL documentation for more information regarding queries and filters:
MatchAllFilter: matches all your document, not that useful I'd say
TermFilter: Filters documents that have fields that contain a term (not analyzed)
AndFilter: compound filter used to put in and two or more filters
Don't know what you mean by InFilterBuilder, couldn't find any filter with this name.
The query usually contains what the user types in through the text search box. Filters are more way to refine the search, for example clicking on facet entries. That's why you would still have the query plus one or more filters.

To append to what #javanna said:
A lot of confusion can come from the fact that filters can be defined in several ways:
standalone (with a required query, for instance match_all if all you need is the filters) (http://www.elasticsearch.org/guide/reference/api/search/filter/)
or as part of a filtered query (http://www.elasticsearch.org/guide/reference/query-dsl/filtered-query/)
What's the difference you might ask. And indeed you can construct exactly the same logic in both ways.
The difference is that a query operates on BOTH the resultset as well as any facets you have defined. Whereas, a Filter (when defined standalone) only operates on the resultset and NOT on any facets you may have defined (explained here: http://www.elasticsearch.org/guide/reference/api/search/filter/)

To add to the other answers, InFilter is only used with FilterBuilders. The definition is, InFilter: A filter for a field based on several terms matching on any of them.
The query Java API uses FilterBuilders, which is a factory for filter builders that can dynamically create a query from Java code. We do this using a form and we build our query based on user selections from it with checkboxes, options, and dropdowns.
Here is some Example code for FilterBuilders and there is a snippet from that link that uses InFilter as shown below:
FilterBuilder filterBuilder;
User user = (User) auth.getPrincipal();
if (user.getGroups() != null && !user.getGroups().isEmpty()) {
filterBuilder = FilterBuilders.boolFilter()
.should(FilterBuilders.nestedFilter("userRoles", FilterBuilders.termFilter("userRoles.key", auth.getName())))
.should(FilterBuilders.nestedFilter("groupRoles", FilterBuilders.inFilter("groupRoles.key", user.getGroups().toArray())));
} else {
filterBuilder = FilterBuilders.nestedFilter("userRoles", FilterBuilders.termFilter("userRoles.key", auth.getName()));
}
...

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.