ElasticSearch completion suggester with Java API

ElasticSearch completion suggester with Java API - java

I had tried a few example codes on suggester feature of ElasticSearch on the net but I couldn't solve my problem against the autocomplete solution
my index:
client.prepareIndex("kodcucom", "article", "1")
.setSource(putJsonDocument("ElasticSearch: Java",
"ElasticSeach provides Java API, thus it executes all operations " +
"asynchronously by using client object..",
new Date(),
new String[]{"elasticsearch"},
"Hüseyin Akdoğan")).execute().actionGet();
and I used suggestbuilder to obtain the keyword then scan through the content "field", and here is where the null pointer exception occurs due to no result
CompletionSuggestionBuilder skillNameSuggest = new CompletionSuggestionBuilder("skillNameSuggest");
skillNameSuggest.text("lien");
skillNameSuggest.field("content");
SuggestRequestBuilder suggestRequestBuilder = client.prepareSuggest("kodcucom").addSuggestion(skillNameSuggest);
SuggestResponse suggestResponse = suggestRequestBuilder.execute().actionGet();
Iterator<? extends Suggest.Suggestion.Entry.Option> iterator =
suggestResponse.getSuggest().getSuggestion("skillNameSuggest").iterator().next().getOptions().iterator();
Am I missing some filters or input criteria in order to get result? Any result should ok such as autocomplete or record found.
EDIT 1:
This is where I got the NPE and I could see that none of any result return at suggestResponse from debug mode
Iterator<? extends Suggest.Suggestion.Entry.Option> iterator =
suggestResponse.getSuggest().getSuggestion("skillNameSuggest").iterator().next().getOptions().iterator();
EDIT 2:
I am using 2.1.1 version of ElasticSearch Java API
EDIT 3:
I tried in splitting up the iterator line into several code blocks, the NPE occur at the last line when converting a set of data into iterator, but there is not much helping
Suggest tempSuggest = suggestResponse.getSuggest();
Suggestion tempSuggestion = tempSuggest.getSuggestion("skillNameSuggest");
Iterator tempIterator = tempSuggestion.iterator();
I see that the codes:
SuggestRequestBuilder suggestRequestBuilder = client.prepareSuggest("kodcucom").addSuggestion(skillNameSuggest);
SuggestResponse suggestResponse = suggestRequestBuilder.execute().actionGet();
has already consists a empty array/dataset, am I using the suggest request builder incorrectly?

In order to use completion feature, you need to dedicate one field, which will be called completion and you have to specify a special mapping for it.
For example:
"mappings": {
"article": {
"properties": {
"content": {
"type": "string"
},
"completion_suggest": {
"type": "completion"}
}
}
}
The completion_suggest field is the field we will use for the autocomplete function in the above code sample. After this mapping defination, the data must be indexing as follow:
curl -XPOST localhost:9200/kodcucom/article/1 -d '{
"content": "elasticsearch",
"completion_suggest": {
"input": [ "es", "elastic", "elasticsearch" ],
"output": "ElasticSearch"
}
}'
Then Java API can be used as follows for get suggestions:
CompletionSuggestionBuilder skillNameSuggest = new CompletionSuggestionBuilder("complete");
skillNameSuggest.text("es");
skillNameSuggest.field("completion_suggest");
SearchResponse searchResponse = client.prepareSearch("kodcucom")
.setTypes("article")
.setQuery(QueryBuilders.matchAllQuery())
.addSuggestion(skillNameSuggest)
.execute().actionGet();
CompletionSuggestion compSuggestion = searchResponse.getSuggest().getSuggestion("complete");
List<CompletionSuggestion.Entry> entryList = compSuggestion.getEntries();
if(entryList != null) {
CompletionSuggestion.Entry entry = entryList.get(0);
List<CompletionSuggestion.Entry.Option> options =entry.getOptions();
if(options != null) {
CompletionSuggestion.Entry.Option option = options.get(0);
System.out.println(option.getText().string());
}
}

Following link provides you the details of how to create a suggester index. https://www.elastic.co/blog/you-complete-me
Now, I use asynchronous Suggestionbuilder Java API to generate suggestions based on terms.
SearchRequestBuilder suggestionsExtractor = elasticsearchService.suggestionsExtractor("yourIndexName", "yourIndexType//not necessary", "name_suggest", term);
System.out.println(suggestionsExtractor);
Map<String,Object> suggestionMap = new HashMap<>();
suggestionsExtractor.execute(new ActionListener<SearchResponse>() {
#Override
public void onResponse(SearchResponse searchResponse) {
if(searchResponse.status().equals(RestStatus.OK)) {
searchResponse.getSuggest().getSuggestion("productsearch").getEntries().forEach(e -> {
e.getOptions().forEach(s -> {
ArrayList<Object> contents = new ArrayList<>();
suggestionMap.put(s.getText().string(), s.getScore());
});
});
}
}
#Override
public void onFailure(Exception e) {
Helper.sendErrorResponse(routingContext,new JsonObject().put("details","internal server error"));
e.printStackTrace();
}
});
Following is how suggestionbuilder is created.
public SearchRequestBuilder suggestionsExtractor(String indexName, String typeName, String field, String term) {
CompletionSuggestionBuilder csb = SuggestBuilders.completionSuggestion(field).text(term);
SearchRequestBuilder suggestBuilder = client.prepareSearch()
.suggest(new SuggestBuilder().addSuggestion(indexName, csb));
return suggestBuilder;
}

Related

How to create full text search query in mongodb with spring-data?

I have spring-data-mogodb application on java or kotlin, and need create text search request to mongodb by spring template.
In mongo shell it look like that:
db.stores.find(
{ $text: { $search: "java coffee shop" } },
{ score: { $meta: "textScore" } }
).sort( { score: { $meta: "textScore" } } )
I already tried to do something but it is not exactly what i need:
#override fun getSearchedFiles(searchQuery: String, pageNumber: Long, pageSize: Long, direction: Sort.Direction, sortColumn: String): MutableList<SystemFile> {
val matching = TextCriteria.forDefaultLanguage().matching(searchQuery)
val match = MatchOperation(matching)
val sort = SortOperation(Sort(direction, sortColumn))
val skip = SkipOperation((pageNumber * pageSize))
val limit = LimitOperation(pageSize)
val aggregation = Aggregation
.newAggregation(match, skip, limit)
.withOptions(Aggregation.newAggregationOptions().allowDiskUse(true).build())
val mappedResults = template.aggregate(aggregation, "files", SystemFile::class.java).mappedResults
return mappedResults
}
May be someone already working with text searching on mongodb with java, please share your knowledge with us )

Setup Text indexes
First you need to set up text indexes on the fields on which you want to perform your text query.
If you are using Spring data mongo to insert your documents in your database, you can use #TextIndexed annotation and indexes will be built while inserting your document.
#Document
class MyObject{
#TextIndexed(weight=3) String title;
#TextIndexed String description;
}
If your document are already inserted in your database, you need to build your text indexes manually
TextIndexDefinition textIndex = new TextIndexDefinitionBuilder()
.onField("title", 3)
.onField("description")
.build();
After the build and config of your mongoTemplate you can pass your text indexes/
template.indexOps(MyObject.class).ensureIndex(textIndex);
Building your text query
List<MyObject> getSearchedFiles(String textQuery){
TextQuery textQuery = TextQuery.queryText(new TextCriteria().matchingAny(textQuery)).sortByScore();
List<MyObject> result = mongoTemplate.find(textQuery, MyObject.class, "myCollection");
return result
}

RavenDB query returns null

I'm on RavenDB 3.5.35183. I have a type:
import com.mysema.query.annotations.QueryEntity;
#QueryEntity
public class CountryLayerCount
{
public String countryName;
public int layerCount;
}
and the following query:
private int getCountryLayerCount(String countryName, IDocumentSession currentSession)
{
QCountryLayerCount countryLayerCountSurrogate = QCountryLayerCount.countryLayerCount;
IRavenQueryable<CountryLayerCount> levelDepthQuery = currentSession.query(CountryLayerCount.class, "CountryLayerCount/ByName").where(countryLayerCountSurrogate.countryName.eq(countryName));
CountryLayerCount countryLayerCount = new CountryLayerCount();
try (CloseableIterator<StreamResult<CountryLayerCount>> results = currentSession.advanced().stream(levelDepthQuery))
{
while(results.hasNext())
{
StreamResult<CountryLayerCount> srclc = results.next();
System.out.println(srclc.getKey());
CountryLayerCount clc = srclc.getDocument();
countryLayerCount = clc;
break;
}
}
catch(Exception e)
{
}
return countryLayerCount.layerCount;
}
The query executes successfully, and shows the correct ID for the document I'm retrieving (e.g. "CountryLayerCount/123"), but its data members are both null. The where clause also works fine, the country name is used to retrieve individual countries. This is so simple, but I can't see where I've gone wrong. The StreamResult contains the correct key, but getDocument() doesn't work - or, rather, it doesn't contain an object. The collection has string IDs.
In the db logger, I can see the request coming in:
Receive Request # 29: GET - geodata - http://localhost:8888/databases/geodata/streams/query/CountryLayerCount/ByName?&query=CountryName:Germany
Request # 29: GET - 22 ms - geodata - 200 - http://localhost:8888/databases/geodata/streams/query/CountryLayerCount/ByName?&query=CountryName:Germany
which, when plugged into the browser, correctly gives me:
{"Results":[{"countryName":"Germany","layerCount":5,"#metadata":{"Raven-Entity-Name":"CountryLayerCounts","Raven-Clr-Type":"DbUtilityFunctions.CountryLayerCount, DbUtilityFunctions","#id":"CountryLayerCounts/212","Temp-Index-Score":0.0,"Last-Modified":"2018-02-03T09:41:36.3165473Z","Raven-Last-Modified":"2018-02-03T09:41:36.3165473","#etag":"01000000-0000-008B-0000-0000000000D7","SerializedSizeOnDisk":164}}
]}
The index definition:
from country in docs.CountryLayerCounts
select new {
CountryName = country.countryName
}
AFAIK, one doesn't have to index all the fields of the object to retrieve it in its entirety, right ? In other words, I just need to index the field(s) to find the object, not all the fields I want to retrieve; at least that was my understanding...
Thanks !

The problem is related to incorrect casing.
For example:
try (IDocumentSession sesion = store.openSession()) {
CountryLayerCount c1 = new CountryLayerCount();
c1.layerCount = 5;
c1.countryName = "Germany";
sesion.store(c1);
sesion.saveChanges();
}
Is saved as:
{
"LayerCount": 5,
"CountryName": "Germany"
}
Please notice we use upper case letters in json for property names (this only applies to 3.X versions).
So in order to make it work, please update json properties names + edit your index:
from country in docs.CountryLayerCounts
select new {
CountryName = country.CountryName
}
Btw. If you have per country aggregation, then you can simply query using:
QCountryLayerCount countryLayerCountSurrogate =
QCountryLayerCount.countryLayerCount;
CountryLayerCount levelDepthQuery = currentSession
.query(CountryLayerCount.class, "CountryLayerCount/ByName")
.where(countryLayerCountSurrogate.countryName.eq(countryName))
.single();

How do I write mongo aggregation reduce query in Spring?

data in mongo :
enter image description here
db.test2.aggregate([
{
"$project" : {
"contents" : 1,
"comments" : {
"$filter" : {
"input" : "$comments",
"as" : "item",
"cond" : {"$gt" : ['$$item.score', 2]}
},
},
"comments2" : {
"$filter" : {
"input" : "$comments2",
"as" : "item",
"cond" : {"$gt" : ["$$item.score", 5]}
}
}
}
},
{
"$project" : {
"content" : 1,
"commentsTotal" : {
"$reduce" : {
"input" : "$comments",
"initialValue" : 0,
"in" : {"$add" : ["$$value", "$$this.score"]}
}
},
"comments2Total" : {
"$reduce" : {
"input" : "$comments2",
"initialValue" : 0,
"in" : {"$add" : ["$$value", "$$this.score"]}
}
}
}
},
{$skip : 0},
{$limit: 3}
]);
<!-- language: lang-json-->
So you can see, this does the following :
1、filter the comments and comments2 which score is gt 5.
2、count total of the socre in comment array.
and i write the aggregation query in Spring like this:
AggregationExpression reduce = ArithmeticOperators.Add.valueOf("$$value").add("$$this.socre");
Aggregation aggregation = Aggregation.newAggregation(
Aggregation.project().andExclude("_id")
.andInclude("content")
.and("comments").filter("item", ComparisonOperators.Gt.valueOf("item.score").greaterThanValue(3)).as("comments")
.and("comments2").filter("item", ComparisonOperators.Gt.valueOf("item.score").greaterThanValue(3)).as("comments2"),
Aggregation.project("comments", "comments2")
.and(ArrayOperators.Reduce.arrayOf("comments").withInitialValue("0").reduce(reduce)).as("commentsTotal")
);
when i run like up , it will throws exception :
java.lang.IllegalArgumentException: Invalid reference '$$value'!

You can try below aggregation by wrapping $filter inside the $reduce operation.
Something like below
AggregationExpression reduce1 = new AggregationExpression() {
#Override
public DBObject toDbObject(AggregationOperationContext aggregationOperationContext) {
DBObject filter = new BasicDBObject("$filter", new BasicDBObject("input", "$comments").append("as", "item").append("cond",
new BasicDBObject("$gt", Arrays.<Object>asList("$$item.score", 2))));
DBObject reduce = new BasicDBObject("input", filter).append("initialValue", 0).append("in", new BasicDBObject("$add", Arrays.asList("$$value", "$$this.socre")));
return new BasicDBObject("$reduce", reduce);
}
};
Aggregation aggregation = newAggregation(
Aggregation.project().andExclude("_id")
.andInclude("content")
.and(reduce1).as("commentsTotal")
);

This is an old question, but in case some one winds up here like me, here's how I was able to solve it.
You cannot access "$$this" and "$$value" variables directly like this in spring.
AggregationExpression reduce = ArithmeticOperators.Add.valueOf("$$value").add("$$this.socre");
To do this we have to use reduce variable enum, like this:
AggregationExpression reduce = ArithmeticOperators.Add.valueOf(ArrayOperators.Reduce.Variable.VALUE.getTarget()).add(ArrayOperators.Reduce.Variable.THIS.referringTo("score").getTarget());
Hope this helps!

I had to solve next task and hadn't find any solutions. So i hope my answer will help somebody.
User with roles (user have list of rights + list of roles, each role have own list of rights, needed to find full list of rights):
user structure
role structure
First, i lookup roles to roleDto (for example), then i collect rights from roles to 1 list:
ArrayOperators.Reduce reduce = ArrayOperators.Reduce.arrayOf("$roleDto.rights")
.withInitialValue(new ArrayList<>())
.reduce(ArrayOperators.ConcatArrays.arrayOf("$$value").concat("$$this"));
As result in reduce i have this 1 list of rights collected from roles.
After that i make:
SetOperators.SetUnion.arrayAsSet(reduce).union("$rights")
using previous result. Result type is AggregationExpression because AbstractAggregationExpression implements AggregationExpression.
So, finally i get smth like this (sorry for messy code):
private static AggregationExpression getAllRightsForUser() {
// concat rights from list of roles (each role have list of rights) - list of list to list
ArrayOperators.Reduce reduce = ArrayOperators.Reduce.arrayOf("$roleDto.rights")
.withInitialValue(new ArrayList<>())
.reduce(ArrayOperators.ConcatArrays.arrayOf("$$value").concat("$$this"));
// union result with user.rights
return SetOperators.SetUnion.arrayAsSet(reduce).union("$rights");
}
Result of this operation can be finally used somewhere like here ;) :
public static AggregationOperation addFieldOperation(AggregationExpression aggregationExpression, String fieldName) {
return aoc -> new Document("$addFields", new Document(fieldName, aggregationExpression.toDocument(aoc)));
}

I had the same issue, one of the solutions is to create a custom Reduce function, here's Union example:
public class SetUnionReduceExpression implements AggregationExpression {
#Override
public Document toDocument(AggregationOperationContext context) {
return new Document("$setUnion", ImmutableList.of("$$value", "$$this"));
}
}

Json parsing: Iterating through the values?

I'm trying to use json.simple to get things from this json file:
"Main": {
"Part1":{
"Length": 2,
"Flags": 2,
"Sequence": 4
},
"Part2":{
"Length": 2,
"Type":2,
"Main_Dest":4,
"Main_Source":4,
"Sequence":4,
"Data": {
"1":12,
"2":24
},
"Blank": 8
}
}
Basically, I want to reach the "Type" value in Part2, and on the way add all values. Meaning in the end I want to have the sum 10 (length+flags+sequence+length) and the number 2 for the value "Type". My main problem here is that I have to do it generically, so I can't just collect the values by name because they might change or additional values could be added. Only the value "Type" will always be called exactly that.
What I've done so far is this:
private static void parseJson() {
String path = "...config.json";
boolean count = false;
int sum = 0;
try {
FileReader reader = new FileReader(path);
JSONParser jsonParser = new JSONParser();
JSONObject jsonObject = (JSONObject) jsonParser.parse(reader);
jsonObject.entrySet();
JSONObject main = (JSONObject) jsonObject.get("Main");
for (Iterator iterator = main.keySet().iterator(); iterator.hasNext();){
String key = (String) iterator.next();
//this is where I'm stumped. Do I keep going into the JSONObject until I get to a value?
if (count){
sum += (int) sahara.get(key);
}
if (key.equals("Type")){
count = true;
}
}
System.out.println(skip);
} catch (Exception e) {
}
}
Obviously I don't really know what I'm doing with this. How do I iterate the lowest level in the json file?
As a little side question, which Json parser libraries should I use if I might sell my software? In other words, which doesn't cause licensing issues?

You can iterate over keys recursively, but you can't calculate sum, result will be unpredictable.
jsonObject.keySet not guarantee returns the keys in the same order as they appears in file.

Use the stream API for Json.
I have added the missing curly braces to fix your input.
{
"Main": {
"Part1":{
"Length": 2,
"Flags": 2,
"Sequence": 4
},
"Part2":{
"Length": 2,
"Type":2,
"Main_Dest":4,
"Main_Source":4,
"Sequence":4,
"Data": {
"1":12,
"2":24
},
"Blank": 8
}
}
}
This examples shows how to use the stream API.
// -*- compile-command: "javac -cp javax.json-1.0.jar q43737601.java && java -cp .:javax.json-1.0.jar q43737601"; -*-
import java.io.FileReader;
import javax.json.Json;
import javax.json.stream.JsonParser;
class q43737601
{
public static void main (String argv[]) throws Exception
{
String path = "config.json";
int sum = 0;
JsonParser p = Json.createParser (new FileReader (path));
while (p.hasNext()) {
JsonParser.Event e = p.next();
switch (e) {
case VALUE_NUMBER:
sum += Integer.parseInt(p.getString());
break;
case KEY_NAME:
if ("Type".equals(p.getString()))
System.out.println(sum);
break;
}
}
}
}
If you run it, it displays 10. The example sums up all numbers up to a key called "Type".
I tried the above example with OpenJDK. It was necessary to follow the steps explained in this answer. I had to set the class path (-cp) in the compile command.

how to disable page query in Spring-data-elasticsearch

I use spring-data-elasticsearch framework to get query result from elasticsearch server, the java code like this:
public void testQuery() {
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withFields("createDate","updateDate").withQuery(matchAllQuery()).withPageable(new PageRequest(0,Integer.MAX_VALUE)).build();
List<Entity> list = template.queryForList(searchQuery, Entity.class);
for (Entity e : list) {
System.out.println(e.getCreateDate());
System.out.println(e.getUpdateDate());
}
}
I get the raw query log in server, like this:
{"from":0,"size":10,"query":{"match_all":{}},"fields":["createDate","updateDate"]}
As per the query log, spring-data-elasticsearch will add size limit to the query. "from":0, "size":10, How can I avoid it to add the size limit?

You don't want to do this, you could use the findAll functionality on a repository that returns an Iterable. I think the best way to obtain all items is to use the scan/scroll functionality. Maybe the following code block can put you in the right direction:
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(QueryBuilders.matchAllQuery())
.withIndices("customer")
.withTypes("customermodel")
.withSearchType(SearchType.SCAN)
.withPageable(new PageRequest(0, NUM_ITEMS_PER_SCROLL))
.build();
String scrollId = elasticsearchTemplate.scan(searchQuery, SCROLL_TIME_IN_MILLIS, false);
boolean hasRecords = true;
while (hasRecords) {
Page<CustomerModel> page = elasticsearchTemplate.scroll(scrollId, SCROLL_TIME_IN_MILLIS, CustomerModel.class);
if (page != null) {
// DO something with the records
hasRecords = (page.getContent().size() == NUM_ITEMS_PER_SCROLL);
} else {
hasRecords = false;
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

ElasticSearch completion suggester with Java API - java

Related

How to create full text search query in mongodb with spring-data?

RavenDB query returns null

How do I write mongo aggregation reduce query in Spring?

Json parsing: Iterating through the values?

how to disable page query in Spring-data-elasticsearch

Categories

Resources