How to assign constant boost value for single field in elasticsearch? - java

There are a lot of scoring/boosting options in elasticsearch but I haven't found the possibility to add a constant boost value for particular field. If such option exists, how the mapping should look like? Maybe there is an option to calculate score for the entire document depending on which field is being hit?

Here is the solution: wrapper query "custom_boost_factor" which multiplies the score of embedded query of every type:
curl -XPOST 'http://localhost:9200/test/entry/_search?pretty=true' -d '{
"query":{
"custom_boost_factor" :{
"query" : {
"text_phrase_prefix" : {
"title" : "test"
}
},
"boost_factor": 2.0
}
}
}'

Related

How to access key value in array of objects - java

I'm getting result from cloudant db and response type would be Document object.
This is my query:
FindResult queryResult = cloudantConfig.clientBuilder()
.postFind(findOptions)
.execute()
.getResult();
This is my result from cloudant db:
{
"bookmark": "Tq2MT8lPzkzJBYqLOZaWZOQXZVYllmTm58UHpSamxLukloFUc8BU41GXBQAtfh51",
"docs": [
{
"sports": [
{
"name": "CRICKET",
"player_access": [
"All"
]
}
]
}
]
}
I'd like to access 'name' and 'player access,' but I can only go up to'sports,' and I can't get to 'name' or 'player access.' This is how I attempted to obtain 'name.'
queryResult.getDocs().get(0).get("sports").get(0).get("name");
With above one I'm getting an error like this The method get(int) is undefined for the type Object
I'm receiving the values when I try to get up to'sports.'
This is how I obtain sports:
queryResult.getDocs().get(0).get("sports");
When I sysout the aforementioned sports, I get the results below.
[{name=CRICKET, player_access=[All]}]
So, how do I gain access to 'name' and 'player access' here? Can somebody help me with this?
I've dealed with JSON values recently. But ended up just using regex and splitting/matching from there.
You can regex everything from the "name" (not including until the last comma) and do the same for sport Access.
Be aware that this is just a work around, and not the best option. But sometimes JSON objects on Java can be Tricky.

Retrieve data from Elasticsearch using aggregations where the values contains hyphen

I am working on elastic search for quite some time now... I have been facing a problem recently.
I want to group by a particular column in elastic search index. The values for that particular column has hyphens and other special characters.
SearchResponse res1 = client.prepareSearch("my_index")
.setTypes("data")
.setSearchType(SearchType.QUERY_AND_FETCH)
.setQuery(QueryBuilders.rangeQuery("timestamp").gte(from).lte(to))
.addAggregation(AggregationBuilders.terms("cat_agg").field("category").size(10))
.setSize(0)
.execute()
.actionGet();
Terms termAgg=res1.getAggregations().get("cat_agg");
for(Bucket item :termAgg.getBuckets()) {
cat_number =item.getKey();
System.out.println(cat_number+" "+item.getDocCount());
}
This is the query I have written inorder to get the data groupby "category" column in "my_index".
The output I expected after running the code is:
category-1 10
category-2 9
category-3 7
But the output I am getting is :
category 10
1 10
category 9
2 9
category 7
3 7
I have already went through some questions like this one, but couldn't solve my issue with these answers.
That's because your category field has a default string mapping and it is analyzed, hence category-1 gets tokenized as two tokens namely category and 1, which explains the results you're getting.
In order to prevent this, you can update your mapping to include a sub-field category.raw which is going to be not_analyzed with the following command:
curl -XPUT localhost:9200/my_index/data/_mapping -d '{
"properties": {
"category": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}'
After that, you need to re-index your data and your aggregation will work and return you what you expect.
Just make sure to change the following line in your Java code:
.addAggregation(AggregationBuilders.terms("cat_agg").field("category.raw").size(10))
^
|
add .raw here
When you index "category-1" you will get (by default) two terms, "category", and "1". Therefore when you aggregate you will get back two results for that.
If you want it to be considered a single "term" then you need to change the analyzer used on that field when indexing. Set it to use the keyword analyzer

ElasticSearch index deletion

I'm using Java API to delete old indexes from ElasticSearch.
Client client = searchConnection.client
DeleteIndexResponse delete = client.admin().indices().delete(new DeleteIndexRequest('location')).actionGet();
During deletion cluster goes red for a minute and not indexing new data - reason "missing indices/replicas etc".
How I can tell ElasticSearch that I'm going to delete them to prevent "red state"?
You could use aliases in order to abstract from the real indices underneath. The idea would be to read from an alias and write to an alias instead. That way you can create a new index, swap the write alias to the new index (so that the indexing process is not disrupted) and then delete the old index. Process-wise, it would go like this:
Context: Your current location index has the location_active alias and the indexing process writes to the location_active alias instead of directly to the location index.
Step 1: Create the new location_112015 index
curl -XPUT localhost:9200/location_112015
Step 2: Swap the location_active alias from the "old" location index to the "new" one created in step 1
curl -XPOST 'http://localhost:9200/_aliases' -d '{
"actions" : [
{ "remove" : { "index" : "location", "alias" : "location_active" } },
{ "add" : { "index" : "location_112015", "alias" : "location_active" } }
]
}'
Note that this operation is atomic, so if the indexing process keeps sending new documents to location_active, it will be transparent for it and no docs will be lost, no errors will be raised.
Step 3: Remove the old index
curl -XDELETE localhost:9200/location
Step 4: Rinse and repeat as often as needed
Note: these operations can easily be performed with the Java client library as well.

How to search in elasticsearch similar to mysql like . for example i have to search on name . Can anyone tell me the complete request

curl localhost:9200/posts/tweet/_search?pretty -d '{
"query" : {
"fuzzy_like_this" : {
"like_text" : "ajma"
}`
}
}'
I am not getting how this is executing .
* EDIT *
i am getting some irrelevant result . i am getting 3 names in result ["jolly ajmani","mahammad jama","bashir ahmad" and not getting why "bashir ahmad" is coming in result
I don't know if you are asking why it isn't executing or if this query will do a similar task to MySQLs like. Quoted from Elasticsearch - "Fuzzy like this query find documents that are "like" provided text by running it against one or more fields".
Have a look at the docs regarding this type of query for more information.
Try this -
{
"query": {
"fuzzy_like_this": {
"like_text": "ajmani",
"max_query_terms": 12
}
}
}

elasticsearch match/term query not returning exact match

I am using elasticsearch in my project in Java, with the document format like
/index/type/_mapping
{
"my_id" : "string"
}
Now, suppose the my_id values are
A01, A02, A01.A1, A012.AB0
For the query,
{
"query" : {
"term" : {
"my_id" : "a01"
}
}
}
Observed : the documents returned are for A01, A01.A1, A012.AB0
Expected : I need the A01 document only.
I looked for the solution and found that i would have to use a custom analyzer for my_id field. I do not want to change my mapping for the document.
Also, I used "index": "not_analyzed" in the query but there was no change in the output.
Yes, you could use 'not_analyzed' analyzer, but try to use term filter instead of term query
Also check current mapping of the document

Categories

Resources