Elasticsearch similar documents in Java

Elasticsearch similar documents in Java - java

I'm doing a website (an auction website) using java. I have one page to show the product in auction and I want to show 10 similar products.
To perform the search I'm using elasticsearch (by using the elasticsearch java implementation dadoonet).
One requirement I have is to show only the 10 similar documents that has date > now.
I say the elasticsearch documentation and I found the query "More like this" but first I'm not getting this to work using:
new MoreLikeThisRequest("auction").searchSize(size).id(productId + "").fields(new String[] { "name", "description", "brand" }).type("string");
Because is always showing the error:
org.elasticsearch.index.engine.DocumentMissingException: [_na][_na] [string][2]: document missing
And I'm not find the way to filter the date.
Someone can point me on the right way to do this?
thks

My best bet would be that you have the wrong id and I also see that you are missing the type. To use more like this, you have to provide the document to use. This is defined by the combination of index,type and id. If you do not specify the document right, elasticsearch cannot find the document and that is most probably why you get the document missing message.
In java I would do something like this:
FilteredQueryBuilder queryBuilder =
new FilteredQueryBuilder(
QueryBuilders.matchAllQuery(),
FilterBuilders.rangeFilter("datefield").lte("now")
);
SearchSourceBuilder query = SearchSourceBuilder.searchSource().query(queryBuilder);
client.prepareMoreLikeThis("index","type","id")
.setField("field1","field2")
.setSearchSource(query)
.execute().actionGet();

So after strugling a little bit I found someone with the same problem. So his suggestion was to set the min_term_freq to 1.
So the code now looks like this:
FilteredQueryBuilder queryBuilder = new FilteredQueryBuilder(QueryBuilders.matchAllQuery(), FilterBuilders.rangeFilter("finish_date").lt("now"));
SearchSourceBuilder query = SearchSourceBuilder.searchSource().query(queryBuilder);
SearchResponse response = esClient.prepareMoreLikeThis("auction", "product", productId + "").setField("name.name", "description", "brand").setPercentTermsToMatch(0.3f)
.setMinTermFreq(1).setSearchSource(query).execute().actionGet();
But I dont know what this MinTermFreq does and if the value 1 is the right value. Someone know what is this field?
Thks for all the help!
Once again, Thank you for all the help and sorry for all the trouble!

Related

Spring Data MongoDb elemMatch criteria matching all search strings

I'm having an issue with custom Spring Data queries with MongoDb and Java. I'm attempting to implement a flexible search functionality against most of the fields of the document.
This document represents a person, and it contains a set of addresses embedded in it; the address has a field that is a set of strings that are the 'street address lines'.
I started with Query By Example, and this works for the single fields. but doesn't work for other types - such as this set of strings. For these, I'm building custom criteria.
The search criteria includes a set of street lines that I would like to match against the document's lines. If every line in the search is found in the document, the criteria should be considered matching.
I've tried using elemMatch, but this doesn't quite work like I want:
addressCriteriaList.add(Criteria.where("streetAddressLines").elemMatch(new Criteria().in(addressSearch.getStreetAddressLines())));
This seems to match if only ONE line in the document matches the search. If I have the following document:
"streetAddressLines": [ "123 Main Street", "Apt 1" ]
and the search looks like this:
"streetAddressLines": [ "123 Main Street", "Apt 2" ]
the elemMatch succeeds, but that's not what i want.
I've also tried looping through each of the search lines, trying an elemMatch to see if each is in the document:
var addressLinesCriteriaList = new Array<Criteria>();
var streetAddressLines = address.getStreetAddressLines();
streetAddressLines.forEach(l -> addressLinesCriteriaList.add(Criteria.where("streetAddressLines").elemMatch(new Criteria().is(l))))
var matchCriteria = new Criteria.andOperator(addressLinesCriteriaList);
This doesn't seem to work. I have done some experimenting, and it may be that this doesn't seem to work: new Criteria().is(l)
I tried this, and this DOES seem to work, but I would think that it's really inefficient to create a collection for each search line:
streetAddressLines.forEach(l ->
{
var list = new ArrayList<String>();
list.add(l);
addressCriteriaList.add(Criteria.where("streetAddressLines").elemMatch(new Criteria().in(l)));
});
So I don't know exactly what's going on - does anyone have any ideas of what I'm doing wrong? Thanks in advance.

You need to use the $all operator or the all method of Criteria class. Something along these lines:
addressCriteriaList.add(Criteria.where("streetAddressLines").all(addressSearch.getStreetAddressLines()));
If addressSearch.getStreetAddressLines returns a list, try this:
addressCriteriaList.add(Criteria.where("streetAddressLines").all(addressSearch.getStreetAddressLines().toArray(new String[0])));

Script fields in hibernate elasticsearch

I'm using hibernate-search-elasticsearch 5.8.2.Final and I can't figure out how to get script fields:
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-request-script-fields.html
Is there any way to accomplish this functionality?

This is not possible in Hibernate Search 5.8.
In Hibernate Search 5.10 you could get direct access to the REST client, send a REST request to Elasticsearch and get the result as a JSON string that you would have to parse yourself, but it is very low-level and you would not benefit from the Hibernate Search search APIs at all (no query DSL, no managed entity loading, no direct translation entity type => index name, ...).
If you want better support for this feature, don't hesitate to open a ticket on our JIRA, describing in details what you are trying to achieve and how you would have expected to be able to do that. We are currently working on Search 6.0 which brings a lot of improvements, in particular when it comes to using native features of Elasticsearch, so it just might be something we could slip into our backlog.
EDIT: I forgot to mention that, while you cannot use server-side scripts, you can still get the full source from your documents, and do some parsing in your application to achieve a similar result. This will work even in Search 5.8:
FullTextEntityManager fullTextEm = Search.getFullTextEntityManager(entityManager);
FullTextQuery query = fullTextEm.createFullTextQuery(
qb.keyword()
.onField( "tags" )
.matching( "round-based" )
.createQuery(),
VideoGame.class
)
.setProjection( ElasticsearchProjectionConstants.SCORE, ElasticsearchProjectionConstants.SOURCE );
Object[] projections = (Object[]) query.getSingleResult();
for (Object projection : projections) {
float score = (float) projection[0];
String source = (String) projection[1];
}
See this section of the documentation.

MongoCollection : How to get value of nested key

I have some mongo data that looks like this
{
"_id": {
"$oid": "5984cfb276c912dd03c1b052"
},
"idkey": "123",
"objects": [{
"key1": "481334",
"key2": {
"key3":"val3",
"key4": "val4"
}
}]
}
I want to know what the value of key4 is. I also need to filter the results byidkey and key1. So I tried
doc = mongoCollection.find(and(eq("idKey", 123),eq("objects.key1", 481334))).first();
and this works. But i want to check the value of key4 without having to unwrap the entire object. Is there some query i can perform that gives me just the value of key4? Note that I can update the value of key4 as
mongoCollection.updateOne(and(eq("idKey", 123), eq("objects.key1", 481334)),Updates.set("objects.$.key2.key4", "someVal"));
Is there a similar query i can run just to get the value of key4?
Upadte
thanks a lot #dnickless for your help. I tried both of your suggestions but i am getting null. Here is what i tried
existingDoc = mongoCollection.find(and(eq("idkey", 123), eq("objects.key1", 481334))).first();
this gives me
Document{{_id=598b13ca324fb0717c509e2d, idkey="2323", objects=[Document{{key1="481334", key2=Document{{key3=val3, key4=val4}}}}]}}
so far so good. next i tried
mongoCollection.updateOne(and(eq("idkey", "123"), eq("objects.key1", "481334")),Updates.set("objects.$.key2.key4", "newVal"));
now i tried to get the updated document as
updatedDoc = mongoCollection.find(and(eq("idkey", "123"),eq("objects.key1","481334"))).projection(Projections.fields(Projections.excludeId(), Projections.include("key4", "$objects.key2.key4"))).first();
for this i got
Document{{}}
and finally i tried
updatedDoc = mongoCollection.aggregate(Arrays.asList(Aggregates.match(and(eq("idkey", "123"), eq("objects.key1", "481334"))),
Aggregates.unwind("$objects"), Aggregates.project(Projections.fields(Projections.excludeId(), Projections.computed("key4", "$objects.key2.key4")))))
.first();
and for this i got
Document{{key4="newVal"}}
so i'm happy :) but can you think of a reason why the firs approach did not work?
Final answer
thanks for the update #dnickless
document = collection.find(and(eq("idkey", "123"), eq("objects.key1", "481334"))).projection(fields(excludeId(), include("key4", "objects.key2.key4"))).first();

Your data sample contains a lowercase "idkey" whereas your query uses "idKey". In my examples below, I use the lowercase version. Also you are querying for integers 123 and 481334 as opposed to strings which would be correct looking at your sample data. I'm going for the string version with my below code in order to make it work against the provided sample data.
You have two options:
Either you simply limit your result set but keep the same structure using a simple find + projection:
document = collection.find(and(eq("idkey", "123"), eq("objects.key1", "481334"))).projection(fields(excludeId(), include("objects.key2.key4"))).first();
Or, probably nicer in terms of output (not necessarily speed, though), you use the aggregation framework in order to really just get what you want:
document = collection.aggregate(Arrays.asList(match(and(eq("idkey", "123"), eq("objects.key1", "481334"))), unwind("$objects"), project(fields(excludeId(), computed("key4", "$objects.key2.key4"))))).first();

Java QueryDsl for "update myTable where myColumn in ('interesting', 'values')"?

I'm trying to translate this query in QueryDsl:
update myThings set firstColumn = 'newValue' where secondColumn in ('interesting', 'stuff')
I spent hours looking for documentation but the java fu is just not strong enough in this one... :( I can find all kinds of QueryDsl example, but I cant find any for this. I will probably need SimpleExpression.eqAny(CollectionExpression), but I can't figure out how to build such a CollectionExpression around my simple list of strings.
List<String> interestingValues = Arrays.asList("interesting", "stuff");
queryFactory.update(myThings)
.set(myThings.firstColumn, "newValue")
// .where(myThings.secondColumn.in(interestingValues)
// 'in' will probably try to look in table "interestingValues"?
// .where(myThings.secondColumn.eqAny(interestingValues)
// 'eqAny' seems interesting, but doesn't accept a list
.execute();
All I can find is API definitions, but then I get lost in generics any other "new" java concepts which I still have trouble understanding. An example would be very much appreciated.

You have to use new JPAUpdateClause(session, myThings):
JPAUpdateClause<myThings> update = new JPAUpdateClause(session, myThings);
update.set(myThings.firstColumn, "newValue")
.where(myThings.secondColumn.in(interestingValues))
.execute();
If you are using hibernate, use HibernateUpdateClause() instead;

How to get facet value for all fields using Java API

I'm new to Elasticsearch and tried to query some sample documents. I issued the following query using the Java API. This query fetched me the correct result. It returned the names of all categories. Now I want the count of all names of a category. Could you explain me how to do that? I'm sorry for my bad English.
SearchResponse sr = client.prepareSearch()
.addField("Category")
.setQuery(QueryBuilders.matchAllQuery())
.addFacet(FacetBuilders.termsFacet("f")
.field("Category"))
.execute()
.actionGet();

Look at the Count API to count your results if you not want to get the result set (matched documents) but only the count.
If you want to get the result set, you get the result size in the response for every filter or query request too.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Elasticsearch similar documents in Java - java

Related

Spring Data MongoDb elemMatch criteria matching all search strings

Script fields in hibernate elasticsearch

MongoCollection : How to get value of nested key

Java QueryDsl for "update myTable where myColumn in ('interesting', 'values')"?

How to get facet value for all fields using Java API

Categories

Resources