Score of each hit with spring SearchQuery ElasticSearch - java

I'm trying to see and use the invidual _score of each hit when doing a search by a SearchQuery. This is, among other things, to know in what range of scores my searches result in. But other than setting a MinScore using searchQuery.withMinScore(float); I can't find any method for handling the scores of search.
#Override
public Page<Website> listsearch(SearchBody searchBody, int size, int page) {
BoolQueryBuilder qb = QueryBuilders.boolQuery();
for(SearchUnit unit:searchBody.getSearchBody()){
if(unit.isPriority()) {
qb.must(matchQuery("_all", unit.getWord()).operator(MatchQueryBuilder.Operator.AND)
.fuzziness(Fuzziness.AUTO));
}else {
qb.should(termQuery("_all", unit.getWord())
.boost(unit.getWeight()));
}
}
for(SearchUnit ExUnit:searchBody.getExcludeBody()){
qb.mustNot(matchPhraseQuery("_all",ExUnit.getWord()));
}
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withIndices("websites_v1")
.withTypes("website")
.withQuery(qb)
.withMinScore(0.05F)//Magical minscore
.withPageable(new PageRequest(page, size))
.build();
Page<Website> search = searchRepository.search(searchQuery);
return search;
}
The search function used is from org.springframework.data.elasticsearch.repository; defined as
Page<T> search(SearchQuery var1);
So my question is there anyway I can access the score of each returned object in the Page? Or do I need to switch my query method to something else to achive that?

This is not possible with the Spring Data ElasticSearch repositories.
You need to autowire an EntityMapper and an ElasticSearchTemplate and extract the score yourself. Something like this should work:
Pageable pageRequest = new PageRequest(0, 10);
Page<Website> result = elasticSearchTemplate.query(searchQuery, new ResultsExtractor<Page<Website>>() {
#Override
public Page<Website> extract(SearchResponse response) {
List<Website> content = new ArrayList<>();
SearchHit[] hits = response.getHits().getHits();
for (SearchHit hit : hits) {
Website website = entityMapper.mapToObject(hit, Website.class);
content.add(website);
float documentScore = hit.getScore(); // <---- score of a hit
}
return new PageImpl<Website>(content, pageRequest, response.getHits().getTotalHits());
}
});

Related

How to return one random element by Query

I'm trying to return random element in Spring using Query.
I have this:
#Override
public List<AdventureHolidays> findRandomTrekking() {
Query query = new Query();
query.addCriteria(Criteria.where("typeOfAdventureHolidays").is("trekking"));
return mongoTemplate.find(query, AdventureHolidays.class);
}
But this return me all elements that match my criteria,
I tried with:
return mongoTemplate.findOne(query, AdventureHolidays.class); but then I have required type List provided AdventureHoliday
Also I was using and tried with this, but on this way elements appear twice sometimes:
#Aggregation(pipeline = {"{'$match':{'typeOfAdventureHolidays':'trekking'}}", "{$sample:
{size:1}}"})
So I find a way with this Query, but its listing me all documents while I want just one random from collection
After some discussion this is what OP asked for:
private static Queue<AdventureHolidays> elementsToReturn = new LinkedList<>();
public AdventureHolidays findRandomTrekking() {
if (elementsToReturn.size() == 0) { //fetch data from db
Query query = new Query();
query.addCriteria(Criteria.where("typeOfAdventureHolidays")
.is("trekking"));
List<AdventureHolidays> newData = mongoTemplate.find(query, AdventureHolidays.class)
Collections.shuffle(newData);
elementsToReturn.addAll(newData);
}
return elementsToReturn.poll(); //this will crash if database is empty
}
Original answer.
You need to change return type of a method:
public AdventureHolidays findRandomTrekking() {
Query query = new Query();
query.addCriteria(Criteria.where("typeOfAdventureHolidays").is("trekking"));
return mongoTemplate.findOne(query, AdventureHolidays.class);
}

ReactiveMongoTemplate does not return number of documents removed

I have this implementation to remove documents based on some id using ReactiveMongoTemplate. I'm trying to get the size of the the list of impacted documents but it always returns 0, and since it is reactive I'm not sure how to get the number of records deleted
#Override
public int deleteMongoDataForGivenId(Long id) {
int deletedRecords = 0;
Query query = new Query();
query.addCriteria(where("id").is(id));
Flux<Object> deletedDocs = reactiveMongoTemplate.findAllAndRemove(query, Object.class, "SomeCollection");
if(!deletedDocs.collectList().block().isEmpty()) {
List<Object> listOfRecords = deletedDocs.collectList().block();
deletedRecords = listOfRecords.size();
}
}
Doing it the reactive way would be to return a Mono<Long> instead of blocking and unpacking the mono into an long or int:
public Mono<Long> deleteMongoDataForGivenId(Long id) {
Query query = new Query();
query.addCriteria(where("id").is(id));
return reactiveMongoTemplate
.findAllAndRemove(query, MyDocument.class, "SomeCollection")
.count();
}
Having to use a blocking method defies the purpose of reactive programming, but if you don't really have a choice, you can do the following:
public Long deleteMongoDataForGivenId(Long id) {
Query query = new Query();
query.addCriteria(where("id").is(id));
return reactiveMongoTemplate
.findAllAndRemove(query, MyDocument.class, "SomeCollection")
.count()
// Please don't do this!!!
.share().block();
}

Getting aggregated value using ElasticsearchTemplate and aggregators

I am having problems extracting the aggregated value.
configuration is spring with spring-boot-starter-data-elasticsearch.
Document user indexed multiples times in database.
I want to return sum of fields 'commentsCnt'
#Autowired
ElasticsearchTemplate elasticsearchTemplate;
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withIndices("comment")
.withQuery(matchQuery("user", userName))
.addAggregation(AggregationBuilders.sum("sum_of_comments").field("commentsCnt"))
.build();
Aggregations aggregations = elasticsearchTemplate.query(searchQuery,
new ResultsExtractor<Aggregations>() {
#Override
public Aggregations extract(SearchResponse response) {
return response.getAggregations();
}
});
Aggregation ret = aggregations.get("sum_of_comments");
How to extract the value? Maybe there is a better approach?
for (Aggregation aggs : aggregations) {
Sum sum = (Sum) aggs;
double sumValue = sum.getValue();
System.out.println("sumValue=" + sumValue);
}

how to disable page query in Spring-data-elasticsearch

I use spring-data-elasticsearch framework to get query result from elasticsearch server, the java code like this:
public void testQuery() {
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withFields("createDate","updateDate").withQuery(matchAllQuery()).withPageable(new PageRequest(0,Integer.MAX_VALUE)).build();
List<Entity> list = template.queryForList(searchQuery, Entity.class);
for (Entity e : list) {
System.out.println(e.getCreateDate());
System.out.println(e.getUpdateDate());
}
}
I get the raw query log in server, like this:
{"from":0,"size":10,"query":{"match_all":{}},"fields":["createDate","updateDate"]}
As per the query log, spring-data-elasticsearch will add size limit to the query. "from":0, "size":10, How can I avoid it to add the size limit?
You don't want to do this, you could use the findAll functionality on a repository that returns an Iterable. I think the best way to obtain all items is to use the scan/scroll functionality. Maybe the following code block can put you in the right direction:
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(QueryBuilders.matchAllQuery())
.withIndices("customer")
.withTypes("customermodel")
.withSearchType(SearchType.SCAN)
.withPageable(new PageRequest(0, NUM_ITEMS_PER_SCROLL))
.build();
String scrollId = elasticsearchTemplate.scan(searchQuery, SCROLL_TIME_IN_MILLIS, false);
boolean hasRecords = true;
while (hasRecords) {
Page<CustomerModel> page = elasticsearchTemplate.scroll(scrollId, SCROLL_TIME_IN_MILLIS, CustomerModel.class);
if (page != null) {
// DO something with the records
hasRecords = (page.getContent().size() == NUM_ITEMS_PER_SCROLL);
} else {
hasRecords = false;
}
}

Twitter4J: Get more than 1 tweet from a user

I'm trying to get a list of recent statuses from each user on a persons list of followers. I've got the following to get the users...
IDs list = twitter.getFriendsIDs(0);
for(long ID : list.getIDs()){
twitter4j.User TW_user = twitter.showUser(ID);
}
All I can get from this is getStatus() which is their most recent status. getHomeTimeline() is also insufficient as I need a list of recent tweets from each user. Is there anyway I can achieve this using Twitter4J?
I was just trying to find this answer myself. I had decent success using the getUserTimeline method. Looks like you're trying to look up a list of friend IDs, so this method below should take the long[] and spit out all the user statuses. lookupUsers also accepts a String[] of screen names if you want to look users up that way instead.
public static void lookupUsers(long[] usersList) {
try {
Twitter twitter = new TwitterFactory().getInstance();
ResponseList<User> users = twitter.lookupUsers(usersList);
Paging paging = new Paging(1, 100);
List<Status> statuses;
for (User user : users) {
statuses = twitter.getUserTimeline(user.getScreenName(), paging);
System.out.println("\nUser: #" + user.getScreenName());
for (Status s : statuses) {
System.out.println(s.getText());
}
}
} catch (TwitterException e) {
e.printStackTrace();
}
}
Alex's answer is close, but will only get you 100 tweets per user. The following will get you all (or at least the API's max limit):
IDs list = twitter.getFriendsIDs(0);
for(long ID : list.getIDs()) {
Status[] tweets = getAllTweets(twitter, ID);
System.out.println(ID + ": " + tweets.length);
}
Status[] getAllTweets(Twitter twitter, long userId)
{
int pageno = 1;
List statuses = new ArrayList();
while (true)
{
try
{
int size = statuses.size();
Paging page = new Paging(pageno++, 100);
statuses.addAll(twitter.getUserTimeline(userId, page));
if (statuses.size() == size)
break;
}
catch (TwitterException e)
{
e.printStackTrace();
}
}
return (Status[]) statuses.toArray(new Status[0]);
}

Categories

Resources