Explanation on the "Rank" for Retrieve & Rank service in Java - java

Does anyone ever used the Retrieve & Rank service with Java SDK (Rank service particularly) ?
I want to understand how it works because some points seem me not logical :
What is the difference between the Java approach, where we must execute a search query with Apache Solr, and then call the method rank; and the CURL approach, where we just have to run a single query?
Why the method rank takes a CSV file that contains results from the search query whereas we apparently cannot have the result of a search query in CSV?
I did not find my responses neitheir in this documentation nor in this example.
Thanks for your time.

I have never used Retrieve and Rank before but by reading the documentation here are my thoughts
I do not think that there any difference between Java approach and CURL. From what I understand Search and rank in curl uses this command
curl -u "{username}":"{password}" "https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/solr_clusters/sc1ca23733_faa8_49ce_b3b6_dc3e193264c6/solr/example_collection/fcselect?ranker_id=B2E325-rank-67&q=what%20is%20the%20basic%20mechanism%20of%20the%20transonic%20aileron%20buzz&wt=json&fl=id,title"
while in Java
RetrieveAndRank service = new RetrieveAndRank();
service.setUsernameAndPassword("{username}","{password}");
HttpSolrClient solrClient = new HttpSolrClient;
solrClient = getSolrClient(service.getSolrUrl("scfaaf8903_02c1_4297_84c6_76b79537d849"), "{username}","{password}");
SolrQuery query = new SolrQuery("what is the basic mechanism of the transonic aileron buzz");
QueryResponse response = solrClient.query("example_collection", query);
Ranking ranking = service.rank("B2E325-rank-67", response);
System.out.println(ranking);
I think what the curl command would do, at the back end it would fire a search in Solr using the query specified and after the results returned it would rank them.
In Java this is done explicitly, instead of having a method queryAndRank you have two methods, one that is going to run in Solr, get the results from there and then forward these results to ranking system.
The search in Solr can return csv.
The CSVResponseWriter can write the list of documents in a response in
CSV format.
http://wiki.apache.org/solr/CSVResponseWriter

Related

Script fields in hibernate elasticsearch

I'm using hibernate-search-elasticsearch 5.8.2.Final and I can't figure out how to get script fields:
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-request-script-fields.html
Is there any way to accomplish this functionality?
This is not possible in Hibernate Search 5.8.
In Hibernate Search 5.10 you could get direct access to the REST client, send a REST request to Elasticsearch and get the result as a JSON string that you would have to parse yourself, but it is very low-level and you would not benefit from the Hibernate Search search APIs at all (no query DSL, no managed entity loading, no direct translation entity type => index name, ...).
If you want better support for this feature, don't hesitate to open a ticket on our JIRA, describing in details what you are trying to achieve and how you would have expected to be able to do that. We are currently working on Search 6.0 which brings a lot of improvements, in particular when it comes to using native features of Elasticsearch, so it just might be something we could slip into our backlog.
EDIT: I forgot to mention that, while you cannot use server-side scripts, you can still get the full source from your documents, and do some parsing in your application to achieve a similar result. This will work even in Search 5.8:
FullTextEntityManager fullTextEm = Search.getFullTextEntityManager(entityManager);
FullTextQuery query = fullTextEm.createFullTextQuery(
qb.keyword()
.onField( "tags" )
.matching( "round-based" )
.createQuery(),
VideoGame.class
)
.setProjection( ElasticsearchProjectionConstants.SCORE, ElasticsearchProjectionConstants.SOURCE );
Object[] projections = (Object[]) query.getSingleResult();
for (Object projection : projections) {
float score = (float) projection[0];
String source = (String) projection[1];
}
See this section of the documentation.

Google cloud Big query UDF limitations

I am facing a problem in Google bigquery. I have some complex computation need to do and save the result in Bigquery. So we are doing that complex computation in Java and saving result in google bigquery with the help of Google cloud dataflow.
But this complex calculation is taking around 28 min to complete in java. Customer requirement is to do within 20 sec.
So we switch to Google bigquery UDF option. One option is Bigquery legacy UDF. Bigquery legacy UDF have limitation that it is processing row one by one so we phased out this option. As we need multiple rows to process the results.
Second option is Scalar UDF. Big query scalar UDF are only can be called from WEB UI or command line and can not be trigger from java client.
If any one have any idea the please provide the direction on the problem how to proceed.
You can use scalar UDFs with standard SQL from any client API, as long as the CREATE TEMPORARY FUNCTION statements are passed in the query attribute of the request. For example,
QueryRequest queryRequest =
QueryRequest
.newBuilder(
"CREATE TEMP FUNCTION GetWord() AS ('fire');\n"
+ "SELECT COUNT(DISTINCT corpus) as works_with_fire\n"
+ "FROM `bigquery-public-data.samples.shakespeare`\n"
+ "WHERE word = GetWord();")
// Use standard SQL syntax for queries.
// See: https://cloud.google.com/bigquery/sql-reference/
.setUseLegacySql(false)
.build();
QueryResponse response = bigquery.query(queryRequest);
Big query scalar UDF are only can be called from WEB UI or command
line and can not be trigger from java client.
This is not accurate. Standard SQL supports scalar UDFs through CREATE TEMPORARY FUNCTION statement which can be used from any application and any client - it is simply part of the SQL query:
https://cloud.google.com/bigquery/docs/reference/standard-sql/user-defined-functions
To learn how to enable Standard SQL, see this documentation: https://cloud.google.com/bigquery/docs/reference/standard-sql/enabling-standard-sql
Particularly, simplest thing would be to add #standardSql tag at the beginning of SQL query.

How to stream Select results (not String query results) from CassandraOperations?

Spring Data Cassandra 1.5.0 comes with a streaming API in CassandraTemplate. I'm using spring-data-cassandra 1.5.1. I have a code like:
String tableName = cassandraTemplate.getTableName(MyEntity.class).toCql();
Select select = QueryBuilder.select()
.all()
.from(tableName);
// In real world, WHERE statement is much more complex
select.where(eq(ENTITY_FIELD_NAME, expectedField))
List<MyEntity> result = cassandraTemplate.select(select, MyEntity.class);
and want to replace this code with iterable or Java 8 Stream in order to avoid fetching a big list of results to memory at once.
What I'm looking for is a method signature like CassandraOperations.stream(Select query, Class<T> entityClass), but it is not available.
The only available method in CassandraOperations accepts query string: stream(String query, Class<T> entityClass). I tried to pass here a string generated by Select like
cassandraTemplate.stream(select.getQueryString(), MyEntity.class)
But that fails with InvalidQueryException: Invalid amount of bind variables, because getQueryString() returns query with question mark placeholders instead of variables.
I see 3 options to get what I want, but every option looks bad:
Use Spring Query creation mechanism with Stream/Iterator expected return type (good only for simple queries) http://docs.spring.io/spring-data/cassandra/docs/current/reference/html/#repositories.query-methods.query-creation
Use raw CQL query and not to use QueryBuilder
Call select.getQueryString() and then substitute parameters again via BoundStatement
Is there any better way to stream selection results?
Thanks.
So, as of now the answer on my question is to wait until stable version of spring-data-cassandra 2.0.0 comes out:
https://github.com/spring-projects/spring-data-cassandra/blob/2.0.x/spring-data-cassandra/src/main/java/org/springframework/data/cassandra/core/CassandraTemplate.java#L208

How to get facet value for all fields using Java API

I'm new to Elasticsearch and tried to query some sample documents. I issued the following query using the Java API. This query fetched me the correct result. It returned the names of all categories. Now I want the count of all names of a category. Could you explain me how to do that? I'm sorry for my bad English.
SearchResponse sr = client.prepareSearch()
.addField("Category")
.setQuery(QueryBuilders.matchAllQuery())
.addFacet(FacetBuilders.termsFacet("f")
.field("Category"))
.execute()
.actionGet();
Look at the Count API to count your results if you not want to get the result set (matched documents) but only the count.
If you want to get the result set, you get the result size in the response for every filter or query request too.

ElasticSearch - Using FilterBuilders

I am new to ElasticSearch and Couchbase. I am building a sample Java application to learn more about ElasticSearch and Couchbase.
Reading the ElasticSearch Java API, Filters are better used in cases where sort on score is not necessary and for caching.
I still haven't figured out how to use FilterBuilders and have following questions:
Can FilterBuilders be used alone to search?
Or Do they always have to be used with a Query? ( If true, can someone please list an example? )
Going through a documentation, if I want to perform a search based on field values and want to use FilterBuilders, how can I accomplish that? (using AndFilterBuilder or TermFilterBuilder or InFilterBuilder? I am not clear about the differences between them.)
For the 3rd question, I actually tested it with search using queries and using filters as shown below.
I got empty result (no rows) when I tried search using FilterBuilders. I am not sure what am I doing wrong.
Any examples will be helpful. I have had a tough time going through documentation which I found sparse and even searching led to various unreliable user forums.
private void processQuery() {
SearchRequestBuilder srb = getSearchRequestBuilder(BUCKET);
QueryBuilder qb = QueryBuilders.fieldQuery("doc.address.state", "TX");
srb.setQuery(qb);
SearchResponse resp = srb.execute().actionGet();
System.out.println("response :" + resp);
}
private void searchWithFilters(){
SearchRequestBuilder srb = getSearchRequestBuilder(BUCKET);
srb.setFilter(FilterBuilders.termFilter("doc.address.state", "tx"));
//AndFilterBuilder andFb = FilterBuilders.andFilter();
//andFb.add(FilterBuilders.termFilter("doc.address.state", "TX"));
//srb.setFilter(andFb);
SearchResponse resp = srb.execute().actionGet();
System.out.println("response :" + resp);
}
--UPDATE--
As suggested in the answer, changing to lowercase "tx" works. With this question resolved. I still have following questions:
In what scenario(s), are filters used with query? What purpose will this serve?
Difference between InFilter, TermFilter and MatchAllFilter. Any illustration will help.
Right, you should use filters to exclude documents from being even considered when executing the query. Filters are faster since they don't involve any scoring, and cacheable as well.
That said, it's pretty obvious that you have to use a filter with the search api, which does execute a query and accepts an optional filter. If you only have a filter you can just use the match_all query together with your filter. A filter can be a simple one, or a compund one in order to combine multiple filters together.
Regarding the Java API, the names used are the names of the filters available, no big difference. Have a look at this search example for instance. In your code I don't see where you do setFilter on your SearchRequestBuilder object. You don't seem to need the and filter either, since you are using a single filter. Furthermore, it might be that you are indexing using the default mappings, thus the term "TX" is lowercased. That's why when you search using the term filter you don't find any match. Try searching for "tx" lowercased.
You can either change your mapping if you want to keep the "TX" term as it is while indexing, probably setting the field as not_analyzed if it should only be a single token. Otherwise you can change filter, you might want to have a look at a query that is analyzed, so that your query wil be analyzed the same way the content was indexed.
Have a look at the query DSL documentation for more information regarding queries and filters:
MatchAllFilter: matches all your document, not that useful I'd say
TermFilter: Filters documents that have fields that contain a term (not analyzed)
AndFilter: compound filter used to put in and two or more filters
Don't know what you mean by InFilterBuilder, couldn't find any filter with this name.
The query usually contains what the user types in through the text search box. Filters are more way to refine the search, for example clicking on facet entries. That's why you would still have the query plus one or more filters.
To append to what #javanna said:
A lot of confusion can come from the fact that filters can be defined in several ways:
standalone (with a required query, for instance match_all if all you need is the filters) (http://www.elasticsearch.org/guide/reference/api/search/filter/)
or as part of a filtered query (http://www.elasticsearch.org/guide/reference/query-dsl/filtered-query/)
What's the difference you might ask. And indeed you can construct exactly the same logic in both ways.
The difference is that a query operates on BOTH the resultset as well as any facets you have defined. Whereas, a Filter (when defined standalone) only operates on the resultset and NOT on any facets you may have defined (explained here: http://www.elasticsearch.org/guide/reference/api/search/filter/)
To add to the other answers, InFilter is only used with FilterBuilders. The definition is, InFilter: A filter for a field based on several terms matching on any of them.
The query Java API uses FilterBuilders, which is a factory for filter builders that can dynamically create a query from Java code. We do this using a form and we build our query based on user selections from it with checkboxes, options, and dropdowns.
Here is some Example code for FilterBuilders and there is a snippet from that link that uses InFilter as shown below:
FilterBuilder filterBuilder;
User user = (User) auth.getPrincipal();
if (user.getGroups() != null && !user.getGroups().isEmpty()) {
filterBuilder = FilterBuilders.boolFilter()
.should(FilterBuilders.nestedFilter("userRoles", FilterBuilders.termFilter("userRoles.key", auth.getName())))
.should(FilterBuilders.nestedFilter("groupRoles", FilterBuilders.inFilter("groupRoles.key", user.getGroups().toArray())));
} else {
filterBuilder = FilterBuilders.nestedFilter("userRoles", FilterBuilders.termFilter("userRoles.key", auth.getName()));
}
...

Categories

Resources