Get only partially matching records in Solr - java

Is there a way to get records which matches a query partially in Solr.
For &q="java enterprise" in the below mentioned records,
{
"name":"java",
"case:"enterprise",
},
{
"name":"java enterprise"
"case": "enterprise"
}
I want to fetch only those records which have java and enterprise mentioned separately and not together, i.e only the below record should come into my result.
{
"name":"java",
"case:"enterprise",
}
Is there a way to search for only those records and eliminate the documents from the search which has exact match?

You don't need to use exact phrase match, instead, you can use boolean queries in that case
(name:"java" AND case:"enterprise" ) OR (name:"enterprise" AND case:"java" )

Related

How to search in firebase database

I'm trying to filter the data from my database using this code:
fdb.orderByChild("title").startAt(searchquery).endAt(searchquery+"\uf8ff").addValueEventListener(valuelistener2);
My database is like this:
"g12" : {
"Books" : {
"-Mi_He4vHXOuKHNL7yeU" : {
"title" : "Technical Sciences P1"
},
"-Mi_He50tUPTN9XDiVow" : {
"title" : "Life Sciences"
},
"-Mi_He51dhQfl3RAjysQ" : {
"title" : "Technical Sciences P2"
}}
While the code works, it only returns the first value that matches the query and doesn't fetch the rest of the data even though it matches.
If I put a "T" as my search query, I just get the first title "Technical Sciences P1 " and don't get the other one with P2
(Sorry for the vague and common question title, it's just I've been looking for a solution for so long)
While the codes works, it only returns the first value that matches the query
That's the expected behavior since Firebase Realtime Database does not support native indexing or search for text fields in database properties.
When you are using the following query:
fdb.orderByChild("title").startAt(searchquery).endAt(searchquery+"\uf8ff")
It means that you are trying to get all elements that start with searchquery. For example, if you have a title called "Don Quixote" and you search for "Don", your query will return the correct results. However, searching for "Quix" will yield no results.
You might consider downloading the entire node to search for fields client-side but this solution isn't practical at all. To enable full-text search of your Firebase Realtime Database data, I recommend you to use a third-party search service like Algolia or Elasticsearch.
If you consider at some point in time to try using Cloud Firestore, please see the following example:
Is it possible to use Algolia query in FirestoreRecyclerOptions?
To see how it works with Cloud Firestore but in the same way, you can use it with Firebase Realtime Database.

How do I query a user's info in Firestore if all my document IDs are auto-generated?

All of the examples I'm finding online have very simple document IDs, but what do you do if you're auto-generating all your IDs (as the docs say you should)? For example, I want to query the date when the user was created. The document ID for this is below, but I've just copy-pasted it from the Firestore console. How would I know the document ID so that I may query any user's info? Note that I will be have a users, groups, usergroups, etc... There will be quite a few collections, each using the auto-ID feature. I would need to be able to update any row in any collection.
val docRef = db.collection("users").document("9AjpkmJdAdFScZaeo8FV45DT54E")
docRef.get()
.addOnSuccessListener { document ->
if (document != null) {
Log.e("Query", "Data: ${document.data}")
} else {
Log.e("Query", "Document is null")
}
}
.addOnFailureListener { exception ->
Log.e("Query", "Failure")
}
If you have data to query, that should all be stored as fields in the documents. Don't put that data in the ID of the documents - use field values.
You can filter documents in a collection using "where" clauses as shown in the documentation. What you're showing here isn't enough to go with in to make specific recommendations. But you definitely want to think about your queries ahead of time, and model your data to suit those queries.
If you need to update a document, you must first query for it, then update what you find from the query. This is extremely common, as Firestore does no provide any SQL-like "update where" queries that both locate and update data in the same command.

Neo4j slow cypher query in embedded mode

I have a huge graphdatabase with authors, which are connected to papers and papers a connected to nodes which contains meta information of the paper.
I tried to select authors which match a specific pattern and therefore I executed the following cypher statement in java.
String query = "MATCH (n:AUTHOR) WHERE n.name =~ '(?i).*jim.*' RETURN n";
db.execute(query);
I get a resultSet with all "authors" back. But the execution is very slow. Is it, because Neo4j writes the result into the memory?
If I try to find nodes with the Java API, it is much faster. Of course, I am only able to search for the exact name like the following code example, but it is about 4 seconds faster as the query above. I tested it on a small database with about 50 nodes, whereby only 6 of the nodes are authors. The six author are also in the index.
db.findNodes(NodeLabel.AUTHOR, NodeProperties.NAME, "jim knopf" );
Is there a chance to speed up the cypher? Or a possiblity to get all nodes via Java API and the findNodes() method, which match a given pattern?
Just for information, I created the index for the name of the author in java with graph.schema().indexFor(NodeLabel.AUTHOR).on("name").create();
Perhaps somebody could help. Thanks in advance.
EDIT:
I run some tests today. If I execute the query PROFILE MATCH (n:AUTHOR) WHERE n.name = 'jim seroka' RETURN n; in the browser interface, I have only the operator NodeByLabelScan. It seems to me, that Neo4j does not automatic use the index (Index for name is online). If I use a the specific index, and execute the query PROFILE MATCH (n:AUTHOR) USING INDEX n:AUTHOR(name) WHERE n.name = 'jim seroka' RETURN n; the index will be used. Normally Neo4j should use automatically the correct index. Is there any configuration to set?
I also did some testing in the embedded mode again, to check the performance of the query in the embedded mode. I tried to select the author "jim seroka" with db.findNode(NodeLabel.AUTHOR, "name", "jim seroka");. It works, and it seems to me that the index is used, because of a execution time of ~0,05 seconds.
But if I run the same query, as I executed in the interface and mentioned before, using a specific index, it takes ~4,9 seconds. Why? I'm a little bit helpless. The database is local and there are only 6 authors. Is the connector slow or is the creation of connection wrong? OK, findNode() does return just a node and execute a whole Result, but four seconds difference?
The following source code should show how the database will be created and the query is executed.
public static GraphDatabaseService getNeo4jDB() {
....
return new GraphDatabaseFactory().newEmbeddedDatabase(STORE_DIR);
}
private Result findAuthorNode(String searchValue) {
db = getNeo4jDB();
String query = "MATCH (n:AUTHOR) USING INDEX n:AUTHOR(name) WHERE n.name = 'jim seroka' RETURN n";
return db.execute(query);
}
Your query uses a regular expression and therefore is not able to use an index:
MATCH (n:AUTHOR) WHERE n.name =~ '(?i).*jim.*' RETURN n
Neo4j 2.3 introduced index supported STARTS WITH string operator so this query would be very performant:
MATCH (n:Author) WHERE n.name STARTS WITH 'jim' RETURN n
Not quite the same as the regular expression, but will have better performance.

Query of distinct in elastic search

SQL Query : SELECT DISTINCT column FROM table_name WHERE [condition]
We want to apply same query in elastic search, where i can find distinct values of column in search result.
For example we have index of users (userindex) field is information where school name or company name of the user in indexed.
let there are users with same school name. i want all the distinct school name from the index
As keety stated in the comments, one possibility is to use a terms aggregation like the following:
curl localhost:9200/users/_search?pretty=1 -d
{
"aggs": {
"schools": {
"terms": {
"field": "schoolname"
}
}
}
}
Depending on your use case this might already be enough. But you should keep in mind that the buckets ES returns for the aggregation are somewhat limited and possibly inaccurate concerning pagination and counts.
See https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html

About solr query facet

In my solr document, the document data is like:
{
"createTime":"2013-09-10",
"reason":"reason1",
"postId":"postId_1",
"_version_":1445959401549594624 },
{
"createTime":"2013-09-11",
"reason":"reason2",
"postId":"postId_1",
"_version_":1445959401549594624 },
{
"createTime":"2013-09-12",
"reason":"reason3",
"postId":"postId_1",
"_version_":1445959401549594624 },
{
"createTime":"2013-09-13",
"reason":"reason4",
"postId":"postId_2",
"_version_":1445959401549594624 },<script>alert("1")</script>
Now I need use solr facetQuery to select some data like this:
1. postId1, 3 records, the last createTime is "2013-09-12"
2. postId2, 1 record, the last createTime is "2013-09-13", reason is reason4
How can I do this using solr facetQuery?
You can use Field Collapsing feature, which can help you group the results.
If you group on post_id, you would be able to get the the results as per the post id.
You would get the count for each post id (numFound), which will give you the 3 records part.
You can order the results within the group by date desc and return single result (group.limit=1) which will give you the last date.
You can pick up the reason from the records.

Categories

Resources