I do not understand why the query does not work.
I need to search for a document in by two fields. Two ID-s. It need to search for a document if 2 values match. ID1 AND ID2
But I get an empty result.
query = MultiFieldQueryParser.parse(new String[]{id1, id2},
new String[]{"ID1", "ID2"},
new SimpleAnalyzer());
TopDocs topDocs = searcher.search(query, 1);
Document doc = searcher.doc(topDocs.scoreDocs[0].doc)
The index works 100%. Verified by other requests.
Thanks for the help.
Since you only want to perform an AND intersection between two separate queries -- and not really do a MultiFieldQuery (where you search for the same value in multiple fields), a slightly modified version of what is shown in Lucene OR search using Boolean Query should work:
BooleanQuery bothQuery = new BooleanQuery();
// field, value
TermQuery idQuery1 = new TermQuery(new Term("ID1", "id1"));
TermQuery idQuery2 = new TermQuery(new Term("ID2", "id2"));
bothQuery.add(new BooleanClause(idQuery1, BooleanClause.Occur.MUST));
bothQuery.add(new BooleanClause(idQuery2, BooleanClause.Occur.MUST));
TopDocs topDocs = searcher.search(bothQuery, 1);
Document doc = searcher.doc(topDocs.scoreDocs[0].doc)
Thank MatsLindh for the above answer. Managed to solved similar problems for school assignment thanks to you.
Bear in mind that the sample code is outdated and for Lucene 8.9 (my case), you should do this instead
Query query = new BooleanQuery.Builder()
.add(query1, BooleanClause.Occur.MUST)
.add(query2, BooleanClause.Occur.MUST)
.build();
TopDocs topDocs = searcher.search(query, 1);
Document doc = searcher.doc(topDocs.scoreDocs[0].doc)
TermQuery objects and Query objects can be used interchangeably to replace query1 and query2 for the above code.
Related
Lucene version: 7.5.0
With two given inputs (userTitle & userQuestion), I want to perform a search only within the indexed documents whose title matches with userTitle, but I am struggling to make this happen.
Below is what I have so far, but this is returning documents with other titles as well.
"title" and "body" are both TextFields.
Any advice would be greatly appreciated.
Query queryTitle = new TermQuery(new Term("title", userTitle));
Analyzer analyzer = new StandardAnalyzer();
QueryParser qpBody = new QueryParser("body", analyzer);
Query queryBody = qpBody.parse(QueryParser.escape(userQuestion));
BooleanQuery query = new BooleanQuery.Builder()
.add(queryTitle, BooleanClause.Occur.MUST)
.add(queryBody, BooleanClause.Occur.SHOULD)
.build();
I want to apologize first for my poor English I'm new to loosen and I didn't really understand the query documentation, I indexed some docs and made this query code but its not working
Term t = new Term("description", "history");
Query q = new FuzzyQuery(t, 2);
int hitsPerPage = 100;
Path indexPath = Paths.get("C:\\Users\\Win 7\\Desktop\\projet_ri\\index");
Directory directory = FSDirectory.open(indexPath);
DirectoryReader reader = DirectoryReader.open(directory);
IndexSearcher iSearcher = new IndexSearcher(reader);
TopDocs topdocs = iSearcher.search(q, hitsPerPage);
ScoreDoc[] resultsList = topdocs.scoreDocs;
System.out.println("Tab size: "+resultsList.length); // This prints Tab size: 0
for(int i = 0; i<resultsList.length; i++){
Document book = iSearcher.doc(resultsList[i].doc);
String description = book.getField("description").stringValue();
System.out.println(description);
}
The program isnt even entering the loop, i tried to check resultsList tab and it prints that the size is zero
Can someone help me to correct my code or give me a query example code ?
You actually missed using a QueryParser for your query.
This QueryParser needs the same Analyzer as you use for indexing. This is really important, otherwise the resultset may differs from what you expect. Your sequence should be something like this:
open Index
create IndexSearcher
create QueryParser with on indexing used Analyzer
create Query with given search terms
parse Query with QueryParser
search
close everything!
See basic lucene tutorial: https://www.tutorialspoint.com/lucene/lucene_search_operation.htm
I have observed an odd behaviour but I don't see what I am doing wrong.
I created via multiple BooleanQueries the following query:
+(-(Request.zipCode:18055 Request.zipCode:33333 Request.zipCode:99999) +Request.zipCode:[* TO *]) *:*
...this is what I get via toString
Update: this way I created a part of the BooleanQuery which is responsible to create this snippet +Request.zipCode:[* TO *])
Query fieldOccursQuery = new TermQuery(new Term(queryFieldName, "[* TO *]"));
I have created exaclty same (per my understanding) Query via QueryParser like this:
String querystr = "+(-(Request.zipCode:18055 Request.zipCode:33333 Request.zipCode:99999) +Request.zipCode:[* TO *]) *:*";
Query query = new QueryParser(Version.LUCENE_46, "title", LuceneServiceI.analyzer).parse(querystr);
I processed both of them the same way like this:
IndexReader reader = DirectoryReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);
int max = reader.maxDoc();
TopScoreDocCollector collector = TopScoreDocCollector.create(max > 0 ? max : 1, true);
searcher.search(query, collector);
....
ScoreDoc[] hits = collector.topDocs().scoreDocs;
Map<Integer, Document> docMap = new TreeMap<Integer, Document>();
for (int i = 0; i < hits.length; i++) {
docMap.put(hits[i].doc, indexSearcher.doc(hits[i].doc));
}
Different results
On a index like: stored,indexed,tokenized,omitNorms,indexOptions=DOCS_ONLY<Request.zipCode:04103>
The Query via QueryParser deliver one document as expected
The Query via BooleanQuery does not deliver 1 expected document
Questions
Are there possibilities that both same queries deliver different results? Set certain attributes to my BooleanQuery etc.
How can I get the same wanted result for BooleanQuery?
I could not found anything about differences only in concern of performance (http://www.gossamer-threads.com/lists/lucene/java-user/144374)
I found the solution to my problem.
Instead of creating this for the BooleanQuery:
Query fieldOccursQuery = new TermQuery(new Term(queryFieldName, "[* TO *]"));
I used this:
ConstantScoreQuery constantScoreQuery = new ConstantScoreQuery(new FieldValueFilter(queryFieldName));
query.add(constantScoreQuery, Occur.MUST);
Now my query looks different but I only get documents with fields with my queryFieldName.
Issue seems to be the leading wildcard in my first solution:
Find all Lucene documents having a certain field
I am using Lucene to search a contacts directly with general contact information for a database of people such as first name, last name, phone number, address etc. This question pertains specifically to searching by first and last name. Here is how I am indexing the names.
document.add(new Field("firstName", contact.getFirstName(), Field.Store.NO, Field.Index.NOT_ANALYZED));
document.add(new Field("lastName", contact.getLastName(), Field.Store.NO, Field.Index.NOT_ANALYZED));
I am searching the index like this:
IndexReader indexReader = IndexReader.open(FSDirectory.open(directory));
IndexSearcher indexSearcher = new IndexSearcher(indexReader);
int hitsPerPage = indexSearcher.maxDoc();
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_35);
String[] fields = {"id", "firstName", "lastName", "phoneNumber", "email", "address", "website"};
BooleanQuery booleanQuery = new BooleanQuery();
String[] terms = queryString.split(" ");
for(String term : terms) {
for(String field : fields) {
booleanQuery.add(new FuzzyQuery(new Term(field, term)), BooleanClause.Occur.SHOULD);
}
}
TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
indexSearcher.search(booleanQuery, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
The reason I am using a boolean query as opposed to a MultiFieldQuery is because it allows me to get results when a field is not exact. Basically I split the querystring by whitespace and then add terms for each of those keywords on each field in the index. I'm new to Lucene so I really have no idea if this is the optimal way to do this, but so far its been working ok for me.
The only hiccup i'm having is that when searching by full name it is not returning the results in the right order.
Index has 2 records, John Doe and John Smith.
When I search for John Doe my results will look like:
1) John Smith
2) John Doe
If I type John Smith it will reverse and display John Doe first. Why is it not returning the exact match as the first result?
If you are going to search for all terms across all fields, why not index the entire text as part of another field? And then you can issue a query like
/*
\\\\ is for escaping "
*/
String searchCriteria = "all:\\\\"John Doe\\\\"^3 OR all:(John Doe)";
IndexSearcher is = new IndexSearcher(indexDirectory);
Analyzer analyzer = new StandardAnalyzer();
QueryParser parser = new QueryParser("all", analyzer);
Query query = parser.parse(searchCriteria);
TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, true);
indexSearcher.search(query, collector);
ScoreDoc[] hits = collector.topDocs().scoreDocs;
However, if you want to continue with your current design, you can try http://lucene.apache.org/java/3_5_0/api/all/org/apache/lucene/search/IndexSearcher.html#explain(org.apache.lucene.search.Query, int) to find out why a document is being scored higher than other.
Using boolean queries and a for loop turned out to be a proper way of searching the index in my situation. The results were being reversed due to the way I was parsing and displaying them on the client side so it was a completely unrelated issue.
I'm using Lucene 3.1 to index some documents.
When I use IndexSearcher.search(), I successfully get back results for queries.
However, when I use IndexSearcher.doqFreq(), I get back 0 for a term. Can anyone offer some insight?
Also, why is there both an IndexSearcher.docFreq() and IndexReader.docFreq()? I have tried both, and both give me 0.
Here is my code:
IndexReader indexReader = IndexReader.open(dir);
IndexSearcher searcher = new IndexSearcher(indexReader);
...
String seachTermString = "foobar";
String field = "body";
Term term = new Term(field, searchTermString);
int numDocs = searcher.docFreq(term);
and then I get numDocs=0, even though when I use IndexSearcher.search() with the same search term string, I get back hits.
Try converting your term completely to lower case letters.
Create TermQuery from the Term you are creating to get document frequency with search.docFreq(term). Use this TermQuery for searching and check if it yields any results. It should. If this TermQuery doesn't give any results, something is amiss in the query creation in the step 1 of search in the question.
Are you adding your Fields with the Field.TermVector.YES option enabled?
Document doc = new Document();
doc.add(new Field("value", documentContents, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.YES));
Use TermEnum:
Term term = new Term(field, searchTermString);
TermEnum enum = indexReader.terms(term);
int numDocs = enum.docFreq();
And you don't need the IndexSearcher