String Parser in Lucene

String Parser in Lucene - java

I am new in Lucene Search,
I merged DrillDownQuery and Boolean Query in such a way like
BooleanQuery inner=new BooleanQuery();
inner.add(NumericRangeQuery.newIntRange("year",null,null, false, false));
DrillDownQuery q = new DrillDownQuery(config);
q.add("book","lucene in action","lissa" );
inner.add(q,Occur.MUST)
DrillDownQuery q1 = new DrillDownQuery(config);
q1.add("book","java programming","hills publications" );
inner.add(q1,Occur.MUST)
i just want to store the above Query in Database as a string
so that I can use it
String str=inner.toString();
the str variable contains this type of value
++ConstantScore($facets:booklucene in actionlissa)^0.0 ++ConstantScore($facets:bookjava programminghills publications)^0.0 +year:{* TO *}
as in future whenever any user clicks on link, I just want to fetch this string query from database and get parse into Lucene Query
for this purpose i used lots of parser but am not able to parse such type of query. Some of the parser I am using is here
QueryParser qs=new QueryParser(null,new WhitespaceAnalyzer());
Query q= qs.parse(str)
I have also use ComplexPhraseQueryParser,SimpleParser but I am not getting proper result yet
any kind of help would be heartily appreciated

Related

How to only get from MongoDB documents that match a specific filter using Java

My MongoDB Documents are like this:
"_id":"C\Users...\1.html"{
"data": Object[
gambia:1
estonia:1
...etc
]
}
I don't know if I wrote correctly the structure, so here you have a picture:
My problem is that I want to get from DB all Documents that match a specific word given. For example if the user enters "car" I need all the documents that in its data have the word car. But I couldn't find how to do it.
What I tried? I tried to get the first document that matches the word "gambia" which I know it exists in the first Document:
MongoClient mongoClient = MongoClients.create();
MongoDatabase database = mongoClient.getDatabase("motor");
MongoCollection<Document> collection = database.getCollection("dictionary");
String word = "gambia";
Document myDoc = collection.find(Filters.eq("data", word)).first();
System.out.println(myDoc);
But I'm getting a null pointer exception.

You can achieve this using the mongodb function $exists.
This function tells you if a determinated field, is in your collection.
For example:
MongoCollection<Document> collection = database.getCollection("dictionary");
String word = "gambia";
DBObject query = new BasicDBObject("data." + word, new BasicDBObject(
"$exists", true));
Document myDoc = collection.find(query);
System.out.println(myDoc);

How to perform a query search only on documents with one of its Fields matching a specific value in Lucene 7.5.0?

Lucene version: 7.5.0
With two given inputs (userTitle & userQuestion), I want to perform a search only within the indexed documents whose title matches with userTitle, but I am struggling to make this happen.
Below is what I have so far, but this is returning documents with other titles as well.
"title" and "body" are both TextFields.
Any advice would be greatly appreciated.
Query queryTitle = new TermQuery(new Term("title", userTitle));
Analyzer analyzer = new StandardAnalyzer();
QueryParser qpBody = new QueryParser("body", analyzer);
Query queryBody = qpBody.parse(QueryParser.escape(userQuestion));
BooleanQuery query = new BooleanQuery.Builder()
.add(queryTitle, BooleanClause.Occur.MUST)
.add(queryBody, BooleanClause.Occur.SHOULD)
.build();

Lucene query fuzzyquery

I want to apologize first for my poor English I'm new to loosen and I didn't really understand the query documentation, I indexed some docs and made this query code but its not working
Term t = new Term("description", "history");
Query q = new FuzzyQuery(t, 2);
int hitsPerPage = 100;
Path indexPath = Paths.get("C:\\Users\\Win 7\\Desktop\\projet_ri\\index");
Directory directory = FSDirectory.open(indexPath);
DirectoryReader reader = DirectoryReader.open(directory);
IndexSearcher iSearcher = new IndexSearcher(reader);
TopDocs topdocs = iSearcher.search(q, hitsPerPage);
ScoreDoc[] resultsList = topdocs.scoreDocs;
System.out.println("Tab size: "+resultsList.length); // This prints Tab size: 0
for(int i = 0; i<resultsList.length; i++){
Document book = iSearcher.doc(resultsList[i].doc);
String description = book.getField("description").stringValue();
System.out.println(description);
}
The program isnt even entering the loop, i tried to check resultsList tab and it prints that the size is zero
Can someone help me to correct my code or give me a query example code ?

You actually missed using a QueryParser for your query.
This QueryParser needs the same Analyzer as you use for indexing. This is really important, otherwise the resultset may differs from what you expect. Your sequence should be something like this:
open Index
create IndexSearcher
create QueryParser with on indexing used Analyzer
create Query with given search terms
parse Query with QueryParser
search
close everything!
See basic lucene tutorial: https://www.tutorialspoint.com/lucene/lucene_search_operation.htm

Java Lucene - different results for BooleanQuery and QueryParser Query for same Lucene Query Language

I have observed an odd behaviour but I don't see what I am doing wrong.
I created via multiple BooleanQueries the following query:
+(-(Request.zipCode:18055 Request.zipCode:33333 Request.zipCode:99999) +Request.zipCode:[* TO *]) *:*
...this is what I get via toString
Update: this way I created a part of the BooleanQuery which is responsible to create this snippet +Request.zipCode:[* TO *])
Query fieldOccursQuery = new TermQuery(new Term(queryFieldName, "[* TO *]"));
I have created exaclty same (per my understanding) Query via QueryParser like this:
String querystr = "+(-(Request.zipCode:18055 Request.zipCode:33333 Request.zipCode:99999) +Request.zipCode:[* TO *]) *:*";
Query query = new QueryParser(Version.LUCENE_46, "title", LuceneServiceI.analyzer).parse(querystr);
I processed both of them the same way like this:
IndexReader reader = DirectoryReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);
int max = reader.maxDoc();
TopScoreDocCollector collector = TopScoreDocCollector.create(max > 0 ? max : 1, true);
searcher.search(query, collector);
....
ScoreDoc[] hits = collector.topDocs().scoreDocs;
Map<Integer, Document> docMap = new TreeMap<Integer, Document>();
for (int i = 0; i < hits.length; i++) {
docMap.put(hits[i].doc, indexSearcher.doc(hits[i].doc));
}
Different results
On a index like: stored,indexed,tokenized,omitNorms,indexOptions=DOCS_ONLY<Request.zipCode:04103>
The Query via QueryParser deliver one document as expected
The Query via BooleanQuery does not deliver 1 expected document
Questions
Are there possibilities that both same queries deliver different results? Set certain attributes to my BooleanQuery etc.
How can I get the same wanted result for BooleanQuery?
I could not found anything about differences only in concern of performance (http://www.gossamer-threads.com/lists/lucene/java-user/144374)

I found the solution to my problem.
Instead of creating this for the BooleanQuery:
Query fieldOccursQuery = new TermQuery(new Term(queryFieldName, "[* TO *]"));
I used this:
ConstantScoreQuery constantScoreQuery = new ConstantScoreQuery(new FieldValueFilter(queryFieldName));
query.add(constantScoreQuery, Occur.MUST);
Now my query looks different but I only get documents with fields with my queryFieldName.
Issue seems to be the leading wildcard in my first solution:
Find all Lucene documents having a certain field

How can I retrieve DBObjects that contains a substring of a search-word?

I am using MongoDB with Java Driver (http://tinyurl.com/dyjxz8k). In my application I want it to be possible to give results that contains a substring of the users search-term. The method looks like this:
*searchlabel = the name of a field
*searchTerm = the users searchword
private void dbSearch(String searchlabel, String searchTerm){
if(searchTerm != null && (searchTerm.length() > 0)){
DBCollection coll = db.getCollection("MediaCollection");
BasicDBObject query = new BasicDBObject(searchlabel, searchTerm);
DBCursor cursor = coll.find();
cursor = coll.find(query);
try {
while(cursor.hasNext()) {
System.out.println(cursor.next());
//view.showResult(cursor.next());
}
} finally {
cursor.close();
}
}
}
Does anybody have any idea about how I can solve this? Thanks in advance =) And a small additional question: How can I handle the DBObjects according to presentation in (a JLabel in) view?

For text-searching in Mongo, there are two options:
$regex operator - however unless you have a simple prefix regexp, queries won't use an index, and will result in a full scan, which usually is slow
In Mongo 2.4, a new text index has been introduced. A text query will split your query into words, and do an or-search for documents including any of the words. Text indexes also eliminate some stop-words and have simple stemming for some languages (see the docs).
If you are looking for a more advanced full-text search engine, with more powerful tokenising, stemming, autocomplete etc., maybe a better fit would be e.g. ElasticSearch.

I use this method in the mongo console to search with a regular expression in JavaScript:
// My name to search for
var searchWord = "alex";
// Construct a query with a simple /^alex$/i regex
var query = {};
query.animalName = new RegExp("^"+searchWord+"$","i");
// Perform find operation
var lionsNamedAlex = db.lions.find(query);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

String Parser in Lucene - java

Related

How to only get from MongoDB documents that match a specific filter using Java

How to perform a query search only on documents with one of its Fields matching a specific value in Lucene 7.5.0?

Lucene query fuzzyquery

Java Lucene - different results for BooleanQuery and QueryParser Query for same Lucene Query Language

How can I retrieve DBObjects that contains a substring of a search-word?

Categories

Resources