I have a system where people inputs some words and based on this I have to search into a database of products. The products belongs to one category and have attributes such as brand,price,condition (new, old,used..)
Does someone knows how to sort a list of results according to best match i.e. those which match words entered by the user should appear first
Maybe you could use Zend Lucene, you'll find a quick intro on this Symfony framework page.
Edit: as you are using Java, try the original Lucene library (Zend Lucene is actually a port to PHP).
Related
I'm developing a java web application where users can enter their request to the web application through a text box.. I need to analyze user's text inputs (customer requests) and compare it with the web application database and give (view) the suitable suggestions or results to the customer ? Is it possible with OpenNLP ? please give me some advises.
This sounds like a "More Like This" kind of use case rather than an NLP use case, but it depends on some details . . .
If you need to extract specific product names from the customer request, then look them up, then you could train a Named Entity Recognition model (NER) on your data using OpenNLP's name finder. Although it may be overkill for this use case, because unless you have a ton of data with a ton of product names, you could probably just use a Regex match approach on a solid list of product names.
If you need to "fuzzy match" the whole customer request to other customer requests or to product descriptions or something, you would likely be better off using something like ElasticSearch to index your database entries, then pass in the customer request to the "more like this" function, which would return you N best matches (scored) on the fly. In fact, I would recommend this approach first, since it requires no model maintenance, no training data, no feature extraction etc that comes with NER.
HTH
Here's a link to the ElasticSearch MLT function
Link
I have on my server a parser and a searcher for lucene query that search on xml and i have an android application that use this service.
Until now this android application has used the searcher in a simple way. Writing something on a text widget and clicking on a button it's like search:
title: something
It's found all the files which have on title "something".
But the service permit to me to search things like:
mediatype:audio AND mtime:[45dayago TO now] AND metadata_count:[04 TO 99]
More info on lucene query are here.
For user it's realy difficult know what terms are valid or how to describe advanced query but it's realy important to search on an archive. I would try to make an easy and valid lucene query to help the user to use that advanced search experience.
Any ideas would be appreciated.
If it were me, I'd start by constructing a view that has an arbitrary number of TextEdit boxes (the user can select how many, possibly via a spinner or a "add more" button). Then have a selector whether these terms are to be "ANDed" or "ORed" together.
Once you have that working, enhance each entry to be more than just a TextEdit, but to have options for entering numbers (via spinners?), ranges (via slider?) or dates/times (via a calendar picker?)?
The key here is to take an iterative approach. Implement one additional capability, and make sure it works, before moving on to the next one.
I want to build a keyword search, i saw google app engine api and lucene api but my problem is I have some articles lets say 5000 articles each article have a unique ID, if user search with a keyword then the program should return all the article ID which contains this keyword.
Second thing if user search with a keyword for ex. dress then it should return the articles which contains the keywords dress, dressing, dressed etc.
This is what the Search API is designed for.
While it has some limitations, for your basic use case it should suffice. If you want to use Lucene, you will need to run it on another platform (or heavily customise it) because it uses the file system.
For your requirement to find similar words, you can read about stemmed queries here
Use lucene which is a high-performance, full-featured text search engine library. Index each article in different lucene document with unique field article_id. Also index article text in field article_text. Apply StopWordsFilter, PorterStemFilter etc. to field article_text. After indexing you are ready to search keywords.
I have been working on information extraction and was able to run standAloneAnnie.java
http://gate.ac.uk/wiki/code-repository/src/sheffield/examples/StandAloneAnnie.java
My question is, How can I use GATE ANNIE to get similar words like if I input (dine) will get result like (food, eat, dinner, restaurant) ?
More Information:
I am doing a project where I was assigned to develop a simple webpage to take user input and pass to GATE components which will tokenize the query and return a semantic grouping for each phrase in order to make some recommendation.
For example user would enter "I want to have dinner in Kuala Lumpur" and the system will break it down to (Search for :dinner - Required: restaurant, dinner, eat, food - Location: Kuala Lumpur.
ANNIE by default has like 15 annotations, see demo
http://services.gate.ac.uk/annie/
Now I already implemented everything as the demo but my question is. Can I do that using GATE ANNIE, i mean is it possible to find words synonyms or group words based on their type (noun, verbs)?
Plain vanilla ANNIE doesn't support this kind of thing but there are third party plugins such as Phil Gooch's WordNet Suggester that might help. Or if your domain is fairly restricted you might get better results with less effort by simply creating your own gazetteer lists of related terms and a few simple JAPE rules. You may find the training materials available on the GATE Wiki useful if you haven't done much of this before.
Currently, I am using Lucene version 3.0.2 to create a search application that is similar to a dictionary. One of the objects that I want to display is a sort of "example", where Lucene would look for a word in a book and then the sentences where the words were used are displayed.
I've been reading the Lucene in Action book and it mentions something like this, but looking through it I can't find other mentions. Is this something you can do with Lucene? If it is, how is can you do it?
I believe what you are looking for is a Highlighter.
One possibility is to use the lucene.search.highlight package, specifically the Highlighter.
Another option is to use the lucene.search.vectorhighlight package, specifically the FastVectorHighlighter.
Both classes search a text document, choose relevant snippets and display them with the matching terms highlighted. I have only used the first one, which worked fine for my use-case. If you can pre-divide the book into shorter parts, it would make highlighting faster.