Im trying to create a Business card reader
Im having a block of text for example
Name
Head - Business Development
Company Name
# 2/324, ll Floor, Some Road,
Street, City-Zip, State, Country.
Tel : +987654321
Mobile: +123456789
Email : mail#comany.com
Website : www.comany.com
I want to parse the details out of this text like name, company name, designation, address. I was able to parse number, email address and website. Can anyone help me with it. I dont want to use any webservices, I want it to be done offline.
I was able to parse number, email address and website
How are you doing it? What problems are you facing in parsing other parts?
If the sequence of contents is not going to change, simply read line by line and parse accordingly.
You can look at StringTokenizer.
Related
Can any one help me to get solution for the api changing based on the email id(account details )
ex:
/api/getRooms/test#gmail.com
(last name is changing) for other accounts .
Put all the email ids in the csv with header (Header Name:- Email).
Then, use "CSV Data Set Config" to provide them as inputs to your request. Pass the header as variable to the api like /api/getRooms/${Email}.
In my java web application (Jsp + Servlet + hibernate) users can request books. The request goes to the database as a text. After that I tokenize the text using Apache Open NLP. Then I need to compare these tokenized text with books table (the books table has book ID , Book Name , Author , Description) and give most related suggestions to the user. Mostly I need to compare this with book name column and book description column. Is this possible?
import opennlp.tools.tokenize.SimpleTokenizer;
public class SimpleTokenizerExample {
public static void main(String args[]){
String sentence = "Hello Guys , I like to read horror stories. If you have any horror story books please share with us. Also my favorite author is Stephen King";
//Instantiating SimpleTokenizer class
SimpleTokenizer simpleTokenizer = SimpleTokenizer.INSTANCE;
//Tokenizing the given sentence
String tokens[] = simpleTokenizer.tokenize(sentence);
//Printing the tokens
for(String token : tokens) {
System.out.println(token);
}
}
}
Apache OpenNLP can do Natural Language Processing, but the task you describe is Information Retrieval. Take a look at http://lucene.apache.org/solr/.
If you really need to use DB only, you can try to make a query for each token using the LIKE sql keyword:
SELECT DISTINCT FROM mytable WHERE token IN description;
and rank the lines with higher match.
How OpenNLP can help you?
You can use the OpenNLP Stemmer. In that case you can get the stem of the book description and title before adding it to the columns to the database. You also need to stem the query. This will help you with inflections: "car" will match "cars", "car".
You can accomplish the same with the OpenNLP Lemmatizer, but you need a trained model, which is not available today for that module.
just to add to what #wcolen says, some out of the box stemmers exist for various languages in Lucene as well.
Another thing OpenNLP could help with is recognizing book authors names (e.g. Stephen King) via the NameFinderTool so that you could adjust the query so that your code creates a phrase query for such entities instead of a plain keyword based query (with the result that you won't get results containing Stephen or King but only results containing Stephen King).
I 'm trying to use the Baas Parse for my Android application.
I've followed the quickstart guide with success, so i'have successfully registered a TestObject into my Parse database.
Now, I try to adapt Parse to my needs and to register a User into Parse database with that code:
Parse.initialize(this, "NkThBbZ4gcXQf3s59UGTozpCjKQbECVP5SmuXCkY", "n7a65t3o8fZNAXTOygWIg2L9Kui316yepfgoSdhf");
ParseObject userParse = new ParseObject("User");
userParse.put("username", user.getPseudonyme());
userParse.put("password", user.getPassword());
userParse.saveInBackground();
But it doesn't insert my user into the User table, instead it creates a new table and insert my User in that new User table.
The first User table is a default table created by parse as far as i understand but i don't understand why is it impossible to use it.
The result is that i have 2 Users tables.
Thank you very much for your answers if anyone had the same problem.
Sebastien
The pre-made User table actually has an underscore before it. It is the _User table, not the User table. The underscore is to signify it's a special class that Parse gives you. Use _User instead of User.
I am doing "Address Validation". My Address table has "Street Address", "City", "State", "Postal Code", "Country". I am using Google MAP API to validate my address.
I gave my address like
Street Address -- "kajhfkjdhfkjdsh"
City -- ksjfdlsjflsdjflk
State -- AP
PostalCode -- 500087
Country -- India
In this example only State, PostalCode, Country are valid and the remaining fields are invalid. But when I use the Google Map's API its saying its a valid address. But the street address and city are invalid.
So as per my observation the address validation is done only based on 3 fields those are (State, Postal Code, Country). So How can I validate Street and city along with the remaining fields with GOOGLE MAPS API?
Or is there any way/other (API) to validate all the fields in my address table.
Can anyone help me on this. I am stuck over here.
When trying this address with the google geocoding service http://maps.googleapis.com/maps/api/geocode/xml?sensor=false&address=kajhfkjdhfkjdsh,ksjfdlsjflsdjflk,AP,500087,India I see the result is of type postal_code and partial_match is set to true.
Trying real address I get the result type as street_address and the partial_match flag is missing.
I suggest you make some tests and check the content of the service result. This is a geocoding service, so it aims to give you a location to point your map to, not to validate addresses.
I'm using the Yahoo Finance Streaming API to get stock quotes, I wanted to save these into a DB table for historical reference.
I'm looking for something which can easily parse various strings which have a format that varies like the examples below:
<script>try{parent.yfs_mktmcb({"unixtime":1310957222});}catch(e){}</script>
<script>try{parent.yfs_u1f({"ASX.AX":{c10:"-0.06"}});}catch(e){}</script>
<script>try{parent.yfs_u1f({"AWC.AX":{l10:"2.16",c10:"+0.01",p20:"+0.47"}});}catch(e){}</script>
<script>try{parent.yfs_u1f({"ALZ.AX":{l10:"2.6900",c10:"-0.1200",p20:"-4.27"}});}catch(e){}</script>
I want to parse these strings to a MySQL database and I was thinking the easiest way will be using Java to do this parsing. Basically these entries are line by line in a text file. I want to extract the time, the stock code, the price and the change values in a simple table.
The table looks like StockCode | Date | Time | Price | ChangeDol | ChangePer
Are there any tools or frameworks which would make this process easy?
Thanks!
I don't how you get your quote, but if you could use YQL, any XML parser would do:
YQL
<quote symbol="YHOO">
<Ask>14.76</Ask>
<AverageDailyVolume>28463800</AverageDailyVolume>
<Bid>14.51</Bid>
<AskRealtime>14.76</AskRealtime>
<BidRealtime>14.51</BidRealtime>
<BookValue>9.826</BookValue>
<Change_PercentChange>0.00 - 0.00%</Change_PercentChange>
....
</quote>
List of XML Parsers for Java
You could have a look there
http://www.wikijava.org/wiki/Downloading_stock_market_quotes_from_Yahoo!_finance
They get finencial data as csv from yahoo.