Java postal address parser [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
Somewhat related to this question, but in the absence of any answer about QuickBooks specifically, does anyone know of an address parser for Java? Something that can take unstructured address information and parse out the address line 1, 2 and city state postal code and country?

I do know that the Google Maps web service is great at doing this. So, if you want to use that, you could save a lot of effort.
The real issue here is that you need a worldwide database of city/country/province names to effectively parse UNSTRUCTURED addresses.
Here is how I build a URL for use by the Google Maps API in C#:
string url = "http://maps.google.com/maps/geo?key=" + HttpUtility.UrlEncode(this.apiKey) + "&sensor=false&output=xml&oe=utf8&q=" + HttpUtility.UrlEncode(location);

The SourceForge JGeocoder has an address parser that you may find useful. See http://jgeocoder.sourceforge.net/parser.html.

Might want to read this Stack Overflow question:
"Parse usable Street Address, City, State, Zip from a string". No actual Java code to do the job (just some VB), but there is some discussion of the problem and more info on the alternative John Gietzen mentions, of using a web service to interpret it for you.

The Mural project has an address parser: https://mural.dev.java.net/. I haven't figured out how to exract it from the larger Mural engine, but it does work based on some very limited tests.

See www.address-parser.com, they offer a web service for parsing international addresses.

Related

Is there a library to take a string, and classify it into a category based on whether it matches a group of strings? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
So, I have four lists of strings, each list which corresponds to a specific category. Each string is a job title, such as "web-developer", which corresponds to the category "IT".
The input string is going to be another job title, and the idea is to sort that job title into the appropriate category based on how well it matches the list of strings
Does anyone know a good library to accomplish this? Sadly, I do not have enough source material to properly train a machine learning system... All the libraries I've found so far seem to be based on machine learning
Alternatively, if no such library exists, do anyone have any suggestions on how to accomplish this? My best idea so far have been to just... search through all the strings and do a string.contains(searchString) and just match it like that. I dunno how to handle multiple matches though...
Ideally the library should be java, but this is not a necessity.
Alternatively, if no such library exists, do anyone have any
suggestions on how to accomplish this? My best idea so far have been
to just... search through all the strings and do a
string.contains(searchString) and just match it like that. I dunno how
to handle multiple matches though...
You could use an algorithm like Levenshtein string distance to achieve this. The algorithm gives you the number of steps needed to change one string to another: the less steps needed, the more similar the strings are.
There is an implementation within the StringUtils Apache Commons library.

Fully persistent linked list [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Why isn't there any implementation (in C, C++, Java or even Python...) of a fully persistent (not necessarily functional) linked list that has a constant time/space overhead in the number of modifications?
The data structure I have in mind is the one described in this paper:
http://www.cs.cmu.edu/~sleator/papers/Persistence.htm
After a long search on google I was unable to find even a partially persistent linked list implementation with the overhead sited above.
PS: The definitions of persistence I am speaking about are those described in the following Wikipedia page:
http://en.wikipedia.org/wiki/Persistent_data_structure
EDIT(after the question was put on hold):
I don't think the reason mentioned applies to my question. I am not exactly asking for recommendation among different available libraries, so there can t be "opinionated answers and spam". My question is kind of astonishment that a data structure, that is supposed to be great in theory, was not implemented by any of the known languages. So before I implement it myself I asked this question to see if there is an answer like: "It is normal, the data structure X dominates the one you re looking for and that's why it has not been implemented despite its simplicity". Another answer could be "It is not as good as you think since there is a big hidden constant" or "it doesn t do well with the way caches are built nowadays"... I am sorry if my question was not clear enough. I transformed my question making my request more explicit now.
Have you tried Functional Java library? It got some persistent data structures:
http://www.functionaljava.org/features.html

How to find word inflected forms in a large String? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 9 months ago.
Improve this question
I have a random text content in a String variable. I want to look for all word inflections of a specific word user specifies.
Example: If the user is looking for the word "assist" then it should grab all "assist, assists, assisted, assisting" occurrences in the String.
Is there a Java library available to detect such inflections automatically in the specified String?
Note: I have seen a Java library called WolframAlpha that claims it does this and here is its web interface, but i don't see this library working, and no guide is available for using it.
First of all it is not Java library, it is Wolfram language previously known as Mathematica. It does have JLink and can be called from Java, but you must have Wolfram Kernel running that executes the code.
This is called Natural Language Processing and it's a huge, complex field. I have fiddled about with few problems, but all I can say this is harder then complex if you want to get reliable solution.
Something you might want to take a look at would be : The Stanford NLP
It is called word stemming. First you need (for a specific language) derive the stem:
assisting -> assist using -ance, -ing, -ly, -s, -ed etcetera.
sought -> search using an exception list
Then do a search, maybe with a regular expression (Matcher.find). Pattern:
"\\bassist\\p{L}*"
"\\b(search|sought)\\p{L}"
For prefixes un- dis- inter- the case would still be more complicated, but in general flections are word endings in English. Then there is synonym searching.
Dictionaries out there are often called corpora. A search for "free English corpus" will yield results.
\\b = word boundary
p{L}* = 0 or more (*) letters
Check this out..
I don't know how big your requirement is, but you can always use wiktionary and parse your data??
Check this question.. Can be of help

Address Parser in Java [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I have thousands of pieces of address data and I want to parse them so I can separate street from country from postal code and so on.
Is there any way to do that in Java ?
I know that google open sourced their international address and phone number parsing library. I'd suggest you check their presentation here and javadoc.
If you simply have addresses from all over the world in the form they are on the letters, and you later want to send letters there, you better leave them in this format (maybe after splitting of the country, which comes usually last).
The internal formats very differ between the individual countries (even if only comparing Germany, Great Britain, Russia), and having a database with the individual components afterwards requires individual (country-specific) logic to put them together again.
(I once had an application which took input of the individual fields and later created an address list from then (by the "german way to do this"), and always received complains from the British users that I formatted their addresses in wrong order. So in a later version I simply created a multi-line "address" input field, which I then outputted without any change.)
You could probably use regular expressions if you don't want to add 3rd party dependencies.
See: http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
and http://download.oracle.com/javase/6/docs/api/java/util/regex/Matcher.html
Usage is basically:
private static final Pattern PAT_NAME = Pattern.compile("my\\sregex");
...
Matcher matcher = PAT_NAME.matcher("my address");
There is an older library here: http://jgeocoder.sourceforge.net/parser.html, but it works for most cases.
If you want to use an API, I've used SmartyStreets in the past and they work decently well (https://smartystreets.com/).

binary decision diagram [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
In Java, I have set of expressions like cond1 AND (cond2 OR cond3) AND ( cond 4 OR cond5). I would like to convert it into tree and then evaluate the final boolean answer. I tried searching a lot around java BDD but not able to get any. Any suggestion with sample code ?
A 5-second Google search returned some reasonable-looking results:
JavaBDD
Java Decision Diagram Libraries
What is the best Binary Decision Diagram library for Java?
Is this not what you're looking for?
He means Binary Decision Diagrams.
I've been tinkering with JavaBDD and JBDD/JDD. Both are based on BuDDY (a C library) -- JBDD actually uses the C DLLs for a marginal performance boost.
It looks to me like JavaBDD is more fully-featured (ex. it supports composing BDDs, which is what I need). But there is also no tutorial for it, and while the class docs aren't terrible, frankly I can't figure out how to use it for the most basic of boolean operations (like the problem you pose).
JBDD/JDD requires you to use manual garbage collection, and does weird things like store BDD objects in Java integers -- clearly carry-overs from C. But it has a set of tutorials.
If you want to run your own parser, check out JavaCC.
Here is a nice tutorial to get you started. A bit older, but still valid:
http://www.javaworld.com/jw-12-2000/jw-1229-cooltools.html

Categories

Resources