I'm writing an application that needs to make use of geo-referenced data, and I'd like to use MongoDB + Morphia. The application is in Scala, but if a portion needs to be Java, that's ok (yay compatibility!)
I have a class to represent the Latitude and Longitude of events:
class LatLon
{
#BeanProperty
var latDegrees : Double
#BeanProperty
var lonDegrees : Double
}
It's not a very exciting class, but it is useful in this context.
Now, I have an event that I record at a location:
class ObservedEvent
{
#BeanProperty
var observation : String = _
#Beanproperty
var location : LatLon = _
}
Now, I have a ton of observed events and I want to store them in MongoDB with Morphia. The 'location' should be stored as a GeoJson Point so I can index the collection, etc. I have tried making SimpleValueConverter's, adapters, and a few other hacks, but I haven't been able to figure out how to make this work. It seems like such a common use-case that it would be built in. Hopefully the answer here is "It's built in! Look [here]". If it is, I haven't found it :(
Thanks!
Geojson format is not strictly required; an index is possible on any key in mongo, so you can simply index latDegrees and lonDegrees keys. Plus geojson format is verbose and will increase your db size. We have been doing that for about 1 year and its caused no problems with mapping.
Yes, you do need Geojson if you wish to use the variety of nifty $near, $within geo-spatial queries or $geoNear aggregation.. But if you dont plan on using them, then dont bother imho.
Now, I have a ton of observed events and I want to store them in
MongoDB with Morphia. The 'location' should be stored as a GeoJson
Point so I can index the collection, etc.
When and if you need to convert your geo data from your latDegrees, lonDegrees format to the expected geojson format run this shell script. This is best as a one-time operation.
db.COLLECTION.find().forEach(function(doc) {
// Create a geojson point using the lon and lat found in each doc.
var location = {"type":"Point", "coordinates":[doc.lonDegrees, doc.latDegrees};
printjson ("Location is " + location);
// update the collection with geojson doc and unset the old one
db.COLLECTION.update({"_id":doc._id},{$set:{"loc":location}, $unset : {"lonDegrees", "latDegrees"} } );
})
where COLLECTION is the name of your 'table' within the DB.
Note that geojson needs data in the lon, lat sequence versus the common lat, lon convention.
BE CAREFUL. The above script sets the geojson and unset your old format in one decisive operation. You can always create a new collection by replacing the update with insert --- on a new collection as such. The following will retain your data and make a new collection in the same DB.
db.NEW_COLLECTION_NAME.insert(doc};
db.NEW_COLLECTION_NAME.update({"_id":doc._id},{$set:{"loc":location}, $unset : {"lonDegrees", "latDegrees"} } );
Run the scrip on the commandline using the mongo shell.
mongo DATABASE filename.js
I cannot help you with the core problem, nevertheless, I would suggest focusing on a BSON handler on the Reactive or Rogue Mongo driver -- opposed to extending your private code. Its cleaner and then others can use your work. Yes is a common issue and should be in the Mongo drivers.
ie. Looking at this Scala sample you see the use or BSONDocument. One could create a BSONLatLonDocument that serializes/marshalls a 2 float array into the needed geojson format.
I wanted to generate simple GeoJSON from Java and found GeoJSON-POJO.
Fast to implement and very simple.
Add GeoJSON-POJO to your pom as a new dependency.
Build java objects with your data.
Send the objects off to Jackson for serialization.
I'm sure this solution could be optimized in many ways. But I doubt the feature could be completed with fewer man-hours. I went from random data on a server to GeoJSON displaying on a client browser map (leafletjs) in about 10 minutes.
Sorry if that doesn't help you with MongoDB or Morphia...
For what it's worth, morphia 0.110 will have greatly improved geo support. There are snapshot builds in the sonatype snapshot repository if you want to give it a go.
You can use morphia GeoJson class to generate a Point that will be persisted with type point in mongo db
I am new to GIS area and I need to validate a geometry in WKT format in java, to check whether a simple polygon is a closed loop, i.e the start and end points of the vertices should be the same. I am currently using jGeometry class of oracle spatial(com.oracle.sdoapi), get the first and last vertices and comparing them. also, i am using getType() method to check whether it is a simple polygon or not. The following is the piece of code that am using:
WKT wkt = new WKT();
JGeometry geometry = wkt.toJGeometry(wkt.getBytes());
double[] d1 = geometry.getFirstPoint();
double[] d2 = geometry.getLastPoint();
if(!jGeometry.getType() == jGeometry.GTYPE_POLYGON){
//error message for other geometries
}
Is there any simple way of doing this or is there any API available? I dont want to reinvent the wheel, if it is already done and simple to use. Thanks!
The Java Topology Suite contains a WKTReader class that will suit your purposes. See http://tsusiatsoftware.net/jts/javadoc/com/vividsolutions/jts/io/WKTReader.html. You can use WKTReader to parse the WKT, and look for ParseExceptions, which indicate an invalid WKT.
If the WKT parses, you can then use the instanceof operator or WKTReader.getGeometryType() to determine the type of parsed Geometry class, and see if it's one of the Geometry types (Polygon or Multipolygon) with closed shells like Polygon or Multipolygon.
I know there are web services out that have this information, however they can be limited to per day requests. I have about 114,000 records I need zip codes for. I have a data base full of zip codes with there lat and longs. However I am not sure how I can calculate the given lat and long against the zip code lat and long.
Basically I need to cross reference the given address lat and long against the supplied zip code lat and long. I can either use PHP, Java, or MySQL Procedure or just a calculation.
I can give you a stepping stone but thats about it, in this case.
$distance = 10;
$latitude = 37.295092;
$longitude = -121.896490;
$sql = "SELECT loc.*, (((acos(sin(($latitude*pi()/180)) * sin((`latitude`*pi()/180))+cos(($latitude*pi()/180)) * cos((`latitude`*pi()/180)) * cos((($longitude - `longitude`)*pi()/180))))*180/pi())*60*1.1515) AS `distance` FROM table_with_lonlat_ref loc HAVING distance < $distance"
if you create a query that does a JOIN between the 2 tables you have and reduce the distance to 1 or 2, you could in concept come up with just about all the lat/lon combinations you need. Or you could also find a DB that has all the US zipcodes, that also has lat/lon then query over one table to insert into another based on the matched zipcodes. I have such a zipcode DB somewhere.
also might I suggest http://www.maxmind.com/app/geolite its never complete less you wanna pay for it and it changes up every so often but. From this you can get almost nearly every combination of lat/lon possible to use as your reference point based on IP of a visitor (its off a little in some cases as the IP may steam from a hub a town or 2 away. But its better than nothing, gives you only limits your server can handle, and no worry about API restrictions outside of usage terms from maxmind.
Anyway all in all, Ive been using this combination for a while on a number of sites and have yet to come up with much problems to date. Well I know its not a direct answer to your question but I hope it leads you to a solution
Since you already have a database of lats and longs, I'm going to assume it describes a set of rectangular regions from (latA, lonA) to (latB, lonB), each with an associated zip code. I'll also assume (or recommend) that you've indexed those four fields.
Your query can fairly easily match whether a coordinate (coordA, coordB) fits within the two ranges describing that rectangle.
update coords set zip_code=(
select zip_code from zip_codes
where coords.coordA >= zip_codes.latA
and coords.coordA <= zip_codes.latB
and coords.coordB >= zip_codes.lonA
and coords.coordB <= zip_codes.lonB
limit 1
)
Caviat: You should verify whether latA < latB and lonA < lonB in your zip code database, and verify that you're using the same coordinate system in both tables. You may need to make adjustments, either through a conversion or by changing the operators appropriately.
Any clever ideas on how to generate random coordinates (latitude / longitude) of places on Earth? Latitude / Longitude. Precision to 5 points and avoid bodies of water.
double minLat = -90.00;
double maxLat = 90.00;
double latitude = minLat + (double)(Math.random() * ((maxLat - minLat) + 1));
double minLon = 0.00;
double maxLon = 180.00;
double longitude = minLon + (double)(Math.random() * ((maxLon - minLon) + 1));
DecimalFormat df = new DecimalFormat("#.#####");
log.info("latitude:longitude --> " + df.format(latitude) + "," + df.format(longitude));
Maybe i'm living in a dream world and the water topic is unavoidable ... but hopefully there's a nicer, cleaner and more efficient way to do this?
EDIT
Some fantastic answers/ideas -- however, at scale, let's say I need to generate 25,000 coordinates. Going to an external service provider may not be the best option due to latency, cost and a few other factors.
To deal with the body of water problem is going to be largely a data issue, e.g. do you just want to miss the oceans or do you need to also miss small streams. Either you need to use a service with the quality of data that you need, or, you need to obtain the data yourself and run it locally. From your edit, it sounds like you want to go the local data route, so I'll focus on a way to do that.
One method is to obtain a shapefile for either land areas or water areas. You can then generate a random point and determine if it intersects a land area (or alternatively, does not intersect a water area).
To get started, you might get some low resolution data here and then get higher resolution data here for when you want to get better answers on coast lines or with lakes/rivers/etc. You mentioned that you want precision in your points to 5 decimal places, which is a little over 1m. Do be aware that if you get data to match that precision, you will have one giant data set. And, if you want really good data, be prepared to pay for it.
Once you have your shape data, you need some tools to help you determine the intersection of your random points. Geotools is a great place to start and probably will work for your needs. You will also end up looking at opengis code (docs under geotools site - not sure if they consumed them or what) and JTS for the geometry handling. Using this you can quickly open the shapefile and start doing some intersection queries.
File f = new File ( "world.shp" );
ShapefileDataStore dataStore = new ShapefileDataStore ( f.toURI ().toURL () );
FeatureSource<SimpleFeatureType, SimpleFeature> featureSource =
dataStore.getFeatureSource ();
String geomAttrName = featureSource.getSchema ()
.getGeometryDescriptor ().getLocalName ();
ResourceInfo resourceInfo = featureSource.getInfo ();
CoordinateReferenceSystem crs = resourceInfo.getCRS ();
Hints hints = GeoTools.getDefaultHints ();
hints.put ( Hints.JTS_SRID, 4326 );
hints.put ( Hints.CRS, crs );
FilterFactory2 ff = CommonFactoryFinder.getFilterFactory2 ( hints );
GeometryFactory gf = JTSFactoryFinder.getGeometryFactory ( hints );
Coordinate land = new Coordinate ( -122.0087, 47.54650 );
Point pointLand = gf.createPoint ( land );
Coordinate water = new Coordinate ( 0, 0 );
Point pointWater = gf.createPoint ( water );
Intersects filter = ff.intersects ( ff.property ( geomAttrName ),
ff.literal ( pointLand ) );
FeatureCollection<SimpleFeatureType, SimpleFeature> features = featureSource
.getFeatures ( filter );
filter = ff.intersects ( ff.property ( geomAttrName ),
ff.literal ( pointWater ) );
features = featureSource.getFeatures ( filter );
Quick explanations:
This assumes the shapefile you got is polygon data. Intersection on lines or points isn't going to give you what you want.
First section opens the shapefile - nothing interesting
you have to fetch the geometry property name for the given file
coordinate system stuff - you specified lat/long in your post but GIS can be quite a bit more complicated. In general, the data I pointed you at is geographic, wgs84, and, that is what I setup here. However, if this is not the case for you then you need to be sure you are dealing with your data in the correct coordinate system. If that all sounds like gibberish, google around for a tutorial on GIS/coordinate systems/datum/ellipsoid.
generating the coordinate geometries and the filters are pretty self-explanatory. The resulting set of features will either be empty, meaning the coordinate is in the water if your data is land cover, or not empty, meaning the opposite.
Note: if you do this with a really random set of points, you are going to hit water pretty often and it could take you a while to get to 25k points. You may want to try to scope your point generation better than truly random (like remove big chunks of the Atlantic/Pacific/Indian oceans).
Also, you may find that your intersection queries are too slow. If so, you may want to look into creating a quadtree index (qix) with a tool like GDAL. I don't recall which index types are supported by geotools, though.
This has being asked a long time ago and I now have the similar need. There are two possibilities I am looking into:
1. Define the surface ranges for the random generator.
Here it's important to identify the level of precision you are going for. The easiest way would be to have a very relaxed and approximate approach. In this case you can divide the world map into "boxes":
Each box has it's own range of lat lon. Then you first randomise to get a random box, then you randomise to get a random lat and random long within the boundaries of that box.
Precisions is of course not the best at all here... Though it depends:) If you do your homework well and define a lot of boxes covering most complex surface shapes - you might be quite ok with the precision.
2. List item
Some API to return continent name from coordinates OR address OR country OR district = something that WATER doesn't have. Google Maps API's can help here. I didn't research this one deeper, but I think it's possible, though you will have to run the check on each generated pair of coordinates and rerun IF it's wrong. So you can get a bit stuck if random generator keeps throwing you in the ocean.
Also - some water does belong to countries, districts...so yeah, not very precise.
For my needs - I am going with "boxes" because I also want to control exact areas from which the random coordinates are taken and don't mind if it lands on a lake or river, just not open ocean:)
Download a truckload of KML files containing land-only locations.
Extract all coordinates from them this might help here.
Pick them at random.
Definitely you should have a map as a resource. You can take it here: http://www.naturalearthdata.com/
Then I would prepare 1bit black and white bitmap resource with 1s marking land and 0x marking water.
The size of bitmap depends on your required precision. If you need 5 degrees then your bitmap will be 360/5 x 180/5 = 72x36 pixels = 2592 bits.
Then I would load this bitmap in Java, generate random integer withing range above, read bit, and regenerate if it was zero.
P.S. Also you can dig here http://geotools.org/ for some ready made solutions.
To get a nice even distribution over latitudes and longitudes you should do something like this to get the right angles:
double longitude = Math.random() * Math.PI * 2;
double latitude = Math.acos(Math.random() * 2 - 1);
As for avoiding bodies of water, do you have the data for where water is already? Well, just resample until you get a hit! If you don't have this data already then it seems some other people have some better suggestions than I would for that...
Hope this helps, cheers.
There is another way to approach this using the Google Earth Api. I know it is javascript, but I thought it was a novel way to solve the problem.
Anyhow, I have put together a full working solution here - notice it works for rivers too: http://www.msa.mmu.ac.uk/~fraser/ge/coord/
The basic idea I have used is implement the hiTest method of the GEView object in the Google Earth Api.
Take a look at the following example of the hitest from Google.
http://earth-api-samples.googlecode.com/svn/trunk/examples/hittest.html
The hitTest method is supplied a random point on the screen in (pixel coordinates) for which it returns a GEHitTestResult object that contains information about the geographic location corresponding to the point. If one uses the GEPlugin.HIT_TEST_TERRAIN mode with the method one can limit results only to land (terrain) as long as we screen the results to points with an altitude > 1m
This is the function I use that implements the hitTest:
var hitTestTerrain = function()
{
var x = getRandomInt(0, 200); // same pixel size as the map3d div height
var y = getRandomInt(0, 200); // ditto for width
var result = ge.getView().hitTest(x, ge.UNITS_PIXELS, y, ge.UNITS_PIXELS, ge.HIT_TEST_TERRAIN);
var success = result && (result.getAltitude() > 1);
return { success: success, result: result };
};
Obviously you also want to have random results from anywhere on the globe (not just random points visible from a single viewpoint). To do this I move the earth view after each successful hitTestTerrain call. This is achieved using a small helper function.
var flyTo = function(lat, lng, rng)
{
lookAt.setLatitude(lat);
lookAt.setLongitude(lng);
lookAt.setRange(rng);
ge.getView().setAbstractView(lookAt);
};
Finally here is a stripped down version of the main code block that calls these two methods.
var getRandomLandCoordinates = function()
{
var test = hitTestTerrain();
if (test.success)
{
coords[coords.length] = { lat: test.result.getLatitude(), lng: test.result.getLongitude() };
}
if (coords.length <= number)
{
getRandomLandCoordinates();
}
else
{
displayResults();
}
};
So, the earth moves randomly to a postition
The other functions in there are just helpers to generate the random x,y and random lat,lng numbers, to output the results and also to toggle the controls etc.
I have tested the code quite a bit and the results are not 100% perfect, tweaking the altitude to something higher, like 50m solves this but obviously it is diminishing the area of possible selected coordinates.
Obviously you could adapt the idea to suit you needs. Maybe running the code multiple times to populate a database or something.
As a plan B, maybe you can pick a random country and then pick a random coordinate inside of this country. To be fair when picking a country, you can use its area as weight.
There is a library here and you can use its .random() method to get a random coordinate. Then you can use GeoNames WebServices to determine whether it is on land or not. They have a list of webservices and you'll just have to use the right one. GeoNames is free and reliable.
Go there http://wiki.openstreetmap.org/
Try to use API: http://wiki.openstreetmap.org/wiki/Databases_and_data_access_APIs
I guess you could use a world map, define a few points on it to delimit most of water bodies as you say and use a polygon.contains method to validate the coordinates.
A faster algorithm would be to use this map, take some random point and check the color beneath, if it's blue, then water... when you have the coordinates, you convert them to lat/long.
You might also do the blue green thing , and then store all the green points for later look up. This has the benifit of being "step wise" refinable. As you figure out a better way to make your list of points you can just point your random graber at a more and more acurate group of points.
Maybe a service provider has an answer to your question already: e.g. https://www.google.com/enterprise/marketplace/viewListing?productListingId=3030+17310026046429031496&pli=1
Elevation api? http://code.google.com/apis/maps/documentation/elevation/ above sea level or below? (no dutch points for you!)
Generating is easy, the Problem is that they should not be on water. I would import the "Open Streetmap" for example here http://ftp.ecki-netz.de/osm/ and import it to an Database (verry easy data Structure). I would suggest PostgreSQL, it comes with some geo functions http://www.postgresql.org/docs/8.2/static/functions-geometry.html . For that you have to save the points in a "polygon"-column, then you can check with the "&&" operator if it is in an Water polygon. For the attributes of an OpenStreetmap Way-Entry you should have a look at http://wiki.openstreetmap.org/wiki/Category:En:Keys
Supplementary to what bsimic said about digging into GeoNames' Webservices, here is a shortcut:
they have a dedicated WebService for requesting an ocean name.
(I am aware the of OP's constraint to not using public web services due to the amount of requests. Nevertheless I stumbled upon this with the same basic question and consider this helpful.)
Go to http://www.geonames.org/export/web-services.html#astergdem and have a look at "Ocean / reverse geocoding". It is available as XML and JSON. Create a free user account to prevent daily limits on the demo account.
Request example on ocean area (Baltic Sea, JSON-URL):
http://api.geonames.org/oceanJSON?lat=54.049889&lng=10.851388&username=demo
results in
{
"ocean": {
"distance": "0",
"name": "Baltic Sea"
}
}
while some coordinates on land result in
{
"status": {
"message": "we are afraid we could not find an ocean for latitude and longitude :53.0,9.0",
"value": 15
}
}
Do the random points have to be uniformly distributed all over the world? If you could settle for a seemingly uniform distribution, you can do this:
Open your favorite map service, draw a rectangle inside the United States, Russia, China, Western Europe and definitely the northern part of Africa - making sure there are no big lakes or Caspian seas inside the rectangles. Take the corner coordinates of each rectangle, and then select coordinates at random inside those rectangles.
You are guaranteed non of these points will be on any sea or lake. You might find an occasional river, but I'm not sure how many geoservices are going to be accurate enough for that anyway.
This is an extremely interesting question, from both a theoretical and practical perspective. The most suitable solution will largely depend on your exact requirements. Do you need to account for every body of water, or just the major seas and oceans? How critical are accuracy and correctness; Will identifying sea as land or vice-versa be a catastrophic failure?
I think machine learning techniques would be an excellent solution to this problem, provided that you don't mind the (hopefully small) probability that a point of water is incorrectly classified as land. If that's not an issue, then this approach should have a number of advantages against other techniques.
Using a bitmap is a nice solution, simple and elegant. It can be produced to a specified accuracy and the classification is guaranteed to be correct (Or a least as correct as you made the bitmap). But its practicality is dependent on how accurate you need the solution to be. You mention that you want the coordinate accuracy to 5 decimal places (which would be equivalent to mapping the whole surface of the planet to about the nearest metre). Using 1 bit per element, the bitmap would weigh in at ~73.6 terabytes!
We don't need to store all of this data though; We only need to know where the coastlines are. Just by knowing where a point is in relation to the coast, we can determine whether it is on land or sea. As a rough estimate, the CIA world factbook reports that there are 22498km of coastline on Earth. If we were to store coordiates for every metre of coastline, using a 32 bit word for each latitude and longitude, this would take less than 1.35GB to store. It's still a lot if this is for a trivial application, but a few orders of magnitude less than using a bitmap. If having such a high degree of accuracy isn't neccessary though, these numbers would drop considerably. Reducing the mapping to only the nearest kilometre would make the bitmap just ~75GB and the coordinates for the world's coastline could fit on a floppy disk.
What I propose is to use a clustering algorithm to decide whether a point is on land or not. We would first need a suitably large number of coordinates that we already know to be on either land or sea. Existing GIS databases would be suitable for this. Then we can analyse the points to determine clusters of land and sea. The decision boundary between the clusters should fall on the coastlines, and all points not determining the decision boundary can be removed. This process can be iterated to give a progressively more accurate boundary.
Only the points determining the decision boundary/the coastline need to be stored, and by using a simple distance metric we can quickly and easily decide if a set of coordinates are on land or sea. A large amount of resources would be required to train the system, but once complete the classifier would require very little space or time.
Assuming Atlantis isn't in the database, you could randomly select cities. This also provides a more realistic distribution of points if you intend to mimic human activity:
https://simplemaps.com/data/world-cities
There's only 7,300 cities in the free version.
what i am trying to do: the user selects start and destination on a map and then from their coordinates i want to show the closest point location from a list of locations on map. i have a simple Sqlite database containing the longitude,latitude and name of the possible locations.
i did some research and this is what i found:
http://www.scribd.com/doc/2569355/Geo-Distance-Search-with-MySQL
but this is meant for using it with mySql and some kind of spatial search extension.
is there a possibility i can do something similar using android api or external libs?
public Point dialogFindClosestLocationToPoint(geometry.Point aStartPoint){
List<PointWithDistance> helperList=new ArrayList<PointWithDistance>();
try {
openDataBase();
Cursor c=getCursorQueryWithAllTheData();
if(c.moveToFirst())
do{
PointWithDistance helper=new PointWithDistance(c.getDouble(1),c.getDouble(2),c.getString(3));
int distance=returnDistanceBetween2Points(aStartPoint, helper);
if(distance<MAX_SEARCH_DISTANCE){
helper.setDistance(distance);
Log.i("values", helper.name);
helperList.add(helper);
}
}while (c.moveToNext());
Collections.sort(helperList,new PointComparator());
if(helperList!=null)
return helperList.get(0);
else return null;
}catch(SQLException sqle){
throw sqle;
}
finally{
close();
}
this is the code in the PointComparator() class:
public int compare(PointWithDistance o1, PointWithDistance o2) {
return (o1.getDistance()<o2.getDistance() ? -1 : (o1.getDistance()==o2.getDistance() ? 0 : 1));
}
where PointWithDistance is a object that contains: lat, long , distance, name
however this solution doesn't provide the right return info... and i realize that is it not scalable at all and very slow. i need a solution that will execute fast with a database with max of 1000 rows.
edit: my there was a mistake in this code in the sorting now i have it changed( should be < instead of >)
This kind of thing is done most efficiently using an R-Tree. The JSI library provides a Java implementation that I have used successfully with an index of 80.000 locations, processing thousands of lookups per second. However, it may not run on Android.
I was looking for something very similar some time ago:
Android sqlite sort on calculated column (co-ordinates distance)
I was using a MySQL lookup on my server, MySQL allows you to create a virtual column, performs the calculation and sorts by distance, and then you can set the max results returned or the max distance - it works very well:
Select Lat, Lon, acos(sin($lat)*sin(radians(Lat)) + cos($lat)*cos(radians(Lat))cos(radians(Lon)-$lon))$R As dist From MyTable ORDER BY dist DESC
I wanted to perform the same operation in my app - pull all the points in order to distance from the users location allowing me to show the closest ones. I ended up going with the a solution along the lines of the one suggested on the link above but realise its probably not the optimal solution but works for the purpose I wanted.
i haven't tried running your code, but it seems like it would work, it's just that it's not efficient. like you don't actually need to sort, you need the extract the minimum.
you can restrict your query to just the square that is of size (2*MAX_SEARCH_DISTANCE)^2 (with your point in the middle.
This way you are localizing your query and that will return you less results to compute distance for.
Of course this will not help if all your locations are in the localized square (maybe unlikely?).
Also, I suppose you could use hamiltonian distance instead of euclidean.
euclidean distance = sqrt((lat0 - lat1)^2 + (lon0 - lon1)^2)
hamitonian distance = (lat0 - lat1) + (lon0 - lon1)