Is there an algorithm that can tell you what points to connect to form a triangle given a set of points? None of the connecting lines can intersect, however triangles can be inside of other triangles.
Given a general set of points in R^d the Delaunay triangulation is often an optimal choice for tessellation.
Specifically, the Delaunay triangulation will tessellate the convex hull of the point set into a set of non-overlapping elements, ensuring that the radius of the largest circumsphere is minimised - this means that the triangulation is optimal in terms of its "compactness", or in other words, elements with good aspect ratio are generated.
Efficient algorithms to construct Delaunay triangulations are not trivial, but there are a number of good libraries out there - I can recommend Triangle, CGAL or Qhull (for high dimensional problems) also JDT is apparently an implementation in Java, though I've never used it.
I am not sure it is exactly what you are looking for, but it may be of some help: Graph Theory
I am also attempting to solve this problem. This is a link to the github branch of someone who works on this for the game Ingress, which is why I'm interested in the solution. However, to my knowledge the optimal solution is found through brute force (I may be wrong on this), and has other factors it maximizes and minimizes. Also I think there are things such as taking in an E6 latitude/longitude and projects onto a Gnomonic projection to determine shortest routes, however I think this can be discounted when going through the code. I don't think there is your solution in this code, but it might be a good jumping off point for you, me, and anyone else looking into this problem.
Related
Given a set of (x, y, z) points in 3D space, I want to be able to estimate the z for a new (x, y) pair.
For example: I am given a height map of a geographical feature, for example some hills in the countryside. That means that for some latitudes and longitudes, I know the elevation of the ground at that point. I would like to estimate the elevation of a person standing at (latitude, longitude) that is most likely not in the sample set.
How can I do that in Java?
I have already researched splines but am struggling to make any progress that way, and I also just tried using graphhopper's ElevationInterpolator but it gives clearly wrong results — it doesn't seem to give accurate estimations at all, unless the provided (lat, long) are in the sample set, then it is correct, but if it's just slightly offset it gives a wildly different elevation, and it gives the same elevation for all positions that aren't in the sample set.
In case you have elevations surrounding the point in question, the best way I can see would be to find a closest enclosing triangle or a quad and interpolate linearly between those. I don’t think you can get much better than that.
In case you only have elevations on the side of a point, all you can do is assume it is relatively flat or maybe try calculating some sort of gradient from the points you have but basically that’ll only be a wild guess.
Depending on how big an area you’re covering, there’s also a geoid that you might want to take into account.
As for your “intuitive” formula it uses too much data around the point in question so it will definitely produce wrong results. The fact of the matter is that the further points have nothing to do with elevation, all you need to estimate is the few closest ones and since you don’t know anything about the surface anyway it doesn’t really matter what the other points are. Well maybe unless you’re into ML, then maybe you can get something out of them...
I found a MicrosphereInterpolator in Apache Commons
I guess this is actually a really hard problem and has no standard solution. This solution seems to be based on William Dudziak's 2007 MS thesis.
The MicrosphereInterpolator works well for me at the moment.
Other solutions I have tried, for example the intuitive formula, don't give intuitive results.
The only downside is that the MicrosphereInterpolator is quite slow.
I'm trying to find if a scanned pdf form contains a signature (like making sure a check is signed).
The problem domain:
I will be receiving document packages (multi page pdf's with multiple forms). I have already put together document package classifiers that will check the package for all documents and scale the images to a common size. After that I know where the signatures should be and can scan the area of the document specifically. What I'm looking for is the best approach to making sure there is a signature present. I've considered just checking for a base threshold of dark pixels but that seems so clumsy. The trouble with signatures is that they are not really writing, more of a personal mark.
The only thing I can come up with is a machine learning method to look for loopyness? But I'm not all the familiar with machine learning and don't even know where to start with something like that. Anyone with some suggestions for practical approaches would very appreciated.
I'm coding this in Java if that's helpful at all
What you asked was very broad so there isn't a lot of information that we can give you. However, I can point you to some helpful links:
http://java-ml.sourceforge.net/ --This is a library that you can download that has lots of useful algorithms and other code to include in your program
https://www.youtube.com/playlist?list=PLiaHhY2iBX9hdHaRr6b7XevZtgZRa1PoU --this is a series that explains neural networks (something you might want to look into for your machine learning)
So a big tip I have for your algorithm is to instead of looking for how long exactly all of the loops and things are, look at all of their relative distances
"Relative distances from what?" you say. Well this is where the next tip comes in handy: instead of keeping track of the lines, keep track of the tips of the loops and the order of these points. If you then take the distance between all of them (relatively of course which means to set one of the lengths to zero). Along to keeping track of the distances, you should also keep track of the angles. You would calculate the angle ABC by taking the distance between (A,B), (B,C), and (A,C) (A,B, and C being coordinates on the xy plane) which creates a triangle between the points which allows you to use trigonometry to calculate the angle.
(I am assuming that for all of these you are also trying to detect who's signature it is of course because it actually doesn't really complicate things much at all) When trying to match up the signature detected to the stored signatures to see if they are the "same," don't make it to where the distances and angles have to be exact. Give a margin of error (like use a % range above and below). Here is a tip: Make the margin of error rather large. That way if it is written poorly, it will still be detected. This raises the chances of more than one signature being picked up. Luckily, there is a simply solution to this. Just have it run the algorithm again on the signatures that were found but with the margin of error smaller (you of course don't do this manually, the program does it). Continue decreasing the margin of error until you get only one signature remaining.
I am hoping you have ideas already for detecting where the actual signature is but check for the difference in darkness of the pixels of course. Make sure it is pretty continuous. Also take note of the fact that signatures are commonly signed in both black or blue or sometimes red and other fancy colors.
I'm trying to compare multiple algorithms that are used to smooth GPS data. I'm wondering what should be the standard way to compare the results to see which one provides better smoothing.
I was thinking on a machine learning approach. To crate a car model based on a classifier and check on which tracks provides better behaviour.
For the guys who have more experience on this stuff, is this a good approach? Are there other ways to do this?
Generally, there is no universally valid way for comparing two datasets, since it completely depends on the applied/required quality criterion.
For your appoach
I was thinking on a machine learning approach. To crate a car model
based on a classifier and check on which tracks provides better
behaviour.
this means that you will need to define your term "better behavior" mathematically.
One possible quality criterion for your application is as follows (it consists of two parts that express opposing quality aspects):
First part (deviation from raw data): Compute the RMSE (root mean squared error) between the smoothed data and the raw data. This gives you a measure for the deviation of your smoothed track from the given raw coordinates. This means, that the error (RMSE) increases, if you are smoothing more. And it decreases if you are smoothing less.
Second part (track smoothness): Compute the mean absolute lateral acceleration that the car will experience along the track (second deviation). This will decrease if you are smoothing more, and it will increase if you are smoothing less. I.e., it behaves in contrary to the RMSE.
Result evaluation:
(1) Find a sequence of your data where you know that the underlying GPS track is a straight line or where the tracked object is not moving. Note, that for those tracks, the (lateral) acceleration is zero by definition(!).
For these, compute RMSE and mean absolute lateral acceleration.
The RMSE of appoaches that have (almost) zero acceleration results from measurement inaccuracies!
(2) Plot the results in a coordinate system with the RMSE on the x axis and the mean acceleration on the y axis.
(3) Pick all approaches that have an RMSE similar to what you found in step (1).
(4) From those approaches, pick the one(s) with the smallest acceleration. Those give you the smoothest track with an error explained through measurement inaccuracies!
(5) You're done :)
I have no experience on this topic but I have few things in mind that may help you.
You know it is a car. You know that the data is generated from a car so you can define a set of properties of a car. For example if a car is moving with speed above 50km than the angle of the corner should be at least 110 degrees. I am absolutely guessing with the values but if you do a little research i am sure you will be able to define such properties. Next thing you can do is to test how each approximation fits the car properties and choose the best one.
Raw data. I assume you are testing all methods on a part of given road. You can generate a "raw gps track" - a track that best fits the movement of a car. Google maps may help you to generate such track os some gps devise with higher accuracy. Than you measure the distance between each approximation and your generated track - the one with the min distance wins.
i think you easily match the coordinates after the address conversion.
because address have street,area and city. so you can easily match the different radius.
let try this link
Take a look at this paper that discusses comparing machine learning algorithms:
"Choosing between two learning algorithms
based on calibrated tests" available at:
http://www.cs.waikato.ac.nz/ml/publications/2003/bouckaert-calibrated-tests.pdf
Also check out this paper:
"Bayesian Comparison of Machine Learning Algorithms on Single and
Multiple Datasets" available at:
http://www.jmlr.org/proceedings/papers/v22/lacoste12/lacoste12.pdf
Note: It is noted from the question that you are looking into the best way to compare the results for machine learning algorithms and are not looking for additional machine learning algorithms that may implement this feature.
Machine Learning is not an well suited approach for that task, you would have to define what is good smoothing...
Principially your task cannot be solved by an algorithm that gives an general answer because every smoothing destroy the original data by some amount and adds invented positions, and different systems/humans that use the smoothed data react differently on that changed data.
The question is: What do you want to achieve with smoothing?
Why do you need smoothing? (have you forgotten to implement or enable a stand still filter that eliminates movement while the vehicle is standing still, which in GPS introduces jumping location during stand still?)
The GPS chip has already built in a (best possible?) real time smoothing using a Kalman filter, having on the one side more information than a post processed smotthing algo, on the other side it has less.
So next you have to ask yourself: do you compare post processing smooting algos or real time algos? (probably post processing) Comparing a real time smoothing algorithm with a post process smoothing algorithm is not fair.
Again: What do you expect from smoothed data: That they look somewhat fine, but unrealistic like photoshopped models for tv-advertisments?
What is good smoothing? near to real vehicle postion which nobody ever knows, or a curve whith low acceleration?
I would prefer an smoothing algorithm that produces the curve most near to the real (usually unknown) vehicle trajectory.
Or you might just think it should somehow look beautifull: In that case overlay the curves with different colors, display it on a satelitte image map, and let a team of humans (experts at least owning and driving an own car) decide what looks good and realistic.
We humans have the best multi purpose pattern matching algorithm built in.
Again why smooth?: for display in a map to please humans that look at that map?
or to use the smoothed tracks to feed other algorithms that have problems with the original data?
To please humans I have given an answer above.
To please other algorithms:
What they need? nearer positions? or better course value / direction between points.
What attributes do you want to smooth: only the latitude, longitude coordinates, or also the speed value, and course value?
I have much professional experience with GPS tracks, and recommend, to just remove every location under 7km/h and keep the rest as it is. In most cases there is no need for further smoothing.
Otherwise it gets expensive:
A possible solution:
1) You arrange a 2000€ Reference GPS receiver delivered with a magnetic vehicle roof antenna (E.g Company hemisphere 2000 GPS receiver) and use that as reference
2) You use a comnsumer GPS usually used for your task (smartphone, etc.)
Both mounted inside the car: drive some test tracks, in good conditions (highways) but more tracks at very bad: strong curves combined with big houses left and right. And through tunnel, a struight and a curved one, if you have one.
3) apply the smoothing algoritms to the consumer GPS tracks
4) compare the smoothed to the reference track, by matching two positions and finally calulate the (RMSE Root mean squared error)
Difficulties
matching two positions: Hopefully the time can be exactly matched which is usually not the case (0,5s offset possible).
Think what do you do when having an GPS outage.
Consider first to display a raw track and identify what kind of unsmoothed data is not suitable/ nice looking. (Probably later posting the pics here)
what about using the good old Kalman Filter!
I'd like to find an implementation of an approximate algorithm for the Minimum Feedback Arc Set in Java but I did not find anything so far. Does anyone have something in mind?
It appears that the simplest approximate algorithm one can implement (but with no minimality guarantees) is the one of this paper:
A fast and effective heuristic for the feedback arc set problem, by P. Eades, X. Lin, W.F. Smyth.
It is very easy to implement and works quite fast for large graphs (I tried it on a graph of 2.5 million edges and around 100 thousand nodes and broke all cycles in less than a minute).
What would be the best algorithm in terms of speed for locating an object in a field?
The field consists of 18 by 18 squares with side length 30.48 cm. The robot is placed in the square (0,0) and its job is to reach the light source while avoiding obstacles along the way. To locate the light source, the robot does a 360 degree turn to find the angle with the highest light reading and then travels towards the source. It can reliably detect a light source from 100 cm.
The way I'm implementing this presently is I'm storing the information about each tile in a 2x2 array. The possible values of the tiles are unexplored (default), blocked (there's an obstacle), empty (there's nothing in there). I'm thinking of using the DFS algorithm where the children are at position (i+3,j) or (i,j+3). However, considering the fact that I will be doing a rotation to locate the angle with the highest light reading at each child, I think there may be an algorithm which may be able to locate the light source faster than DFS. Also, I will only be travelling in the x and y directions since the robot will be using the grid lines on the floor to make corrections to it's x and y positions.
I would appreciate it if a fast and reliable algorithm could be suggested to accomplish this task.
This is a really broad question, and I'm not an expert so my answer is based on "first principles" thinking rather than experience in the field.
(I'm assuming that your robot has generally unobstructed line of sight and movement; i.e. it is an open area with scattered obstacles, not in a maze.)
The problem is interpreting the information that you get back from a 360 degree scan.
If the robot sees the light source, then traversing a route to the light source is either trivial, or a "simple" maze walking task.
The difficulty is when you don't see the source. It might mean that the source is not within the circle of visibility. But it could also mean that the light is behind an obstacle. And unfortunately, a simple sensor like you are describing cannot distinguish these two cases.
If your sensor system allowed you to see the obstacles, you could plot the locations of the "shadow" regions (regions behind obstacles), and use that to keep track of the places that are left to search. So your strategy would be to visit a small number of locations and do a scan at each, then methodically "tidy up" a small number of areas that were in shadow.
But since you cannot easily tell where the shadow areas are, you need an algorithm that (ultimately) searches everywhere. DFS is a general strategy that searches everywhere, but it does it by (in effect) looking in the nooks and crannies first. A better strategy is to a breadth first search, and only visit the nooks and crannies if the wide-scale scans didn't find the light source.
I would appreciate it if a fast and reliable algorithm could be suggested to accomplish this task.
I think you are going to need to develop one yourself. (Isn't this the point of the problem / task / competition?)
Although it may not look like it, this looks a more like a maze following problem than anything. I suppose this is some kind of challenge or contest situation, where there's always a path from start to target, but suppose there's not for a moment. One of the successful results for a robot navigating a beacon fully surrounded by obstacles would be a report with a description of a closed path of obstacles surrounding a signal. If there's not such a closed path, then you can find a hole in somewhere; this is why is looks like maze following.
So the basic algorithm I'd choose is to start with a spiraling-inward tranversal, sweeping out a path narrow enough so that you're sure to see a beacon if one is present. If there are no obstacles (a degenerate case), this finds the target in minimal time. (Hint: each turn reduces the number of cells your sensor can locate per step.)
Take the spiral traversal to be counter-clockwise. What you have then is related to the rule for solving mazes by keeping your right hand on the wall and following the generated path. In this case, you have the complication that, while the start of the maze is on the boundary, the end may not be. It's possible of the right-hand-touching path to fail in such a situation. Detecting this situation requires looking for "cavities" in the region swept out by adjacency to the wall.