I was building a prototype of vehicle routing application using google maps and optaplanner. I change the distance based scoring to duration based scoring, where the duration value was calculated using distance / avg speed of vehicle.
Now I want to add traffic jam variable into my application. The traffic jam variable was implemented as additional duration value from the current location to another location (I using a map of location and double just like distance variable in RoadLocation class). When I tried it to run it, the result was always same with the previous one. Here is the result from the first run :
I draw some red line to represent the traffic jam, and then try to re-run the solving phase. Here is the second result:
The result was the same with the previous one. My questions is, what the best method to apply the traffic jam variable into vehicle routing problem? Does anyone has any experience adding this variable? Any comment and suggestion will be appreciated.
Thanks and regards.
This paragraph is just an introduction. If you wanna skip it, do it. ;-)
I have implemented a similar approach with traffic jams, but it was not a real-time system. The solution runs every X minutes, which is absolutely fine.
That gave me the benefit for pre-calculating the ways and routes for the complete road network, before the actual optaPlanner calculation starts.
This saves time for the real calculation of optaPlanner.
The network consists of vertexes and arcs. For each arc you'll have a weight.
Here starts the real deal for you.
Let's assume that you implement a Dijkstra or A-Star algorithm for the precalculation step for all places and how to get there. These way finding algorithm is seleting the arc with the lowest "travelling" costs. For each arc/road, which would be blocked, we assume a distance of DOUBLE.MAX_VALUE. This value can be interpreted as "not driveable" or drastically said: This connection between two vertexes even doesn't exist for the current solution finding process. So the way finding algorithm will simply skip this road. For every driveable road, we calculate the real costs, e.g. distance or take an approximation out of experience.
The optaplanner process itself just uses the precalculated way finding mechanism, e.g. compares the calculated distances for getting from place A to place B.
For setting the distance variable to DOUBLE.MAX_VALUE you can decide between user based information, information of other providers like google or admin based rules. As my experience goes along with user based content vs. admin based actions, I can recommend both ways.
Let's discuss the user based action: The user can have the same set of GUI actions as the admin for flagging a way as "jammed". For the next optaPlanner iteration the flag is going to be involved. If you have GPS data of your users you can get the approximate velocity. For each intervall of the GPS measurement you can calculate that velocity. If the velocity is on a road (not a crossing), and below a defined minimum velocity (let's say 1mph or 2 kmh), then you can ask the user if it's a traffic jam or not via popup OR block that road automatically without asking the user. If you chose the popup dialog then a lot of different users have to vote "yes" within a defined time slot, e.g. half an hour, then the road get's blocked. You can resolve the traffic jam, when a lot of users drive the road again and send the GPS coordinates of that road.
The main advantage of the automatic approach is, that you'll have a system based approach with a low error rate.
If you take the manual approach via admin, then you have to take care of implementing a GUI for displaying the roads and enabling/disabling the blocked attribute for a road.
Related
I'm developing an app that tracks walks, bicycle rides, car rides etc. I need precise info, so I basically would like to use only GPS. Still most sources I found recommend using Googles fused API e.g. for power saving reasons, so I went for the fused API.
Now once in a while (once or twice a month) I get one freak value among thousands of good ones. a few of them I got near railway stations, where the freak value is at another railway station, several kilometers away, so I assume it is a wrong interpreted WiFi based position.
Here's one example, where I ride my bicycle from the river towards the main railway station located east of the river. Once I arrive at the main station, I get no position for 126 s (I asked for every 10 sec, so I probably lost the GPS signal), and then suddenly I get a freak GPS value at another railway station 3450m away on the other side of the river. The reported accuracy for the freak value is 20 m.
The problem is that I cannot easily identify and filter these freak values.
Calling currentLocation.getProvider() always returns "fused", which is not very helpful.
Also Location.getAccuracy() returns typical values below 100m.
So today I filter based on evaluating speed combined with unrealistic changes in bearing, but I'm afraid I might also discard good samples in the process.
I scanned a lot of Stackoverflow, but strangely enough I didn't find any relevant answers yet.
I now feel like moving to the old framework location API and use GPS based data only. But is that really necessary, or does anybody have an idea how to avoid getting the freak values, or alternatively how to easily identify and discard all wifi based positions?
And will using the framework location API have bad battery life as a result?
I currently building an application that similar to Optaplanner Vehicle Routing Examples. The difference is: it is web based and the visualization & distance calculation will be using GWT Google Maps V3 direction service. Just like the optaplanner blog post at here: Visualizing Vehicle Routing with Leaflet and Google Maps
I actually a little bit confused about calculating the distance between each location, should I do it realtime? What I mean realtime is first load the locations (about 350 locations) and then calculate the distance between each location (which will result in 350 x 350 = 122500 direction request) before start the solving phase.
The other way that I could think about is, do the calculation of each locations distance and store it in database, then load the data before start the solving phase. But if I choose this way, how to handle the locations change? i.e. a new location added or an existing location deleted?
Also I have read about google maps API limitation, it state that the services only available for 2500 request per 24 hour. How to solve this limitation?
Any comments and answers will be appreciated. Thanks and regards.
I have successfully used MapPoint together with MPMileage and CDXZipStream to maintain a database of locations (address + coordinates) using MapPoint and CDXZipStream. Drive times between two points were maintained using MPMileage and MapPoint. MapPoint is no longer being sold by Microsoft, but you may be able to find a copy on eBay or find an alternative. MPMileage and CDX just made my job easier. I was able to interrogate MapPoint as much as I wanted - it provided travel time for about 8 trips per second - no limit but your time. My database now holds over 600,000 trips and 15,000 locations. Also, these products require expenditure of some $. I spent about $300 for the three products I mentioned, a lot less than a Google commercial license. Maptitude can be a replacement for MapPoint, but you may not be able to control street speeds as well with Maptitude.
Prior to running a solution, I had a query make certain that the required coordinates were geocoded and that the potential legs (travel between two points) were in the database. If not, I'd fill in or update the values using the tools I mentioned. My particular method is not conducive to on demand work, but you can probably program such a process yourself.
I limited search space by imposing some reasonable assumptions, e.g. no trips over 13 miles in my case. You may be able to impose similar constraints to limit your search space. At any one time, I probably only use about 60,000 of the travel times as only the needed times are loaded - and the rest remain in the database in case I need them in the future. Within the OptaPlanner solution, these are facts, not entities or variables. These facts provide the travel time between the two points.
Hope this helps.
I'm trying to compare multiple algorithms that are used to smooth GPS data. I'm wondering what should be the standard way to compare the results to see which one provides better smoothing.
I was thinking on a machine learning approach. To crate a car model based on a classifier and check on which tracks provides better behaviour.
For the guys who have more experience on this stuff, is this a good approach? Are there other ways to do this?
Generally, there is no universally valid way for comparing two datasets, since it completely depends on the applied/required quality criterion.
For your appoach
I was thinking on a machine learning approach. To crate a car model
based on a classifier and check on which tracks provides better
behaviour.
this means that you will need to define your term "better behavior" mathematically.
One possible quality criterion for your application is as follows (it consists of two parts that express opposing quality aspects):
First part (deviation from raw data): Compute the RMSE (root mean squared error) between the smoothed data and the raw data. This gives you a measure for the deviation of your smoothed track from the given raw coordinates. This means, that the error (RMSE) increases, if you are smoothing more. And it decreases if you are smoothing less.
Second part (track smoothness): Compute the mean absolute lateral acceleration that the car will experience along the track (second deviation). This will decrease if you are smoothing more, and it will increase if you are smoothing less. I.e., it behaves in contrary to the RMSE.
Result evaluation:
(1) Find a sequence of your data where you know that the underlying GPS track is a straight line or where the tracked object is not moving. Note, that for those tracks, the (lateral) acceleration is zero by definition(!).
For these, compute RMSE and mean absolute lateral acceleration.
The RMSE of appoaches that have (almost) zero acceleration results from measurement inaccuracies!
(2) Plot the results in a coordinate system with the RMSE on the x axis and the mean acceleration on the y axis.
(3) Pick all approaches that have an RMSE similar to what you found in step (1).
(4) From those approaches, pick the one(s) with the smallest acceleration. Those give you the smoothest track with an error explained through measurement inaccuracies!
(5) You're done :)
I have no experience on this topic but I have few things in mind that may help you.
You know it is a car. You know that the data is generated from a car so you can define a set of properties of a car. For example if a car is moving with speed above 50km than the angle of the corner should be at least 110 degrees. I am absolutely guessing with the values but if you do a little research i am sure you will be able to define such properties. Next thing you can do is to test how each approximation fits the car properties and choose the best one.
Raw data. I assume you are testing all methods on a part of given road. You can generate a "raw gps track" - a track that best fits the movement of a car. Google maps may help you to generate such track os some gps devise with higher accuracy. Than you measure the distance between each approximation and your generated track - the one with the min distance wins.
i think you easily match the coordinates after the address conversion.
because address have street,area and city. so you can easily match the different radius.
let try this link
Take a look at this paper that discusses comparing machine learning algorithms:
"Choosing between two learning algorithms
based on calibrated tests" available at:
http://www.cs.waikato.ac.nz/ml/publications/2003/bouckaert-calibrated-tests.pdf
Also check out this paper:
"Bayesian Comparison of Machine Learning Algorithms on Single and
Multiple Datasets" available at:
http://www.jmlr.org/proceedings/papers/v22/lacoste12/lacoste12.pdf
Note: It is noted from the question that you are looking into the best way to compare the results for machine learning algorithms and are not looking for additional machine learning algorithms that may implement this feature.
Machine Learning is not an well suited approach for that task, you would have to define what is good smoothing...
Principially your task cannot be solved by an algorithm that gives an general answer because every smoothing destroy the original data by some amount and adds invented positions, and different systems/humans that use the smoothed data react differently on that changed data.
The question is: What do you want to achieve with smoothing?
Why do you need smoothing? (have you forgotten to implement or enable a stand still filter that eliminates movement while the vehicle is standing still, which in GPS introduces jumping location during stand still?)
The GPS chip has already built in a (best possible?) real time smoothing using a Kalman filter, having on the one side more information than a post processed smotthing algo, on the other side it has less.
So next you have to ask yourself: do you compare post processing smooting algos or real time algos? (probably post processing) Comparing a real time smoothing algorithm with a post process smoothing algorithm is not fair.
Again: What do you expect from smoothed data: That they look somewhat fine, but unrealistic like photoshopped models for tv-advertisments?
What is good smoothing? near to real vehicle postion which nobody ever knows, or a curve whith low acceleration?
I would prefer an smoothing algorithm that produces the curve most near to the real (usually unknown) vehicle trajectory.
Or you might just think it should somehow look beautifull: In that case overlay the curves with different colors, display it on a satelitte image map, and let a team of humans (experts at least owning and driving an own car) decide what looks good and realistic.
We humans have the best multi purpose pattern matching algorithm built in.
Again why smooth?: for display in a map to please humans that look at that map?
or to use the smoothed tracks to feed other algorithms that have problems with the original data?
To please humans I have given an answer above.
To please other algorithms:
What they need? nearer positions? or better course value / direction between points.
What attributes do you want to smooth: only the latitude, longitude coordinates, or also the speed value, and course value?
I have much professional experience with GPS tracks, and recommend, to just remove every location under 7km/h and keep the rest as it is. In most cases there is no need for further smoothing.
Otherwise it gets expensive:
A possible solution:
1) You arrange a 2000€ Reference GPS receiver delivered with a magnetic vehicle roof antenna (E.g Company hemisphere 2000 GPS receiver) and use that as reference
2) You use a comnsumer GPS usually used for your task (smartphone, etc.)
Both mounted inside the car: drive some test tracks, in good conditions (highways) but more tracks at very bad: strong curves combined with big houses left and right. And through tunnel, a struight and a curved one, if you have one.
3) apply the smoothing algoritms to the consumer GPS tracks
4) compare the smoothed to the reference track, by matching two positions and finally calulate the (RMSE Root mean squared error)
Difficulties
matching two positions: Hopefully the time can be exactly matched which is usually not the case (0,5s offset possible).
Think what do you do when having an GPS outage.
Consider first to display a raw track and identify what kind of unsmoothed data is not suitable/ nice looking. (Probably later posting the pics here)
what about using the good old Kalman Filter!
In my naive beginning Android mind I thought the way to do this would be to loop through each of the objects checking if proximity falls within X range and if so, include the object. This is being done with Google Maps and GeoPoints.
That said, I know this is probably the slowest way possibly. I did a search for Android Proxmity algorithm's and did not get much really. What I am looking for is best options with regard to this the more efficiently.
Are there any libraries I have not been able to find?
If not, should I load these Location objects into SQL then go from there or keep them in a JSONArray?
Once I establish my best datastructure, what is he best method to find all Locations located with X miles of user?
I am not asking for cut and paste code, rather the best method to this efficiently. Then, I can stumble through the code :)
My first gut feeling is to group the Locations by regions but I'm not exactly sure how to do this.
I could potentially have tens of thousands of datapoints.
Any help in simply heading in the right direction is greatly appreciated.
As a side note, I reach this juncture after discovering that a remote API I had been using was.. well.. just PLAIN WRONG and ommiting datapoints from my proximity search. I also realized that if just placed on the datapoints on the phone, then I could allow the user to run the App without internet connection, and only GPS and this would be a HUGE plus. So, with all setbacks come opportunnities!
The answer depends on the representation of the GeoPoints: If these are not sorted you need to scan all of them (this is done in linear time, sorting wrt. distance or clustering will be more expensive). Use Location.distanceTo(Location) or Location.distanceBetween(float, float, float, float, float[]) to calculate the distances.
If the GeoPoints were sorted wrt. distance to your position this task can be done much more efficiently, but since the supplier does not know your position, I assume that this cannot be done.
If the GeoPoints are clustered, i.e. if you have a set of clusters with some center and a radius select each cluster where the distance from your position to the cluster's center is within the limit plus the radius. For these clusters you need to check each GeoPoint contained in the cluster (some of them are possibly farther away from your position than the limit allows). Alternatively you might accept the error and include all points of the cluster (if the radius is relatively small I would recommend this).
Please consider this scenario:
An app knows which of a few routes the phone is on, thanks to GPS. That means it knows the only two directions that the device will be traveling in.
Am I right in thinking that the best way to determine which direction the phone is moving (it will almost certainly not be pointing the right way, so compass is not an option) is to poll the GPS until it starts moving, and find the direction the Co-Ords are moving in?
How regularly, and for how long, do you suggest the polling polls/lasts for?
Thanks in advance!
This is a science in itself, read up on kalman filtering. Basically, the difference between the last two points given by the GPS is the direction you are moving. Then errors come into the equation and you need to start learning about good ways to filter the data and get better results.
Attempt at explaining kalman filtering:
A kalman filter uses a Model to predict new values for the predicted thing. It makes an assumption like "stuff usually moves in the direction of their speed. so if it was here a second ago, it will be there now". It will then use this model to predict the next point and when it actually can measure the next point, it will use that data to update the model and assess the accuracy of the prediction. Then it will start giving you predictions based on a combination of real data and predictions which are weighted according to it's measurements of the prediction accuracy. So if the model is normally very accurate but there is suddenly a jump in the data, it will assume that it is a fluke and it will not let it affect the value too much. If the data is very jumpy, it will trust the data more and the model less.