I am reading about Received Signal Strength Indicator(RSSI). This could be used for our particular case, a rough estimation of distance between devices. But maybe there could be something to combine in order to improve the accuracy.
There are too many variables at play to accurately determine such data.
It's completely impractical to obtain a reliable distance figure just looking at the RSSI. It may give you an order of magnitude but nothing remotely accurate.
Most notable is the variety of devices and underlying hardware used. But take into account this simple example: you have 2 pairs of devices, 1 pair has good signal for a range of 30m without obstacles within and another pair has the same value for what would be considered a good signal but they are within 1m but with an obstacle which causes heavy interefernce. Any interpretation from empirical data would be awfully inaccurate.
The best change you have is to pinpoint each device's location by GPS and communicate&compare coordinates. But again, the underlying hardware comes at play. Also the length of time it takes to pinpoint with a better accuracy. We're talking about 5-50 meters. So it's basically mostly noise.
Go beyond that and you go out of wi-fi direct / peer-to-peer range.
As such, trying to estimate distances begtween 2 android devices at this time would be in my view an utter waste of time. (unless you are building a huge network of interconnected wi-fi direct devices in which case you could come up with some cool applications, or at least visualisation stuff - that's a bit of a stretch considering one-to-many limitations).
Related
Server is receiving a certain rate(12 per minute) of monitoring data for some process via external source(web services, etc). Now process may run for a minute(or less than) or for an hour or a day. At the end of the process, I may be having 5 or 720 or 17280 data points. This data is being gathered for more than 40 parameters and stored into the database for future display via web. Imagine more than 1000 processes are running and the amount of data generated. I have to stick to RDBMS(MySQL specifically). Therefore, I want to process the data and decrease the amount the data by selecting only statistically significant points before storing the data to the database. The ultimate objective is to plot these data points over a graph where Y-axis will be time and X-axis will be represented by some parameter(part of data point).
I do not want to miss any significant fluctuation or nature but at the same time I cannot manage to plot all of the data points(in case the number is huge > 100).
Please note that I am aware of basic statistical terms like mean, standard deviation, etc.
If this is a constant process, you could plot the mean (should be a flat line) and any points that exceeded a certain threshold. Three standard deviations might be a good threshold to start with, then see whether it gives you the information you need.
If it's not a constant process, you need to figure out how it should be varying with time and do a similar thing: plot the points that substantially vary from your expectation at that point in time.
That should give you a pretty clean graph while still communicating the important information.
If you expect your process to be noisy, then doing some smoothing through a spline can help you reduce noise and compress your data (since to draw a spline you need only a few points, where "few" is arbitrary picked by you, depending on how much detail you want to get rid of).
However, if your process is not noisy, then outliers are very important, since they may represent errors or exceptional conditions. In this case, you are better off getting rid of the points that are close to the average (say less than 1 standard deviation), and keeping those that are far.
A little note: the term "statistically significant", describes a high enough level of certainty to discard the null hypothesis. I don't think it applies to your problem.
I'm trying to compare multiple algorithms that are used to smooth GPS data. I'm wondering what should be the standard way to compare the results to see which one provides better smoothing.
I was thinking on a machine learning approach. To crate a car model based on a classifier and check on which tracks provides better behaviour.
For the guys who have more experience on this stuff, is this a good approach? Are there other ways to do this?
Generally, there is no universally valid way for comparing two datasets, since it completely depends on the applied/required quality criterion.
For your appoach
I was thinking on a machine learning approach. To crate a car model
based on a classifier and check on which tracks provides better
behaviour.
this means that you will need to define your term "better behavior" mathematically.
One possible quality criterion for your application is as follows (it consists of two parts that express opposing quality aspects):
First part (deviation from raw data): Compute the RMSE (root mean squared error) between the smoothed data and the raw data. This gives you a measure for the deviation of your smoothed track from the given raw coordinates. This means, that the error (RMSE) increases, if you are smoothing more. And it decreases if you are smoothing less.
Second part (track smoothness): Compute the mean absolute lateral acceleration that the car will experience along the track (second deviation). This will decrease if you are smoothing more, and it will increase if you are smoothing less. I.e., it behaves in contrary to the RMSE.
Result evaluation:
(1) Find a sequence of your data where you know that the underlying GPS track is a straight line or where the tracked object is not moving. Note, that for those tracks, the (lateral) acceleration is zero by definition(!).
For these, compute RMSE and mean absolute lateral acceleration.
The RMSE of appoaches that have (almost) zero acceleration results from measurement inaccuracies!
(2) Plot the results in a coordinate system with the RMSE on the x axis and the mean acceleration on the y axis.
(3) Pick all approaches that have an RMSE similar to what you found in step (1).
(4) From those approaches, pick the one(s) with the smallest acceleration. Those give you the smoothest track with an error explained through measurement inaccuracies!
(5) You're done :)
I have no experience on this topic but I have few things in mind that may help you.
You know it is a car. You know that the data is generated from a car so you can define a set of properties of a car. For example if a car is moving with speed above 50km than the angle of the corner should be at least 110 degrees. I am absolutely guessing with the values but if you do a little research i am sure you will be able to define such properties. Next thing you can do is to test how each approximation fits the car properties and choose the best one.
Raw data. I assume you are testing all methods on a part of given road. You can generate a "raw gps track" - a track that best fits the movement of a car. Google maps may help you to generate such track os some gps devise with higher accuracy. Than you measure the distance between each approximation and your generated track - the one with the min distance wins.
i think you easily match the coordinates after the address conversion.
because address have street,area and city. so you can easily match the different radius.
let try this link
Take a look at this paper that discusses comparing machine learning algorithms:
"Choosing between two learning algorithms
based on calibrated tests" available at:
http://www.cs.waikato.ac.nz/ml/publications/2003/bouckaert-calibrated-tests.pdf
Also check out this paper:
"Bayesian Comparison of Machine Learning Algorithms on Single and
Multiple Datasets" available at:
http://www.jmlr.org/proceedings/papers/v22/lacoste12/lacoste12.pdf
Note: It is noted from the question that you are looking into the best way to compare the results for machine learning algorithms and are not looking for additional machine learning algorithms that may implement this feature.
Machine Learning is not an well suited approach for that task, you would have to define what is good smoothing...
Principially your task cannot be solved by an algorithm that gives an general answer because every smoothing destroy the original data by some amount and adds invented positions, and different systems/humans that use the smoothed data react differently on that changed data.
The question is: What do you want to achieve with smoothing?
Why do you need smoothing? (have you forgotten to implement or enable a stand still filter that eliminates movement while the vehicle is standing still, which in GPS introduces jumping location during stand still?)
The GPS chip has already built in a (best possible?) real time smoothing using a Kalman filter, having on the one side more information than a post processed smotthing algo, on the other side it has less.
So next you have to ask yourself: do you compare post processing smooting algos or real time algos? (probably post processing) Comparing a real time smoothing algorithm with a post process smoothing algorithm is not fair.
Again: What do you expect from smoothed data: That they look somewhat fine, but unrealistic like photoshopped models for tv-advertisments?
What is good smoothing? near to real vehicle postion which nobody ever knows, or a curve whith low acceleration?
I would prefer an smoothing algorithm that produces the curve most near to the real (usually unknown) vehicle trajectory.
Or you might just think it should somehow look beautifull: In that case overlay the curves with different colors, display it on a satelitte image map, and let a team of humans (experts at least owning and driving an own car) decide what looks good and realistic.
We humans have the best multi purpose pattern matching algorithm built in.
Again why smooth?: for display in a map to please humans that look at that map?
or to use the smoothed tracks to feed other algorithms that have problems with the original data?
To please humans I have given an answer above.
To please other algorithms:
What they need? nearer positions? or better course value / direction between points.
What attributes do you want to smooth: only the latitude, longitude coordinates, or also the speed value, and course value?
I have much professional experience with GPS tracks, and recommend, to just remove every location under 7km/h and keep the rest as it is. In most cases there is no need for further smoothing.
Otherwise it gets expensive:
A possible solution:
1) You arrange a 2000€ Reference GPS receiver delivered with a magnetic vehicle roof antenna (E.g Company hemisphere 2000 GPS receiver) and use that as reference
2) You use a comnsumer GPS usually used for your task (smartphone, etc.)
Both mounted inside the car: drive some test tracks, in good conditions (highways) but more tracks at very bad: strong curves combined with big houses left and right. And through tunnel, a struight and a curved one, if you have one.
3) apply the smoothing algoritms to the consumer GPS tracks
4) compare the smoothed to the reference track, by matching two positions and finally calulate the (RMSE Root mean squared error)
Difficulties
matching two positions: Hopefully the time can be exactly matched which is usually not the case (0,5s offset possible).
Think what do you do when having an GPS outage.
Consider first to display a raw track and identify what kind of unsmoothed data is not suitable/ nice looking. (Probably later posting the pics here)
what about using the good old Kalman Filter!
I am trying to detect peaks in the accelerometer data so I can find the number of steps. The speed I have it polling on it is game. I think that should be a good speed to give me data but not to give me too many data points. Are there any algorithms you recommend to figure out the peak? I currently have the data in and excel and I tried graphing it out but there are way too many little jumps up and down.
I once used peak detection algorithm for some computer vision application. There are many sophisticated techniques, but I wrote very raw approach myself:
I used windowed averaging filter which smoothed all the local ups and downs (if your peak is narrow used smaller window size).
Then I took discrete derivative and find all points where derivative sign-ess changed from +ve to -ve. Then finally I took average of values around all those points and applied threshold around 1/3 of average.
It's not best approach but worked out for me well. You can play around with different discrete filters either in matlab or python. There is a very good plugin in python called scipy it can make ur life easy.
My goal is to recognize simple gestures from accelerometers mounted on a sun spot. A gesture could be as simple as rotating the device or moving the device in several different motions. The device currently only has accelerometers but we are considering adding gyroscopes if it would make it easier/more accurate.
Does anyone have recommendations for how to do this? Any available libraries in Java? Sample projects you recommend I check out? Papers you recommend?
The sun spot is a Java platform to help you make quick prototypes of systems. It is programmed using Java and can relay commands back to a base station attached to a computer. If I need to explain how the hardware works more leave a comment.
The accelerometers will be registering a constant acceleration due to gravity, plus any acceleration the device is subjected to by the user, plus noise.
You will need to low pass filter the samples to get rid of as much irrelevant noise as you can. The worst of the noise will generally be higher frequency than any possible human-induced acceleration.
Realise that when the device is not being accelerated by the user, the only force is due to gravity, and therefore you can deduce its attitude in space. Moreover, when the total acceleration varies greatly from 1g, it must be due to the user accelerating the device; by subtracting last known estimate of gravity, you can roughly estimate in what direction and by how much the user is accelerating the device, and so obtain data you can begin to match against a list of known gestures.
With a single three-axis accelerometer you can detect the current pitch and roll, and also acceleration of the device in a straight line. Integrating acceleration minus gravity will give you an estimate of current velocity, but the estimate will rapidly drift away from reality due to noise; you will have to make assumptions about the user's behaviour before / between / during gestures, and guide them through your UI, to provide points where the device is not being accelerated and you can reset your estimates and reliably estimate the direction of gravity. Integrating again to find position is unlikely to provide usable results over any useful length of time at all.
If you have two three-axis accelerometers some distance apart, or one and some gyros, you can also detect rotation of the device (by comparing the acceleration vectors, or from the gyros directly); integrating angular momentum over a couple of seconds will give you an estimate of current yaw relative to that when you started integrating, but again this will drift out of true rapidly.
Since no one seems to have mentioned existing libraries, as requested by OP, here goes:
http://www.wiigee.org/
Meant for use with the Wiimote, wiigee is an open-source Java based implementation for pattern matching based on accelerometer readings. It accomplishes this using Hidden Markov Models[1].
It was apparently used to great effect by a company, Thorn Technologies, and they've mentioned their experience here : http://www.thorntech.com/2013/07/mobile-device-3d-accelerometer-based-gesture-recognition/
Alternatively, you could consider FastDTW (https://code.google.com/p/fastdtw/). It's less accurate than regular DTW[2], but also computationally less expensive, which is a big deal when it comes to embedded systems or mobile devices.
[1] https://en.wikipedia.org/wiki/Hidden_Markov_model
[2] https://en.wikipedia.org/wiki/Dynamic_time_warping
EDIT: The OP has mentioned in one of the comments that he completed his project, with 90% accuracy in the field and a sub-millisecond compute time, using a variant of $1 Recognizer. He also mentions that rotation was not a criteria in his project.
What hasn't been mentioned yet is the actual gesture recognition. This is the hard part. After you have cleaned up your data (low pass filtered, normalized, etc) you still have most of the work to do.
Have a look at Hidden Markov Models. This seems to be the most popular approach, but using them isn't trivial. There is usually a preprocessing step. First doing STFT and clustering the resultant vector into a dictionary, then feeding that into a HMM. Have a look at jahmm in google code for a java lib.
Adding to moonshadow's point about having to reset your baseline for gravity and rotation...
Unless the device is expected to have stable moments of rest (where the only force acting on it is gravity) to reset its measurement baseline, your system will eventually develop an equivalent of vertigo.
Suppose I am sampling a number of signals at a fixed rate (say once per second) and extracting some metrics from the signals such as, ratio of one to the other, the rate of change, relative rate of change, etc.
I've heard that Neural Networks can be of use in discovering relationships. Is this true?
If so, what books/internet resources can I use to learn more about how do do this.
The processing is being done is Java, so a Java slant on all your answers would be most appreciated.
Thanks
Most likely you would need to determine some sort of a "window", like maybe the last 10 samples. You would normalize your signal into an array of 10 "doubles" normalized between -1 and 1. This would form the "input" into your neural network. So you would have 10 input neurons. Then you have to decide what you want the output to be. Maybe you have 100 different classifications that you may want to classify the signals into. If this is the case you would have 100 different output neurons that would each be trained to produce a higher output than the other output neurons when they recognize a specific signal.
Between the input and output layers neural networks usually have one or more hidden layers. These just provide additional capability to the neural network.
For Java neural network programming, you might try the Encog project. There is also a DotNet version of Encog as well.
That is true. You can discover relationships with NN. The problems is that it's very hard to interpret the weightings after you make the calibration.. so they are a bit of a black box (more so than other data mining algorithms).
I would actually recommend exploring the Neural Net algorithm that comes with MS Analysis Services. It's a good way to learn about NNets before you start programming anything (and since it's a server service you can call it from java).