How to do Gesture Recognition using Accelerometers

How to do Gesture Recognition using Accelerometers - java

My goal is to recognize simple gestures from accelerometers mounted on a sun spot. A gesture could be as simple as rotating the device or moving the device in several different motions. The device currently only has accelerometers but we are considering adding gyroscopes if it would make it easier/more accurate.
Does anyone have recommendations for how to do this? Any available libraries in Java? Sample projects you recommend I check out? Papers you recommend?
The sun spot is a Java platform to help you make quick prototypes of systems. It is programmed using Java and can relay commands back to a base station attached to a computer. If I need to explain how the hardware works more leave a comment.

The accelerometers will be registering a constant acceleration due to gravity, plus any acceleration the device is subjected to by the user, plus noise.
You will need to low pass filter the samples to get rid of as much irrelevant noise as you can. The worst of the noise will generally be higher frequency than any possible human-induced acceleration.
Realise that when the device is not being accelerated by the user, the only force is due to gravity, and therefore you can deduce its attitude in space. Moreover, when the total acceleration varies greatly from 1g, it must be due to the user accelerating the device; by subtracting last known estimate of gravity, you can roughly estimate in what direction and by how much the user is accelerating the device, and so obtain data you can begin to match against a list of known gestures.
With a single three-axis accelerometer you can detect the current pitch and roll, and also acceleration of the device in a straight line. Integrating acceleration minus gravity will give you an estimate of current velocity, but the estimate will rapidly drift away from reality due to noise; you will have to make assumptions about the user's behaviour before / between / during gestures, and guide them through your UI, to provide points where the device is not being accelerated and you can reset your estimates and reliably estimate the direction of gravity. Integrating again to find position is unlikely to provide usable results over any useful length of time at all.
If you have two three-axis accelerometers some distance apart, or one and some gyros, you can also detect rotation of the device (by comparing the acceleration vectors, or from the gyros directly); integrating angular momentum over a couple of seconds will give you an estimate of current yaw relative to that when you started integrating, but again this will drift out of true rapidly.

Since no one seems to have mentioned existing libraries, as requested by OP, here goes:
http://www.wiigee.org/
Meant for use with the Wiimote, wiigee is an open-source Java based implementation for pattern matching based on accelerometer readings. It accomplishes this using Hidden Markov Models[1].
It was apparently used to great effect by a company, Thorn Technologies, and they've mentioned their experience here : http://www.thorntech.com/2013/07/mobile-device-3d-accelerometer-based-gesture-recognition/
Alternatively, you could consider FastDTW (https://code.google.com/p/fastdtw/). It's less accurate than regular DTW[2], but also computationally less expensive, which is a big deal when it comes to embedded systems or mobile devices.
[1] https://en.wikipedia.org/wiki/Hidden_Markov_model
[2] https://en.wikipedia.org/wiki/Dynamic_time_warping
EDIT: The OP has mentioned in one of the comments that he completed his project, with 90% accuracy in the field and a sub-millisecond compute time, using a variant of $1 Recognizer. He also mentions that rotation was not a criteria in his project.

What hasn't been mentioned yet is the actual gesture recognition. This is the hard part. After you have cleaned up your data (low pass filtered, normalized, etc) you still have most of the work to do.
Have a look at Hidden Markov Models. This seems to be the most popular approach, but using them isn't trivial. There is usually a preprocessing step. First doing STFT and clustering the resultant vector into a dictionary, then feeding that into a HMM. Have a look at jahmm in google code for a java lib.

Adding to moonshadow's point about having to reset your baseline for gravity and rotation...
Unless the device is expected to have stable moments of rest (where the only force acting on it is gravity) to reset its measurement baseline, your system will eventually develop an equivalent of vertigo.

Related

Comparing between two skeleton tracking on processing 3

I am currently doing my dissertation which would involve in having 2 people a professional athlete and an amateur. First with the image processing skeletonization I would like to record the professional athlete while performing the squat exercise , then when the amateur performs the exercise I want to be able to compare the professional skeleton with that of the amateur to see if it is properly formed.
Please I m open for any suggestions and opinions , Would gladly appreciate some help

Here lies your question:
properly formed.
What does properly performed actually mean ? How can this be quantified ?
Bare in mind I'm not an athletic/experienced in this field.
If I were given the task I would counter-intuitively go in the opposite direction:
moving away form Processing 3/kinect/computer. I would instead:
find a professional athlete
find a skilled with trainer with functional mobility training.
find an amateur (probably easiest)
Item 2 will be trickier.For example FMS seems to put a lot of emphasis on correct exercising and mobility (to enhance performance and reduce risk of injuries). I'm not sure if that's the only approach or the best. You might want to check opinions on Physical Fitness, consult with people studying/teaching exercise science, etc. Do check credentials as it feels like a field where everyone has an opinion/preference.
The idea is to understand how a professional educated trainer asses correct movement. Take note of how that works in the real world and try to systemise it.
What are the cues for a correct execution ?
is the key poses
the motion in between
how the skeletal and muscular system work together/ the weights/forces applied/etc.
Having a better understanding of how this works in the real world should lead you to things you can start quantifying/comparing numerically on a computer.
Try to make a checklist/score system manually using a pen and paper based on the information you gather. If this works you already have a system you can start programming.
The next step is acquiring the data.
This is probably where the kinect comes, but bare in mind:
the second version of the kinect is more precise than the first
there is a Kinect2 SDK wrapper for Processing 3: use that if you can (windows only). There is a way you can get libfreenect2 working with OpenNI on osx/linux and therefore with SimpleOpenNI in Processing, but it's not straight forward and you won't have the same precision on the skeleton tracking algorithm
use data that is as precise as possible:
you can get the accuracy of a tracked skeleton joint
use an environment that doesn't contain a complex background (makes it easy to segment users and detect/track skeletons with little change of mistaking it for something else). prefer artificial non-incandescent light (less of a problem with kinect v2, but still you want as little IR interference as possible).
comparing orientation matrices or joints on single poses might not be enough to get the full picture: how do you capture/quantify motion taking into account the things that the kinect can't easily see: muscles flexing/forces applied/moving centre of gravity/etc.
try to use a grid system that will make it simple to pair the digital values with real world measurements. Check out how people used to study motion in the past, for example Étienne-Jules Marey or Eadweard Muybridge
Motion capture by Étienne-Jules Marey
Motion study by Eadweard Muybridge (notice the grid)
It's a pretty full on project to get right involving bits of anatomy/physics/kinematics/etc.
Start with the research first:
how did people study this in the past ?
what are the current developments ?
how does it work in the real world (without computers) ?
Take your constraints into account:
what resources (people/gear/etc.) can you use ?
how much time do you have available ?
Given the above, what topic/section of the project can be realistically be tackled to get useful results.
Overall probably something along these lines:
background research
real world studies
comparison system has feature which can be measured both with kinect and by a person
record data (real world data + mobility comparison evalutation and kinect data + mobility comparison)
compare data
write evaluation of findings (how effective is the system? what are limitations ? what could be improved (future work) ? etc.)
In short be aware of the kinect limitations: skeleton tracking is probability based: it's not 100% accurate. use data that's as clean/correct as possible to begin with (make it easy to acquire good data if you can control the capture environment). From what a real trainer would track, what could you track with a kinect ? do a comparison of the intersecting measurements.

How to estimate distance between two android devices through wifi-direct?

I am reading about Received Signal Strength Indicator(RSSI). This could be used for our particular case, a rough estimation of distance between devices. But maybe there could be something to combine in order to improve the accuracy.

There are too many variables at play to accurately determine such data.
It's completely impractical to obtain a reliable distance figure just looking at the RSSI. It may give you an order of magnitude but nothing remotely accurate.
Most notable is the variety of devices and underlying hardware used. But take into account this simple example: you have 2 pairs of devices, 1 pair has good signal for a range of 30m without obstacles within and another pair has the same value for what would be considered a good signal but they are within 1m but with an obstacle which causes heavy interefernce. Any interpretation from empirical data would be awfully inaccurate.
The best change you have is to pinpoint each device's location by GPS and communicate&compare coordinates. But again, the underlying hardware comes at play. Also the length of time it takes to pinpoint with a better accuracy. We're talking about 5-50 meters. So it's basically mostly noise.
Go beyond that and you go out of wi-fi direct / peer-to-peer range.
As such, trying to estimate distances begtween 2 android devices at this time would be in my view an utter waste of time. (unless you are building a huge network of interconnected wi-fi direct devices in which case you could come up with some cool applications, or at least visualisation stuff - that's a bit of a stretch considering one-to-many limitations).

GPS data comparison after smoothing

I'm trying to compare multiple algorithms that are used to smooth GPS data. I'm wondering what should be the standard way to compare the results to see which one provides better smoothing.
I was thinking on a machine learning approach. To crate a car model based on a classifier and check on which tracks provides better behaviour.
For the guys who have more experience on this stuff, is this a good approach? Are there other ways to do this?

Generally, there is no universally valid way for comparing two datasets, since it completely depends on the applied/required quality criterion.
For your appoach
I was thinking on a machine learning approach. To crate a car model
based on a classifier and check on which tracks provides better
behaviour.
this means that you will need to define your term "better behavior" mathematically.
One possible quality criterion for your application is as follows (it consists of two parts that express opposing quality aspects):
First part (deviation from raw data): Compute the RMSE (root mean squared error) between the smoothed data and the raw data. This gives you a measure for the deviation of your smoothed track from the given raw coordinates. This means, that the error (RMSE) increases, if you are smoothing more. And it decreases if you are smoothing less.
Second part (track smoothness): Compute the mean absolute lateral acceleration that the car will experience along the track (second deviation). This will decrease if you are smoothing more, and it will increase if you are smoothing less. I.e., it behaves in contrary to the RMSE.
Result evaluation:
(1) Find a sequence of your data where you know that the underlying GPS track is a straight line or where the tracked object is not moving. Note, that for those tracks, the (lateral) acceleration is zero by definition(!).
For these, compute RMSE and mean absolute lateral acceleration.
The RMSE of appoaches that have (almost) zero acceleration results from measurement inaccuracies!
(2) Plot the results in a coordinate system with the RMSE on the x axis and the mean acceleration on the y axis.
(3) Pick all approaches that have an RMSE similar to what you found in step (1).
(4) From those approaches, pick the one(s) with the smallest acceleration. Those give you the smoothest track with an error explained through measurement inaccuracies!
(5) You're done :)

I have no experience on this topic but I have few things in mind that may help you.
You know it is a car. You know that the data is generated from a car so you can define a set of properties of a car. For example if a car is moving with speed above 50km than the angle of the corner should be at least 110 degrees. I am absolutely guessing with the values but if you do a little research i am sure you will be able to define such properties. Next thing you can do is to test how each approximation fits the car properties and choose the best one.
Raw data. I assume you are testing all methods on a part of given road. You can generate a "raw gps track" - a track that best fits the movement of a car. Google maps may help you to generate such track os some gps devise with higher accuracy. Than you measure the distance between each approximation and your generated track - the one with the min distance wins.

i think you easily match the coordinates after the address conversion.
because address have street,area and city. so you can easily match the different radius.
let try this link

Take a look at this paper that discusses comparing machine learning algorithms:
"Choosing between two learning algorithms
based on calibrated tests" available at:
http://www.cs.waikato.ac.nz/ml/publications/2003/bouckaert-calibrated-tests.pdf
Also check out this paper:
"Bayesian Comparison of Machine Learning Algorithms on Single and
Multiple Datasets" available at:
http://www.jmlr.org/proceedings/papers/v22/lacoste12/lacoste12.pdf
Note: It is noted from the question that you are looking into the best way to compare the results for machine learning algorithms and are not looking for additional machine learning algorithms that may implement this feature.

Machine Learning is not an well suited approach for that task, you would have to define what is good smoothing...
Principially your task cannot be solved by an algorithm that gives an general answer because every smoothing destroy the original data by some amount and adds invented positions, and different systems/humans that use the smoothed data react differently on that changed data.
The question is: What do you want to achieve with smoothing?
Why do you need smoothing? (have you forgotten to implement or enable a stand still filter that eliminates movement while the vehicle is standing still, which in GPS introduces jumping location during stand still?)
The GPS chip has already built in a (best possible?) real time smoothing using a Kalman filter, having on the one side more information than a post processed smotthing algo, on the other side it has less.
So next you have to ask yourself: do you compare post processing smooting algos or real time algos? (probably post processing) Comparing a real time smoothing algorithm with a post process smoothing algorithm is not fair.
Again: What do you expect from smoothed data: That they look somewhat fine, but unrealistic like photoshopped models for tv-advertisments?
What is good smoothing? near to real vehicle postion which nobody ever knows, or a curve whith low acceleration?
I would prefer an smoothing algorithm that produces the curve most near to the real (usually unknown) vehicle trajectory.
Or you might just think it should somehow look beautifull: In that case overlay the curves with different colors, display it on a satelitte image map, and let a team of humans (experts at least owning and driving an own car) decide what looks good and realistic.
We humans have the best multi purpose pattern matching algorithm built in.
Again why smooth?: for display in a map to please humans that look at that map?
or to use the smoothed tracks to feed other algorithms that have problems with the original data?
To please humans I have given an answer above.
To please other algorithms:
What they need? nearer positions? or better course value / direction between points.
What attributes do you want to smooth: only the latitude, longitude coordinates, or also the speed value, and course value?
I have much professional experience with GPS tracks, and recommend, to just remove every location under 7km/h and keep the rest as it is. In most cases there is no need for further smoothing.
Otherwise it gets expensive:
A possible solution:
1) You arrange a 2000€ Reference GPS receiver delivered with a magnetic vehicle roof antenna (E.g Company hemisphere 2000 GPS receiver) and use that as reference
2) You use a comnsumer GPS usually used for your task (smartphone, etc.)
Both mounted inside the car: drive some test tracks, in good conditions (highways) but more tracks at very bad: strong curves combined with big houses left and right. And through tunnel, a struight and a curved one, if you have one.
3) apply the smoothing algoritms to the consumer GPS tracks
4) compare the smoothed to the reference track, by matching two positions and finally calulate the (RMSE Root mean squared error)
Difficulties
matching two positions: Hopefully the time can be exactly matched which is usually not the case (0,5s offset possible).
Think what do you do when having an GPS outage.
Consider first to display a raw track and identify what kind of unsmoothed data is not suitable/ nice looking. (Probably later posting the pics here)

what about using the good old Kalman Filter!

android accelometer detect jumping?

Hi!
I'm just starting out with Android development and i was wondering if there is any built in machinery in Android phones that are able to detect if a person is squatting (if a person holds his phone in his hand and keeps his hand steady, the phone changes it's altitude by approximately 1 m) or jumping. The user is holding his phone in his hand in both cases. If there is, what kind of packages should one look into to use those built in detectors?
Thank you very much for your help!

In both cases you'll want to make use of the inbuilt accelerometer. Squatting will be very difficult to detect reliably though. Jumping could be quite easy. For example, when jumping, there will be an upward acceleration significantly greater than gravity, followed by free-fall, followed by another upward acceleration upwards. (Landing).

Can you programmatically detect white noise?

The Dell Streak has been discovered to have an FM radio which has very crude controls. 'Scanning' is unavailable by default, so my question is does anyone know how, using Java on Android, one might 'listen' to the FM radio as we iterate up through the frequency range detecting white noise (or a good signal) so as to act much like a normal radio's seek function?

I have done some practical work on this specific area, i would recommend (if you have a little time for it) to try just a little experimentation before resorting to fft'ing. The pcm stream can be interpreted very complexely and subtly (as per high quality filtering and resampling) but can also be practically treated for many purposes as the path of a wiggly line.
White noise is unpredictable shaking of the line, which is never-the-less quite continuous in intensity (rms, absolute mean..) Acoustic content is recurrent wiggling and occasional surprises (jumps, leaps) :]
Non-noise like content of a signal may be estimated by performing quick calculations on a running window of the pcm stream.
For example, noise will strongly tend to have a higher value for the absolute integral of its derivative, than non-noise. I think that is the academic way of saying this:
loop(n+1 to n.length)
{ sumd0+= abs(pcm[n]);
sumd1+= abs(pcm[n]-pcm[n-1]);
}
wNoiseRatio = ?0.8; //quite easily discovered, bit tricky to calculate.
if((sumd1/sumd0)<wNoiseRatio)
{ /*not like noise*/ }
Also, the running absolute average over ~16 to ~30 samples of white noise will tend to vary less, over white noise than acoustic signal:
loop(n+24 to n.length-16)
{ runAbsAve1 += abs(pcm[n]) - abs(pcm[n-24]); }
loop(n+24+16 to n.length)
{ runAbsAve2 += abs(pcm[n]) - abs(pcm[n-24]); }
unusualDif= 5; //a factor. tighter values for longer measures.
if(abs(runAbsAve1-runAbsAve2)>(runAbsAve1+runAbsAve2)/(2*unusualDif))
{ /*not like noise*/ }
This concerns how white noise tends to be non-sporadic over large enough span to average out its entropy. Acoustic content is sporadic (localised power) and recurrent (repetitive power).
The simple test reacts to acoustic content with lower frequencies and could be drowned out by high frequency content. There are simple to apply lowpass filters which could help (and no doubt other adaptions).
Also, the root mean square can be divided by the mean absolute sum providing another ratio which should be particular to white noise, though i cant figure what it is right now. The ratio will also differ for the signals derivatives as well.
I think of these as being simple formulaic signatures of noise. I'm sure there are more..
Sorry to not be more specific, it is fuzzy and imprecise advice, but so is performing simple tests on the output of an fft. For better explaination and more ideas perhaps check out statistical and stochastic(?) measurements of entropy and randomness on wikipedia etc.

Use a Fast Fourier Transform.
This is what you can use a Fast Fourier Transform for. It analyzes the signal and determines the strength of the signal at various frequencies. If there's a spike in the FFT curve at all, it should indicate that the signal is not simply white noise.
Here is a library which supports FFT's. Also, here is a blog with source code in case you want to learn about what the FFT does.

If you don't have FFT tools available, just a wild suggestion:
Try to compress a few milliseconds of audio.
A typical feature of noise is that it compresses much less than clear signal.

As far as I know there is no API or even drivers for the FM Radio in the Android SDK and unless Dell releases one you will have to roll your own. It's actually even worse than that. All(?) new chipsets has FM Radio but not all phones has an FM Radio application.
The old Windows Mobile had the same problem.

For white noise detection you need to do FFT and see that it has more or less continious spectrum. But recording from FM might be a problem.

Just high pass filtering it will give a good idea, and has sometimes been used for squelch on fm radios.
Note that this is comparable to what the derivative suggestion was getting at - taking the derivative is a simple form of high pass filter, and taking the absolute value of that a crude way of measuring power.

Do you have a subscription to the IEEE Xplore library? There are countless papers (one picked at random) on this very topic.
A very simplistic method would be to observe the "flatness" of the power spectral density. One could take this by using a Fast Fourier Transform of the signal in the time domain and find the standard deviation of the spectral density. If it is below some threshold, you have your white noise.

The main question here is: what type of signal do you have access to?
I bet you don't have direct access to the analog EM signal directly. So no use of FFT on this signal possible. You can't also try to build a phased-lock loop, which is the way your standard old radio tuner works ("Scanning" in your case).
Your only option is indeed to pick one frequency and listen too it (and try do detect when it's noise with FFT on sound). You might even only have access to the FFTed signal.
Problem here: If you want to detect a potential frequency using white noise you will pick up signals too easily.
Anyway, here is what I would try to do with this strategy:
Double integrate the autocorrelation of the spectral density over a fraction of a second of audio. And this for each frequency.
Then look for a FM frequency where this number is maxed.
Little explanation here:
Spectral density gives you a signal which most used frequencies are maxed.
If a bit of time later if the same frequencies are maxed then you have some supposedly clear audio. You get this by integrating the autocorrelation the spectral density for one audio frequency for a fraction of a second (using some function that grows larger than linear might also work)
You then just have to integrate this for all audio frequencies
Also be careful to normalize the integrals: a loud white noise signal should not get a higher score than a clear but low audio signal.

Several people have mentioned the FFT, which you'll want to do, but to then detect white noise you need to make sure that the magnitude is relatively constant over the range of audio frequencies. You'll want to look at magnitudes only, you can throw away the phases. You can compute an average and standard deviation for the magnitudes in O(N) time. For white noise, you should find the standard deviation to be a relatively small fraction of the average. If I remember my statistics right, it should be about (1/sqrt(N)) of the average.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.