I have an android application which is getting gesture coordinates (3 axis - x,y,z). I need to compare them with coordinates which I have in my DB and determine whether they are the same or not.
I also need to add some tolerance, since accelerometer (device which captures gestures) is very sensitive. It would be easy, but I also want to consider e.g. "big circle" drawn in the air, same as "small circle" drawn in the air. meaning that there would be different values, but structure of the graph would be the same, right?
I have heard about translating graph values into bits and then compare. Is that the right approach? Is there any library for such comparison?
So far I just hard coded it, covering all my requirements except the last one (big circle vs small circle).
My code now:
private int checkWhetherGestureMatches(byte[] values, String[] refValues) throws IOException {
int valuesSize = 32;
int ignorePositions = 4;
byte[] valuesX = new byte[valuesSize];
byte[] valuesY = new byte[valuesSize];
byte[] valuesZ = new byte[valuesSize];
for (int i = 0; i < valuesSize; i++) {
int position = i * 3 + ignorePositions;
valuesX[i] = values[position];
valuesY[i] = values[position + 1];
valuesZ[i] = values[position + 2];
}
Double[] valuesXprevious = new Double[valuesSize];
Double[] valuesYprevious = new Double[valuesSize];
Double[] valuesZprevious = new Double[valuesSize];
for (int i = 0; i < valuesSize; i++) {
int position = i * 3 + ignorePositions;
valuesXprevious[i] = Double.parseDouble(refValues[position]);
valuesYprevious[i] = Double.parseDouble(refValues[position + 1]);
valuesZprevious[i] = Double.parseDouble(refValues[position + 2]);
}
int incorrectPoints = 0;
for (int j = 0; j < valuesSize; j++) {
if (valuesX[j] < valuesXprevious[j] + 20 && valuesX[j] > valuesXprevious[j] - 20
&& valuesY[j] < valuesYprevious[j] + 20 && valuesY[j] > valuesYprevious[j] - 20
&& valuesZ[j] < valuesZprevious[j] + 20 && valuesZ[j] > valuesZprevious[j] - 20) {
} else {
incorrectPoints++;
}
}
return incorrectPoints;
}
EDIT:
I found JGraphT, it might work. If you know anything about that already, let me know.
EDIT2:
See these images, they are the same gesture but one is done in a slower motion than another.
Faster one:
Slower one:
I haven't captured images of the same gesture where one would be smaller than another, might add that later.
If your list of gestures is complex, I would suggest training a neural network which can classify the gestures based on the graph value bits you mentioned. The task is very similar to classification of handwritten numerical digits, for which lots of resources are there on the net.
The other approach would be to mathematically guess the shape of the gesture, but I doubt it will be useful considering the tolerance of the accelerometer and the fact that users won't draw accurate shapes.
(a) convert your 3D coordinates into 2D plain figure. Use matrix transformations.
(b) normalize your gesture scale - again with matrix transformations
(c) normalize the number of points or use interpolation on the next step.
(d) calculate the difference between your stored (s) gesture and current (c) gesture as
Sum((Xs[i] - Xc[i])^2 + (Ys[i] - Yc[i])^2) where i = 0 .. num of points
If the difference is below your predefined precision - gestures are equal.
I have used a Java implementation of Dynamic Time Wrapping algorithm. The library is called fastDTW.
Unfortunately from what I undersood they don't support it anymore, though I found a use for it.
https://code.google.com/p/fastdtw/
I can't recall now, but I think I used this one and compiled it myself:
https://github.com/cscotta/fastdtw/tree/master/src/main/java/com/fastdtw/dtw
Related
I need to validate if a pedestrian crossed an intersection using GPS' readings and findNearestIntersectionOSM calls to get the nearest intersections.
For each response from geoname, i check if the distance between the 2 points is less than a certain threshold and also using the sin function, i check if the angle between the intersection(GeoPoint.BearingTo) and pedestrian's current location flips its sign
Sin(previous location reading) * Sin(Current location read) < 0
Unfortunately, this is insufficient, and i sometimes receive false positives and so on.
Is there a better approach, or anything I'm missing?
Just to make clear, I'm not planning to dive into Image Processing field, but simply use some of OSM's functionality (if possible)
private void OnClosestIntersectionPoint(GeoPoint gPtIntersection) {
int iDistance = mGeoLastKnownPosition.distanceTo(gPtIntersection);
double dbCurrentBearing = mGeoLastKnownPosition.bearingTo(gPtIntersection);
if(mDbLastKnownBearing == null) {
mDbLastKnownBearing = new Double(dbCurrentBearing);
return;
}
boolean bFlippedSignByCrossing = Math.sin(mDbLastKnownBearing) * Math.sin(dbCurrentBearing) < 0;
mDbLastKnownBearing = dbCurrentBearing; // update bearing regardless to what's going to happen
if(bFlippedSignByCrossing && iDistance <= 10 && !HasntMarkIntersectionAsCrossed(gPtIntersection))
MarkAsIntersectionCrossed(mGeoLastKnownIntersection);
}
I need for my java-program a function that checks for polygon-collision, but the algorithms (for point-in-polygon) I tried were not correct for my needs, the degenerative cases are a problem for me.
This is what i try to reach with my program: I have 2 polygons and want to put them nearest possible together. I want to place them on their vertices and rotate them along the edge to fit optimal. Therefor I need a collision-detection, if they intersect or not.
My biggest problem is that those polygon-edges could be on the same point. The researched algorithms decide if it is in polygon a or b (mostly with y-value).
What I use
Polygon with double coordinates for x and y
standard java
no external librarys
My required rules:
polygons can have same edge and same vertices (can be on same boundary, but not complete polygon overlay)
the edges should not be allowed to intersect
it is not allowed, that one polygon is completly surrounded by another polygon (a hole).
(an optional very small epsilon in algorithm would be good, because rotating with double is not very exact)
I tried too the internal classes like Path2D.Double() with contains too without success to this problem.
The last algorithm (of about minimum of 8) i tried was this:
wiki.cizmar.org/doku.php?id=physics:point-in-polygon_problem_with_simulation_of_simplicity
This is C Code of the linked algorithm (last one I tried)
int i, j, c = 0;
for (i = 0, j = number_of_vertices-1; i < number_of_vertices; j = i++) {
if ( ((vertices[i].y>p.y) != (vertices[j].y>p.y)) &&
(p.x < (vertices[j].x-vertices[i].x) * (p.y-vertices[i].y) / (vertices[j].y-vertices[i].y) + vertices[i].x) )
c = !c;
}
return c;
My adapted JAVA code (Punkt=Point, Form.getCoords = List of Coordinates with x,y)
private boolean testPointInsidePolygon3c(Punkt p, Form f){
int number_of_vertices = f.getCoords().size();
int i, j = 0;
boolean odd = false;
for (i = 0, j = number_of_vertices-1; i < number_of_vertices; j = i++) {
if ( ((f.getCoords().get(i).getY() >p.getY()) != (f.getCoords().get(j).getY() >p.getY())) &&
( p.getX() < (f.getCoords().get(j).getX() -f.getCoords().get(i).getX())
* (p.getY() -f.getCoords().get(i).getY())
/ (f.getCoords().get(j).getY() -f.getCoords().get(i).getY())
+ f.getCoords().get(i).getX())
){
odd = !odd;
}
}
return odd;
}
To show that problem: here are pictures with 2 polygons. the blue vertices are the troublesomes.
Problem Example #1 example from another source
I hope you got some ideas, links, algorithm or anything for me. i got stucked too long with that problem ;-)
What a pity - i could not do a complete correct algorithm, that solves my problem.
That is why I now use the JTS-Library!
With overlaps and covers/within i got everything correct in my test-cases.
I am writing my own audio format as part of a game console project. Part of the project requires me to write an emulator so I know exactly how to implement it's functions in hardware. I am currently writing the DSP portion, but I am having trouble writing a decoding algorithm. Before I go further, I'll explain my format.
DST (Dingo Sound Track) Audio format
The audio format only records to pieces of data per sample: the amplitude and the number of frames since the last sample. I'll explain. When converting an audio file (WAV for example), it compares the current sample with the previous one. If it detects that the current sample switches amplitude direction in relation to the previous sample, it records the previous sample and the number of frames since the last record. It keeps going until the end of the file. Here is a diagram to explain further:
What I need to do
I need my "DSP" to figure out the data between each sample, as accurately as possible using only the given information. I don't think it's my encoding algorithm, because when I play the file in Audacity, I can sort of make out the original song. But when I try to play it with my decoding algorithm, I get scattered clicks. I am able to play WAV files directly with a few mods to the algorithm with almost no quality drop, so I know it's definitely the algorithm and not the rest of the DSP.
The Code
So now I got all of the basic info out of the way, here is my code (only the important parts).
Encoding algorithm:
FileInputStream s = null;
BufferedWriter bw;
try {
int bytes;
int previous = 0;
int unsigned;
int frames = 0;
int size;
int cursor = 0;
boolean dir = true;
int bytes2;
int previous2 = 0;
int unsigned2;
int frames2 = 0;
boolean dir2 = true;
s = new FileInputStream(selectedFile);
size = (int)s.getChannel().size();
File f = new File(Directory.getPath() + "\\" + (selectedFile.getName().replace(".wav", ".dts")));
System.out.println(f.getPath());
if(!f.exists()){
f.createNewFile();
}
bw = new BufferedWriter(new FileWriter(f));
try (BufferedInputStream b = new BufferedInputStream(s)) {
byte[] data = new byte[128];
b.skip(44);
System.out.println("Loading...");
while ((bytes = b.read(data)) > 0) {
// do something
for(int i=1; i<bytes; i += 4) {
unsigned = data[i] & 0xFF;
if (dir) {
if (unsigned < previous) {
bw.write(previous);
bw.write(frames);
dir = !dir;
frames = 0;
}else{
frames ++;
}
} else {
if (unsigned > previous) {
bw.write(previous);
bw.write(frames);
dir = !dir;
frames = 0;
}else{
frames ++;
}
}
previous = unsigned;
cursor ++;
unsigned2 = data[i + 2] & 0xFF;
if (dir2) {
if (unsigned2 < previous2) {
bw.write(previous2);
bw.write(frames2);
dir2 = !dir2;
frames2 = 0;
}else{
frames2 ++;
}
} else {
if (unsigned2 > previous2) {
bw.write(previous2);
bw.write(frames2);
dir2 = !dir2;
frames2 = 0;
}else{
frames2 ++;
}
}
previous2 = unsigned2;
cursor ++;
progress.setValue((int)(((float)(cursor / size)) * 100));
}
}
b.read(data);
}
bw.flush();
bw.close();
System.out.println("Done");
convert.setEnabled(true);
status.setText("finished");
} catch (Exception ex) {
status.setText("An error has occured");
ex.printStackTrace();
convert.setEnabled(true);
}
finally {
try {
s.close();
} catch (Exception ex) {
status.setText("An error has occured");
ex.printStackTrace();
convert.setEnabled(true);
}
}
The progress and status objects can be ignored for they are part of the GUI of my converter tool. This algorithm converts WAV files to my format (DST).
Decoding algorithm:
int start = bufferSize * (bufferNumber - 1);
short current;
short frames;
short count = 1;
short count2 = 1;
float jump;
for (int i = 0; i < bufferSize; i ++) {
current = RAM.read(start + i);
i++;
frames = RAM.read(start + i);
if (frames == 0) {
buffer[count - 1] = current;
count ++;
} else {
jump = current / frames;
for (int i2 = 1; i2 < frames; i2++) {
buffer[(2 * i2) - 1] = (short) (jump * i2);
count ++;
}
}
i++;
current = RAM.read(start + i);
i++;
frames = RAM.read(start + i);
if (frames == 0) {
buffer[count2] = current;
count2 ++;
} else {
jump = current / frames;
for (int i2 = 1; i2 < frames; i2++) {
buffer[2 * i2] = (short) (jump * i2);
count2 ++;
}
}
}
bufferNumber ++;
if(bufferNumber > maxBuffer){
bufferNumber = 1;
}
The RAM object is just a byte array. bufferNumber and maxBuffer refer to the amount of processing buffers the DSP core uses. buffer is the object that the resulting audio is written to. This algorithm set is designed to convert stereo tracks, which works the same way in my format but each sample will contain two sets of data, one for each track.
The Question
How do I figure out the missing audio between each sample, as accurately as possible, and how accurate will the approach be? I would love to simply use the WAV format, but my console is limited on memory (RAM). This format halves the RAM space required to process audio. I am also planning on implementing this algorithm in an ARM microcontroller, which will be the console's real DSP. The algorithm should also be fast, but accuracy is more important. If I need to clarify or explain anything further, let me know since this is my first BIG question and I am sure I forgot something. Code samples would be nice, but aren't needed that much.
EDIT:
I managed to get the DSP to output a song, but it's sped up and filled with static. The sped up part is due to a glitch in it not splitting the track into stereo (I think). And the static is due to the initial increment being too steep. Here is a picture of what I'm getting:
Here is the new code used in the DSP:
if (frames == 0) {
buffer[i - 1] = current;
//System.out.println(current);
} else {
for (int i2 = 1; i2 < frames + 1; i2++) {
jump = (float)(previous + ((float)(current - previous) / (frames - i2 + 1)));
//System.out.println((short)jump);
buffer[(2 * i2) - 1] = (short)(jump);
}
}
previous = current;
I need a way to smooth out those initial increments, and I'd prefer not to use complex arithmetic because I am limited on performance when I port this to hardware (preferably something that can operate on a 100MHZ ARM controller while being able to keep a 44.1KHZ sample rate). Edit: the result wave should actually be backwards. Sorry.
Second Edit:
I got the DSP to output in stereo, but unfortunately that didn't fix anything else like I hoped it would. I also fixed some bugs with the encoder so now it takes 8 bit unsigned audio. This has become more of a math issue so I think I'll post a similar question in Mathematics Stack Exchange. Well that was a waste of time. It got put on fhold near instantly.
You have basically a record of the signal's local extrema and want to reconstruct the signal. The most straight-forward way would be to use some monotonic interpolation scheme. You can try if this fits your needs. But I guess, the result would be very inaccurate because the characteristics of the signal are ignored.
I am not an audio engineer, so my assumptions could be wrong. But maybe, you get somewhere with these thoughts.
The signal is basically a mixture of sines. Calculating a sine function for any segment between two key frames is quite easy. The period is given by twice their distance. The amplitude is given by half the amplitude difference. This will give you a sine that hits the two key samples exactly. Furthermore, it will give you a C1-continuous signal because the derivatives at the connection points are zero. For a nice signal, you probably need even more smoothness. So you could start to interpolate the two sines around a key frame with an appropriate window function. I would start with a simple triangle window but others may give better results. This procedure will preserve the extrema.
It is probably easier to tackle this problem visually (with a plot of the signal), so you can see the results.
If it's all about size, then maybe you want to look into established audio compression methods. They usually give much better compression ratio than 1:2. Also, I don't understand why this method saves RAM because you'll have to calculate all samples when decoding. Of course, this assumes that not the complete data are loaded into RAM but streamed in pieces.
I'm looking for java code (or a library) that calculates the earth mover's distance (EMD) between two histograms. This could be directly or indirectly (e.g. using the Hungarian algorithm). I found several implementations of this in c/c++ (e.g. "Fast and Robust Earth Mover's Distances", but I'm wondering if there is a Java version readily available.
I will be using the EMD calculation to evaluate the approach given by this paper in the context of a science project I'm working on.
Update
Using a variety of resources I estimate that the code below should do the trick. determineMinCostAssignment is the calculation of the optimal assignment as determined by the Hungarian algorithm. For this I will be using the code from http://konstantinosnedas.com/dev/soft/munkres.htm
My main concern is the calculated flow: I am not sure if this is correct. Is there someone who can verify that this is correct or not?
/**
* Determines the Earth Mover's Distance between two histogram assuming an equal distance between two buckets of a histogram. The distance between
* two buckets is equal to the differences in the indexes of the buckets.
*
* #param threshold
* The maximum distance to use between two buckets.
*/
public static double determineEarthMoversDistance(double[] histogram1, double[] histogram2, int threshold) {
if (histogram1.length != histogram2.length)
throw new InvalidParameterException("Each histogram must have the same number of elements");
double[][] groundDistances = new double[histogram1.length][histogram2.length];
for (int i = 0; i < histogram1.length; ++i) {
for (int j = 0; j < histogram2.length; ++j) {
int abs_diff = Math.abs(i - j);
groundDistances[i][j] = Math.min(abs_diff, threshold);
}
}
int[][] assignment = determineMinCostAssignment(groundDistances);
double costSum = 0, flowSum = 0;
for (int i = 0; i < assignment.length; i++) {
double cost = groundDistances[assignment[i][0]][assignment[i][1]];
double flow = histogram2[assignment[i][1]];
costSum += cost * flow;
flowSum += flow;
}
return costSum / flowSum;
}
Here's a pure Java port of the FastEMD algorithm, that I just released:
https://github.com/telmomenezes/JFastEMD
The website "Fast and Robust Earth Mover's Distances" has a Java wrapper for the C/C++ code with compiled binary for Linux and Windows.
This is what I use for Java/Scala:
import org.apache.commons.math3.ml.distance.EarthMoversDistance
new EarthMoversDistance().compute(observed, expected)
https://github.com/wihoho/VideoRecognition
Adapt the author's C implementation with python module through a file interface
The modified C codes are under the folder EarthMoverDistance SourceCode
I am pretty sure that you can do the same thing with Java. Just add a file interface to connect the C implementation of EMD with your Java codes.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
since I don't want to do it on my own, I am searching for a good FFT implementation for java. First I used this one here FFT Princeton but it uses objects and my profiler told me, that its not really fast due to this fact. So I googled again and found this one: FFT Columbia which is faster. Maybe one of you guys know another FFT implementation? I'd like to have the "best" one because my app has to process a huge amount of sound data, and users don't like waiting... ;-)
Regards.
FFTW is the 'fastest fourier transform in the west', and has some Java wrappers:
http://www.fftw.org/download.html
Hope that helps!
Late to the party - here as a pure java solution for those when JNI is not an option.JTransforms
I wrote a function for the FFT in Java: http://www.wikijava.org/wiki/The_Fast_Fourier_Transform_in_Java_%28part_1%29
I've released it in the Public Domain so you can use those functions everywhere (for personal or business projects too). Just cite me in the credits and send me just a link to your work, and you're ok.
It is completely reliable. I've checked its output against Mathematica's FFT and they were always correct until the 15th decimal digit. I think it's a very good FFT implementation for Java. I wrote it on the J2SE 1.6 version and tested it on the J2SE 1.5-1.6 version.
If you count the number of instructions (it's a lot much simpler than a perfect computational complexity function estimation) you can clearly see that this version is great even if it's not optimized at all. I'm planning to publish the optimized version if there are enough requests.
Let me know if it was useful, and tell me any comments you like.
I share the same code right here:
/**
* #author Orlando Selenu
* Originally written in the Summer of 2008
* Based on the algorithms originally published by E. Oran Brigham "The Fast Fourier Transform" 1973, in ALGOL60 and FORTRAN
*/
public class FFTbase {
/**
* The Fast Fourier Transform (generic version, with NO optimizations).
*
* #param inputReal
* an array of length n, the real part
* #param inputImag
* an array of length n, the imaginary part
* #param DIRECT
* TRUE = direct transform, FALSE = inverse transform
* #return a new array of length 2n
*/
public static double[] fft(final double[] inputReal, double[] inputImag,
boolean DIRECT) {
// - n is the dimension of the problem
// - nu is its logarithm in base e
int n = inputReal.length;
// If n is a power of 2, then ld is an integer (_without_ decimals)
double ld = Math.log(n) / Math.log(2.0);
// Here I check if n is a power of 2. If exist decimals in ld, I quit
// from the function returning null.
if (((int) ld) - ld != 0) {
System.out.println("The number of elements is not a power of 2.");
return null;
}
// Declaration and initialization of the variables
// ld should be an integer, actually, so I don't lose any information in
// the cast
int nu = (int) ld;
int n2 = n / 2;
int nu1 = nu - 1;
double[] xReal = new double[n];
double[] xImag = new double[n];
double tReal, tImag, p, arg, c, s;
// Here I check if I'm going to do the direct transform or the inverse
// transform.
double constant;
if (DIRECT)
constant = -2 * Math.PI;
else
constant = 2 * Math.PI;
// I don't want to overwrite the input arrays, so here I copy them. This
// choice adds \Theta(2n) to the complexity.
for (int i = 0; i < n; i++) {
xReal[i] = inputReal[i];
xImag[i] = inputImag[i];
}
// First phase - calculation
int k = 0;
for (int l = 1; l <= nu; l++) {
while (k < n) {
for (int i = 1; i <= n2; i++) {
p = bitreverseReference(k >> nu1, nu);
// direct FFT or inverse FFT
arg = constant * p / n;
c = Math.cos(arg);
s = Math.sin(arg);
tReal = xReal[k + n2] * c + xImag[k + n2] * s;
tImag = xImag[k + n2] * c - xReal[k + n2] * s;
xReal[k + n2] = xReal[k] - tReal;
xImag[k + n2] = xImag[k] - tImag;
xReal[k] += tReal;
xImag[k] += tImag;
k++;
}
k += n2;
}
k = 0;
nu1--;
n2 /= 2;
}
// Second phase - recombination
k = 0;
int r;
while (k < n) {
r = bitreverseReference(k, nu);
if (r > k) {
tReal = xReal[k];
tImag = xImag[k];
xReal[k] = xReal[r];
xImag[k] = xImag[r];
xReal[r] = tReal;
xImag[r] = tImag;
}
k++;
}
// Here I have to mix xReal and xImag to have an array (yes, it should
// be possible to do this stuff in the earlier parts of the code, but
// it's here to readibility).
double[] newArray = new double[xReal.length * 2];
double radice = 1 / Math.sqrt(n);
for (int i = 0; i < newArray.length; i += 2) {
int i2 = i / 2;
// I used Stephen Wolfram's Mathematica as a reference so I'm going
// to normalize the output while I'm copying the elements.
newArray[i] = xReal[i2] * radice;
newArray[i + 1] = xImag[i2] * radice;
}
return newArray;
}
/**
* The reference bit reverse function.
*/
private static int bitreverseReference(int j, int nu) {
int j2;
int j1 = j;
int k = 0;
for (int i = 1; i <= nu; i++) {
j2 = j1 / 2;
k = 2 * k + j1 - 2 * j2;
j1 = j2;
}
return k;
}
}
EDIT: 5th of May, 2022. Well... after more than 10 years I'm publishing the code on Github to avoid losing it: https://github.com/hedoluna/fft
Feel free to contribute and send me your opinions :) Thanks!
I guess it depends on what you are processing. If you are calculating the FFT over a large duration you might find that it does take a while depending on how many frequency points you are wanting. However, in most cases for audio it is considered non-stationary (that is the signals mean and variance changes to much over time), so taking one large FFT (Periodogram PSD estimate) is not an accurate representation. Alternatively you could use Short-time Fourier transform, whereby you break the signal up into smaller frames and calculate the FFT. The frame size varies depending on how quickly the statistics change, for speech it is usually 20-40ms, for music I assume it is slightly higher.
This method is good if you are sampling from the microphone, because it allows you to buffer each frame at a time, calculate the fft and give what the user feels is "real time" interaction. Because 20ms is quick, because we can't really perceive a time difference that small.
I developed a small bench mark to test the difference between FFTW and KissFFT c-libraries on a speech signal. Yes FFTW is highly optimised, but when you are taking only short-frames, updating the data for the user, and using only a small fft size, they are both very similar. Here is an example on how to implement the KissFFT libraries in Android using LibGdx by badlogic games. I implemented this library using overlapping frames in an Android App I developed a few months ago called Speech Enhancement for Android.
I'm looking into using SSTJ for FFTs in Java. It can redirect via JNI to FFTW if the library is available or will use a pure Java implementation if not.