Actual performance benefits of distance squared vs distance

Actual performance benefits of distance squared vs distance - java

When calculating the distance between two 3D points in Java, I can compute the distance, or the distance squared between them, avoiding a call to Math.sqrt.
Natively, I've read that sqrt is only a quarter of the speed of multiplication which makes the inconvenience of using the distance squared not worthwhile.
In Java, what is the absolute performance difference between multiplication and calculating a square root?

I Initially wanted to add this as a comment, but it started to get too bid, so here goes:
Try it yourself. Make a loop with 10.000 iterations where you simply calculate a*a + b*b and another separate loop where you calculate Math.sqrt(a*a + b*a). Time it and you'll know. Calculating a square root is an iterative process on its own where the digital (computer bits) square root converges closer to the real square root of the given number until it's sufficiently close (as soon as the difference between each iteration is less than some really small value). There are multiple algorithms out there beside the one the Math library uses and their speed depends on the input and how the algorithm is designed. Stick with Math.sqrt(...) in my opinion, can't go wrong and it's been tested by a LOT of people.
Although this can be done very fast for one square root, there's a definite observable time difference.
On a side note: I cannot think of a reason to calculate the square root more than once, usually at the end. If you want to know the distance between points, just use the squared value of that distance as a default and make comparisons/summations/subtractions or whatever you want based on that default.
PS: Provide more code if you want a more "practical" answer

Related

How do I apply FFT onto an audio recording to get a frequency?

This is supposed to be for an android app, so the language in question is obviously Java.
I'm trying to record some audio and get the dominant frequency. This is for a very specific purpose, and the frequencies I need to be detected are pure sounds made by another device. I have the recording part done, so the only thing that I need to do is calculate the frequency from the buffer it generates.
I know I'm supposed to use something called FFT, so I put these into my project: http://introcs.cs.princeton.edu/java/97data/FFT.java, and http://introcs.cs.princeton.edu/java/97data/Complex.java.html
I know there are many questions about this, but none of them give an answer that I can understand. Others have broken links.
Anyone know how to do this, and explain in a relatively simple manner?

Generally a DFT (FFT included) implementation will take N time-domain samples (your recording) and produce N/2 complex values in the frequency domain. The angle of the complex value represents the phase and the absolute value of it represents the amplitude. Usually the values output will be ordered from lowest frequency to highest frequency.
Some implementations may output N complex values, but the extra values are redundant unless your input contains complex values. It should not in your case. This is why many implementations input real values and output N/2 complex values, as this is the most common use of FFT.
So, you will want to calculate the absolute value of the output since the amplitude is what you are interested in. The absolute value of a complex number is the square root of the sum of the square of it's real and the square of it's complex component.
The exact frequencies of each value will depend on the number of samples of input and the interval between the samples. The frequency of value at position i (assuming i goes from 0 to N/2 - 1) will be i * (sampling frequency) / N.
This is assuming your N is even, rather than trying to explain the case of N being odd I'll recommend you keep N even for simplicity. For the case of FFT N will always be a power of two so N will always be even anyway.
If you're looking for a tone over a minimum time T then I'd also recommend processing the input in blocks of T/2 size.

Fourier transforms are a mathematical technique that lets you go back and forth between time and frequency domains for time-dependent signals.
FFT is a computer algorithm for calculating discrete transforms quickly and efficiently.
You'll take a sample of your time signal and apply FFT to it to get the amplitude versus frequency for the sample.
It's not an easy topic if you don't have the mathematical background. It assumes a good knowledge of trigonometry (sines and cosines), functions, and calculus. If you don't have that, it'll be difficult to read and understand any reference you can find.
If you don't have that background, do your best to treat a library FFT function as a black box and use what it gives back.

Algorithm for clustering Tweets based on geo radius

I want to cluster tweets based on a specified geo-radius like (10 meters). If for example I specify 10 meters radius, then I want all tweets that are within 10 meters to be in one cluster.
A simple algorithm could be to calculate the distance between each tweet and each other tweets, but that would be very computationally expensive. Are there better algorithms to do this?

You can organize your tweets in a quadtree. This makes it quite easy to find tweets near by without looking at all tweeds and their location.
The quadtree does not directly deliver the distance (because it is based on a Manhatten-distance but it gives you near by tweets, for which you can calculate the precise distance afterwards afterwards.

If your problem is only in computation of distances:
remember: you should never count distances if you need them for comparison only. Use their squares instead.
Do not compare:
sqrt((x1-x2)^2+(y1-y2)^2) against 10
compare instead
(x1-x2)^2+(y1-y2)^2 against 100
It takes GREATLY less time.
The other improvement can be reached if you simply compare coordinates before comparing squares of distances. If abs(x1-x2)>1, you needn't that pair anymore. (It is the Manchattan distance MrSmith is speaking about)
I don't know how you work with your points, but if their set is stable, you could make two arrays of them, and in each one order them according to one of the coordinates. After that you need to check only these points that are close to the source one in both arrays.

Bezier curve approximation for large amount of points

I have about hundred points, that I want to approximate with Bezier curve, but if there are more than 25 points (or something like that), factorial counting in number of combination causes number overflow.
Is there a way of approximating such amount of points in a Bezier-like way (smooth curve without passing through all points, except first and last)?
Or do I need to choose another approximation algorithm with the same effect?
I'm using default swing drawing tools.
P.S. English is not native for me, so probably I've used wrong math terms somewhere.

Do you want to get one Bezier curve fitting best in all 100 points? If that is the case Jim Herold has a very detailed explanation how to do it. A further optimisation could be reducing the amount of points using the Douglas-Peucker algorithm.

Fast multi-body gravity algorithm?

I am writing a program to simulate an n-body gravity system, whose precision is arbitrarily good depending on how small a step of "time" I take between each step. Right now, it runs very quickly for up to 500 bodies, but after that it gets very slow since it has to run through an algorithm determining the force applied between each pair of bodies for every iteration. This is of complexity n(n+1)/2 = O(n^2), so it's not surprising that it gets very bad very quickly. I guess the most costly operation is that I determine the distance between each pair by taking a square root. So, in pseudo code, this is how my algorithm currently runs:
for (i = 1 to number of bodies - 1) {
for (j = i to number of bodies) {
(determining the force between the two objects i and j,
whose most costly operation is a square root)
}
}
So, is there any way I can optimize this? Any fancy algorithms to reuse the distances used in past iterations with fast modification? Are there any lossy ways to reduce this problem? Perhaps by ignoring the relationships between objects whose x or y coordinates (it's in 2 dimensions) exceed a certain amount, as determined by the product of their masses? Sorry if it sounds like I'm rambling, but is there anything I could do to make this faster? I would prefer to keep it arbitrarily precise, but if there are solutions that can reduce the complexity of this problem at the cost of a bit of precision, I'd be interested to hear it.
Thanks.

Take a look at this question. You can divide your objects into a grid, and use the fact that many faraway objects can be treated as a single object for a good approximation. The mass of a cell is equal to the sum of the masses of the objects it contains. The centre of mass of a cell can be treated as the centre of the cell itself, or more accurately the barycenter of the objects it contains. In the average case, I think this gives you O(n log n) performance, rather than O(n2), because you still need to calculate the force of gravity on each of n objects, but each object only interacts individually with those nearby.
Assuming you’re calculating the distance with r2 = x2 + y2, and then calculating the force with F = Gm1m2 / r2, you don’t need to perform a square root at all. If you do need the actual distance, you can use a fast inverse square root. You could also used fixed-point arithmetic.

One good lossy approach would be to run a clustering algorithm to cluster the bodies together.
There are some clustering algorithms that are fairly fast, and the trick will be to not run the clustering algorithm every tick. Instead run it every C ticks (C>1).
Then for each cluster, calculate the forces between all bodies in the cluster, and then for each cluster calculate the forces between the clusters.
This will be lossy but I think it is a good approach.
You'll have to fiddle with:
which clustering algorithm to use: Some are faster, some are more accurate. Some are deterministic, some are not.
how often to run the clustering algorithm: running it less will be faster, running it more will be more accurate.
how small/large to make the clusters: most clustering algorithms allow you some input on the size of the clusters. The larger you allow the clusters to be, the faster but less accurate the output will be.
So it's going to be a game of speed vs accuracy, but at least this way you will be able to sacrafice a bit of accuracy for some speed gains - with your current approach there's nothing you can really tweak at all.

You may want to try a less precise version of square root. You probably don't need a full double precision. Especially if the order of magnitude of your coordinate system is normally the same, then you can use a truncated taylor series to estimate the square root operation pretty quickly without giving up too much in efficiency.

There is a very good approximation to the n-body problem that is much faster (O(n log n) vs O(n²) for the naive algorithm) called Barnes Hut. Space is subdivided into a hierarchical grid, and when computing force contribution for distant masses, several masses can be considered as one. There is an accuracy parameter that can be tweaked depending on how much your willing to sacrifice accuracy for computation speed.

Java: Calculate distance between a large number of locations and performance

I'm creating an application that will tell a user how far away a large number of points are from their current position.
Each point has a longitude and latitude.
I've read over this article
http://www.movable-type.co.uk/scripts/latlong.html
and seen this post
Calculate distance in meters when you know longitude and latitude in java
There are a number of calculations (50-200) that need carried about.
If speed is more important than the accuracy of these calculations, which one is best?

this is O(n)
Dont worry about performance. unless every single calculation takes too long (which it isnt).

As Imre said this is O(n), or linear, meaning that no matter how the values differ or how many times you do it the calculations in the algorithm will take the same amount of time for each iteration. However, I disagree in the context that the Spherical Law of Cosines has less actual variables and calculations being performed in the algorithm meaning that less resources are being used. Hence, I would choose that one because the only thing that will differ speed would be the computer resources available. (note: although it will be barely noticable unless on a really old/slow machine)
Verdict based on opinion: Spherical Law of Cosines

The two links that you posted use the same spherical geometry formula to calculate the distances, so I would not expect there to be a significant difference between their running speed. Also, they are not really computationally expensive, so I would not expect it to be a problem, even on the scale of a few hundred iterations, if you are running on modern hardware.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Actual performance benefits of distance squared vs distance - java

Related

How do I apply FFT onto an audio recording to get a frequency?

Algorithm for clustering Tweets based on geo radius

Bezier curve approximation for large amount of points

Fast multi-body gravity algorithm?

Java: Calculate distance between a large number of locations and performance

Categories

Resources