I have a situation where I might need to apply a multiplier to a value in order to get the correct results. This involves computing the value using floating point division.
I'm thinking it would be a good idea to check the values before I perform floating point logic on them to save processor time, however I'm not sure how efficient it will be at run-time either way.
I'm assuming that the if check is 1 or 2 instructions (been a while since assembly class), and that the floating point operation is going to be many more than that.
//Check
if (a != 10) { //1 or 2 instructions?
b *= (float) a / 10; //Many instructions?
}
Value a is going to be '10' most of the time, however there are a few instances where it wont be. Is the floating point division going to take very many cycles even if a is equal to the divisor?
Will the previous code with the if statement execute more efficiently than simply the next one without?
//Don't check
b *= (float) a / 10; //Many instructions?
Granted there wont be any noticable difference either way, however I'm curious as to the behavior of the floating point multiplication when the divisor is equal to the dividend in case things get processor heavy.
Assuming this is in some incredibly tight loop, executed billions of times, so the difference of 1-2 instructions matters, since otherwise you should probably not bother --
Yes you are right to weigh the cost of the additional check each time, versus the savings when the check is true. But my guess is that it has to be true a lot to overcome not only the extra overhead, but the fact that you're introducing a branch, which will ultimately do more to slow you down via a pipeline stall in the CPU in the JIT-compiled code than you'll gain otherwise.
If a == 10 a whole lot, I'd imagine there's a better and faster way to take advantage of that somehow, earlier in the code.
IIRC, floating-point multiplication is much less expensive than division, so this might be faster than both:
b *= (a * 0.1);
If you do end up needing to optimize this code I would recommend using Caliper to do micro benchmarks of your inner loops. It's very hard to predict accurately what sort of effect these small modifications will have. Especially in Java where how the VM behaves is bit of an unknown since in theory it can optimize the code on the fly. Best to try several strategies and see what works.
http://code.google.com/p/caliper/
Related
Considering this code which calculates a power of a double x:
public static double F1 (double x, int k){
if (k==1) { return x; } // O(1)
return x * F1(x, k-1); // O(k)
}
I have concluded that
the nr of operations in if (k==1) { return x; } : 2 operations, the if-statement and the return-statement. Thus, T(1) = 2
the nr of operations in return x * F1(x, k-1); : 4 operations, the return-statement = 1, the *-operator = 1, and F1(x, k-1); = 2. So the first part of the equation = 4
We have one recursive call in x * F1(x, k-1), so x = 1.
We reduce the problem by 1 in each recursive call, so y = k-1. So the second part of the equation = T(k-1)
Putting this all together, we get:
4 + T(k-1), T(1) = 2
But how do I proceed from here to find the exact runtime?
I tried to look at this question for an explanation, but it focused on how to calculate the Big-O notation, and not the exact time complexity. How do I proceed to find the exact time-complexity?
The answer here should be:
Exact: 4k-2
Tilde: 4k
Big-O: O(k)
But I don't know what they did to arrive at this.
But how do I proceed from here to find the exact runtime?
You toss everything you did so far in the garbage and fire up JMH instead, see later for more on that.
It is completely impossible to determine exact runtime based on such academic analysis. Exact runtime depends on which song is playing in your music player, whether your OS is busy doing some disk cleanup, sending a ping to the network time server, which pages so happen to be on the on-die caches, which CPU core your code ends up being run on, and the phase of the moon.
Let me say this as clear as I can: Something like 4k - 2 is utterly irrelevant and misguided - that's just not how computers work. You can't say that an algorithm with 'exact runtime' 4k - 2 will be faster than a 6k + 2 algorithm. It is equally likely to be slower: It holds zero predictive power. It's a completely pointless 'calculation'. It means nothing. There's a reason big-O notation exist: That does mean something regardless of hardware vagary: Given 2 algorithms such that one has a 'better' big-O notation than the other, then there exists some input size such that the better algorithm WILL be faster, regardless of hardware concerns. It might be a really big number and big-O does nothing whatsoever to tell you at what number this occurs.
The point of big-O notation is that it dictates with mathematical certainty what will eventually happen if you change the size of the input to your algorithm, in very broad strokes. It is why you remove all constants and everything but the largest factor when showing a big-O notation.
Take a graph; on the X-axis, there's 'input size', which is the 'k' in O(k). On the Y-axis, there's execution time (or if you prefer, max. memory load). Then, make up some input size and run your algorithm a few times. Average the result, and place a dot on that graph. For example, if you are running your algorithm on an input of k=5, and it takes 27ms on average, put a dot on x=5, y=27.
Keep going. Lots of dots. Eventually those dots form a graph. The graph will, near the x=0 point, be all over the place. As if a drunk with a penchant for randomness is tossing darts at a board.
But, eventually (and when 'eventually' kicks in is impossible to determine, as, again, it depends on so many OS things, don't bother attempting to predict such things), it'll start looking like a recognizable shape. We define these shapes in terms of simplistic formulas. For example, if it eventually (far enough to the right) coalesces into something that looks like what you'd get if you graph y=x^2, then we call that O(x^2).
Now, y=5x^2 looks exactly like y=x^2. For that matter, y=158*x^2 + 25000x + 2134931239, if you look far enough to the right on that curve, looks exactly like y=x^2. Hence why O(158x^2+20x) is completely missing the point, and therefore incorrect. The point of O is merely to tell you what it'll look like 'far enough to the right'.
This leaves us with precisely 2 useful performance metrics:
O(k) notation. Which you correctly determined here: This algorithm has an O(k) runtime.
A timing report. There is no point trying to figure this out by looking at the code, you need to run the code. Repeatedly, with all sorts of guards around it to ensure that hotspot optimization isn't eliminating your code completely, re-running lots of times to get a good average, and ensuring that we're past the JVM's JIT step. You use JMH to do this, and note that the result of JMH, naturally, depends on the hardware you run it on, and that's because programs can have wildly different performance characteristics depending on hardware.
For the first k-1 steps you execute:
the comparison k==1
the subtraction k-1
the product x * ...
the return instruction
In the last step you execute:
the comparison k==1
the return instruction
So you have 4*(k-1)+2 = 4k-2 overall instructions.
EDIT: As #rzwitserloot correctly pointed out, the quantity that you are searching for is not very significant, but it depends on how the code is compiled and executed. Above I've just tried to figure out what your teacher meant with "exact time-complexity".
A few weeks ago I wrote an exam. The first task was to find the right approximation of a function with the given properties: Properties of the function
I had to check every approximation with the tests i write for the properties. The properties 2, 3 and 4 were no problem. But I don't know how to check for property 1 using a JUnit test written in Java. My approach was to do it this way:
#Test
void test1() {
Double x = 0.01;
Double res = underTest.apply(x);
for(int i = 0; i < 100000; ++i) {
x = rnd(0, x);
Double lastRes = res;
res = underTest.apply(x);
assertTrue(res <= lastRes);
}
}
Where rnd(0, x) is a function call that generates a random number within (0,x].
But this can't be the right way, because it is only checking for the x's getting smaller, the result is smaller than the previous one. I.e. the test would also succeed if the first res is equal to -5 and the next a result little smaller than the previous for all 100000 iterations. So it could be that the result after the 100000 iteration is -5.5 (or something else). That means the test would also succeed for a function where the right limit to 0 is -6. Is there a way to check for property 1?
Is there a way to check for property 1?
Trivially? No, of course not. Calculus is all about what happens at infinity / at infinitely close to any particular point – where basic arithmetic gets stuck trying to divide a number that grows ever tinier by a range that is ever smaller, and basic arithmetic can't do '0 divided by 0'.
Computers are like basic arithmetic here, or at least, you are applying basic arithmetic in the code you have pasted, and making the computer do actual calculus is considerably more complicated than what you do here. As in, years of programming experience required more complicated; I very much doubt you were supposed to write a few hundred thousand lines of code to build an entire computerized math platform to write this test, surely.
More generally, you seem to be labouring under the notion that a test proves anything.
They do not. A test cannot prove correctness, ever. Just like science can never prove anything, they can only disprove (and can build up ever more solid foundations and confirmations which any rational person usually considers more than enough so that they will e.g. take as given that the Law of Gravity will hold, even if there is no proof of this and there never will be).
A test can only prove that you have a bug. It can never prove that you don't.
Therefore, anything you can do here is merely a shades of gray scenario: You can make this test catch more scenarios where you know the underTest algorithm is incorrect, but it is impossible to catch all of them.
What you have pasted is already useful: This test proves that as you use double values that get ever closer to 0, your test ensures that the output of the function grows ever smaller. That's worth something. You seem to think this particular shade of gray isn't dark enough for you. You want to show that it gets fairly close to infinity.
That's... your choice. You cannot PROVE that it gets to infinity, because computers are like basic arithmetic with some approximation sprinkled in.
There isn't a heck of a lot you can do here: Yes, you can test if the final res value is for example 'less than -1000000'. You can even test if it is literally negative infinity, but there is no guarantee that this is even correct; a function that is defined as 'as the input goes to 0 from the positive end, the output will tend towards negative infinity' is free to do so only for an input so incredibly tiny, that double cannot represent it at all (computers are not magical; doubles take 64 bit, that means there are at most 2^64 unique numbers that double can even represent. 2^64 is a very large number, but it is nothing compared to the doubly-dimensioned infinity that is the concept of 'all numbers imaginable' (there are an infinite amount of numbers between 0 and 1, and an infinite amount of such numbers across the whole number line, after all). Thus, there are plenty of very tiny numbers that double just cannot represent. At all.
For your own sanity, using randomness in a unit test is a bad idea, and for some definitions of the word 'unit test', literally broken (some labour under the notion that unit tests must be reliable or they cannot be considered a unit test, and this is not such a crazy notion if you look at what 'unit test' pragmatically speaking ends up being used for: To have test environments automatically run the unit tests, repeatedly and near continuously, in order to flag down ASAP when someone breaks one. If the CI server runs a unit test 1819 times a day, it would get quite annoying if by sheer random chance, one in 20k times it fails; it would then assume the most recent commit is to blame and no framework out there I know of repeats a unit test a few more times. In the end programming works best if you move away from the notion of proofs and cold hard definitions, and move towards 'do what the community thinks things mean'. For unit tests, that means: Don't use randomness).
Firstly: you cannot reliably test ANY of the properties
function f could break one of the properties in a point x which is not even representable as a double due to limited precision
there is too much points too test, realistically you need to pick a subset of the domain
Secondly:
your definition of a limit is wrong. You check that the function is monotonically decreasing. This is not required by limit definition - the function could fluctuate when approaching the limit. In the general case, I would probably follow Weierstrass definition
But:
By eyeballing the conditions you can quickly notice that a logarithmic function (with any base) meets the criteria. (so the function is indeed monotonically decreasing).
Let's pick natural logarithm, and check its value at smallest x that can be represented as a double:
System.out.println(Double.MIN_VALUE);
System.out.println(Math.log(Double.MIN_VALUE));
System.out.println(Math.log(Double.MIN_VALUE) < -1000);
// prints
4.9E-324
-744.4400719213812
false
As you can see, the value is circa -744, which is really far from minus infinity. You cannot get any closer on double represented on 64 bits.
I want to have a fast log1p function for Java. Java has Math.log1p, but it is apparently too slow for my needs.
I have found this code for log1p here:
http://golang.org/src/pkg/math/log1p.go
for the GO language.
Is it the same like the one in Java, or is it a faster one? (assuming I translate it to java).
Anyone is aware of some other fast implementation of log1p?
Thanks.
In "What Every Computer Scientist Should Know About Floating-Point Arithmetic" (https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) there is a short algorithm to compute log1p within 5 epsilon for 0<=x<3/4 and given certain requirements on the arithmetic
double xp1 = 1+x;
if(xp1==1)
return x;
else
return x * log(xp1) / (xp1-1);
Maybe this performs better on your system than the builtin log1p implementation. However, use it with care (see the paper for things that could go wrong e.g. in extended-base systems) and have some tests ready.
since log1p(x) = Math.log(x+1), finding a natural log fast algorithm is sufficient for what you need.
Fast Natural Logarithm in Java
I have found the following approximation here, and there is not much
information about it except that it is called “Borchardt’s Algorithm”
and it is from the book “Dead Reconing: Calculating without
instruments”. The approximation is not very good (some might say very
bad…), it gets worse the larger the values are. But the approximation
is also a monotonic, slowly increasing function, which is good enough
for my use case.
public static double log(double x) {
return 6 * (x - 1) / (x + 1 + 4 * (Math.sqrt(x))); }
This approximation is 11.7 times faster than Math.log().
See this site. Also, a performance comparison for math libraries in java.
But probably what you need is to link to c++ compiled stuff, detailed here.
Ok so I'm trying to use Apache Commons Math library to compute a double integral, but they are both from negative infinity (to around 1) and it's taking ages to compute. Are there any other ways of doing such operations in java? Or should it run "faster" (I mean I could actually see the result some day before I die) and I'm doing something wrong?
EDIT: Ok, thanks for the answers. As for what I've been trying to compute it's the Gaussian Copula:
So we have a standard bivariate normal cumulative distribution function which takes as arguments two inverse standard normal cumulative distribution functions and I need integers to compute that (I know there's a Apache Commons Math function for standard normal cumulative distribution but I failed to find the inverse and bivariate versions).
EDIT2: as my friend once said "ahhh yes the beauty of Java, no matter what you want to do, someone has already done it" I found everything I needed here http://www.iro.umontreal.ca/~simardr/ssj/ very nice library for probability etc.
There are two problems with infinite integrals: convergence and value-of-convergence. That is, does the integral even converge? If so, to what value does it converge? There are integrals which are guaranteed to converge, but whose value it is not possible to determine exactly (try the integral from 1 to infinity of e^(-x^2)). If it can't be exactly returned, then an exact answer is not possible mathematically, which leaves only approximation. Apache Commons uses several different approximation schemes, but all require the use of finite bounds for correctness.
The best way to get an appropriate answer is to repeatedly evaluate finite integrals, with ever increasing bounds, and compare the results. In pseudo-code, it would look something like this:
double DELTA = 10^-6//your error threshold here
double STEP_SIZE = 10.0;
double oldValue=Double.MAX_VALUE;
double newValue=oldValue;
double lowerBound=-10; //or whatever you want to start with--for (-infinity,1), I'd
//start with something like -10
double upperBound=1;
do{
oldValue = newValue;
lowerBound-= STEP_SIZE;
newValue = integrate(lowerBound,upperBound); //perform your integration methods here
}while(Math.abs(newValue-oldValue)>DELTA);
Eventually, if the integral converges, then you will get enough of the important stuff in that widening the bounds further will not produce meaningful information.
A word to the wise though: this kind of thing can be explosively bad if the integral doesn't converge. In that case, one of two situations can occur: Either your termination condition is never satisfied and you fall into an infinite loop, or the value of the integral oscillates indefinitely around a value, which may cause your termination condition to be incorrectly satisfied (giving incorrect results).
To avoid the first, the best way is to put in some maximum number of steps to take before returning--doing this should stop the potentially infinite loop that can result.
To avoid the second, hope it doesn't happen or prove that the integral must converge (three cheers for Calculus 2, anyone? ;-)).
To answer your question formally, no, there are no other such ways to perform your computation in java. In fact, there are no guaranteed ways of doing it in any language, with any algorithm--the mathematics just don't work out the way we want them to. However, in practice, a lot (though by no means all!) of the practical integrals do converge; its been my experience that only about ~20 iterations will give you an approximation of reasonable accuracy, and Apache should be fast enough to handle that without taking absurdly long.
Suppose you are integrating f(x) over -infinity to 1, then substitute x = 2 - 1/(1-t), and evaluate over the range 0 .. 1. Note check a maths text for how to do the substition, I'm a little rusty and its too late here.
The result of a numerical integration where one of the bounds is infinity has a good chance to be infinity as well. And it will take infinite time to prove it ;)
So you either find an equivalent formula (using real math) that can be computed or your replace the lower bound with a reasonable big negative value and look, if you can get a good estimation for the integral.
If Apache Commons Math could do numerical integration for integrals with infinite bounds in finite time, they wouldn't give it away for free ;-)
Maybe it's your algorithm.
If you're doing something naive like Simpson's rule it's likely to take a very long time.
If you're using Gaussian or log quadrature you might have better luck.
What's the function you're trying to integrate, and what's the algorithm you're using?
I need to calculate Math.exp() from java very frequently, is it possible to get a native version to run faster than java's Math.exp()??
I tried just jni + C, but it's slower than just plain java.
This has already been requested several times (see e.g. here). Here is an approximation to Math.exp(), copied from this blog posting:
public static double exp(double val) {
final long tmp = (long) (1512775 * val + (1072693248 - 60801));
return Double.longBitsToDouble(tmp << 32);
}
It is basically the same as a lookup table with 2048 entries and linear interpolation between the entries, but all this with IEEE floating point tricks. Its 5 times faster than Math.exp() on my machine, but this can vary drastically if you compile with -server.
+1 to writing your own exp() implementation. That is, if this is really a bottle-neck in your application. If you can deal with a little inaccuracy, there are a number of extremely efficient exponent estimation algorithms out there, some of them dating back centuries. As I understand it, Java's exp() implementation is fairly slow, even for algorithms which must return "exact" results.
Oh, and don't be afraid to write that exp() implementation in pure-Java. JNI has a lot of overhead, and the JVM is able to optimize bytecode at runtime sometimes even beyond what C/C++ is able to achieve.
Use Java's.
Also, cache results of the exp and then you can look up the answer faster than calculating them again.
You'd want to wrap whatever loop's calling Math.exp() in C as well. Otherwise, the overhead of marshalling between Java and C will overwhelm any performance advantage.
You might be able to get it to run faster if you do them in batches. Making a JNI call adds overhead, so you don't want to do it for each exp() you need to calculate. I'd try passing an array of 100 values and getting the results to see if it helps performance.
The real question is, has this become a bottle neck for you? Have you profiled your application and found this to be a major cause of slow down? If not, I would recommend using Java's version. Try not to pre-optimize as this will just cause development slow down. You may spend an extended amount of time on a problem that may not be a problem.
That being said, I think your test gave you your answer. If jni + C is slower, use java's version.
Commons Math3 ships with an optimized version: FastMath.exp(double x). It did speed up my code significantly.
Fabien ran some tests and found out that it was almost twice as fast as Math.exp():
0.75s for Math.exp sum=1.7182816693332244E7
0.40s for FastMath.exp sum=1.7182816693332244E7
Here is the javadoc:
Computes exp(x), function result is nearly rounded. It will be correctly rounded to the theoretical value for 99.9% of input values, otherwise it will have a 1 UPL error.
Method:
Lookup intVal = exp(int(x))
Lookup fracVal = exp(int(x-int(x) / 1024.0) * 1024.0 );
Compute z as the exponential of the remaining bits by a polynomial minus one
exp(x) = intVal * fracVal * (1 + z)
Accuracy: Calculation is done with 63 bits of precision, so result should be correctly rounded for 99.9% of input values, with less than 1 ULP error otherwise.
Since the Java code will get compiled to native code with the just-in-time (JIT) compiler, there's really no reason to use JNI to call native code.
Also, you shouldn't cache the results of a method where the input parameters are floating-point real numbers. The gains obtained in time will be very much lost in amount of space used.
The problem with using JNI is the overhead involved in making the call to JNI. The Java virtual machine is pretty optimized these days, and calls to the built-in Math.exp() are automatically optimized to call straight through to the C exp() function, and they might even be optimized into straight x87 floating-point assembly instructions.
There's simply an overhead associated with using the JNI, see also:
http://java.sun.com/docs/books/performance/1st_edition/html/JPNativeCode.fm.html
So as others have suggested try to collate operations that would involve using the JNI.
Write your own, tailored to your needs.
For instance, if all your exponents are of the power of two, you can use bit-shifting. If you work with a limited range or set of values, you can use look-up tables. If you don't need pin-point precision, you use an imprecise, but faster, algorithm.
There is a cost associated with calling across the JNI boundary.
If you could move the loop that calls exp() into the native code as well, so that there is just one native call, then you might get better results, but I doubt it will be significantly faster than the pure Java solution.
I don't know the details of your application, but if you have a fairly limited set of possible arguments for the call, you could use a pre-computed look-up table to make your Java code faster.
There are faster algorithms for exp depending on what your'e trying to accomplish. Is the problem space restricted to a certain range, do you only need a certain resolution, precision, or accuracy, etc.
If you define your problem very well, you may find that you can use a table with interpolation, for instance, which will blow nearly any other algorithm out of the water.
What constraints can you apply to exp to gain that performance trade-off?
-Adam
I run a fitting algorithm and the minimum error of the fitting result is way larger
than the precision of the Math.exp().
Transcendental functions are always much more slower than addition or multiplication and a well-known bottleneck. If you know that your values are in a narrow range, you can simply build a lookup-table (Two sorted array ; one input, one output). Use Arrays.binarySearch to find the correct index and interpolate value with the elements at [index] and [index+1].
Another method is to split the number. Lets take e.g. 3.81 and split that in 3+0.81.
Now you multiply e = 2.718 three times and get 20.08.
Now to 0.81. All values between 0 and 1 converge fast with the well-known exponential series
1+x+x^2/2+x^3/6+x^4/24.... etc.
Take as much terms as you need for precision; unfortunately it's slower if x approaches 1. Lets say you go to x^4, then you get 2.2445 instead of the correct 2.2448
Then multiply the result 2.781^3 = 20.08 with 2.781^0.81 = 2.2445 and you have the result
45.07 with an error of one part of two thousand (correct: 45.15).
It might not be relevant any more, but just so you know, in the newest releases of the OpenJDK (see here), Math.exp should be made an intrinsic (if you don't know what that is, check here).
This will make performance unbeatable on most architectures, because it means the Hotspot VM will replace the call to Math.exp by a processor-specific implementation of exp at runtime. You can never beat these calls, as they are optimized for the architecture...