Ok so I'm trying to use Apache Commons Math library to compute a double integral, but they are both from negative infinity (to around 1) and it's taking ages to compute. Are there any other ways of doing such operations in java? Or should it run "faster" (I mean I could actually see the result some day before I die) and I'm doing something wrong?
EDIT: Ok, thanks for the answers. As for what I've been trying to compute it's the Gaussian Copula:
So we have a standard bivariate normal cumulative distribution function which takes as arguments two inverse standard normal cumulative distribution functions and I need integers to compute that (I know there's a Apache Commons Math function for standard normal cumulative distribution but I failed to find the inverse and bivariate versions).
EDIT2: as my friend once said "ahhh yes the beauty of Java, no matter what you want to do, someone has already done it" I found everything I needed here http://www.iro.umontreal.ca/~simardr/ssj/ very nice library for probability etc.
There are two problems with infinite integrals: convergence and value-of-convergence. That is, does the integral even converge? If so, to what value does it converge? There are integrals which are guaranteed to converge, but whose value it is not possible to determine exactly (try the integral from 1 to infinity of e^(-x^2)). If it can't be exactly returned, then an exact answer is not possible mathematically, which leaves only approximation. Apache Commons uses several different approximation schemes, but all require the use of finite bounds for correctness.
The best way to get an appropriate answer is to repeatedly evaluate finite integrals, with ever increasing bounds, and compare the results. In pseudo-code, it would look something like this:
double DELTA = 10^-6//your error threshold here
double STEP_SIZE = 10.0;
double oldValue=Double.MAX_VALUE;
double newValue=oldValue;
double lowerBound=-10; //or whatever you want to start with--for (-infinity,1), I'd
//start with something like -10
double upperBound=1;
do{
oldValue = newValue;
lowerBound-= STEP_SIZE;
newValue = integrate(lowerBound,upperBound); //perform your integration methods here
}while(Math.abs(newValue-oldValue)>DELTA);
Eventually, if the integral converges, then you will get enough of the important stuff in that widening the bounds further will not produce meaningful information.
A word to the wise though: this kind of thing can be explosively bad if the integral doesn't converge. In that case, one of two situations can occur: Either your termination condition is never satisfied and you fall into an infinite loop, or the value of the integral oscillates indefinitely around a value, which may cause your termination condition to be incorrectly satisfied (giving incorrect results).
To avoid the first, the best way is to put in some maximum number of steps to take before returning--doing this should stop the potentially infinite loop that can result.
To avoid the second, hope it doesn't happen or prove that the integral must converge (three cheers for Calculus 2, anyone? ;-)).
To answer your question formally, no, there are no other such ways to perform your computation in java. In fact, there are no guaranteed ways of doing it in any language, with any algorithm--the mathematics just don't work out the way we want them to. However, in practice, a lot (though by no means all!) of the practical integrals do converge; its been my experience that only about ~20 iterations will give you an approximation of reasonable accuracy, and Apache should be fast enough to handle that without taking absurdly long.
Suppose you are integrating f(x) over -infinity to 1, then substitute x = 2 - 1/(1-t), and evaluate over the range 0 .. 1. Note check a maths text for how to do the substition, I'm a little rusty and its too late here.
The result of a numerical integration where one of the bounds is infinity has a good chance to be infinity as well. And it will take infinite time to prove it ;)
So you either find an equivalent formula (using real math) that can be computed or your replace the lower bound with a reasonable big negative value and look, if you can get a good estimation for the integral.
If Apache Commons Math could do numerical integration for integrals with infinite bounds in finite time, they wouldn't give it away for free ;-)
Maybe it's your algorithm.
If you're doing something naive like Simpson's rule it's likely to take a very long time.
If you're using Gaussian or log quadrature you might have better luck.
What's the function you're trying to integrate, and what's the algorithm you're using?
Related
A few weeks ago I wrote an exam. The first task was to find the right approximation of a function with the given properties: Properties of the function
I had to check every approximation with the tests i write for the properties. The properties 2, 3 and 4 were no problem. But I don't know how to check for property 1 using a JUnit test written in Java. My approach was to do it this way:
#Test
void test1() {
Double x = 0.01;
Double res = underTest.apply(x);
for(int i = 0; i < 100000; ++i) {
x = rnd(0, x);
Double lastRes = res;
res = underTest.apply(x);
assertTrue(res <= lastRes);
}
}
Where rnd(0, x) is a function call that generates a random number within (0,x].
But this can't be the right way, because it is only checking for the x's getting smaller, the result is smaller than the previous one. I.e. the test would also succeed if the first res is equal to -5 and the next a result little smaller than the previous for all 100000 iterations. So it could be that the result after the 100000 iteration is -5.5 (or something else). That means the test would also succeed for a function where the right limit to 0 is -6. Is there a way to check for property 1?
Is there a way to check for property 1?
Trivially? No, of course not. Calculus is all about what happens at infinity / at infinitely close to any particular point – where basic arithmetic gets stuck trying to divide a number that grows ever tinier by a range that is ever smaller, and basic arithmetic can't do '0 divided by 0'.
Computers are like basic arithmetic here, or at least, you are applying basic arithmetic in the code you have pasted, and making the computer do actual calculus is considerably more complicated than what you do here. As in, years of programming experience required more complicated; I very much doubt you were supposed to write a few hundred thousand lines of code to build an entire computerized math platform to write this test, surely.
More generally, you seem to be labouring under the notion that a test proves anything.
They do not. A test cannot prove correctness, ever. Just like science can never prove anything, they can only disprove (and can build up ever more solid foundations and confirmations which any rational person usually considers more than enough so that they will e.g. take as given that the Law of Gravity will hold, even if there is no proof of this and there never will be).
A test can only prove that you have a bug. It can never prove that you don't.
Therefore, anything you can do here is merely a shades of gray scenario: You can make this test catch more scenarios where you know the underTest algorithm is incorrect, but it is impossible to catch all of them.
What you have pasted is already useful: This test proves that as you use double values that get ever closer to 0, your test ensures that the output of the function grows ever smaller. That's worth something. You seem to think this particular shade of gray isn't dark enough for you. You want to show that it gets fairly close to infinity.
That's... your choice. You cannot PROVE that it gets to infinity, because computers are like basic arithmetic with some approximation sprinkled in.
There isn't a heck of a lot you can do here: Yes, you can test if the final res value is for example 'less than -1000000'. You can even test if it is literally negative infinity, but there is no guarantee that this is even correct; a function that is defined as 'as the input goes to 0 from the positive end, the output will tend towards negative infinity' is free to do so only for an input so incredibly tiny, that double cannot represent it at all (computers are not magical; doubles take 64 bit, that means there are at most 2^64 unique numbers that double can even represent. 2^64 is a very large number, but it is nothing compared to the doubly-dimensioned infinity that is the concept of 'all numbers imaginable' (there are an infinite amount of numbers between 0 and 1, and an infinite amount of such numbers across the whole number line, after all). Thus, there are plenty of very tiny numbers that double just cannot represent. At all.
For your own sanity, using randomness in a unit test is a bad idea, and for some definitions of the word 'unit test', literally broken (some labour under the notion that unit tests must be reliable or they cannot be considered a unit test, and this is not such a crazy notion if you look at what 'unit test' pragmatically speaking ends up being used for: To have test environments automatically run the unit tests, repeatedly and near continuously, in order to flag down ASAP when someone breaks one. If the CI server runs a unit test 1819 times a day, it would get quite annoying if by sheer random chance, one in 20k times it fails; it would then assume the most recent commit is to blame and no framework out there I know of repeats a unit test a few more times. In the end programming works best if you move away from the notion of proofs and cold hard definitions, and move towards 'do what the community thinks things mean'. For unit tests, that means: Don't use randomness).
Firstly: you cannot reliably test ANY of the properties
function f could break one of the properties in a point x which is not even representable as a double due to limited precision
there is too much points too test, realistically you need to pick a subset of the domain
Secondly:
your definition of a limit is wrong. You check that the function is monotonically decreasing. This is not required by limit definition - the function could fluctuate when approaching the limit. In the general case, I would probably follow Weierstrass definition
But:
By eyeballing the conditions you can quickly notice that a logarithmic function (with any base) meets the criteria. (so the function is indeed monotonically decreasing).
Let's pick natural logarithm, and check its value at smallest x that can be represented as a double:
System.out.println(Double.MIN_VALUE);
System.out.println(Math.log(Double.MIN_VALUE));
System.out.println(Math.log(Double.MIN_VALUE) < -1000);
// prints
4.9E-324
-744.4400719213812
false
As you can see, the value is circa -744, which is really far from minus infinity. You cannot get any closer on double represented on 64 bits.
I was wondering how to replace common trigonometric values in an expression. To put this into more context, I am making a calculator that needs to be able to evaluate user inputs such as "sin(Math.PI)", or "sin(6 * math.PI/2)". The problem is that floating point values aren't accurate and when I input sin(Math.PI), the calculator ends up with:
1.2245457991473532E-16
But I want it to return 0. I know I could try replacing in the expression all sin(Math.PI) and other common expressions with 0, 1, etc., except I have to check all multiples of Math.PI/2. Can any of you give me some guidance on how to return the user the proper values?
You're running into the problem that it's not quite possible to express a number like pi in a fixed number of bits, so with the available machine precision the computation gives you a small but non-zero number. Math.PI in any case is only an approximation of PI, which is an irrational number. To clean up your answer for display purposes, one possibility is to use rounding. You could instead try adding +1 and -1 to it which may well round the answer to zero.
This question here may help you further:
Java Strange Behavior with Sin and ToRadians
Your problem is that 1.2245457991473532E-16 is in fact zero for many purposes. What about simply rounding the result yielded by sin? With enough rounding, you may achieve what you want and even get 0.5, -0.5 and other important sin values relatively easily.
If you really want to replace those functions as your title suggests, then you can't do that in Java. Your best bet would be to create an SPI specification for common functions that could either fall back to the standard Java implementation or use your own implementation, which replaces the Java one.
Then users of your solution would need to retrieve one of the implementations using dependency injection of explicit references to a factory method.
I need to evaluate a logarithm of any base, it does not matter, to some precision. Is there an algorithm for this? I program in Java, so I'm fine with Java code.
How to find a binary logarithm very fast? (O(1) at best) might be able to answer my question, but I don't understand it. Can it be clarified?
Use this identity:
logb(n) = loge(n) / loge(b)
Where log can be a logarithm function in any base, n is the number and b is the base. For example, in Java this will find the base-2 logarithm of 256:
Math.log(256) / Math.log(2)
=> 8.0
Math.log() uses base e, by the way. And there's also Math.log10(), which uses base 10.
I know this is extremely late, but this may come to be useful for some since the matter here is precision. One way of doing this is essentially implementing a root-finding algorithm that uses, from its base, the high precision types you might want to be using, consisting of simple +-x/ operations.
I would recommend implementing Newton's method since it demands relatively few iterations and has great convergence. For this sort of application, specifically, I believe it's fair to say it will always provide the correct result provided good input validation is implemented.
Considering a simple constant "a" where
Where a is sought to be solved for such that it obeys, then
We can use the Newton method iteratively to find "a" within any specified tolerance, where each a-ith iteration can be computed by
and the denominator is
,
because that's the first derivative of the function, as necessary for the Newton method. Once this is solved for, "a" is the direct answer for the "a = log,b(x)" problem, obtainable by simple +-x/ operations, so you're already good to go. "Wait, but there's a power there?". Yes. If you can rely on your power function as being accurate enough, then there are no issues with going ahead and using it there. Otherwise, you can further break down the power operation into a series of other +-x/ operations by using these methods to simplify whatever decimal number that is on the power into two integer power operations that can be computed easily with a series of multiplication operations. This process will eventually leave you with nth-roots to solve for, which you can also find with the Newton method. If you do go down that road, you can use this for the newton method
which, as you can see, has to be solved for recursively until you reach b = 1.
Phew, but yeah, that's it. This is the way you can solve the problem by making sure you use high precision types along the whole way with only +-x/ operations. Below is a quick implementation I did in Excel to solve for log,2(3), compared with the solution given by the software's original function. As you can see, I can just keep refining "a" until I reach the tolerance I want by monitoring what the optimization function gives me. In this, I used a=2 as the initial guess, which you can use and should be fine for most cases.
I have a situation where I might need to apply a multiplier to a value in order to get the correct results. This involves computing the value using floating point division.
I'm thinking it would be a good idea to check the values before I perform floating point logic on them to save processor time, however I'm not sure how efficient it will be at run-time either way.
I'm assuming that the if check is 1 or 2 instructions (been a while since assembly class), and that the floating point operation is going to be many more than that.
//Check
if (a != 10) { //1 or 2 instructions?
b *= (float) a / 10; //Many instructions?
}
Value a is going to be '10' most of the time, however there are a few instances where it wont be. Is the floating point division going to take very many cycles even if a is equal to the divisor?
Will the previous code with the if statement execute more efficiently than simply the next one without?
//Don't check
b *= (float) a / 10; //Many instructions?
Granted there wont be any noticable difference either way, however I'm curious as to the behavior of the floating point multiplication when the divisor is equal to the dividend in case things get processor heavy.
Assuming this is in some incredibly tight loop, executed billions of times, so the difference of 1-2 instructions matters, since otherwise you should probably not bother --
Yes you are right to weigh the cost of the additional check each time, versus the savings when the check is true. But my guess is that it has to be true a lot to overcome not only the extra overhead, but the fact that you're introducing a branch, which will ultimately do more to slow you down via a pipeline stall in the CPU in the JIT-compiled code than you'll gain otherwise.
If a == 10 a whole lot, I'd imagine there's a better and faster way to take advantage of that somehow, earlier in the code.
IIRC, floating-point multiplication is much less expensive than division, so this might be faster than both:
b *= (a * 0.1);
If you do end up needing to optimize this code I would recommend using Caliper to do micro benchmarks of your inner loops. It's very hard to predict accurately what sort of effect these small modifications will have. Especially in Java where how the VM behaves is bit of an unknown since in theory it can optimize the code on the fly. Best to try several strategies and see what works.
http://code.google.com/p/caliper/
Im facing a scenario where in ill have to compute some huge math expressions. The expressions in themselves are simple, ie have just the conventional BODMAS fundamental but the numbers that occur as operands are very large, to the tune of 1000 digit numbers. I do know of the BigInteger class of the java.math module but am looking for a different way so that the computation can occur also in a speedy manner. Im a guy still finding his feet in Java, so any pointers or advice in gthis regard would be of great help.
Regards
p1nG
Try it with BigInteger, profile the results with some test calculations, and see if it will work for you, before you look for something more optimized.
Since you say you are new to Java, I would have to suggest you use BigInteger and BigDecimal unless you want to write your own arbitrarily large number handlers. BigInteger and BigDecimal are fast enough for most uses of them. The only time I've had speed issues with them is when dealing with numbers on the order of a million digits.
That is unless you have a specific need for not using BigInteger.
First write the program correctly (using BigFoo) and then determine if optimization is appropriate.
BigInteger/BigFloat will be the most optimized implementation of generalized math that you will possibly get.
If you want it faster, you MIGHT be able to write assembly to use bit-shifting patters for specialized math (well, like divide by 2 tends to be a simple right shift), but if you are doing more than a few different types of equations, that will be very impractical.
BigInteger is only slow in comparison to int, but it will probably be the best you are going to possibly get for operations on numbers of more than 64 bits or so without going to another language--and even then you probably won't get much of an improvement unless that other language is assembly...
I am surprised that equations with 1000 digits have a practical application (except perhaps encryption)
Could you could explain what you are doing and what your speed requirements are?