So, I came across a problem today in my construction of a restricted Boltzmann machine that should be trivial, but seems to be troublingly difficult. Basically I'm initializing 2k values to random doubles between 0 and 1.
What I would like to do is calculate the geometric mean of this data set. The problem I'm running into is that since the data set is so long, multiplying everything together will always result in zero, and doing the proper root at every step will just rail to 1.
I could potentially chunk the list up, but I think that's really gross. Any ideas on how to do this in an elegant way?
In theory I would like to extend my current RBM code to have closer to 15k+ entries, and be able to run the RBM across multiple threads. Sadly this rules out apache commons math (geometric mean method is not synchronized), longs.
Wow, using a big decimal type is way overkill!
Just take the logarithm of everything, find the arithmetic mean, and then exponentiate.
Mehrdad's logarithm solution certainly works. You can do it faster (and possibly more accurately), though:
Compute the sum of the exponents of the numbers, say S.
Slam all of the exponents to zero so that each number is between 1/2 and 1.
Group the numbers into bunches of at most 1000.
For each group, compute the product of the numbers. This will not underflow.
Add the exponent of the product to S and slam the exponent to zero.
You now have about 1/1000 as many numbers. Repeat steps 2 and 3 unless you only have one number.
Call the one remaining number T. The geometric mean is T1/N 2S/N, where N is the size of the input.
It looks like after a sufficient number of multiplications the double precision is not sufficient anymore. Too many leading zeros, if you will.
The wiki page on arbitrary precision arithmetic shows a few ways to deal with the problem. In Java, BigDecimal seems the way to go, though at the expense of speed.
Related
I have a program in which I need to generate random numbers that determine various outputs(To explain the exact reason would be too long). In theory a high number (lets say 100,000) is a valid output for my program, but its most likely(but not entirely impossible) going to end up being useless output.
I'd like to generate random numbers that are weighted to be around a "normalized" number.
For example I'd pick a number (10), and the majority of numbers that are randomly generated will be near 10. But there's a small chance the random number could any integer. I currently just use a range when generating the numbers, but this bothers me since numbers outside this range could potentially be valid and useful output.
Is there an easy way to do this without introducing to much overhead or having to map a percentage chances to individual integers?
For positive integers geometric, negative binomial, or Poisson are all possibilities. Java implementations are readily available for all of these.
I would consider this more of a statistics problem than a programming one. I think you want a logarithmic distribution. Here's an example Java implementation.
I have a question in regard to dealing with small probabilities values in machine learning models.
The standard way to avoid underflow problems which results from multiplying small floating-point numbers is to use log(x) instead of x
suppose that x=0.50 the log of which is log(x)=-0.301029996
to recover x later on the value of exp(log(x)) != x that is
0.740055574 != 0.50
So, how is using the logarithm is useful to deal with underflow??
This has nothing to do with the overflow. In the first log, you compute the log in base 10, instead of the natural logarithm. You can do this:
raise 10^log(x) to get back x, or use the natural logarithm.
(Not at all sure I remember correctly, so please correct me if I'm wrong.)
This is not really about overflow or underflow, but about floating point precision.
The idea is that if you have many very small numbers, multiplying them will produce an extremely small number. Say, you have ten probabilities of 1%, or 0.01 each. Multiply them, and the result is 1e-20. In those regions, floating point precision is not very good, which can introduce errors. In the worst case, the number could be 'rounded' to zero, which would break the entire calculation.
The trick wth logarithms is that after conversion to logarithms,
the values will generally be on a much smaller scale (in the sense of having a smaller exponent),
instead of multiplying the values, you just have to add them, so very small (or very big) numbers do not get even smaller (or bigger) as fast, and
once you've taken the log of all the candudate probabilities, all the other calculations are just additions, not multiplication, which should also be a bit faster.
Example (using Python, because I'm too lazy to fire up Eclipse, but the same works for Java):
>>> x,y,z = 0.01, 0.02, 0.03
>>> x*y*z
6.0000000000000002e-06
>>> log(x)+log(y)+log(z)
-12.023751088736219
>>> exp(log(x)+log(y)+log(z))
6.0000000000000002e-06
Also, as pointed out in the other answer, the problem with your particular calculation is that you seem to use a logarithm base-10 (log10 in Java), for which not exp(x) is the inverse function, but 10^x. Note, however, that in most languages / math libraries, log is in fact the natural logarithm.
I have an assignment (i think a pretty common one) where the goal is to develop a LargeInteger class that can do calculations with.. very large integers.
I am obviously not allowed to use the Java.math.bigeinteger class at all.
Right off the top I am stuck. I need to take 2 Strings from the user (the long digits) and then I will be using these strings to perform the various calculation methods (add, divide, multiply etc.)
Can anyone explain to me the theory behind how this is supposed to work? After I take the string from the user (since it is too large to store in int) am I supposed to break it up maybe into 10 digit blocks of long numbers (I think 10 is the max long maybe 9?)
any help is appreciated.
First off, think about what a convenient data structure to store the number would be. Think about how you would store an N digit number into an int[] array.
Now let's take addition for example. How would you go about adding two N digit numbers?
Using our grade-school addition, first we look at the least significant digit (in standard notation, this would be the right-most digit) of both numbers. Then add them up.
So if the right-most digits were 7 and 8, we would obtain 15. Take the right-most digit of this result (5) and that's the least significant digit of the answer. The 1 is carried over to the next calculation. So now we look at the 2nd least significant digit and add those together along with the carry (if there is no carry, it is 0). And repeat until there are no digits left to add.
The basic idea is to translate how you add, multiply, etc by hand into code when the numbers are stored in some data structure.
I'll give you a few pointers as to what I might do with a similar task, but let you figure out the details.
Look at how addition is done from simple electronic adder circuits. Specifically, they use small blocks of addition combined together. These principals will help. Specifically, you can add the blocks, just remember to carry over from one block to the next.
Your idea of breaking it up into smaller blogs is an excellent one. Just remember to to the correct conversions. I suspect 9 digits is just about right, for the purpose of carry overs, etc.
These tasks will help you with addition and subtraction. Multiplication and Division are a bit trickier, but again, a few tips.
Multiplication is the easier of the tasks, just remember to multiply each block of one number with the other, and carry the zeros.
Integer division could basically be approached like long division, only using whole blocks at a time.
I've never actually build such a class, so hopefully there will be something in here you can use.
Look at the source code for MPI 1.8.6 by Michael Bromberger (a C library). It uses a simple data structure for bignums and simple algorithms. It's C, not Java, but straightforward.
Its division performs poorly (and results in slow conversion of very large bignums to tex), but you can follow the code.
There is a function mpi_read_radix to read a number in an arbitrary radix (up to base 36, where the letter Z is 35) with an optional leading +/- sign, and produce a bignum.
I recently chose that code for a programming language interpreter because although it is not the fastest performer out there, nor the most complete, it is very hackable. I've been able to rewrite the square root myself to a faster version, fix some coding bugs affecting a port to 64 bit digits, and add some missing operations that I needed. Plus the licensing is BSD compatible.
I'm writing a calculator without using decimals (supports only Rational numbers), but I'd like to be able to do a version of square root.
When a square root function is pressed for (say) the number 12, I'd like to just simplify/"reduce" the square root and return 2*sqrt(3)--by it into (2*2) * 3 and extracting the sqrt(2*2) as 2.
I'm using biginteger which has a very nice gcd() method and a pow() method that is restricted to positive parameters (which makes sense unless you are trying to do exactly what I'm trying to do.
I could come up with a few iterative ways to do this but they may take a while with numbers in the hundreds-of-digits range.
I'm hoping there is some cute, simple, non-iterative trick I haven't been exposed to.
Just to clarify: I have the intent to add imaginary numbers so I'm planning on results like this:
17 + 4i √3
-----------
9
Without long streams of decimals.
What you're asking, in essence, is to find all repeated prime factors. Since you're dealing with numbers in the hundreds-of-digits range, I'm going to venture a guess here that there are no good ways to do this in general. Otherwise public key cryptography will all of a sudden be on somewhat shaky ground.
There are a number of methods of computing the square root. With those, you can express the result as an integer plus a remainder less than 1.
Maybe try finding the highest perfect square that is less than your number. That will give you part of the equation, then you would only need to handle the remainder part which is the difference between your number and the perfect square you found. This would degrade as numbers get large as well, but perhaps not as fast.
Ok so I'm trying to use Apache Commons Math library to compute a double integral, but they are both from negative infinity (to around 1) and it's taking ages to compute. Are there any other ways of doing such operations in java? Or should it run "faster" (I mean I could actually see the result some day before I die) and I'm doing something wrong?
EDIT: Ok, thanks for the answers. As for what I've been trying to compute it's the Gaussian Copula:
So we have a standard bivariate normal cumulative distribution function which takes as arguments two inverse standard normal cumulative distribution functions and I need integers to compute that (I know there's a Apache Commons Math function for standard normal cumulative distribution but I failed to find the inverse and bivariate versions).
EDIT2: as my friend once said "ahhh yes the beauty of Java, no matter what you want to do, someone has already done it" I found everything I needed here http://www.iro.umontreal.ca/~simardr/ssj/ very nice library for probability etc.
There are two problems with infinite integrals: convergence and value-of-convergence. That is, does the integral even converge? If so, to what value does it converge? There are integrals which are guaranteed to converge, but whose value it is not possible to determine exactly (try the integral from 1 to infinity of e^(-x^2)). If it can't be exactly returned, then an exact answer is not possible mathematically, which leaves only approximation. Apache Commons uses several different approximation schemes, but all require the use of finite bounds for correctness.
The best way to get an appropriate answer is to repeatedly evaluate finite integrals, with ever increasing bounds, and compare the results. In pseudo-code, it would look something like this:
double DELTA = 10^-6//your error threshold here
double STEP_SIZE = 10.0;
double oldValue=Double.MAX_VALUE;
double newValue=oldValue;
double lowerBound=-10; //or whatever you want to start with--for (-infinity,1), I'd
//start with something like -10
double upperBound=1;
do{
oldValue = newValue;
lowerBound-= STEP_SIZE;
newValue = integrate(lowerBound,upperBound); //perform your integration methods here
}while(Math.abs(newValue-oldValue)>DELTA);
Eventually, if the integral converges, then you will get enough of the important stuff in that widening the bounds further will not produce meaningful information.
A word to the wise though: this kind of thing can be explosively bad if the integral doesn't converge. In that case, one of two situations can occur: Either your termination condition is never satisfied and you fall into an infinite loop, or the value of the integral oscillates indefinitely around a value, which may cause your termination condition to be incorrectly satisfied (giving incorrect results).
To avoid the first, the best way is to put in some maximum number of steps to take before returning--doing this should stop the potentially infinite loop that can result.
To avoid the second, hope it doesn't happen or prove that the integral must converge (three cheers for Calculus 2, anyone? ;-)).
To answer your question formally, no, there are no other such ways to perform your computation in java. In fact, there are no guaranteed ways of doing it in any language, with any algorithm--the mathematics just don't work out the way we want them to. However, in practice, a lot (though by no means all!) of the practical integrals do converge; its been my experience that only about ~20 iterations will give you an approximation of reasonable accuracy, and Apache should be fast enough to handle that without taking absurdly long.
Suppose you are integrating f(x) over -infinity to 1, then substitute x = 2 - 1/(1-t), and evaluate over the range 0 .. 1. Note check a maths text for how to do the substition, I'm a little rusty and its too late here.
The result of a numerical integration where one of the bounds is infinity has a good chance to be infinity as well. And it will take infinite time to prove it ;)
So you either find an equivalent formula (using real math) that can be computed or your replace the lower bound with a reasonable big negative value and look, if you can get a good estimation for the integral.
If Apache Commons Math could do numerical integration for integrals with infinite bounds in finite time, they wouldn't give it away for free ;-)
Maybe it's your algorithm.
If you're doing something naive like Simpson's rule it's likely to take a very long time.
If you're using Gaussian or log quadrature you might have better luck.
What's the function you're trying to integrate, and what's the algorithm you're using?