I'm trying to sum 561 logs.
They look like these:
-7.314254939475686
-7.656004233197743
-4.816276208120333
-8.426112454893817
-4.771824445549499
-9.34240318676797
So they're not big numbers. However, when I proceed with summing them I get this:
-2668.179647264475
-2674.7747795369874
-2679.18920466334
-2683.9724816026214
-2690.3342661536453
-Infinity
-Infinity
The code that does it is:
double probspam=0;
for(int j=0;j<words.size();j++)
{
probspam+= Math.log(spam.getClassProbability(words.get(j)));
}
Do you have any idea of how to get around the -Infinity issue and why it happens? Thank you
For some values, spam.getClassProbability() returns 0.0: see the docs:
If the argument is positive zero or negative zero, then the result is negative infinity.
The Javadoc for Math explains why you get -Infinity as a result:
If the argument is positive zero or negative zero, then the result is negative infinity.
You should check your values for zeros, or filter them out prior to applying the log function.
Most likely the value of spam.getClassProbability(words.get(j)) is zero at some point.
Math.log(0.0) returns negative infinity (as the API documentation says).
If the class probability for one word is zero, then you add -Infinity to your sum.
One of your spam candidates is getting a zero from getClassProbability:
System.out.println(Math.log(0));
Output:
-Infinity
This is a special reserved double value, and any operation on it also gives -Infinity, so once it hits the zero, your summing variable will stay -Infinity
To "fix" it, do this:
double wordProbSpam = spam.getClassProbability(words.get(j));
probspam += wordProbSpam > 0 ? Math.log(wordProbSpam) : 0;
Frankly, I think your approach is flawed. I would be simply summing the result of getClassProbability(), not summing its log, because for number between 0-1 the log is negative, which will do weird things to the sum.
I think you've already had this questioned in general - you are taking the log of 0.0. Even if your getClassProbability() is perfect, numerical underflow may still mean it returns zero when mathematically speaking the result was non-zero.
One option is to replace all zeros with the value of Double.ulp(0.0). This is the smallest non-zero value Java can represent (4.9e-324) and has a log around -744.44. This recognises the game breaking concept of a zero probability. After all spammers are very clever so the probability will never truly be zero.
Related
Math.pow(-27.0, 1.0/3) should be equivalent to cbrt(-27) which does return -3. Why does pow return NaN?
It is not integer division and it is not me, I can not think of a reason this should happen.
According to Oracle, one of the special cases for Math.pow(double,double) method is:
If the first argument is finite and less than zero and the second
argument is finite and not an integer, then the result is NaN.
And Math.cbrt(double) works because:
For positive finite x, cbrt(-x) == -cbrt(x); that is, the cube root of
a negative value is the negative of the cube root of that value's
magnitude
It doesn't return an error. I highly recommend you read through the rules given in the Math.pow(double, double) Javadoc. I'll help the first argument is finite and less than zero and the second argument is finite and not an integer, then the result is NaN.
You cannot take the cube root of -27 rationally because it is negative, which explains why math.pow returns NAN. Cbrt(-27) returns -3 because it works a little differently: it takes the magnitude of the value (in this case, ||-27|| = 27), calculates and then reapplies the negative, giving you -3.
In my Android app I am trying to evaluate the expression: (-2^(x)) but can't seem
to get the Math.pow() method from the JAVA library to work. I am able to evaluate (2^(x)) but not the other with negative base.
Here is a look at the Logs. The y values are all returned as NaN.
to evaluate I am using the following statements:
double result = Math.pow(x,exponent);
result = coefficient * result;
I don't know what might seem to be the problem. Perhaps is the way the negative base is set up.
thanks for any advice
return multiplier * Math.pow(base,result);
Assuming that you are trying to compute (-2) to the power x, where x is one of the x values shown in the image: this does not work unless x is an integer. The reason is that the answer is not a real number. (For example, what is (-1)^0.5? That's the square root of -1, which is i, an imaginary number, not a real number.) The x values shown in the image are all non-integers (there appear to be some that are very close to integers but still aren't--there's a non-zero in the last decimal place). Thus, the results all come out as NaN.
This is explicit in the javadoc for Math.pow:
If the first argument is finite and less than zero:
if the second
argument is a finite even integer, the result is equal to the result
of raising the absolute value of the first argument to the power of
the second argument
if the second argument is a finite odd integer,
the result is equal to the negative of the result of raising the
absolute value of the first argument to the power of the second
argument
if the second argument is finite and not an integer, then the
result is NaN.
If what you're doing is something other than (-2)^x, then your question is confusing and needs clarification.
A negative base with a fractional exponent is a complex number with a real and imaginary part. The Math.pow function is not equipped to return a complex number; a double return value can't represent or refer to a complex value.
The problem happens because of the way all languages represent floating point numbers. You can no more represent 0.1 exactly as a binary number than you can 1/3 using base 10.
I am confused about using expm1 function in java
The Oracle java doc for Math.expm1 says:
Returns exp(x) -1. Note that for values of x near 0, the exact sum of
expm1(x) + 1 is much closer to the true result of ex than exp(x).
but this page says:
However, for negative values of x, roughly -4 and lower, the algorithm
used to calculate Math.exp() is relatively ill-behaved and subject to
round-off error. It's more accurate to calculate ex - 1 with a
different algorithm and then add 1 to the final result.
should we use expm1(x) for negative x values or near 0 values?
The implementation of double at the bit level means that you can store doubles near 0 with much more precision than doubles near 1. That's why expm1 can give you much more accuracy for near-zero powers than exp can, because double doesn't have enough precision to store very accurate numbers very close to 1.
I don't believe the article you're citing is correct, as far as the accuracy of Math.exp goes (modulo the limitations of double). The Math.exp specification guarantees that the result is within 1 ulp of the exact value, which means -- to oversimplify a bit -- a relative error of at most 2^-52, ish.
You use expm1(x) for anything close to 0. Positive or negative.
The reason is because exp(x) of anything close to 0 will be very close to 1. Therefore exp(x) - 1 will suffer from destructive cancellation when x is close to 0.
expm1(x) is properly optimized to avoid this destructive cancellation.
From the mathematical side: If exp is implemented using its Taylor Series, then expm1(x) can be done by simply omitting the first +1.
why is Math.floor(Double.MIN_VALUE) == 0 ?
can any one send me the java algorithme of Floor function or at least explain this result please?
Double.MIN_VALUE doesn't mean what you think it means. It means "the smallest positive double value" - so naturally when you take the "floor" of it (largest integer less than or equal to the value), you'll get 0. Documentation:
A constant holding the smallest positive nonzero value of type double, 2-1074. It is equal to the hexadecimal floating-point literal 0x0.0000000000001P-1022 and also equal to Double.longBitsToDouble(0x1L).
I agree that the name is confusing, but it's always worth checking the documentation as soon as you see confusing behaviour.
If you want to get the "lowest" finite double, just use -double.MAX_VALUE.
How can we use them in our codes, and what will cause NaN(not a number)?
Positive infinity means going to infinity in the positive direction -- going into values that are larger and larger in magnitude in the positive direction.
Negative infinity means going to infinity in the negative direction -- going into values that are larger and larger in magnitude in the negative direction.
Not-a-number (NaN) is something that is undefined, such as the result of 0/0.
And the constants from the specification of the Float class:
Float.NEGATIVE_INFINITY
Float.POSITIVE_INFINITY
Float.NaN
More information can be found in the IEEE-754 page in Wikipedia.
Here's a little program to illustrate the three constants:
System.out.println(0f / 0f);
System.out.println(1f / 0f);
System.out.println(-1f / 0f);
Output:
NaN
Infinity
-Infinity
This may be a good reference if you want to learn more about floating point numbers in Java.
Positive Infinity is a positive number so large that it can't be represented normally. Negative Infinity is a negative number so large that it cannot be represented normally. NaN means "Not a Number" and results from a mathematical operation that doesn't yield a number- like dividing 0 by 0.
In Java, the Double and Float classes both have constants to represent all three cases. They are POSITIVE_INFINITY, NEGATIVE_INFINITY, and NaN.
Plus consider this:
double a = Math.pow(10, 600) - Math.pow(10, 600); //==NaN
Mathematically, everybody can see it is 0. But for the machine, it is an "Infinity" - "Infinity" (of same Rank), which is indeed NaN.
1/0 will result in positive infinity.
0/0 will result in Nan. You can use NaN as any other number, eg: NaN+NaN=NaN, NaN+2.0=NaN
-1/0 will result in negative infinity.
Infinity (in java) means that the result of an operation will be such an extremely large positive or negative number that it cannot be represented normally.
The idea is to represent special numbers which can arise naturally from operations on "normal" numbers. You could see infinity (both positive and negative) as "overflow" of the floating point representation, the idea being that in at least some conditions, having such a value returned by a function still gives meaningful result. They still have some ordering properties, for example (so they won't screw sorting operations, for example).
Nan is very particular: if x is Nan, x == x is false (that's actually one way to test for nan, at least in C, again). This can be quite confusing if you are not used to floating point peculiarities. Unless you do scientific computation, I would say that having Nan returned by an operation is a bug, at least in most cases that come to mind. Nan can come for various operations: 0/0, inf - inf, inf/inf, 0 * inf. Nan does not have any ordering property, either.
You can use them as any other number:
e.g:
float min = Float.NEGATIVE_INFINITY;
float max = Float.POSITIVE_INFINITY;
float nan = Float.NaN;
Positive Infinity is a positive number so large that it can't be
represented normally. Negative Infinity is a negative number so large
that it cannot be represented normally. NaN means "Not a Number" and
results from a mathematical operation that doesn't yield a number-
like dividing 0 by 0.
this is not a complete answer(or not clarified enough) - consider this:
double a = Math.pow(10,600) - Math.pow(10,600); //==NaN
mathematically everybody can see it is 0. but for the machine it is an "Infinity" - "Infinity"(of same order) witch is indeed NaN...