Are there any fully compliant IEEE754r implementations available for Java that offer support for all the features Java chose to omit (or rather high level languages in general like to omit):
Traps
Sticky flags
Directed rounding modes
Extended/long double
Quad precision
DPD (densly packed decimals)
Clarification before anyone gets it wrong: I'm not looking for the JVM to offer any support for the above, just some classes that do implement the types and operations in software, basically something in the style of the already existing primitve wrapper classes Float/Double.
No, there does not exist a fully compliant IEEE754R implementation. Not only in Java, but in all currently available languages (Status July 2012).
EDIT: The poster asked for IEEE754 R support which is identical to IEEE 754-2008. If I want to add all reasons why there is no such thing, this would be long.
Traps: No, calling own routines with OVERFLOW, UNDERFLOW, INEXACT etc. with SIGFPE is not
a trap. See IEEE754 (the old one) p. 21 for what constitutes a trap.
Signaling NaNs. NaN payload access. Flag access.
Enumerate languages which can do that.
Rounding modes: The new standard defines roundTiesToAway (p. 16) as new rounding mode.
Unfortunately there are AFAIK no processors which supports this mode and
no software emulation either.
Quad precision: Only supported in very few compilers and even less compilers which are not broken.
Densely packed Decimals: Will probably only supported in languages which use decimals,
e.g. COBOL.
Intersection of all sets: Empty set. None. Nothing.
This with the following source implemented functions below:
double nextAfter(double x, double y) - returns the double adjacent to x in the direction of y
double scalb(double x, int e) - computes x*2e quickly
boolean unordered(double c1, double c2) - returns true iff the two cannot be compared numerically (one or both is NaN)
int fpclassify(double value) - classifies a floating-point value into one of five types:
FP_NAN: "not any number", typically the result of illegal operations like 0/0
FP_INFINITY: represents one end of the real line, available by 1/0 or POSITIVE_INFINITY
FP_ZERO: positive or negative zero; they are different, but not so much that it comes up much
FP_SUBNORMAL: a class of numbers very near zero; further explanation would require a detailed examination of the floating-point binary representation
FP_NORMAL: most values you encounter are "normal"
double copySign(double value, double sign) - returns value, possibly with its sign flipped, to match "sign"
double logb754(double value) - extracts the exponent of the value, to compute log2
double logb854(double value) - like logb754(value), but with an IEEE854-compliant variant for subnormal numbers
double logbn(double value) - also computing log2(value), but with a normalizing correction for the subnormals; this is the best log routine
double raise(double x) - not actually an IEEE754 routine, this is an optimized version of nextAfter(x,POSITIVE_INFINITY)
double lower(double x) - not actually an IEEE754 routine, this is an optimized version of nextAfter(x,NEGATIVE_INFINITY)
"All of these routines also have float variants, differing only in argument and return types. The class is org.dosereality.util.IEEE754"
Sun bug reference 2003
Related
I was reading about floating-point NaN values in the Java Language Specification (I'm boring). A 32-bit float has this bit format:
seee eeee emmm mmmm mmmm mmmm mmmm mmmm
s is the sign bit, e are the exponent bits, and m are the mantissa bits. A NaN value is encoded as an exponent of all 1s, and the mantissa bits are not all 0 (which would be +/- infinity). This means that there are lots of different possible NaN values (having different s and m bit values).
On this, JLS §4.2.3 says:
IEEE 754 allows multiple distinct NaN values for each of its single and double floating-point formats. While each hardware architecture returns a particular bit pattern for NaN when a new NaN is generated, a programmer can also create NaNs with different bit patterns to encode, for example, retrospective diagnostic information.
The text in the JLS seems to imply that the result of, for example, 0.0/0.0, has a hardware-dependent bit pattern, and depending on whether that expression was computed as a compile time constant, the hardware it is dependent on might be the hardware the Java program was compiled on or the hardware the program was run on. This all seems very flaky if true.
I ran the following test:
System.out.println(Integer.toHexString(Float.floatToRawIntBits(0.0f/0.0f)));
System.out.println(Integer.toHexString(Float.floatToRawIntBits(Float.NaN)));
System.out.println(Long.toHexString(Double.doubleToRawLongBits(0.0d/0.0d)));
System.out.println(Long.toHexString(Double.doubleToRawLongBits(Double.NaN)));
The output on my machine is:
7fc00000
7fc00000
7ff8000000000000
7ff8000000000000
The output shows nothing out of the expected. The exponent bits are all 1. The upper bit of the mantissa is also 1, which for NaNs apparently indicates a "quiet NaN" as opposed to a "signalling NaN" (https://en.wikipedia.org/wiki/NaN#Floating_point). The sign bit and the rest of the mantissa bits are 0. The output also shows that there was no difference between the NaNs generated on my machine and the constant NaNs from the Float and Double classes.
My question is, is that output guaranteed in Java, regardless of the CPU of the compiler or VM, or is it all genuinely unpredictable? The JLS is mysterious about this.
If that output is guaranteed for 0.0/0.0, are there any arithmetic ways of producing NaNs that do have other (possibly hardware-dependent?) bit patterns? (I know intBitsToFloat/longBitsToDouble can encode other NaNs, but I'd like to know if other values can occur from normal arithmetic.)
A followup point: I've noticed that Float.NaN and Double.NaN specify their exact bit pattern, but in the source (Float, Double) they are generated by 0.0/0.0. If the result of that division is really dependent on the hardware of the compiler, it seems like there is a flaw there in either the spec or the implementation.
This is what §2.3.2 of the JVM 7 spec has to say about it:
The elements of the double value set are exactly the values that can be represented
using the double floating-point format defined in the IEEE 754 standard, except
that there is only one NaN value (IEEE 754 specifies 253-2 distinct NaN values).
and §2.8.1:
The Java Virtual Machine has no signaling NaN value.
So technically there is only one NaN. But §4.2.3 of the JLS also says (right after your quote):
For the most part, the Java SE platform treats NaN values of a given type as though collapsed into a single canonical value, and hence this specification normally refers to an arbitrary NaN as though to a canonical value.
However, version 1.3 of the Java SE platform introduced methods enabling the programmer to distinguish between NaN values: the Float.floatToRawIntBits and Double.doubleToRawLongBits methods. The interested reader is referred to the specifications for the Float and Double classes for more information.
Which I take to mean exactly what you and CandiedOrange propose: It is dependent on the underlying processor, but Java treats them all the same.
But it gets better: Apparently, it is entirely possible that your NaN values are silently converted to different NaNs, as described in Double.longBitsToDouble():
Note that this method may not be able to return a double NaN with exactly same bit pattern as the long argument. IEEE 754 distinguishes between two kinds of NaNs, quiet NaNs and signaling NaNs. The differences between the two kinds of NaN are generally not visible in Java. Arithmetic operations on signaling NaNs turn them into quiet NaNs with a different, but often similar, bit pattern. However, on some processors merely copying a signaling NaN also performs that conversion. In particular, copying a signaling NaN to return it to the calling method may perform this conversion. So longBitsToDouble may not be able to return a double with a signaling NaN bit pattern. Consequently, for some long values, doubleToRawLongBits(longBitsToDouble(start)) may not equal start. Moreover, which particular bit patterns represent signaling NaNs is platform dependent; although all NaN bit patterns, quiet or signaling, must be in the NaN range identified above.
For reference, there is a table of the hardware-dependant NaNs here. In summary:
- x86:
quiet: Sign=0 Exp=0x7ff Frac=0x80000
signalling: Sign=0 Exp=0x7ff Frac=0x40000
- PA-RISC:
quiet: Sign=0 Exp=0x7ff Frac=0x40000
signalling: Sign=0 Exp=0x7ff Frac=0x80000
- Power:
quiet: Sign=0 Exp=0x7ff Frac=0x80000
signalling: Sign=0 Exp=0x7ff Frac=0x5555555500055555
- Alpha:
quiet: Sign=0 Exp=0 Frac=0xfff8000000000000
signalling: Sign=1 Exp=0x2aa Frac=0x7ff5555500055555
So, to verify this you would really need one of these processors and go try it out. Also any insights on how to interpret the longer values for the Power and Alpha architectures are welcome.
The way I read the JLS here the exact bit value of a NaN is dependent on who/what made it and since the JVM didn't make it, don't ask them. You might as well ask them what an "Error code 4" string means.
The hardware produces different bit patterns meant to represent different kinds of NaN's. Unfortunately the different kinds hardware produce different bit patterns for the same kinds of NaN's. Fortunately there is a standard pattern that Java can use to at least tell that it is some kind of NaN.
It's like Java looked at the "Error code 4" string and said, "We don't know what 'code 4' means on your hardware, but there was the word 'error' in that string, so we think it's an error."
The JLS tries to give you a chance to figure it out on your own though:
"However, version 1.3 of the Java SE platform introduced methods enabling the programmer to distinguish between NaN values: the Float.floatToRawIntBits and Double.doubleToRawLongBits methods. The interested reader is referred to the specifications for the Float and Double classes for more information."
Which looks to me like a C++ reinterpret_cast. It's Java giving you a chance to analyze the NaN yourself in case you happen to know how its signal was encoded. If you want to track down the hardware specs so you can predict what different events should produce which NaN bit patterns you are free to do so but you are outside the uniformity the JVM was meant to give us. So expect it might change from hardware to hardware.
When testing if a number is NaN we check if it's equal to itself since it's the only number that isn't. This isn't to say that the bits are different. Before comparing the bits the JVM tests for the many bit patterns that say it's a NaN. If it's any of those patterns then it reports that it's not equal, even if the bits of the two operands really are the same (and even if they're different).
Back in 1964, when pressed to give an exact definition for pornography, U.S. Supreme Court Justice Stewart famously said, “I Know It When I See It”. I think of Java as doing the same thing with NaN's. Java can't tell you anything that a "signaling" NaN might be signaling cause it doesn't know how that signal was encoded. But it can look at the bits and tell it's a NaN of some kind since that pattern follows one standard.
If you happen to be on hardware that encodes all NaN's with uniform bits you'll never prove that Java is doing anything to make NaN's have uniform bits. Again, the way I read the JLS, they are outright saying you are on your own here.
I can see why this feels flaky. It is flaky. It's just not Java's fault. I'll lay odds that some where out there some enterprising hardware manufactures came up with wonderfully expressive signaling NaN bit patterns but they failed to get it adopted as a standard widely enough that Java can count on it. That's what's flaky. We have all these bits reserved for signalling what kind of NaN we have and can't use them because we don't agree what they mean. Having Java beat NaN's into a uniform value after the hardware makes them would only destroy that information, harm performance, and the only payoff is to not seem flaky. Given the situation, I'm glad they realized they could cheat their way out of the problem and define NaN as not being equal to anything.
Here is a program demonstrating different NaN bit patterns:
public class Test {
public static void main(String[] arg) {
double myNaN = Double.longBitsToDouble(0x7ff1234512345678L);
System.out.println(Double.isNaN(myNaN));
System.out.println(Long.toHexString(Double.doubleToRawLongBits(myNaN)));
final double zeroDivNaN = 0.0 / 0.0;
System.out.println(Double.isNaN(zeroDivNaN));
System.out
.println(Long.toHexString(Double.doubleToRawLongBits(zeroDivNaN)));
}
}
output:
true
7ff1234512345678
true
7ff8000000000000
Regardless of what the hardware does, the program can itself create NaNs that may not be the same as e.g. 0.0/0.0, and may have some meaning in the program.
The only other NaN value that I could generate with normal arithmetic operations so far is the same but with the sign changed:
public static void main(String []args) {
Double tentative1 = 0d/0d;
Double tentative2 = Math.sqrt(-1d);
System.out.println(tentative1);
System.out.println(tentative2);
System.out.println(Long.toHexString(Double.doubleToRawLongBits(tentative1)));
System.out.println(Long.toHexString(Double.doubleToRawLongBits(tentative2)));
System.out.println(tentative1 == tentative2);
System.out.println(tentative1.equals(tentative2));
}
Output:
NaN
NaN
7ff8000000000000
fff8000000000000
false
true
I understand the conversion in strictfp mode is used for portability, not for accuracy as noted in this question. However, The Java Language Specification, Java SE 8 Edition says that
A widening primitive conversion from float to double that is not strictfp may lose information about the overall magnitude of the converted value.
which sounds to me that a widening primitive conversion that is strictfp is intended for accuracy. Furthermore, I suspect that double can represent literally all the values that float can take, in which I see no reason why a conversion from float to double is an issue here.
EDIT:
The wording "... may lose information about..." in the spec gave me a feeling of the conversion in non-strictfp mode lakcing some kind of accuracy as compared to that in strictfp mode. It did not make sense to me because the conversion in non-strictfp mode possibly makes use of intermediate values in higher precision. This question was first written based on this understanding, and might not look as desirable as you expected.
Intel's IA64 architecture uses a fixed floating point register format, one sign bit, 17 exponent bits, and 64 significand bits. When a floating point number is stored from one of those registers into a 32 or 64 bit variable, it has to be converted.
Java aims for consistent results, so it is undesirable for the value of an expression to change depending on whether an intermediate result was held in a register or as an in-memory float or double.
Originally, Java simply insisted on all calculations being done as though all intermediate results were stored. That turned out to give poor performance due to the difficulty of forcing the exponent into the right range on each calculation. The solution was to give the programmer the choice between a fully consistent strictfp mode, and a more relaxed mode in which an exponent could go outside the range for the expression type without the value being forced to a zero or infinity.
Suppose, in relaxed mode, an in-register float has an exponent outside the double exponent range, and is being converted to an in-memory double. That conversion will force the value to a zero or infinity, losing its magnitude. That is an exception to the general rule that widening arithmetic conversions preserve the overall magnitude.
If the same calculation were done in strictfp mode, the float would not have been allowed to have an exponent outside the float exponent range. Whatever calculation generated it would have forced the value to a zero or infinity. Every float value is exactly representable in double, so the conversion does not change the value at all, let alone losing the overall magnitude.
I'm currently in the need of an epsilon of type double (preferred are constants in java's libraries instead of own implementations/definitions)
As far as I can see Double has MIN_VALUE and MAX_VALUE as static members.
Why there is no EPSILON?
What would a epsilon<double> be?
Are there any differences to a std::numeric_limits< double >::epsilon()?
Epsilon: The difference between 1 and the smallest value greater than 1 that is representable for the data type.
I'm presuming you mean epsilon in the sense of the error in the value. I.e this.
If so then in Java it's referred to as ULP (unit in last place). You can find it by using the java.lang.Math package and the Math.ulp() method. See javadocs here.
The value isn't stored as a static member because it will be different depending on the double you are concerned with.
EDIT: By the OP's definition of epsilon now in the question, the ULP of a double of value 1.0 is 2.220446049250313E-16 expressed as a double. (I.e. the return value of Math.ulp(1.0).)
By the edit of the question, explaining what is meant by EPSILON, the question is now clear, but it might be good to point out the following:
I believe that the original question was triggered by the fact that in C there is a constant DBL_EPSILON, defined in the standard header file float.h, which captures what the question refers to. The same standard header file contains definitions of constants DBL_MIN and DBL_MAX, which clearly correspond to Double.MIN_VALUE and Double.MAX_VALUE, respectively, in Java. Therefore it would be natural to assume that Java, by analogy, should also contain a definition of something like Double.EPSILON with the same meaning as DBL_EPSILON in C. Strangely, however, it does not. Even more strangely, C# does contain a definition double.EPSILON, but it has a different meaning, namely the one that is covered in C by the constant DBL_MIN and in Java by Double.MIN_VALUE. Certainly a situation that can lead to some confusion, as it makes the term EPSILON ambiguous.
Without using Math package:
Double.longBitsToDouble(971l << 52)
That's 2^-52 (971 = 1023(double exponent bias) - 52, shift by 52 is because mantissa is stored on the first 52 bits).
It's a little quicker than Math.ulp(1.0);
Also, if you need this to compare double values, there's a really helpful article: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
double: The double data type is a double-precision 64-bit IEEE 754 floating point. Its range of values is beyond the scope of this discussion, but is specified in the Floating-Point Types, Formats, and Values section of the Java Language Specification. For decimal values, this data type is generally the default choice. As mentioned above, this data type should never be used for precise values, such as currency.
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
looking up at IEEE 754 you'll find the precision of epsion...
http://en.wikipedia.org/wiki/IEEE_floating_point
binary64:
Base(b)=2
precision(p)=53
machineEpsion(e) (b^-(p-1))/2=2^-53=1.11e-16
machineEpsilon(e) b^-(p-1)=2^-52=2.22e-16
Marking a class as strictfp means that any method code in the class will conform to the IEEE 754 standard rules for floating points.
What does this means? I really don't get it.
Some processors have capabilities for slightly more accurate arithmetic - e.g. 80 bits instead of 64 bits. Likewise on some processors it may be faster to use double arithmetic even where logically float arithmetic is used. So for example:
float foo(float x, float y)
{
x = x * 1.2345f;
y = y * 2.3456f;
return x * y;
}
Here the intermediate operations could potentially be optimized to use 80-bit arithmetic throughout, only falling back to a 32-bit value when it's returned... even though the operation described in source code for each multiplication is theoretically a 32-bit multiplication.
When you use strictfp, you turn off those potential optimizations. It's rarely necessary, but it means that you're guaranteed that the exact same set of arithmetic operations - when given the exact same set of inputs - will give the exact same set of results regardless of implementation.
Its all about precision.
The (IEEE 754) is a technical standard for floating-point computation (from wiki)
If you don't use strictfp, the JVM implementation is free to use extra precision where available.
For some applications, a programmer might need every platform to have precisely the same floating-point behaviour, even on platforms that could handle greater precision. However, if this level of precision is not necessary the VM does not use intermediates by default.
http://en.wikipedia.org/wiki/Strictfp
Why the inconsistency?
There is no inconsistency: the methods are simply designed to follow different specifications.
long round(double a)
Returns the closest long to the argument.
double floor(double a)
Returns the largest (closest to positive infinity) double value that is less than or equal to the argument and is equal to a mathematical integer.
Compare with double ceil(double a)
double rint(double a)
Returns the double value that is closest in value to the argument and is equal to a mathematical integer
So by design round rounds to a long and rint rounds to a double. This has always been the case since JDK 1.0.
Other methods were added in JDK 1.2 (e.g. toRadians, toDegrees); others were added in 1.5 (e.g. log10, ulp, signum, etc), and yet some more were added in 1.6 (e.g. copySign, getExponent, nextUp, etc) (look for the Since: metadata in the documentation); but round and rint have always had each other the way they are now since the beginning.
Arguably, perhaps instead of long round and double rint, it'd be more "consistent" to name them double round and long rlong, but this is argumentative. That said, if you insist on categorically calling this an "inconsistency", then the reason may be as unsatisfying as "because it's inevitable".
Here's a quote from Effective Java 2nd Edition, Item 40: Design method signatures carefully:
When in doubt, look to the Java library APIs for guidance. While there are plenty of inconsistencies -- inevitable, given the size and scope of these libraries -- there are also fair amount of consensus.
Distantly related questions
Why does int num = Integer.getInteger("123") throw NullPointerException?
Most awkward/misleading method in Java Base API ?
Most Astonishing Violation of the Principle of Least Astonishment
floor would have been chosen to match the standard c routine in math.h (rint, mentioned in another answer, is also present in that library, and returns a double, as in java).
but round was not a standard function in c at that time (it's not mentioned in C89 - c identifiers and standards; c99 does define round and it returns a double, as you would expect). it's normal for language designers to "borrow" ideas, so maybe it comes from some other language? fortran 77 doesn't have a function of that name and i am not sure what else would have been used back then as a reference. perhaps vb - that does have Round but, unfortunately for this theory, it returns a double (php too). interestingly, perl deliberately avoids defining round.
[update: hmmm. looks like smalltalk returns integers. i don't know enough about smalltalk to know if that is correct and/or general, and the method is called rounded, but it might be the source. smalltalk did influence java in some ways (although more conceptually than in details).]
if it's not smalltalk, then we're left with the hypothesis that someone simply chose poorly (given the implicit conversions possible in java it seems to me that returning a double would have been more useful, since then it can be used both while converting types and when doing floating point calculations).
in other words: functions common to java and c tend to be consistent with the c library standard at the time; the rest seem to be arbitrary, but this particular wrinkle may have come from smalltalk.
I agree, that it is odd that Math.round(double) returns long. If large double values are cast to long (which is what Math.round implicitly does), Long.MAX_VALUE is returned. An alternative is using Math.rint() in order to avoid that. However, Math.rint() has a somewhat strange rounding behavior: ties are settled by rounding to the even integer, i.e. 4.5 is rounded down to 4.0 but 5.5 is rounded up to 6.0). Another alternative is to use Math.floor(x+0.5). But be aware that 1.5 is rounded to 2 while -1.5 is rounded to -1, not -2. Yet another alternative is to use Math.round, but only if the number is in the range between Long.MIN_VALUE and Long.MAX_VALUE. Double precision floating point values outside this range are integers anyhow.
Unfortunately, why Math.round() returns long is unknown. Somebody made that decision, and he probably never gave an interview to tell us why. My guess is, that Math.round was designed to provide a better way (i.e., with rounding) for converting doubles to longs.
Like everyone else here I also don't know the answer, but thought someone might find this useful. I noticed that if you want to round a double to an int without casting, you can use the two round implementations long round(double) and int round(float) together:
double d = something;
int i = Math.round(Math.round(d));