How to read binary 32-bit fixed-point number in Java - java

I need to read Adobe's signed 32-bit fixed-point number, with 8 bits for the integer part followed by 24 bits for the fractional part. This is a "path point", as defined in Adobe Photoshop File Formats Specification.
This is how I'd do it in Ruby, but I need to do it in Java.
read(1).unpack('c*')[0].to_f +
(read(3).unpack('B*')[0].to_i(2).to_f / (2 ** 24)).to_f

Based on some discussion in the comments of David Wallace's answer, there is a significantly faster way to compute the correct double.
Probably the very fastest way to convert it is like this:
double intFixedPoint24ToDouble(int bits){
return ((double) bits) / 1<<24;
}
The reason that this is faster is because of the way double-precision floating point arithmatic works. In this case, the above sequence can be converted to some extremely simple additions and bit shifting. When this gets run, the actual steps it takes look like this:
Convert an int (bits) to a double (done on FPU, usually). This is quite fast.
Subtract 0x00180000 from the upper 32 bits of that result. This is extremely fast.
A very similar optimization can be applied whenever you multiply or divide any floating point number by any compile-time constant integer that is a power of two.
This compiler optimization does not apply if you are instead dividing by a double, or if you divide by a non-compile-time-constant expression (any expression involving anything other than final compile-time-constant variables, literal numbers, or operators). In that case, it must be performed as a double-precision floating-point division, which is probably the slowest single operation, except for block transfers and advanced mathematical functions.
However, as you can see, 1<<24 is a compile-time constant power of two, and so the optimization does apply in this case.

Read your bytes into an int (which is always 32 bits in Java), then use this. You need double, not float, because single precision floating point won't necessarily be long enough to hold your 32 bit fixed point number.
double toFixedPoint(int bytes){
return bytes / Math.pow(2, 24);
}
If speed is a concern, then work out Math.pow(2,24) outside of this method and store it.

Related

Get nth digit after decimal point

What would be the most efficient way to grab the, say, 207th decimal place of a number? Would it just be x * Math.pow(10,207) % 10?
How's this for python
int((x*(10**n)))%10
What you want is impossible.
The only things in java that work with Math.pow and basic operators are the primitives. The only floating point primitives are float and double. These are IEEE754 floating point numbers; doubles are 64 bits and floats are 32 bits.
A simple principle applies: If you have 64 bits, then you can only represent 2^64 different numbers (it's actually a little less). So, you get about 18446744073709551616 numbers, of all numbers in existence, which actually exist as far as the computer is concerned for doubles. all other numbers do not exist.
So what happens if a mathematical operation (say, 0.1 + 0.2) ends up being a number that doesn't exist? Well, java (this is predicated by the IEEE754 standard; most languages and chips do it this way) will return you the nearest number amongst all the 18446744073709551616 numbers that do exist.
The problem with wanting the 207th digit is that obviously, given that only 18446744073709551616 numbers exist, none of those 18446744073709551616 numbers have that kind of precision. Asking for the 207th digit is therefore completely random. It says nothing about the input number... whatsoever.
Let me repeat that: There are no double values that have a significant 207th digit AT ALL.
If you want 'perfect' representation, with no rounding whatsoever, you want BigDecimal, but note that demanding perfection is tricky. Imagine in basic decimal math (computers are binary, but lets stick with decimal as we're all much more familiar with it, what with our 10 fingers and all), I ask you to only give me perfect answers, and then I ask you to divide 1 by 3.
BigDecimal won't let you do that either, so the ops you can run on BigDecimals without telling BigDecimal in what ways it is allowed to be inprecise leads to exceptions.
If you've set it all up exactly how you wanted it, and you really have a BigDecimal with a 207th digit after the comma, you can use the scale feature, or just the power-of-10 feature to get what you want.
Note BigDecimal is not primitive and therefore does not support the +, %, etc operators.
***Special Note: There is no: "This will handle all situations" answer here as the arbitrary value such as 207 could take the calculations way outside the bounds of possible precision of the variable types involved. My answer as such will only work within the bounds of variable type precision for which 207 is really not possible...
To get the specific digit an arbitrary number (like 207) of places after the decimal point... if you just multiply by factor of 10.. and then take mod 10, the answer (in java) is still a floating point type... not a single digit...
To get a specific digit an arbitrary number (n) of places after the decimal point, without converting to string:
Math.floor(x*Math.pow(10,n)) % 10;
to get 4th digit after 2.987654321
x*Math.pow(10, 4) = 29876.54321
Math.floor(29876.54321) = 29876
29876 % 10 = 6

How accurate is "double-precision floating-point format"?

Let's say, using java, I type
double number;
If I need to use very big or very small values, how accurate can they be?
I tried to read how doubles and floats work, but I don't really get it.
For my term project in intro to programming, I might need to use different numbers with big ranges of value (many orders of magnitude).
Let's say I create a while loop,
while (number[i-1] - number[i] > ERROR) {
//does stuff
}
Does the limitation of ERROR depend on the size of number[i]? If so, how can I determine how small can ERROR be in order to quit the loop?
I know my teacher explained it at some point, but I can't seem to find it in my notes.
Does the limitation of ERROR depend on the size of number[i]?
Yes.
If so, how can I determine how small can ERROR be in order to quit the loop?
You can get the "next largest" double using Math.nextUp (or the "next smallest" using Math.nextDown), e.g.
double nextLargest = Math.nextUp(number[i-1]);
double difference = nextLargest - number[i-1];
As Radiodef points out, you can also get the difference directly using Math.ulp:
double difference = Math.ulp(number[i-1]);
(but I don't think there's an equivalent method for "next smallest")
If you don't tell us what you want to use it for, then we cannot answer anything more than what is standard knowledge: a double in java has about 16 significant digits, (that's digits of the decimal numbering system,) and the smallest possible value is 4.9 x 10-324. That's in all likelihood far higher precision than you will need.
The epsilon value (what you call "ERROR") in your question varies depending on your calculations, so there is no standard answer for it, but if you are using doubles for simple stuff as opposed to highly demanding scientific stuff, just use something like 1 x 10-9 and you will be fine.
Both the float and double primitive types are limited in terms of the amount of data they can store. However, if you want to know the maximum values of the two types, then run the code below with your favourite IDE.
System.out.println(Float.MAX_VALUE);
System.out.println(Double.MAX_VALUE);
double data type is a double-precision 64-bit IEEE 754 floating point (digits of precision could be between 15 to 17 decimal digits).
float data type is a single-precision 32-bit IEEE 754 floating point (digits of precision could be between 6 to 9 decimal digits).
After running the code above, if you're not satisfied with their ranges than I would recommend using BigDecimal as this type doesn't have a limit (rather your RAM is the limit).

Wrong Output Dollar Amount To Coins [duplicate]

double r = 11.631;
double theta = 21.4;
In the debugger, these are shown as 11.631000000000000 and 21.399999618530273.
How can I avoid this?
These accuracy problems are due to the internal representation of floating point numbers and there's not much you can do to avoid it.
By the way, printing these values at run-time often still leads to the correct results, at least using modern C++ compilers. For most operations, this isn't much of an issue.
I liked Joel's explanation, which deals with a similar binary floating point precision issue in Excel 2007:
See how there's a lot of 0110 0110 0110 there at the end? That's because 0.1 has no exact representation in binary... it's a repeating binary number. It's sort of like how 1/3 has no representation in decimal. 1/3 is 0.33333333 and you have to keep writing 3's forever. If you lose patience, you get something inexact.
So you can imagine how, in decimal, if you tried to do 3*1/3, and you didn't have time to write 3's forever, the result you would get would be 0.99999999, not 1, and people would get angry with you for being wrong.
If you have a value like:
double theta = 21.4;
And you want to do:
if (theta == 21.4)
{
}
You have to be a bit clever, you will need to check if the value of theta is really close to 21.4, but not necessarily that value.
if (fabs(theta - 21.4) <= 1e-6)
{
}
This is partly platform-specific - and we don't know what platform you're using.
It's also partly a case of knowing what you actually want to see. The debugger is showing you - to some extent, anyway - the precise value stored in your variable. In my article on binary floating point numbers in .NET, there's a C# class which lets you see the absolutely exact number stored in a double. The online version isn't working at the moment - I'll try to put one up on another site.
Given that the debugger sees the "actual" value, it's got to make a judgement call about what to display - it could show you the value rounded to a few decimal places, or a more precise value. Some debuggers do a better job than others at reading developers' minds, but it's a fundamental problem with binary floating point numbers.
Use the fixed-point decimal type if you want stability at the limits of precision. There are overheads, and you must explicitly cast if you wish to convert to floating point. If you do convert to floating point you will reintroduce the instabilities that seem to bother you.
Alternately you can get over it and learn to work with the limited precision of floating point arithmetic. For example you can use rounding to make values converge, or you can use epsilon comparisons to describe a tolerance. "Epsilon" is a constant you set up that defines a tolerance. For example, you may choose to regard two values as being equal if they are within 0.0001 of each other.
It occurs to me that you could use operator overloading to make epsilon comparisons transparent. That would be very cool.
For mantissa-exponent representations EPSILON must be computed to remain within the representable precision. For a number N, Epsilon = N / 10E+14
System.Double.Epsilon is the smallest representable positive value for the Double type. It is too small for our purpose. Read Microsoft's advice on equality testing
I've come across this before (on my blog) - I think the surprise tends to be that the 'irrational' numbers are different.
By 'irrational' here I'm just referring to the fact that they can't be accurately represented in this format. Real irrational numbers (like π - pi) can't be accurately represented at all.
Most people are familiar with 1/3 not working in decimal: 0.3333333333333...
The odd thing is that 1.1 doesn't work in floats. People expect decimal values to work in floating point numbers because of how they think of them:
1.1 is 11 x 10^-1
When actually they're in base-2
1.1 is 154811237190861 x 2^-47
You can't avoid it, you just have to get used to the fact that some floats are 'irrational', in the same way that 1/3 is.
One way you can avoid this is to use a library that uses an alternative method of representing decimal numbers, such as BCD
If you are using Java and you need accuracy, use the BigDecimal class for floating point calculations. It is slower but safer.
Seems to me that 21.399999618530273 is the single precision (float) representation of 21.4. Looks like the debugger is casting down from double to float somewhere.
You cant avoid this as you're using floating point numbers with fixed quantity of bytes. There's simply no isomorphism possible between real numbers and its limited notation.
But most of the time you can simply ignore it. 21.4==21.4 would still be true because it is still the same numbers with the same error. But 21.4f==21.4 may not be true because the error for float and double are different.
If you need fixed precision, perhaps you should try fixed point numbers. Or even integers. I for example often use int(1000*x) for passing to debug pager.
Dangers of computer arithmetic
If it bothers you, you can customize the way some values are displayed during debug. Use it with care :-)
Enhancing Debugging with the Debugger Display Attributes
Refer to General Decimal Arithmetic
Also take note when comparing floats, see this answer for more information.
According to the javadoc
"If at least one of the operands to a numerical operator is of type double, then the
operation is carried out using 64-bit floating-point arithmetic, and the result of the
numerical operator is a value of type double. If the other operand is not a double, it is
first widened (§5.1.5) to type double by numeric promotion (§5.6)."
Here is the Source

can someone please explain why a double is called a double in Java? [duplicate]

I'm extremely new to Java and just wanted to confirm what Double is? Is it similar to Float or Int? Any help would be appreciated. I also sometimes see the uppercase Double and other times the lower case double. If someone could clarify what this means that'd be great!
Double is a wrapper class,
The Double class wraps a value of the
primitive type double in an object. An
object of type Double contains a
single field whose type is double.
In addition, this class provides
several methods for converting a
double to a String and a String to a
double, as well as other constants and
methods useful when dealing with a
double.
The double data type,
The double data type is a
double-precision 64-bit IEEE 754
floating point. Its range of values is
4.94065645841246544e-324d to 1.79769313486231570e+308d (positive or negative).
For decimal values, this data type is
generally the default choice. As
mentioned above, this data type should
never be used for precise values, such
as currency.
Check each datatype with their ranges : Java's Primitive Data Types.
Important Note : If you'r thinking to use double for precise values, you need to re-think before using it. Java Traps: double
In a comment on #paxdiablo's answer, you asked:
"So basically, is it better to use Double than Float?"
That is a complicated question. I will deal with it in two parts
Deciding between double versus float
On the one hand, a double occupies 8 bytes versus 4 bytes for a float. If you have many of them, this may be significant, though it may also have no impact. (Consider the case where the values are in fields or local variables on a 64bit machine, and the JVM aligns them on 64 bit boundaries.) Additionally, floating point arithmetic with double values is typically slower than with float values ... though once again this is hardware dependent.
On the other hand, a double can represent larger (and smaller) numbers than a float and can represent them with more than twice the precision. For the details, refer to Wikipedia.
The tricky question is knowing whether you actually need the extra range and precision of a double. In some cases it is obvious that you need it. In others it is not so obvious. For instance if you are doing calculations such as inverting a matrix or calculating a standard deviation, the extra precision may be critical. On the other hand, in some cases not even double is going to give you enough precision. (And beware of the trap of expecting float and double to give you an exact representation. They won't and they can't!)
There is a branch of mathematics called Numerical Analysis that deals with the effects of rounding error, etc in practical numerical calculations. It used to be a standard part of computer science courses ... back in the 1970's.
Deciding between Double versus Float
For the Double versus Float case, the issues of precision and range are the same as for double versus float, but the relative performance measures will be slightly different.
A Double (on a 32 bit machine) typically takes 16 bytes + 4 bytes for the reference, compared with 12 + 4 bytes for a Float. Compare this to 8 bytes versus 4 bytes for the double versus float case. So the ratio is 5 to 4 versus 2 to 1.
Arithmetic involving Double and Float typically involves dereferencing the pointer and creating a new object to hold the result (depending on the circumstances). These extra overheads also affect the ratios in favor of the Double case.
Correctness
Having said all that, the most important thing is correctness, and this typically means getting the most accurate answer. And even if accuracy is not critical, it is usually not wrong to be "too accurate". So, the simple "rule of thumb" is to use double in preference to float, UNLESS there is an overriding performance requirement, AND you have solid evidence that using float will make a difference with respect to that requirement.
A double is an IEEE754 double-precision floating point number, similar to a float but with a larger range and precision.
IEEE754 single precision numbers have 32 bits (1 sign, 8 exponent and 23 mantissa bits) while double precision numbers have 64 bits (1 sign, 11 exponent and 52 mantissa bits).
A Double in Java is the class version of the double basic type - you can use doubles but, if you want to do something with them that requires them to be an object (such as put them in a collection), you'll need to box them up in a Double object.

What exactly does Double mean in java?

I'm extremely new to Java and just wanted to confirm what Double is? Is it similar to Float or Int? Any help would be appreciated. I also sometimes see the uppercase Double and other times the lower case double. If someone could clarify what this means that'd be great!
Double is a wrapper class,
The Double class wraps a value of the
primitive type double in an object. An
object of type Double contains a
single field whose type is double.
In addition, this class provides
several methods for converting a
double to a String and a String to a
double, as well as other constants and
methods useful when dealing with a
double.
The double data type,
The double data type is a
double-precision 64-bit IEEE 754
floating point. Its range of values is
4.94065645841246544e-324d to 1.79769313486231570e+308d (positive or negative).
For decimal values, this data type is
generally the default choice. As
mentioned above, this data type should
never be used for precise values, such
as currency.
Check each datatype with their ranges : Java's Primitive Data Types.
Important Note : If you'r thinking to use double for precise values, you need to re-think before using it. Java Traps: double
In a comment on #paxdiablo's answer, you asked:
"So basically, is it better to use Double than Float?"
That is a complicated question. I will deal with it in two parts
Deciding between double versus float
On the one hand, a double occupies 8 bytes versus 4 bytes for a float. If you have many of them, this may be significant, though it may also have no impact. (Consider the case where the values are in fields or local variables on a 64bit machine, and the JVM aligns them on 64 bit boundaries.) Additionally, floating point arithmetic with double values is typically slower than with float values ... though once again this is hardware dependent.
On the other hand, a double can represent larger (and smaller) numbers than a float and can represent them with more than twice the precision. For the details, refer to Wikipedia.
The tricky question is knowing whether you actually need the extra range and precision of a double. In some cases it is obvious that you need it. In others it is not so obvious. For instance if you are doing calculations such as inverting a matrix or calculating a standard deviation, the extra precision may be critical. On the other hand, in some cases not even double is going to give you enough precision. (And beware of the trap of expecting float and double to give you an exact representation. They won't and they can't!)
There is a branch of mathematics called Numerical Analysis that deals with the effects of rounding error, etc in practical numerical calculations. It used to be a standard part of computer science courses ... back in the 1970's.
Deciding between Double versus Float
For the Double versus Float case, the issues of precision and range are the same as for double versus float, but the relative performance measures will be slightly different.
A Double (on a 32 bit machine) typically takes 16 bytes + 4 bytes for the reference, compared with 12 + 4 bytes for a Float. Compare this to 8 bytes versus 4 bytes for the double versus float case. So the ratio is 5 to 4 versus 2 to 1.
Arithmetic involving Double and Float typically involves dereferencing the pointer and creating a new object to hold the result (depending on the circumstances). These extra overheads also affect the ratios in favor of the Double case.
Correctness
Having said all that, the most important thing is correctness, and this typically means getting the most accurate answer. And even if accuracy is not critical, it is usually not wrong to be "too accurate". So, the simple "rule of thumb" is to use double in preference to float, UNLESS there is an overriding performance requirement, AND you have solid evidence that using float will make a difference with respect to that requirement.
A double is an IEEE754 double-precision floating point number, similar to a float but with a larger range and precision.
IEEE754 single precision numbers have 32 bits (1 sign, 8 exponent and 23 mantissa bits) while double precision numbers have 64 bits (1 sign, 11 exponent and 52 mantissa bits).
A Double in Java is the class version of the double basic type - you can use doubles but, if you want to do something with them that requires them to be an object (such as put them in a collection), you'll need to box them up in a Double object.

Categories

Resources