Guava's UnsignedLong: Why does it XOR Long.MIN_VALUE

Guava's UnsignedLong: Why does it XOR Long.MIN_VALUE - java

I was reading Unsigned arithmetic in Java which nicely explained how to do unsigned longs using the following method
public static boolean isLessThanUnsigned(long n1, long n2) {
return (n1 < n2) ^ ((n1 < 0) != (n2 < 0));
}
However I'm confused by Guava's implementation. I'm hoping someone can shed some light on it.
/**
* A (self-inverse) bijection which converts the ordering on unsigned longs to the ordering on
* longs, that is, {#code a <= b} as unsigned longs if and only if {#code flip(a) <= flip(b)} as
* signed longs.
*/
private static long flip(long a) {
return a ^ Long.MIN_VALUE;
}
/**
* Compares the two specified {#code long} values, treating them as unsigned values between
* {#code 0} and {#code 2^64 - 1} inclusive.
*
* #param a the first unsigned {#code long} to compare
* #param b the second unsigned {#code long} to compare
* #return a negative value if {#code a} is less than {#code b}; a positive value if {#code a} is
* greater than {#code b}; or zero if they are equal
*/
public static int compare(long a, long b) {
return Longs.compare(flip(a), flip(b));
}

Perhaps some diagrams help. I'll use 8 bit numbers to keep the constants short, it generalizes to ints and longs in the obvious way.
Absolute view:
Unsigned number line:
[ 0 .. 0x7F ][ 0x80 .. 0xFF]
Signed number line:
[ 0x80 .. 0xFF ][ 0 .. 0x7F]
Relative view:
Unsigned number line:
[ 0 .. 0x7F ][ 0x80 .. 0xFF]
Signed number line:
[ 0x80 .. 0xFF ][ 0 .. 0x7F]
So signed and unsigned numbers largely have the same relative order, except that the two ranges with the sign bit set and the sign bit not set are swapped in order. Inverting that bit of course swaps the order.
x ^ Long.MIN_VALUE inverts the sign bit for a long.
This trick is applicable for any operation that depends only on the relative order, for example comparisons and directly related operations such as min and max. It does not work for operations that depend on the absolute magnitude of the numbers, such as division.

Consider the bits that make up a long type. Performing ^ Long.MIN_VALUE converts a regular two's complement signed representation that holds [-263, 263-1] values into an unsigned representation that holds [0, 264-1] values.
You can see the process by taking the smallest long value, adding one and "flipping" while inspecting the bits (e.g. with Long.toBinaryString()):
Long.MIN_VALUE ^ Long.MIN_VALUE is 00..00 (all 64 bits unset)
(Long.MIN_VALUE + 1) ^ Long.MIN_VALUE is 00..01
(Long.MIN_VALUE + 2) ^ Long.MIN_VALUE is 00..10
(Long.MIN_VALUE + 3) ^ Long.MIN_VALUE is 00..11
and so on until:
Long.MAX_VALUE ^ Long.MIN_VALUE is 11..11 (all 64 bits set)
The "flip" is done because Longs.compare() needs input as unsigned [0, 264-1] values as per the method javadoc in your example:
/**
* Compares the two specified {#code long} values, treating them as unsigned values between
* {#code 0} and {#code 2^64 - 1} inclusive.
*

Related

Make an binary addition behave like a (packed-) decimal addition

I'm currently working on a restrictive environment where the only types allowed are :
byte, byte[], short, short[].
I am almost certain that I can't import external libraries, since I'm working on a JavaCard, and have already tried such things, which didn't turn out good.
So, here I have to manage a byte array with a size of 6 bytes, which represents the balance of the card (in euros), and the last byte are cents, but this is not important now.
Given that I don't have access to integers, I don't know how I can add two byte in the way I want.
Let's have an example :
User puts in (Add) 0x00 0x00 0x00 0x00 0x00 0x57, which, to the user, means add 57 cents. Let's now say that the balance is 0x00 ... 0x26.
I want to be able to create a method that could modify the balance array (with carries), in a way that after adding, the cents are 83, and represented 0x83.
I have to handle subtractions as well, but I guess I can figure that out for myself afterwards.
My first guess was mask out each digit from each byte, and work separately at first, but that got me nowhere.
I'm obviously not asking for a full solution, because I believe my problem is almost impossible, but if you have any thoughts on how to approach this, I'd be very grateful.
So how can I add two arrays containing binary coded decimals to each other on Java Card?
EDIT 1: A common array would look like this :
{ 0x00 , 0x00 , 0x01, 0x52, 0x45, 0x52}
and would represent 15 254€ and 52 cents in a big-endian BCD encoded integer.
EDIT 2 : Well, as I suspected, my card doesn't support the package framework.math, so I can't use BCDUtil or BigNumbers, which would've been useful.

The below implementation goes through the BCD byte-by-byte and digit by digit. This allows it to use 8 bit registers that are efficient on most smart card processors. It explicitly allows for the carry to be handled correctly and returns the carry in case of overflow.
/**
* Adds two values to each other and stores it in the location of the first value.
* The values are represented by big endian, packed BCD encoding with a static size.
* No validation is performed if the arrays do indeed contain packed BCD;
* the result of the calculation is indeterminate if the arrays contain anything other than packed BCD.
* This calculation should be constant time;
* it should only leak information about the values if one of the basic byte calculations leaks timing information.
*
* #param x the first buffer containing the packed BCD
* #param xOff the offset in the first buffer of the packed BCD
* #param y the second buffer containing the packed BCD
* #param yOff the offset in the second buffer of the packed BCD
* #param packedBytes the number of bytes that contain two BCD digits in both buffers
* #return zero or one depending if the full calculation generates a carry, i.e. overflows
* #throws ArrayIndexOutOfBoundsException if a packed BCD value is out of bounds
*/
public static byte addPackedBCD(byte[] x, short xOff, byte[] y, short yOff, short packedBytes) {
// declare temporary variables, we'll handle bytes only
byte xd, yd, zd, z;
// set the initial carry to zero, c will only be 0 or 1
byte c = 0;
// go through the bytes backwards (least significant bytes first)
// as we need to take the carry into account
for (short i = (short) (packedBytes - 1); i >= 0; i--) {
// retrieve the two least significant digits the current byte in the arrays
xd = (byte) (x[xOff + i] & 0b00001111);
yd = (byte) (y[yOff + i] & 0b00001111);
// zd is the addition of the lower two BCD digits plus the carry
zd = (byte) (xd + yd + c);
// c is set to 1 if the final number is larger than 10, otherwise c is set to zero
// i.e. the value is at least 16 or the value is at least 8 + 4 or 8 + 2
c = (byte) (((zd & 0b10000) >> 4)
| (((zd & 0b01000) >> 3)
& (((zd & 0b00100) >> 2) | ((zd & 0b00010) >> 1))));
// subtract 10 if there is a carry and then assign the value to z
z = (byte) (zd - c * 10);
// retrieve the two most significant digits the current byte in the arrays
xd = (byte) ((x[xOff + i] >>> 4) & 0b00001111);
yd = (byte) ((y[yOff + i] >>> 4) & 0b00001111);
// zd is the addition of the higher two BCD digits plus the carry
zd = (byte) (xd + yd + c);
// c is set to 1 if the final number is larger than 10, otherwise c is set to zero
// i.e. the value is at least 16 or the value is at least 8 + 4 or 8 + 2
c = (byte) (((zd & 0b10000) >> 4)
| (((zd & 0b01000) >> 3)
& (((zd & 0b00100) >> 2) | ((zd & 0b00010) >> 1))));
// subtract 10 if there is a carry and then assign the value to the 4 msb digits of z
z |= (zd - c * 10) << 4;
// assign z to the first byte array
x[xOff + i] = z;
}
// finally, return the last carry
return c;
}
Note that I have only tested this for two arrays containing a single byte / two BCD digits. However, the carry works and as all 65536 combinations have been tested the approach must be valid.
To top it off, you may want to test the correctness of the packed BCD encoding before performing any operation. The same approach could be integrated into the for loop of the addition for higher efficiency. Tested against all single byte values as in the previous block of code.
/**
* Checks if the buffer contains a valid packed BCD representation.
* The values are represented by packed BCD encoding with a static size.
* This calculation should be constant time;
* it should only leak information about the values if one of the basic byte calculations leaks timing information.
*
* #param x the buffer containing the packed BCD
* #param xOff the offset in the buffer of the packed BCD
* #param packedBytes the number of bytes that packed BCD in the buffer
* #return true if and only if the value is valid, packed BCD
* #throws ArrayIndexOutOfBoundsException if the packed BCD value is out of bounds
*/
public static boolean validPackedBCD(byte[] x, short xOff, short packedBytes) {
// declare temporary variable, we'll handle bytes only
byte xdd;
// c is the correctness of the digits; it will be off-zero if invalid encoding is encountered
byte c = 0;
short end = (short) (xOff + packedBytes);
// go through the bytes, reusing xOff for efficiency
for (; xOff < end; xOff++) {
xdd = x[xOff];
// c will be set to non-zero if the high bit of each encoded decimal is set ...
// and either one of the two decimals is set as that would indicate a value of 10 or higher
// i.e. only values 8 + 4 or 8 + 2 are 10 or higher if you look at the bits in the digits
c |= ((xdd & 0b1000_1000) >> 2) & (((xdd & 0b0100_0100) >> 1) | (xdd & 0b0010_0010));
}
// finally, return the result - c is zero in case all bytes encode two packed BCD values
return c == 0;
}
Note that this one is also implemented in BCDUtil in Java Card. I do however dislike that class design and I don't think it is that well documented, so I decided for a different tack on it. It's also in javacardx which means that it could theoretically throw an exception if not implemented.
The answer of EJP isn't applicable, other than to indicate that the used encoding is that of packed BCD. The addition that Jones proposes is fast, but it doesn't show how to handle the carry between the 32 bit words:
Note that the most significant digit of the sum will exceed 9 if there should have been a carry out of that position. Furthermore, there is no easy way to detect this carry!
This is of course required for Java Card as it only has 16 bit signed shorts a base type integer. For that reason the method that Jones proposes is not directly applicable; any answer that utilizes the approach of Jones should indicate how to handle the carry between the bytes or shorts used in Java Card.

This is not really hex, it is packed decimal, one of the forms of BCD.
You can do packed-decimal addition and subtraction a byte at a time, with internal carry. There is a trick of adding 6 to force a carry into the MS digit, if necessary, and then masking and shifting it out again if it carried, to correct the LS digit. It's too broad to explain here.
See Jones on BCD arithmetic which shows how to efficiently use bit operands on 32 bit words to implement packed decimal arithmetic.

How to create x-bit binary number with number of leftmost bits set

I would like to create to create an x-bit binary number and specify the number of leftmost bits set.
I.e To create a 8-bit number with 6 most left bits all set as in 1111 1100
Similarly to create 16 bit - number with the 8 left bits all set resulting in 1111 1111 0000 0000
Need to be able to do this for large numbers (128 bits). Is there an existing way to do this using the core libs?
Thanks

Consider using a BitSet, like this:
import java.util.BitSet;
/**
* Creates a new BitSet of the specified length
* with the {#code len} leftmost bits set to {#code true}.
*
* #param totalBits The length of the resulting {#link BitSet}.
* #param len The amount of leftmost bits to set.
* #throws IllegalArgumentException If {#code len > totalBits} or if any of the arguments is negative
*/
public static BitSet leftmostBits(int totalBits, int len)
{
if (len > totalBits)
throw new IllegalArgumentException("len must be smaller or equal to totalBits");
if (len < 0 || totalBits < 0)
throw new IllegalArgumentException("len and totalBits must both be positive");
BitSet bitSet = new BitSet(totalBits);
bitSet.set(0, len);
return bitSet;
}
Here are some unit tests
Then, you can use that BitSet using its public API (here Java 8 is shown):
BitSet has been designed for this (precise bit manipulation), and it provides you with an arbitrary length as well (does not limit you to 64 bits, like long for example would).

You can use two loops. One for all the 1's and another for all the 0's.
Or using Java 8 you can do
InStream.range(0, ones).forEach(i -> System.out.print(1));
InStream.range(ones, bits).forEach(i -> System.out.print(0));

Bitwise left shift behaviour

Today I was learning about the left shift bit operator (<<). As I understand it the left shift bit operator moves bits to the left as specified. And also I know multiply by 2 for shifting. But I am confused, like what exactly is the meaning of "shifting bits" and why does the output differ when value is assigned with a different type?
When I call the function below, it gives output as System.out.println("b="+b); //Output: 0
And my question is: how does b become 0 and why is b typecasted?
public void leftshiftDemo()
{
byte a=64,b;
int i;
i=a << 2;
b=(byte)(a<<2);
System.out.println("i="+i); //Output: 256 i.e 64*2^2
System.out.println("b="+b); //Output: 0 how & why b is typecasted
}
Update (new doubt):
what does it mean "If you shift a 1 bit into high-order position (Bit 31 or 63), the value will become negative". eg.
public void leftshifHighOrder()
{
int i;
int num=0xFFFFFFE;
for(i=0;i<4;i++)
{
num=num<<1;
System.out.println(num);
/*
* Output:
* 536870908
* 1073741816
* 2147483632
* -32 //how this is -ve?
*/
}
}

When integers are casted to bytes in Java, only the lowest order bits are kept:
A narrowing conversion of a signed integer to an integral type T
simply discards all but the n lowest order bits, where n is the number
of bits used to represent type T. In addition to a possible loss of
information about the magnitude of the numeric value, this may cause
the sign of the resulting value to differ from the sign of the input
value.
In this case the byte 64 has the following binary representation:
01000000
The shift operator promotes the value to int:
00000000000000000000000001000000
then left shifts it by 2 bits:
00000000000000000000000100000000
We then cast it back into a byte, so we discard all but the last 8 bits:
00000000
Thus the final byte value is 0. However, your integer keeps all the bits, so its final value is indeed 256.

In java, ints are signed. To represent that, the 2's complement is used. In this representation, any number that has its high-order bit set to 1 is negative (by definition).
Therefore, when you left-shift a 1 that is on the 31st bit (that is the one before last for an int), it becomes negative.

i = a << 2;
in memory:
load a (8 bits) into regitry R1 (32 bits)
shift registry R1 to the left two position
assign registry R1 (32 bits) to variable i (32 bits).
b = (byte)(a << 2);
in memory:
load a (8 bits) into regitry R1 (32 bits)
shift registry R1 to the left two position
assign registry R1 (32 bits) to variable b (8 bits). <- this is why cast (byte) is necessary and why they get only the last 8 bits of the shift operation

The exact meaning of shifting bits is exactly what it sounds like. :-) You shift them to the left.
0011 = 3
0011 << 1 = 0110
0110 = 6

You should read about different data types and their ranges in Java.
Let me explain in easy terms.
byte a=64,b;
int i;
i=a << 2;
b=(byte)(a<<2);
'byte' in Java is signed 2's complement integer. It can store values from -128 to 127 both inclusive. When you do this,
i = a << 2;
you are left shifting 'a' by 2 bits and the value is supposed to be 64*2*2 = 256. 'i' is of type 'int' and 'int' in Java can represent that value.
When you again left shift and typecast,
b=(byte)(a<<2);
you keep your lower 8 bits and hence the value is 0.
You can read this for different primitive types in Java.
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html

C/C++ equivalent to Java's doubleToRawLongBits()

In Java Double.doubleToLongBits() is useful for implementing hashCode() methods.
I'm trying to do the same in C++ and write my own doubleToRawLongBits() method, as after trawling through Google I can't find a suitable implementation.
I can get the signif and exponent from std::frexp(numbr,&exp) and can determine the sign but can't figure out the use of the bitwise operators to get the Java equivalent.
For example, Java's Double.doubleToLongBits() returns the following for the double 3.94:
4616054510065937285
Thanks for any help.
Graham
Below is the documentation copied and pasted from Double.doubleToRawLongBits()
===Java Double.doubleToRawLongBits() description===
/**
* Returns a representation of the specified floating-point value
* according to the IEEE 754 floating-point "double
* format" bit layout, preserving Not-a-Number (NaN) values.
* <p>
* Bit 63 (the bit that is selected by the mask
* <code>0x8000000000000000L</code>) represents the sign of the
* floating-point number. Bits
* 62-52 (the bits that are selected by the mask
* <code>0x7ff0000000000000L</code>) represent the exponent. Bits 51-0
* (the bits that are selected by the mask
* <code>0x000fffffffffffffL</code>) represent the significand
* (sometimes called the mantissa) of the floating-point number.
* <p>
* If the argument is positive infinity, the result is
* <code>0x7ff0000000000000L</code>.
* <p>
* If the argument is negative infinity, the result is
* <code>0xfff0000000000000L</code>.
* <p>
* If the argument is NaN, the result is the <code>long</code>
* integer representing the actual NaN value. Unlike the
* <code>doubleToLongBits</code> method,
* <code>doubleToRawLongBits</code> does not collapse all the bit
* patterns encoding a NaN to a single "canonical" NaN
* value.
* <p>
* In all cases, the result is a <code>long</code> integer that,
* when given to the {#link #longBitsToDouble(long)} method, will
* produce a floating-point value the same as the argument to
* <code>doubleToRawLongBits</code>.
*
* #param value a <code>double</code> precision floating-point number.
* #return the bits that represent the floating-point number.
* #since 1.3
*/
public static native long doubleToRawLongBits(double value);

A simple cast will do:
double d = 0.5;
const unsigned char * buf = reinterpret_cast<const unsigned char *>(&d);
for (unsigned int i = 0; i != sizeof(double); ++i)
std::printf("The byte at position %u is 0x%02X.\n", i, buf[i]);
Where the sign bit and exponent bits are depends on your platform and the endianness. If your floats are IEE754, if the sign and exponent are at the front and if CHAR_BIT == 8, you can try this:
const bool sign = buf[0] & 0x80;
const int exponent = ((buf[0] & 0x7F) << 4) + (buf[1] >> 4) - 1023;
(In C, say (const unsigned char *)(&d) for the cast.)
Update: To create an integer with the same bits, you have to make the integer first and then copy:
unsigned long long int u;
unsigned char * pu = reinterpret_cast<unsigned char *>(&u);
std::copy(buf, buf + sizeof(double), pu);
For this you have to bear several things in mind: the size of the integer has to be sufficient (a static assertion for sizeof(double) <= sizeof(unsigned long long int) should do the trick), and if the integer is in fact larger, then you're only copying into parts of it. I'm sure you'll figure that out, though :-) (You could use some template magic to create an integer of the correct size, if you really wanted.)

#include <stdint.h>
static inline uint64_t doubleToRawBits(double x) {
uint64_t bits;
memcpy(&bits, &x, sizeof bits);
return bits;
}

I like unions for these kinds of things.
union double_and_buffer {
double d;
unsigned char byte_buff[ sizeof(double) ];
} dab;
dab.d = 1.0;
for ( int i = 0; i < sizeof(dab.byte_buff); ++i )
{
cout << hex byte_buff[ i ];
}
I think it makes it more clear what you're doing and lets the compiler do all the math.

How are integers cast to bytes in Java?

I know Java doesn't allow unsigned types, so I was wondering how it casts an integer to a byte. Say I have an integer a with a value of 255 and I cast the integer to a byte. Is the value represented in the byte 11111111? In other words, is the value treated more as a signed 8 bit integer, or does it just directly copy the last 8 bits of the integer?

This is called a narrowing primitive conversion. According to the spec:
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.
So it's the second option you listed (directly copying the last 8 bits).
I am unsure from your question whether or not you are aware of how signed integral values are represented, so just to be safe I'll point out that the byte value 1111 1111 is equal to -1 in the two's complement system (which Java uses).

int i = 255;
byte b = (byte)i;
So the value of be in hex is 0xFF but the decimal value will be -1.
int i = 0xff00;
byte b = (byte)i;
The value of b now is 0x00. This shows that java takes the last byte of the integer. ie. the last 8 bits but this is signed.

or does it just directly copy the last
8 bits of the integer
yes, this is the way this casting works

The following fragment casts an int to a byte. If the integer’s value is larger than the range of a byte, it will be reduced modulo (the remainder of an integer division by the) byte’s range.
int a;
byte b;
// …
b = (byte) a;

Just a thought on what is said: Always mask your integer when converting to bytes with 0xFF (for ints). (Assuming myInt was assigned values from 0 to 255).
e.g.
char myByte = (char)(myInt & 0xFF);
why? if myInt is bigger than 255, just typecasting to byte returns a negative value (2's complement) which you don't want.

Byte is 8 bit. 8 bit can represent 256 numbers.(2 raise to 8=256)
Now first bit is used for sign. [if positive then first bit=0, if negative first bit= 1]
let's say you want to convert integer 1099 to byte. just devide 1099 by 256. remainder is your byte representation of int
examples
1099/256 => remainder= 75
-1099/256 =>remainder=-75
2049/256 => remainder= 1
reason why? look at this image http://i.stack.imgur.com/FYwqr.png

According to my understanding, you meant
Integer i=new Integer(2);
byte b=i; //will not work
final int i=2;
byte b=i; //fine
At last
Byte b=new Byte(2);
int a=b; //fine

for (int i=0; i <= 255; i++) {
byte b = (byte) i; // cast int values 0 to 255 to corresponding byte values
int neg = b; // neg will take on values 0..127, -128, -127, ..., -1
int pos = (int) (b & 0xFF); // pos will take on values 0..255
}
The conversion of a byte that contains a value bigger than 127 (i.e,. values 0x80 through 0xFF) to an int results in sign extension of the high-order bit of the byte value (i.e., bit 0x80). To remove the 'extra' one bits, use x & 0xFF; this forces bits higher than 0x80 (i.e., bits 0x100, 0x200, 0x400, ...) to zero but leaves the lower 8 bits as is.
You can also write these; they are all equivalent:
int pos = ((int) b) & 0xFF; // convert b to int first, then strip high bits
int pos = b & 0xFF; // done as int arithmetic -- the cast is not needed
Java automatically 'promotes' integer types whose size (in # of bits) is smaller than int to an int value when doing arithmetic. This is done to provide a more deterministic result (than say C, which is less constrained in its specification).
You may want to have a look at this question on casting a 'short'.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Guava's UnsignedLong: Why does it XOR Long.MIN_VALUE - java

Related

Make an binary addition behave like a (packed-) decimal addition

How to create x-bit binary number with number of leftmost bits set

Bitwise left shift behaviour

C/C++ equivalent to Java's doubleToRawLongBits()

How are integers cast to bytes in Java?

Categories

Resources