Understanding signed numbers and complements in java - java

I have a 3 byte signed number that I need to determine the value of in Java. I believe it is signed with one's complement but I'm not 100% sure (I haven't studied this stuff in more than 10 years and the documentation of my problem isn't super clear). I think the problem I'm having is Java does everything in two's complement. I have a specific example to show:
The original 3-byte number: 0xEE1B17
Parsed as an integer (Integer.parseInt(s, 16)) this becomes: 15604503
If I do a simple bit flip (~) of this I get (I think) a two's complement representation: -15604504
But the value I should be getting is: -1172713
What I think is happening is I'm getting the two's complement of the entire int and not just the 3 bytes of the int, but I don't know how to fix this.
What I have been able to do is convert the integer to a binary string (Integer.toBinaryString()) and then manually "flip" all of the 0s to 1s and vice-versa. When then parsing this integer (Integer.parseInt(s, 16)) I get 1172712 which is very close. In all of the other examples I need to always add 1 to the result to get the answer.
Can anyone diagnose what type of signed number encoding is being used here and if there is a solution other than manually flipping every character of a string? I feel like there must be a much more elegant way to do this.
EDIT: All of the responders have helped in different ways, but my general question was how to flip a 3-byte number and #louis-wasserman answered this and answered first so I'm marking him as the solution. Thanks to everyone for the help!

If you want to flip the low three bytes of a Java int, then you just do ^ 0x00FFFFFF.

0xFFEE1B17 is -1172713
You must only add the leading byte. FF if the highest bit of the 3-byte value is set and 00 otherwise.
A method which converts your 3-byte value to a proper intcould look like this:
if(byte3val>7FFFFF)
return byte3val| 0xFF000000;
else
return byte3val;

Negative signed numbers are defined so that a + (-a) = 0. So it means that all bits are flipped and then 1 added. See Two's complement. You can check that the condition is satisfied by this process by thinking what happens when you add a + ~a + 1.
You can recognize that a number is negative by its most significant bit. So if you need to convert a signed 3-byte number into a 4-byte number, you can do it by checking the bit and if it's set, set also the bits of the fourth byte:
if ((a & 0x800000) != 0)
a = a | 0xff000000;
You can do it also in a single expression, which will most likely perform better, because there is no branching in the computation (branching doesn't play well with pipelining in current CPUs):
a = (0xfffffe << a) >> a;
Here << and >> perform byte shifts. First we shift the number 8 bits to the right (so now it occupies the 3 "upper" bytes instead of the 3 "lower" ones), and then shift it back. The trick is that >> is so-called Arithmetic shift also known as signed shift. copies the most significant bit to all bits that are made vacant by the operation. This is exactly to keep the sign of the number. Indeed:
(0x1ffffe << 8) >> 8 -> 2097150
(0xfffffe << 8) >> 8 -> -2
Just note that java also has a unsigned right shift operator >>>. For more information, see Java Tutorial: Bitwise and Bit Shift Operators.

Related

How do I convert a short to an int without turning it into a negative in java

I am working on a file reader and came into a problem when trying to read a short. In short (punintended), java is converting a two bytes I'm using to make the short into an int to do bitwise operations and is converting it in a way to keep the same value. I need to convert the byte into an int in a way that would preserve its value so the bits stayed the same.
example of what's happening:
byte number = -1; //-1
int otherNumber = 1;
number | otherNumber; // -1
example of what I want:
byte number = -1; //-1
int otherNumber = 1;
number | otherNumber; // 129
This can be done pretty easily with some bit magic.
I'm sure you're aware that a short is 16 bits (2 bytes) and an int is 32 bits (4 bytes). So, between an integer and a short, there is a two-byte difference. Now, for positive numbers, copying the value of a short to an int is effectively copying the binary data, however, as you've pointed out, this is not the case for negative numbers.
Now let's look at how negative numbers are represented in binary. It's a bit confusing, so I'll try to keep it simple. Modern systems use what's called the two's compliment to store negative numbers. Basically all this means is that the very first bit in the set of bytes representing the number determines whether or not it's negative. For mathematical purposes, the rest of the bits are also inverted and offset 1 bit to the right (since you can't have negative 0). For example, 2 as a short would be represented as 0000 0000 0000 0010, while -2 would be represented as 1111 1111 1111 1110. Now, since the bytes are inverted in a negative number, this means that -2 in int form is the same but with 2 more bytes (16 bits) at the beginning that are all set to 1.
So, in order to combat this, all we need to do is change the extra 1s to 0s. This can be done by simply using the bitwise and operator. This operator goes through each bit and checks if the bits at each position in each operand are a 1 or a 0. If they're both 1, the bit is flipped to a 0. If not, nothing happens.
Now, with this knowledge, all we need to do is create another integer where the first two bytes are all 1. This is fairly simple to do using hexidecimal literals. Since they are an integer by default, we simply need to use this to get four bytes of 1s. With a single byte, if you were to set every bit to 1, the max value you can get is 255. 255 in hex is 0xFF, so 2 bytes would be 0xFFFF. Pretty simple, now you just need to apply it.
Here is an example that does exactly that:
short a = -2;
int b = a & 0xFFFF;
You could also use Short.toUnsignedInt(), but where's the fun in that? 😉

Bit manipulation in Java - 2s complement and flipping bits

I was recently looking into some problems with but manipulation in Java and I came up with two questions.
1) Firstly, I came up to the problem of flipping all the bits in a number.
I found this solution:
public class Solution {
public int flipAllBits(int num) {
int mask = (1 << (int)Math.floor(Math.log(num)/Math.log(2))+1) - 1;
return num ^ mask;
}
}
But what happens when k = 32 bits? Can the 1 be shifted 33 times?
What I understand from the code (although it doesn't really make sense), the mask is 0111111.(31 1's)....1 and not 32 1's, as someone would expect. And therefore when num is a really large number this would fail.
2) Another question I had was determining when something is a bit sequence in 2s complement or just a normal bit sequence. For example I read that 1010 when flipped is 0110 which is -10 but also 6. Which one is it and how do we know?
Thanks.
1) The Math object calls are not necessary. Flipping all the bits in any ordinal type in Java (or C) is not an arithmatic operation. It is a bitwise operation. Using the '^' operator, simply using 1- as an operand will work regardless of the sizeof int in C/C++ or a Java template with with the ordinal type as a parameter T. The tilde '~' operator is the other option.
T i = 0xf0f0f0f0;
System.out.println(T.toHexString(i));
i ^= -1;
System.out.println(T.toHexString(i));
i = ~ i;
System.out.println(T.toHexString(i));
2) Since the entire range of integers maps to the entire range of integers in a 2's compliment transform, it is not possible to detect whether a number is or is not 2's complement unless one knows the range of numbers from which the 2's complement might be calculated and the two sets (before and after) are mutually exclusive.
That mask computation is fairly inscrutable, I'm going to guess that it (attempts to, since you mention it's wrong) make a mask up to and including the highest set bit. Whether that's useful for "flipping all bits" is an other possible point of discussion, since to me at least, "all bits" means all 32 of them, not some number that depends on the value. But if that's what you want then that's what you want. Especially combined with that second question, that looks like a mistake to me, so you'd be implementing the wrong thing from the start - see near the bottom.
Anyway, the mask can be generated with some reasonably nice bitmath, which does not create any doubt about possible edge cases (eg Math.log(0) is probably bad, and k=32 corresponds with negative numbers which are also probably bad to put into a log):
int m = num | (num >> 16);
m |= m >> 8;
m |= m >> 4;
m |= m >> 2;
m |= m >> 1;
return num ^ m;
Note that this function has odd properties, it almost always returns an unsigned-lower number than went in, except at 0. It flips bits so the name is not completely wrong, but flipAllBits(flipAllBits(x)) != x (usually), while the name suggests it should be an involution.
As for the second question, there is nothing to determine. Two's complement is scheme by which you can interpret a bitvector - any bitvector. So it's really a choice you make; to interpret a given bitvector that way or some other way. In Java the "default" interpretation is two's complement (eg toString will print an int by interpreting it according to its two's complement meaning), but you don't have to go along with it, you can (with care) treat int as unsigned, or as an array of booleans, or several bitfields packed together, etc.
If you wanted to invert all the bits but made the common mistake to assume that the number of bits in an int is variable (and that you therefore needed to compute a mask that covers "all bits"), I have some great news for you, because inverting all bits is a lot easier:
return ~num;
If you were reading "invert all bits" in the context of two's complement, it would have the above meaning, so all bits, including those left of the highest set bit.

Why is -1 right shift 1 = -1 in Java?

I came across the question "Why is -1 zero fill right shift 1=2147483647 for integers in Java?"
I understood the concept of zero fill right shift perfectly well from the above question's answer. But when I tried to find -1>>1, I am getting a totally complex answer which I felt difficult to understand.
-1 in binary form is as follows: 11111111111111111111111111111111
After flipping the bits, I got: 00000000000000000000000000000000
Upon adding 1 to it, I got: 00000000000000000000000000000001
Now shifting one position right: 00000000000000000000000000000000
After flipping the bits, I got: 11111111111111111111111111111111
Now adding 1 to it: 00000000000000000000000000000000
I don't understand how -1>>1 is -1 itself, then?
When you do a normal right-shift (i.e. using >>, also known as an arithmetic right shift, as opposed to >>>, which is a logical right shift), the number is sign extended.
How this works is as follows:
When we right-shift we get an empty spot in front of the number, like so:
11111111111111111111111111111111
?1111111111111111111111111111111(1) (right-shift it one place)
The last 1 is shifted out, and in comes the ?.
Now, how we fill in the ? is dependent on how we shift.
If we do a logical shift (i.e. >>>), we simply fill it with 0.
If we do a arithmetic shift (i.e. >>), we fill it with the first bit from the original number, i.e. the sign bit (since it's 1 if the number is negative, and 0 if not). This is called sign extension.
So, in this case, -1 >> 1 sign-extends the 1 into the ?, leaving the original -1.
Further reading:
Arithmetic shift on Wikipedia
Logical shift on Wikipedia
>> is a 'signed right shift' operator, so it will preserve the leftmost bit.
So as -1 (dec) is 0xb11111111111111111111111111111111 (bin), right shifting it using >> will preserve the 1 and the result is again 0xb11111111111111111111111111111111 (bin).
For unsigned right shift, the operator >>> can be used. It always fills up with zeroes (0). In your case, -1 >>> 1 results in 0b01111111111111111111111111111111, or 2147483647, which is the largest possible (positive) 32 bit two's complement (and equal to Integer.MAX_VALUE).

For bitwise operations in Java, do I need to care that most numeric types in Java are signed?

This is a really basic question, but I've never fully convinced myself that my intuitive answer of "it makes no difference" is correct, so maybe someone has a good way to understand this:
If all I want to do with one of the primitive numeric types in Java is bitwise arithmetic, can I simply treat it as if it was an unsigned value or do I need to avoid negative numbers, i.e. keep the highest order bit always set to 0? For example, can I use an int as if it was an unsigned 32-bit number, or should I only use the lowest 31 bits?
I'm looking for as general an answer as possible, but let me give an example: Let's say I want to store 32 flags. Can I store all of them in a single int, if I use something like
store = store & ~(1 << index) | (value << index)
to set flag index to value and something like
return (store & (1 << index)) != 0
to retrieve flag index? Or could I run into any sort of issues with this or similar code if I ever set the flag with index 31 to 1?
I know I need to always be using >>> instead of >> for right shifting, but is this the only concern? Or could there be other things going wrong related to the two's complement representation of negative numbers when I use the highest bit?
I know I need to always be using >>> instead of >> for right shifting, but is this the only concern?
Yes, this is the only concern. Shifting left works the same on signed and unsigned numbers; same goes for ANDing, ORing, and XORing. As long as you use >>> for shifting right, you can use all 32 bits of a signed int.
There are legitimate reasons to use >> as well in that context (a common case is when making a mask that should be 0 or -1 directly, without having to negate a mask that is 0 or 1), so there is really no concern at all. Just be careful of what you're doing to make sure it matches your intent.
Operations that care about signedness (ie they have distinct signed and unsigned forms with different semantics) are:
right shift
division (unsigned form not available in Java)
modulo (unsigned form not available in Java)
comparisons (except equality) (unsigned forms not available in Java)
Operations that don't care about signedness are:
and
or
xor
addition
subtraction
two's complement negation (-x means ~x + 1)
one's complement (~x means -x - 1)
left shift
multiplication

why the binary representationof -127>>1 is 11000000?

I know the binary representation of -127 is 10000001 (complement).
Can any body tell me why I right shift it by 1 digit, then I get 11000000 ?
(-127) = 10000001
(-127>>1) = 11000000 ???
Thanks.
If your programming language does a sign-extending right shift (as Java does), then the left-most 1 comes from extending the sign. That is, because the top bit was set in the original number it remains set in the result for each shift (so shifting by more than 1 has all 1's in the top most bits corresponding to the number of shifts done).
This is language dependent - IIRC C and C++ sign-extend on right shift for a signed value and do not for an unsigned value. Java has a special >>> operator to shift without extending (in java all numeric primitive values are signed, including the misleadingly named byte).
Right-shifting in some languages will pad with whatever is in the most significant bit (in this case 1). This is so that the sign will not change on shifting a negative number, which would turn into a positive one if this was not in place.
-127 as a WORD (2 bytes) is 1111111110000001. If you right shift this by 1 bit, and represent it as a single byte the result is 11000000 This is probably what you are seeing.
Because, if you divide -127 (two's-complement encoded as 10000001) by 2 and round down (towards -infinity, not towards zero), you get -64 (two's-complement encoded as 11000000).
Bit-wise, the reason is: when right-shifting signed values, you do sign-extension -- rather than shifting in zeroes, you duplicate the most significant bit. When working with two's-complement signed numbers, this ensures the correct result, as described above.
Assembly languages (and the machine languages they encode) typically have separate instructions for unsigned and signed right-shift operations (also called "logical shift right" vs. "arithmetic shift right"); and compiled languages typically pick the appropriate instruction when shifting unsigned and signed values, respectively.
It's sign extending, so that a negative number right shifted is still a negative number.

Categories

Resources