In Java, a variable of type int is represented internally as a 32-bit signed integer. Suppose that one bit stores the sign, and the other 31 bits store the magnitude of the number in base 2. In this scheme, what is the largest value that can be stores as type int?
The answer is (2^31)-1. I am curious to what the purpose of the -1 is? Why does 1 have to be subtracted from the magnitude of the number? I don't believe the 1 has anything to do with the sign because that is what the 32nd bit is for.
You are forgetting the 0.
In a decimal system, with a singe digit you are able to store 10 values and largest value is 9 (which is 10^1 - 1).
This is the same story. With a i digits number in base b you are able to store b^i values and the largest one is b^i-1 (I'm ignoring the sign bit right now).
As a side note: the representation used by numbers in Java (as in many other implementations) is the two's complement which has the peculiarity of storing just a 0 (otherwise you would end up having two 0's: -0 and +0 according to the first bit for the sign). So even if the largest value is 2^31-1, the smallest one is -2^31.
You should learn about two's complement notation. All java integer types are signed, and all use this to represent whole numbers. It is very nice, because you never need to subtract.
Related
I've searched and read many previous answers concerning this difference but I still don't get somethings, for example with this line of code :
System.out.println(Integer.parseUnsignedInt("11111111111111111111111111111111", 2));
I read that Unsigned can hold a larger positive value, and no negative value. Unsigned uses the leading bit as a part of the value, while the signed version uses the left-most-bit to identify if the number is positive or negative. signed integers can hold both positive and negative numbers. A Unisgned can go larger than MAX_VALUE, so why is that sysout gives -1 which is negative. And for this line code :
System.out.println(Integer.parseInt("11111111111111111111111111111111", 2));
Why does this line gives an error(32 times 1) ? Isn't a signed int suppose to treat the first 1 as a minus, and the other 31 1's as the positive value ? (so it should give - MAX_VALUE)
Java does not have unsigned integer types. When you call parseUnsignedInt, it does not return an unsigned integer, because there is no such thing in java.
Instead, parseUnsignedInt parses the input as an unsigned 32-bit value, and then returns the signed 32-bit value with the same bits.
If you provide it with a number that has the leading bit set in its unsigned representation, it will therefore return a negative number, because all the signed numbers with that bit set are negative.
Why does this line gives an error(32 times 1) ? Isn't a signed int
suppose to treat the first 1 as a minus, and the other 31 1's as the
positive value ? (so it should give - MAX_VALUE)
It throws a NumberFormatException for exactly the same reason that
Integer.parseInt("2147483648", 10)
does. The string does not represent a number that can be represented as an int when interpreted according to the specified radix. That int uses 32 bits to represent values is only peripherally related. If you want to parse a negative value with Integer.parseInt() then the string should contain a leading minus sign:
Integer.parseInt("-1111111111111111111111111111111", 2)
or even
Integer.parseInt("-10000000000000000000000000000000", 2) // note: 32 digits
The method is not mapping bits directly to an int representation. It is interpreting the string as a textual representation of a number.
A Unsigned can go larger than MAX_VALUE, so why is that sysout gives
-1 which is negative.
Integer holds a 32-bit value. The result of parseUnsignedInt(32 one-bits, 2) is a value with all bits set. What that "means" is in the eye of the beholder. Most of Java considers integer values to be signed. The formatter used by println, for example, regards the value as signed, and the result is -1.
All you really have 'inside' the Integer is a 32-bit value. parseUnsignedInt() handles its input argument specially, but then it has to store the result in a standard Integer. There is nothing to say that the bits are supposed to be 'unsigned'. If you want to treat the value as unsigned, you can't use anything that treats it as signed, since Java really does not have an unsigned-integer type.
Isn't a signed int suppose to treat the first 1 as a minus, and the
other 31 1's as the positive value ? (so it should give - MAX_VALUE)
See other answer for this part.
The representation I think you were expecting, where the high bit indicates the sign and the rest of the bits indicates the (positive) value, is called "sign and magnitude" representation. I am unaware of any current computer that uses sign-and-magnitude for its integers (may have happened in the very early days, 1950s or so, when people were getting to grips with this stuff).
I can store numbers ranging from -127 to 127 but other than that it is impossible and the compiler give warning. The binary value of 127 is 01111111, and 130 is 10000010 still the same size (8 bits) and what I think is I can store 130 in Byte but it is not possible. How did that happen?
Java does not have unsigned types, each numeric type in Java is signed (except char but it is not meant for representing numbers but unicode characters).
Let's take a look at byte. It is one byte which is 8 bits. If it would be unsigned, yes, its range would be 0..255.
But if it is signed, it takes 1 bit of information to store the sign (2 possible values: + or -), which leaves us 7 bits to store the numeric (absolute) value. Range of 7 bit information is 0..127.
Note that the representation of signed integer numbers use the 2's complement number format in most languages, Java included.
Note: The range of Java's byte type is actually -128..127. The range -127..127 only contains 255 numbers (not 256 which is the number of all combinations of 8 bits).
In Java, a byte is a signed data type. You are thinking about unsigned bytes, in which case it is possible to store the value 130 in 8 bits. But with a signed data type, that also allows negative numbers, the first bit is necessary to indicate a negative number.
There are two ways to store negative numbers, (one's complement and two's complement) but the most popular one is two's complement. The benefit of it is that for two's complement, most arithmetic operations do not need to take the sign of the number into account; they can work regardless of the sign.
The first bit indicates the sign of the number: When the first bit is 1, then the number is negative. When the first bit is 0, then the number is positive. So you basically only have 7 bits available to store the magnitude of the number. (Using a small trick, this magnitude is shifted by 1 for negative numbers - otherwise, there would be two different bit patterns for "zero", namely 00000000 and 10000000).
When you want to store a number like 130, whose binary representation is 10000010, then it will be interpreted as a negative number, due to the first bit being 1.
Also see http://en.wikipedia.org/wiki/Two%27s_complement , where the trick of how the magnitude is shifted is explained in more detail.
I'm checking the Java doc, and seeing that Integer.MIN_VALUE is -2^31:
A constant holding the minimum value an int can have, -2^31.
While in C++, the 32 bits signed integer "long" has a different MIN value:
LONG_MIN: Minimum value for an object of type long int -2147483647
(-2^31+1) or less*
I am very confused why they are different and how Java get -2^31? In Java, if an integer has 32 bits and the first bit is used for sign, -2^31+1 is more logical, isn't it?
The value of the most significant bit in a 32 bit number is 2^31, and as this is negative in a signed integer the value is -2^31. I guess C++ uses -2^31+1 as the MIN_VALUE because this means it has the same absolute value as MAX_VALUE i.e. 2^31-1. This means an integer could store -MIN_VALUE, which is not the case in Java (which can cause some fun bugs).
C and C++ are intended to work with the native representation on machines using 1s or 2s complement or signed magnitude. With 2s complement you get a range from -2 n through 2 n -1, about like in Java. Otherwise you lose the most negative number and gain a representation for -0.
In C++ you're only guaranteed that n (the number of bits)is at least 16 where Java guarantees it's exactly 32.
I know the binary representation of -127 is 10000001 (complement).
Can any body tell me why I right shift it by 1 digit, then I get 11000000 ?
(-127) = 10000001
(-127>>1) = 11000000 ???
Thanks.
If your programming language does a sign-extending right shift (as Java does), then the left-most 1 comes from extending the sign. That is, because the top bit was set in the original number it remains set in the result for each shift (so shifting by more than 1 has all 1's in the top most bits corresponding to the number of shifts done).
This is language dependent - IIRC C and C++ sign-extend on right shift for a signed value and do not for an unsigned value. Java has a special >>> operator to shift without extending (in java all numeric primitive values are signed, including the misleadingly named byte).
Right-shifting in some languages will pad with whatever is in the most significant bit (in this case 1). This is so that the sign will not change on shifting a negative number, which would turn into a positive one if this was not in place.
-127 as a WORD (2 bytes) is 1111111110000001. If you right shift this by 1 bit, and represent it as a single byte the result is 11000000 This is probably what you are seeing.
Because, if you divide -127 (two's-complement encoded as 10000001) by 2 and round down (towards -infinity, not towards zero), you get -64 (two's-complement encoded as 11000000).
Bit-wise, the reason is: when right-shifting signed values, you do sign-extension -- rather than shifting in zeroes, you duplicate the most significant bit. When working with two's-complement signed numbers, this ensures the correct result, as described above.
Assembly languages (and the machine languages they encode) typically have separate instructions for unsigned and signed right-shift operations (also called "logical shift right" vs. "arithmetic shift right"); and compiled languages typically pick the appropriate instruction when shifting unsigned and signed values, respectively.
It's sign extending, so that a negative number right shifted is still a negative number.
i know unsigned,two's complement, ones' complement and sign magnitude, and the difference between these, but what i'm curious about is:
why it's called two's(or ones') complement, so is there a more generalize N's complement?
in which way did these genius deduce such a natural way to represent negative numbers?
Two's complement came about when someone realized that 'going negative' by subtracting 1 from 0 and letting the bits rollunder actually made signed arithmetic simpler because no special checks have to be done to check if the number is negative or not. Other solutions give you a discontinuity between -1 and 0. The only oddity with two's complement is that you get one more negative number in your range than you have positive numbers. But, then, other solutions give you strange things like +0 and -0.
According to Wikipedia, the name itself comes from mathematics and is based on ways of making subtraction simpler when you have limited number places. The system is actually a "radix complement" and since binary is base two, this becomes "two's complement". And it turns out that "one's complement" is named for the "diminished radix complement", which is the radix minus one. If you look at this for decimal, the meanings behind the names makes more sense.
Method of Complements (Wikipedia)
You can do the same thing in other bases. With decimal, you would have 9's complement, where each digit X is replaced by 9-X, and the 10's complement of a number is the 9's complement plus one. You can then subtract by adding the 10's complement, assuming a fixed number of digits.
An example - in a 4 digit system, given the subtraction
0846
-0573
=0273
First find the 9's complement of 573, which is 9-0 9-5 9-7 9-3 or 9426
the 10's complement of 573 is 9426+1, or 9427
Now add the 10's complement and throw away anything that carries out of 4 digits
0846
+9427 .. 10's complement of 573
= 10273 .. toss the 'overflow' digit
= 0273 .. same answer
Obviously that's a simple example. But the analogy carries. Interestingly the most-negative value in 4-digit 10's complement? 5000!
As for the etymology, I'd speculate that the term 1's complement is a complement in the same sense as a complementary angle from geometry is 90 degrees minus the angle - i.e., it's the part left over when you subtract the given from some standard value. Not sure how "2's" complement
makes sense, though.
In the decimal numbering system, the radix is ten:
radix complement is called as ten's complement
diminished radix complement is called as nines' complement
In the binary numbering system, the radix is two:
radix complement is called as two's complement
diminished radix complement is called as ones' complement
Source: https://en.wikipedia.org/wiki/Method_of_complements
I've been doing a lot of reading on this lately. I could be wrong, but I think I've got it...
The basic idea of a complement is straightforward: it's the remaining difference between one digit and another digit. For example, in our regular decimal notation, where we only have ten digits ranging from 0 to 9, we know that the difference between 9 and 3 is 6, so we can say that "the nines' complement of 3 is 6".
From there on out, there's something that I find gets easily confused, with very little help online: how we choose to use these complements to achieve subtraction or negative value representation is up to us! There are multiple methods, with two majorly accepted methods that both work, but with different pros and cons. The whole point of the complements is to be used in these methods, but "nine's complement" itself is not a subtraction or negative sign representation method, it's just the difference between nine and another digit.
The old-style "nines' complement" way of flipping a decimal number (nines' complement can also be called the "diminished radix complement" in the context of decimal, because we needed to find a complicated fancy way to say it's one less than ten) to perform addition of a negative value worked fine, but it gave two different values for 0 (+0 and -0), which was an expensive waste of memory on computing machines, and it also required additional tools and steps for carrying values, which added time or resource.
Later, someone realized that if you took the nines' complement and added 1 afterwards, and then dropped any carrying values beyond the most significant digit, you could also achieve negative value representation or subtraction, while only having one 0 value, and not needing to perform any carry-over at the end. (The only downside was that your distribution of values was uneven across negative and positive numbers.) Because the operation involved taking nines' complement and adding one to it, we called it "ten's complement" as a kind of shorthand. Notice the different placement of the apostrophe in the name. We have two different calculations that use the same name. The method "ten's complement" is not the same as "tens' complement". The former uses the second method I mentioned, while the latter uses the first (older) method I mentioned.
Then, to make the names simpler later, we said, "Hey, we call it ten's complement and it flips a base 10 number (decimal representation), so when we're using it we should just call it the "radix complement". And when we use nines' complement in base 10 we should just call it the "diminished radix complement". Again, this is confusing because we're reversing the way it actually happened in our terminology... ten's complement was actually named because it was "nines' complement plus one", but now we're calling it "ten's complement diminished" basically.
And then the same thing applies with ones' complement and two's complement for binary.