why C, C++, Java does not use one complement? - java

I heard C, C++, Java uses two complements for binary representation. Why not use 1 complement? Is there any advantage to use 2 complement over 1 complement?

Working with two's complement signed integers is a lot cleaner. You can basically add signed values as if they were unsigned and have things work as you might expect, rather than having to explicitly deal with an additional carry addition. It is also easier to check if a value is 0, because two's complement only contains one 0 value, whereas one's complement allows one to define both a positive and a negative zero.
As for the additional carry addition, think about adding a positive number to a smallish negative number. Because of the one's complement representation, the smallish negative number will actually be fairly large when viewed as an unsigned quantity. Adding the two together might lead to an overflow with a carry bit. Unlike unsigned addition, this doesn't necessarily mean that the value is too large to represent in the one's complement number, just that the representation temporarily exceeded the number of bits available. To compensate for this, you add the carry bit back in after adding the two one's complement numbers together.

The internal representation of numbers is not part of any of those languages, it's a feature of the architecture of the machine itself. Most implementations use 2's complement because it makes addition and subtraction the same binary operation (signed and unsigned operations are identical).

Is this a homework question? If so, think of how you would represent 0 in a 1's complement system.

The answer is different for different languages.
In the case of C, you could in theory implement the language on a 1's complement machine ... if you could still find a working 1's complement machine to run your programs! Using 1's complement would introduce portability issues, but that's the norm for C. I'm not sure what the deal is for C++, but I wouldn't be surprised if it is the same.
In the case of Java, the language specification sets out precise sizes and representations for the primitive types, and precise behaviour for the arithmetic operators. This is done to eliminate the portability issues that arise when you make these things implementation specific. The Java designers specified 2's complement arithmetic because all modern CPU architectures implement 2's complement and not 1's complement integers.
For reasons why modern hardware implements 2's complement and not 1's complement, take a look at (for example) the Wikipedia pages on the subject. See if you can figure out the implications of the alternatives.

At least C and C++ offer 1's complement negation (which is the same as bitwise negation) via the language's ~ operator. Most processors - and all modern ones - use 2's complement representation for a couple reasons:
Adding positive and negative numbers is the same as adding unsigned integers
No "wasted" values due to two representations of 0 (+0 and -0)
Edit: The draft of C++0x does not specify whether signed integer types are 1's complement or 2's complement, which means it's highly unlikely that earlier versions of C and C++ did specify it. What you have observed is implementation-defined behavior, which is 2's complement on at least modern processors for performance reasons.

Almost all existing CPU hardware uses two's complement, so it makes sense that most programming languages do, too.
C and C++ support one's complement, if the hardware provides it.

It has to do with zero and rounding. If you use 1st complement, you can end up have two zeros. See here for more info.

Sign-magnitude representation would be much better for numeric code. The lack of symmetry is a real problem with 2's complement and also precludes a lot of useful (numeric orientated) bit-hacks. 2's complement also introduces trick situations where an arithmetic operation may not give you the result you think it might. So you must be mindful with regards to division, shifting and negation.

Related

Count number of bits set to 1, why non-negative integers?

In Elements of Programming Interviews in Java, the authors say: "Writing a program to count the number of bits that are set to 1 in a nonnegative integer is a good way to get up to speed with primitive types".
Why do they qualify this with 'nonnegative'?
Non-negative means positive or zero.
You are assumed to use shifts and bit operations.
Bit shifts can be sign-extending, which can be misleading. To avoid this issue, you are asked to handle non-negative values only.
But you can handle negative values too, if you want. Java has unsigned shifts >>>.
Sure this is an exercise. In real code you'll use a library implementation. java.lang.Integer has bitCount(int i). I assume bitCount will execute on a hardware as bit-counting instruction, such as x86 popcnt, so it should be better than manual bit manipulation, or at least use a very decent bit manipulation implementation rather than one you'd write as an exercise. There's a whole tag hammingweight about counting those bits.
Negative integers are stored a little bit differently.
In a 4 Bit signed integer, 1111b would be -1, whereas 1000b would be -8, making the count of 1's dependent on the number of bits in the number. That way you can't answer the question anymore.

java long datatype conversion to unsigned value

I'm porting some C++ code to Java code.
There is no unsigned datatype in java which can hold 64 bits.
I have a hashcode which is stored in Java's long datatype (which of course is signed).
long vp = hashcode / 38; // hashcode is of type 'long'
Since 38 here is greater than 2, the resulting number can be safely used for any other arithmetic in java.
The question is what if the signed bit in 'hashcode' is set to 1. I don't want to get a negative value in variable vp. I wanted a positive value as if the datatype is an unsigned one.
P.S: I don't want to used Biginteger for this purpose because of performance issues.
Java's primative integral types are considered signed, and there isn't really anything you can do about it. However, depending on what you need it for, this may not matter.
Since the integers are all done in two's complement, signed and unsigned are exact same at the binary level. The difference is how you interpret them, and in certain operations. Specifically, right shift, division, modulus and comparison differ. Unsigned right shifts can be done with the >>> operator. As long as you don't need one of the missing operators, you can use longs perfectly well.
If you can use third-party libraries, you can e.g. use Guava's UnsignedLongs class to treat long values as unsigned for many purposes, including division. (Disclosure: I contribute to Guava.)
Well here is how i solved this. Right shift hashcode by 1 bit(division by 2). Then divide that right shifted number by 19(which is 38/2). So essentially i divided the number by 38 exactly like how it is done in c++. I got the same value what i got in c++

I want to store a number between 0 and 100 in a variable - should I use an INT or a BYTE and why?

I know this is a n00b question but I want to define a variable with a value between 0 and 100 in JAVA. I can use INT or BYTE as data types - but which one would be the best to use. And why?(benefits?)
Either int or byte would work for storing a number in that range.
Which is best depends on what you are aiming to do.
If you need to store lots of these numbers in an array, then a byte[] will take less space than an int[] with the same length.
(Surprisingly) a byte variable takes the same amount of space as a int ... due to the way that the JVM does the stack frame / object frame layout.
If you are doing lots of arithmetic, etc using these values, then int is more convenient than byte. In Java, arithmetic and bitwise operations automatically promote the operands to int, so when you need to assign back to a byte you need to use a (byte) type-cast.
Use byte datatype to store value between -128 to 127
I think since you have used 0.00 with decimals, this means you want to allow decimals.
The best way to do this is by using a floating point type: double or float.
However, if you don't need decimals, and the number is always an integer, you can use either int or byte. Byte can hold anything up to +127 or down to -128, and takes up much less space than an int.
with 32-bit ints on your system, your byte would take up 1/4 of the space!
The disadvantages are that you have to be careful when doing arithmetic on bytes - they can overflow, and cause unexpected results if they go out of range.
If you are using basic IO functions, like creating sounds or reading old file formats, byte is invaluable. But watch out for the sign convention: negative has the high-order bit set. However when doing normal numerical calculations, I always use int, since it allows easy extension to larger values if needed, and 32-bit cpus there is minimal speed cost really. Only thing is if you are storing a large quantity of them.
Citing Primitive Data Types:
byte: The byte data type is an 8-bit signed two's complement integer. It has a minimum value of -128 and a maximum value of 127 (inclusive). The byte data type can be useful for saving memory in large arrays, where the memory savings actually matters. They can also be used in place of int where their limits help to clarify your code; the fact that a variable's range is limited can serve as a form of documentation.
int: The int data type is a 32-bit signed two's complement integer. It has a minimum value of -2,147,483,648 and a maximum value of 2,147,483,647 (inclusive). For integral values, this data type is generally the default choice unless there is a reason (like the above) to choose something else. This data type will most likely be large enough for the numbers your program will use, but if you need a wider range of values, use long instead.
Well it depends on what you're doing... out of the blue like that I would say use int because it's a standard way to declare an integer. But, if you do something specific that require memory optimization than maybe byte could be better in that case.
Technically, neither one is appropriate--if you need to store the decimal part, you want a float.
I'd personally argue that unless you're actually getting into resource-constrained issues (embedded system, big data, etc.) your code will be slightly clearer if you go with int. Good style is about being legible to people, so indicating that you're working with integers might make more sense than prematurely optimizing and thus forcing future maintainers of your code (including yourself) to speculate as to the true nature of that variable.

why is it called two's complement? [duplicate]

i know unsigned,two's complement, ones' complement and sign magnitude, and the difference between these, but what i'm curious about is:
why it's called two's(or ones') complement, so is there a more generalize N's complement?
in which way did these genius deduce such a natural way to represent negative numbers?
Two's complement came about when someone realized that 'going negative' by subtracting 1 from 0 and letting the bits rollunder actually made signed arithmetic simpler because no special checks have to be done to check if the number is negative or not. Other solutions give you a discontinuity between -1 and 0. The only oddity with two's complement is that you get one more negative number in your range than you have positive numbers. But, then, other solutions give you strange things like +0 and -0.
According to Wikipedia, the name itself comes from mathematics and is based on ways of making subtraction simpler when you have limited number places. The system is actually a "radix complement" and since binary is base two, this becomes "two's complement". And it turns out that "one's complement" is named for the "diminished radix complement", which is the radix minus one. If you look at this for decimal, the meanings behind the names makes more sense.
Method of Complements (Wikipedia)
You can do the same thing in other bases. With decimal, you would have 9's complement, where each digit X is replaced by 9-X, and the 10's complement of a number is the 9's complement plus one. You can then subtract by adding the 10's complement, assuming a fixed number of digits.
An example - in a 4 digit system, given the subtraction
0846
-0573
=0273
First find the 9's complement of 573, which is 9-0 9-5 9-7 9-3 or 9426
the 10's complement of 573 is 9426+1, or 9427
Now add the 10's complement and throw away anything that carries out of 4 digits
0846
+9427 .. 10's complement of 573
= 10273 .. toss the 'overflow' digit
= 0273 .. same answer
Obviously that's a simple example. But the analogy carries. Interestingly the most-negative value in 4-digit 10's complement? 5000!
As for the etymology, I'd speculate that the term 1's complement is a complement in the same sense as a complementary angle from geometry is 90 degrees minus the angle - i.e., it's the part left over when you subtract the given from some standard value. Not sure how "2's" complement
makes sense, though.
In the decimal numbering system, the radix is ten:
radix complement is called as ten's complement
diminished radix complement is called as nines' complement
In the binary numbering system, the radix is two:
radix complement is called as two's complement
diminished radix complement is called as ones' complement
Source: https://en.wikipedia.org/wiki/Method_of_complements
I've been doing a lot of reading on this lately. I could be wrong, but I think I've got it...
The basic idea of a complement is straightforward: it's the remaining difference between one digit and another digit. For example, in our regular decimal notation, where we only have ten digits ranging from 0 to 9, we know that the difference between 9 and 3 is 6, so we can say that "the nines' complement of 3 is 6".
From there on out, there's something that I find gets easily confused, with very little help online: how we choose to use these complements to achieve subtraction or negative value representation is up to us! There are multiple methods, with two majorly accepted methods that both work, but with different pros and cons. The whole point of the complements is to be used in these methods, but "nine's complement" itself is not a subtraction or negative sign representation method, it's just the difference between nine and another digit.
The old-style "nines' complement" way of flipping a decimal number (nines' complement can also be called the "diminished radix complement" in the context of decimal, because we needed to find a complicated fancy way to say it's one less than ten) to perform addition of a negative value worked fine, but it gave two different values for 0 (+0 and -0), which was an expensive waste of memory on computing machines, and it also required additional tools and steps for carrying values, which added time or resource.
Later, someone realized that if you took the nines' complement and added 1 afterwards, and then dropped any carrying values beyond the most significant digit, you could also achieve negative value representation or subtraction, while only having one 0 value, and not needing to perform any carry-over at the end. (The only downside was that your distribution of values was uneven across negative and positive numbers.) Because the operation involved taking nines' complement and adding one to it, we called it "ten's complement" as a kind of shorthand. Notice the different placement of the apostrophe in the name. We have two different calculations that use the same name. The method "ten's complement" is not the same as "tens' complement". The former uses the second method I mentioned, while the latter uses the first (older) method I mentioned.
Then, to make the names simpler later, we said, "Hey, we call it ten's complement and it flips a base 10 number (decimal representation), so when we're using it we should just call it the "radix complement". And when we use nines' complement in base 10 we should just call it the "diminished radix complement". Again, this is confusing because we're reversing the way it actually happened in our terminology... ten's complement was actually named because it was "nines' complement plus one", but now we're calling it "ten's complement diminished" basically.
And then the same thing applies with ones' complement and two's complement for binary.

Can java floats be sorted by their byte representations?

I'm working in Hadoop, and I need to provide a comparator to sort objects as raw network order byte arrays. This is easy for me to do with integers -- I just compare each byte in order. I also need to do this for floats. I think, but I can't find a reference, that the IEEE 754 format for floats used by Java can be sorted by just comparing each byte as a signed 8 bit value.
Can anyone confirm or refute this?
Edit: the representation is IEEE 754 32 bit floating point. I actually have a (larger) byte buffer and an offset and length within that buffer. I found some utility methods already there that make it easy to turn this into a float, so I guess this question is moot. I'm still curious if anyone knows the answer.
Positive floats have the same ordering as their bit representations viewed as 2s complement integers. Negative floats do not.
For example, the bit representation of -2.0f is 0xc0000000 and -1.0f is 0xbf800000. If you try to use a comparison on the representations, you get -2.0f > -1.0f, which is incorrect.
There's also the issue of NaNs (which compare unordered against all floating point data, whereas the representations do not), but you may not care about them.
This almost works:
int bits = Float.floatToIntBits(x);
bits ^= (bits >> 31) & Integer.MAX_VALUE;
Here negative floats have bits 0-30 inverted (because you want the opposite order to what the original sign/magnitude representation would give you, whilst preserving the sign bit.)
Caveats:
NaNs are included in the ordering (best to consider the results undefined if NaNs are involved.)
+0 now compares as greater than -0 (the built-in relational operators consider them equal.)
It works for all other values though, including denormals and infinities.
Use Float.toIntBits(float) and compare the integers.
Edit: This works only for positive numbers, including positive infinity, but not NaN. For negative numbers you have to reverse the ordering. And positive numbers are of course greater than negative numbers.
well, if you are transmitting data over the network, you should have some form of semantic representation for when you are transmitting an int and when you are transmitting a float. Since it is machine agnostic information, data type width should also be defined in some place or predefined by the spec (i.e 32 bit or 64 bit floats). So, what you should really do is accumulate your bytes into the appropriate data type, then use the natural language data types to do the comparison.
To be really accurate with an answer tho, we would need to see your transmit and receive code to see if you are autoboxing primitives via some kind of decorated i/o stream or some such thing. For a better answer, please provide better detail.

Categories

Resources