BigInteger(String) and BigInteger(byte[]) are not equal

BigInteger(String) and BigInteger(byte[]) are not equal - java

I was expecting the two constructors in the BigInteger class, BigInteger(String) and BigInteger(byte[]), to behave similarly but they don't.
Why are the two BigInteger not equal? How can I create a BigInteger from the byte array?
String hex = "94B4";
byte[] b = DatatypeConverter.parseHexBinary(hex); // -108, -76
BigInteger b1 = new BigInteger(hex, 16); //38068
BigInteger b2 = new BigInteger(b); //-27468

Looks like the byte[] constructor treats input as regular 2's complement data, whereas the hex constructor treats it as, well, a hex string.
Using the new BigInteger(int signum, byte[] magnitude) allows you to force the value to be positive, so new BigInteger(1, b) will be 38068.

It will come as no surprise that 38068 + 27468 is 65536.
Remember that a java.lang.String is an array of characters, and a char in Java is an unsigned 16 bit type. Nice.
BigInteger b2 = new BigInteger(b); circumvents that. It interprets the data as a 2's complement signed 16 bit type.
Hence the difference.

Related

Java->C# BigInteger + Math Conversion

I am attempting to convert some BigInteger objects and math from Java to C#.
The Java flow is as follows:
1. Construct 2 BigIntegers from a base-10 string (i.e. 0-9 values).
2. Construct a third BigInteger from an inputted byte array.
3. Create a fourth BigInteger as third.modPow(first, second).
4. Return the byte result of fourth.
The main complications in converting to C# seem to consist of endianness and signed/unsigned values.
I have tried a couple different ways to convert the initial 2 BigIntegers from Java->C#. I believe that using the base-10 string with BigInteger.Parse will work as intended, but I am not completely sure.
Another complication comes from the use of a BinaryReader/BinaryWriter implementation, in C#, that is already big-endian (like Java). I use the BR/BW to supply the byte array to create the third BigInteger and consume the byte array produced from the modPow (the fourth BigInteger).
I have tried reversing the byte arrays for input and output in every way, and still do not get the expected output.
Java:
public static byte[] doMath(byte[] input)
{
BigInteger exponent = new BigInteger("BASE-10-STRING");
BigInteger mod = new BigInteger("BASE-10-STRING");
BigInteger bigInput = new BigInteger(input);
return bigInput.modPow(exponent, mod).toByteArray();
}
C#:
public static byte[] CSharpDoMath(byte[] input)
{
BigInteger exponent = BigInteger.Parse("BASE-10-STRING");
BigInteger mod = BigInteger.Parse("BASE-10-STRING");
// big->little endian
byte[] reversedBytes = input.Reverse().ToArray();
BigInteger bigInput = new BigInteger(reversedBytes);
BigInteger output = BigInteger.ModPow(bigInput, exponent, mod);
// little->big endian
byte[] bigOutput = output.ToByteArray().Reverse().ToArray();
return bigOutput;
}
I need the same output from both.

Assign primitive max and min

To remember the maximum and minimum values for each integer primitive in decimal is quite difficult. Not so difficult for binary and hex. I understand that the wrapper classes have these values built in, but I like working with binary and hex. The problem is that I try
// value is byte
value = 0x80; // 1000_0000
but it doesn't work. But (after a little research),
value = -0x80;
and
value = (byte) 0x80;
both work. I thought that when assigning a literal to the smaller integer primitives, no cast is necessary. Are the binary, octal, and hex values considered literals or are they all 32 bits because they are not decimal? Any help in understanding what is going on is welcome.

To remember the maximum and minimum values, use the built in constants:
int iMax = Integer.MAX_VALUE;
byte bMin = Byte.MIN_VALUE;
Binary, octal, and hex values are evaluated in compile time as if their decimal equivalent were written. Some examples:
byte b1 = 0b1111111; //OK (same as byte b1 = 127)
byte b2 = 0b10000000; //possible lossy conversion (compiler does not know if you tried to insert the byte -128 or the int 128)
int i1 = 0xFFFFFFFF; //OK
int i2 = 0xFFFFFFFFF; //too large to fit inside int

How does BigInteger interpret the bytes from a string?

Im working on a program that is an implementation of the RSA encryption algorithm, just as a personal exercise, its not guarding anyone's information or anything. I am trying to understand how a plaintext passage is being interpreted numerically, allowing it to be encrypted. I understand that most UTF-8 characters end up only using 1 byte of space, and not the 2 bytes one might think, but thats about it. Heres my code:
BigInteger ONE = new BigInteger("1");
SecureRandom rand = new SecureRandom();
BigInteger d, e, n;
BigInteger p = BigInteger.probablePrime(128, rand);
BigInteger q = BigInteger.probablePrime(128, rand);
BigInteger phi = (p.subtract(ONE)).multiply(q.subtract(ONE));
n = p.multiply(q);
e = new BigInteger("65537");
d = e.modInverse(phi);
String string = "test";
BigInteger plainText = new BigInteger(string.getBytes("UTF-8"));
BigInteger cipherText = plainText.modPow(e, n);
BigInteger originalMessage = cipherText.modPow(d, n);
String decrypted = new String(originalMessage.toByteArray(),"UTF-8");
System.out.println("original: " + string);
System.out.println("decrypted: " + decrypted);
System.out.println(plainText);
System.out.println(cipherText);
System.out.println(originalMessage);
System.out.println(string.getBytes("UTF-8"));
byte byteArray[] = string.getBytes("UTF-8");
for(byte littleByte:byteArray){
System.out.println(littleByte);
}
It outputs:
original: test
decrypted: test
1952805748
16521882695662254558772281277528769227027759103787217998376216650996467552436
1952805748
[B#60d70b42
116
101
115
116
Maybe more specifically i am wondering about this line:
BigInteger plainText = new BigInteger(string.getBytes("UTF-8"));
Does each letter of "test" have a value, and they are literraly added together here? Like say t=1,e=2,s=3,t=1 for example, if you get the bytes from that string, do you end up with 7 or are the values just put together like 1231? And why does
BigInteger plainText = new BigInteger(string.getBytes("UTF-8")); output 1952805748

I am trying to understand how a plaintext passage is being interpreted numerically, allowing it to be encrypted.
It really boils down to understanding what this line does:
BigInteger plainText = new BigInteger(string.getBytes("UTF-8"));
Lets break it down.
We start with a String (string). A Java string is a sequence of characters represented as Unicode code points (encoded in UCS-16 ...).
The getBytes("UTF-8") then encodes the characters as a sequence of bytes, and returns them in a newly allocated byte array.
The BigInteger(byte[]) constructor interprets that byte array as a number. As the javadoc says:
Translates a byte array containing the two's-complement binary representation of a BigInteger into a BigInteger. The input array is
assumed to be in big-endian byte-order: the most significant byte is
in the zeroth element.
The method that is being used here is not giving an intrisically meaningful number, just one that corresponds to the byte-encoded string. And going from the byte array to the number is simply treating the bytes as a bit sequence that represents an integer in 2's complement form ... which is the most common representation for integers on modern hardware.
The key thing is that the transformation from the text to the (unencrypted) BigInteger is lossless and reversible. Any other transformation with those properties could be used.
References:
The Wikipedia page on 2's Complement representation
The Wikipedia page on the UTF-8 text encoding scheme
javadoc BigInteger(byte[])
javadoc String.getBytes(String)
Im still not quite understanding how the the UTF-8 values for each character in "test", 116,101,115,116 respectively come together to form 1952805748?
Convert the numbers 116,101,115,116 to hex.
Convert the number 1952805748 to hex
Compare them
See the pattern?

The answer is in the output, "test" is encoded into array of 4 bytes [116, 101, 115, 116]. This is then interperted by BigInteger as binary integer representation. The value can be calculated this way
value = (116 << 24) + (101 << 16) + (115 << 8) + 116;

How to convert int [] to Big Integer?

I would like to convert an integer array of values, which was original were bytes.

First, make sure you know in which format your int[] is meant to be interpreted.
Each int can be seen as consisting of four bytes, and these bytes together can be converted to an BigInteger. The details are the byte order - which byte is the most and which one the least significant?
Also, do you have a signed or unsigned number?
A simple way to convert your ints to bytes (for latter use in a BigInteger constructor) would be to use ByteBuffer and wrap an IntBuffer around it.
public BigInteger toBigInteger(int[] data) {
byte[] array = new byte[data.length * 4];
ByteBuffer bbuf = ByteBuffer.wrap(array);
IntBuffer ibuf = bbuf.asIntBuffer();
ibuf.put(data);
return new BigInteger(array);
}
Obvious adaptions would be to set the byte order of bbuf, or use another BigInteger constructor (for unsigned).

Well, what about new BigInteger(byte[] val)?
To quote the API docs I linked to:
Translates a byte array containing the two's-complement binary representation of a BigInteger into a BigInteger. The input array is assumed to be in big-endian byte-order: the most significant byte is in the zeroth element.

How to Convert Int to Unsigned Byte and Back

I need to convert a number into an unsigned byte. The number is always less than or equal to 255, and so it will fit in one byte.
I also need to convert that byte back into that number. How would I do that in Java? I've tried several ways and none work. Here's what I'm trying to do now:
int size = 5;
// Convert size int to binary
String sizeStr = Integer.toString(size);
byte binaryByte = Byte.valueOf(sizeStr);
and now to convert that byte back into the number:
Byte test = new Byte(binaryByte);
int msgSize = test.intValue();
Clearly, this does not work. For some reason, it always converts the number into 65. Any suggestions?

A byte is always signed in Java. You may get its unsigned value by binary-anding it with 0xFF, though:
int i = 234;
byte b = (byte) i;
System.out.println(b); // -22
int i2 = b & 0xFF;
System.out.println(i2); // 234

Java 8 provides Byte.toUnsignedInt to convert byte to int by unsigned conversion. In Oracle's JDK this is simply implemented as return ((int) x) & 0xff; because HotSpot already understands how to optimize this pattern, but it could be intrinsified on other VMs. More importantly, no prior knowledge is needed to understand what a call to toUnsignedInt(foo) does.
In total, Java 8 provides methods to convert byte and short to unsigned int and long, and int to unsigned long. A method to convert byte to unsigned short was deliberately omitted because the JVM only provides arithmetic on int and long anyway.
To convert an int back to a byte, just use a cast: (byte)someInt. The resulting narrowing primitive conversion will discard all but the last 8 bits.

If you just need to convert an expected 8-bit value from a signed int to an unsigned value, you can use simple bit shifting:
int signed = -119; // 11111111 11111111 11111111 10001001
/**
* Use unsigned right shift operator to drop unset bits in positions 8-31
*/
int psuedoUnsigned = (signed << 24) >>> 24; // 00000000 00000000 00000000 10001001 -> 137 base 10
/**
* Convert back to signed by using the sign-extension properties of the right shift operator
*/
int backToSigned = (psuedoUnsigned << 24) >> 24; // back to original bit pattern
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/op3.html
If using something other than int as the base type, you'll obviously need to adjust the shift amount: http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
Also, bear in mind that you can't use byte type, doing so will result in a signed value as mentioned by other answerers. The smallest primitive type you could use to represent an 8-bit unsigned value would be a short.

Except char, every other numerical data type in Java are signed.
As said in a previous answer, you can get the unsigned value by performing an and operation with 0xFF. In this answer, I'm going to explain how it happens.
int i = 234;
byte b = (byte) i;
System.out.println(b); // -22
int i2 = b & 0xFF;
// This is like casting b to int and perform and operation with 0xFF
System.out.println(i2); // 234
If your machine is 32-bit, then the int data type needs 32-bits to store values. byte needs only 8-bits.
The int variable i is represented in the memory as follows (as a 32-bit integer).
0{24}11101010
Then the byte variable b is represented as:
11101010
As bytes are signed, this value represent -22. (Search for 2's complement to learn more about how to represent negative integers in memory)
Then if you cast is to int it will still be -22 because casting preserves the sign of a number.
1{24}11101010
The the casted 32-bit value of b perform and operation with 0xFF.
1{24}11101010 & 0{24}11111111
=0{24}11101010
Then you get 234 as the answer.

The solution works fine (thanks!), but if you want to avoid casting and leave the low level work to the JDK, you can use a DataOutputStream to write your int's and a DataInputStream to read them back in. They are automatically treated as unsigned bytes then:
For converting int's to binary bytes;
ByteArrayOutputStream bos = new ByteArrayOutputStream();
DataOutputStream dos = new DataOutputStream(bos);
int val = 250;
dos.write(byteVal);
...
dos.flush();
Reading them back in:
// important to use a (non-Unicode!) encoding like US_ASCII or ISO-8859-1,
// i.e., one that uses one byte per character
ByteArrayInputStream bis = new ByteArrayInputStream(
bos.toString("ISO-8859-1").getBytes("ISO-8859-1"));
DataInputStream dis = new DataInputStream(bis);
int byteVal = dis.readUnsignedByte();
Esp. useful for handling binary data formats (e.g. flat message formats, etc.)

The Integer.toString(size) call converts into the char representation of your integer, i.e. the char '5'. The ASCII representation of that character is the value 65.
You need to parse the string back to an integer value first, e.g. by using Integer.parseInt, to get back the original int value.
As a bottom line, for a signed/unsigned conversion, it is best to leave String out of the picture and use bit manipulation as #JB suggests.

Even though it's too late, I'd like to give my input on this as it might clarify why the solution given by JB Nizet works. I stumbled upon this little problem working on a byte parser and to string conversion myself.
When you copy from a bigger size integral type to a smaller size integral type as this java doc says this happens:
https://docs.oracle.com/javase/specs/jls/se7/html/jls-5.html#jls-5.1.3
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.
You can be sure that a byte is an integral type as this java doc says
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
byte: The byte data type is an 8-bit signed two's complement integer.
So in the case of casting an integer(32 bits) to a byte(8 bits), you just copy the last (least significant 8 bits) of that integer to the given byte variable.
int a = 128;
byte b = (byte)a; // Last 8 bits gets copied
System.out.println(b); // -128
Second part of the story involves how Java unary and binary operators promote operands.
https://docs.oracle.com/javase/specs/jls/se7/html/jls-5.html#jls-5.6.2
Widening primitive conversion (§5.1.2) is applied to convert either or both operands as specified by the following rules:
If either operand is of type double, the other is converted to double.
Otherwise, if either operand is of type float, the other is converted to float.
Otherwise, if either operand is of type long, the other is converted to long.
Otherwise, both operands are converted to type int.
Rest assured, if you are working with integral type int and/or lower it'll be promoted to an int.
// byte b(0x80) gets promoted to int (0xFF80) by the & operator and then
// 0xFF80 & 0xFF (0xFF translates to 0x00FF) bitwise operation yields
// 0x0080
a = b & 0xFF;
System.out.println(a); // 128
I scratched my head around this too :). There is a good answer for this here by rgettman.
Bitwise operators in java only for integer and long?

If you want to use the primitive wrapper classes, this will work, but all java types are signed by default.
public static void main(String[] args) {
Integer i=5;
Byte b = Byte.valueOf(i+""); //converts i to String and calls Byte.valueOf()
System.out.println(b);
System.out.println(Integer.valueOf(b));
}

In terms of readability, I favor Guava's:
UnsignedBytes.checkedCast(long) to convert a signed number to an unsigned byte.
UnsignedBytes.toInt(byte) to convert an unsigned byte to a signed int.

Handling bytes and unsigned integers with BigInteger:
byte[] b = ... // your integer in big-endian
BigInteger ui = new BigInteger(b) // let BigInteger do the work
int i = ui.intValue() // unsigned value assigned to i

in java 7
public class Main {
public static void main(String[] args) {
byte b = -2;
int i = 0 ;
i = ( b & 0b1111_1111 ) ;
System.err.println(i);
}
}
result : 254

I have tested it and understood it.
In Java, the byte is signed, so 234 in one signed byte is -22, in binary, it is "11101010", signed bit has a "1", so with negative's presentation 2's complement, it becomes -22.
And operate with 0xFF, cast 234 to 2 byte signed(32 bit), keep all bit unchanged.
I use String to solve this:
int a = 14206;
byte[] b = String.valueOf(a).getBytes();
String c = new String(b);
System.out.println(Integer.valueOf(c));
and output is 14206.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.