The question is about the correct way of creating a hash in Java:
Lets assume I have a positive BigInteger value that I would like to create a hash from. Lets assume that below instance of the messageDigest is a valid instance of (SHA-256)
public static final BigInteger B = new BigInteger("BD0C61512C692C0CB6D041FA01BB152D4916A1E77AF46AE105393011BAF38964DC46A0670DD125B95A981652236F99D9B681CBF87837EC996C6DA04453728610D0C6DDB58B318885D7D82C7F8DEB75CE7BD4FBAA37089E6F9C6059F388838E7A00030B331EB76840910440B1B27AAEAEEB4012B7D7665238A8E3FB004B117B58", 16);
byte[] byteArrayBBigInt = B.toByteArray();
this.printArray(byteArrayBBigInt);
messageDigest.reset();
messageDigest.update(byteArrayBBigInt);
byte[] outputBBigInt = messageDigest.digest();
Now I only assume that the code below is correct, as according to the test the hashes I produce match with the one produced by:
http://www.fileformat.info/tool/hash.htm?hex=BD0C61512C692C0CB6D041FA01BB152D4916A1E77AF46AE105393011BAF38964DC46A0670DD125B95A981652236F99D9B681CBF87837EC996C6DA04453728610D0C6DDB58B318885D7D82C7F8DEB75CE7BD4FBAA37089E6F9C6059F388838E7A00030B331EB76840910440B1B27AAEAEEB4012B7D7665238A8E3FB004B117B58
However I am not sure why we are doing the step below i.e.
because the returned byte array after the digest() call is signed and in this case it is a negative, I suspect that we do need to convert it to a positive number i.e. we can use a function like that.
public static String byteArrayToHexString(byte[] b) {
String result = "";
for (int i=0; i < b.length; i++) {
result += Integer.toString((b[i] & 0xff) + 0x100, 16).substring(1);
}
return result;
}
thus:
String hex = byteArrayToHexString(outputBBigInt)
BigInteger unsignedBigInteger = new BigInteger(hex, 16);
When I construct a BigInteger from the new hex string and convert it back to byte array then I see that the sign bit, that is most significant bit i.e. the leftmost bit, is set to 0 which means that the number is positive, moreover the whole byte is constructed from zeros ( 00000000 ).
My question is: Is there any RFC that describes why do we need to convert the hash always to a "positive" unsigned byte array. I mean even if the number produced after the digest call is negative it is still a valid hash, right? thus why do we need that additional procedure. Basically, I am looking for a paper: standard or rfc describing that we need to do so.
A hash consists of an octet string (called a byte array in Java). How you convert it to or from a large number (a BigInteger in Java) is completely out of the scope for cryptographic hash algorithms. So no, there is no RFC to describe it as there is (usually) no reason to treat a hash as a number. In that sense a cryptographic hash is rather different from Object.hashCode().
That you can only treat hexadecimals as unsigned is a bit of an issue, but if you really want to then you can first convert it back to a byte array, and then perform new BigInteger(result). That constructor does threat the encoding within result as signed. Note that in protocols it is often not needed to convert back and forth to hexadecimals; hexadecimals are mainly for human consumption, a computer is fine with bytes.
Related
I wrote a RSA encryption in Java. I am trying to turn the numbers that it outputs into text or characters. For example if I feed it Hello I get:
23805663430659911910
However, online RSA encryptions return something to the effect of this:
GVom5zCerZ+dmOCE7YAp0F+N3L26L
I would just like to know how to convert my numbers into something similar. The number returned by my system is a BigInteger. This is what I've tried so far:
RSA rsa = new RSA("Hello");
BigInteger cypher_number = rsa.encrypt(); // 23805663430659911910
byte[] cypher_bytes = cypher_number.toByteArray(); // [B#368102c8
String cypher_text = new String(cypher_bytes); // J^��*���
// Now even though cypher_text is J^��*��� I wouldn't care as long as I can turn it back.
byte[] plain_bytes = cypher_text.getBytes(); // [B#6996db8 | Not the same as cypher_bytes but lets keep going.
BigInteger plain_number = new BigInteger(plain_bytes); // 28779359581043512470254837759607478877667261
// plain_number has more than doubled in size compared to cypher_number and won't decrypt properly.
Using bytes it the only way I can think of. Can someone please help me understand what I'm supposed to be doing or if it's even possible?
This is generally a 2-step process:
convert to binary encoding of the number;
convert the binary encoding to a text base encoding.
For both steps there are multiple schemes possible.
For binary encoding: the PKCS#1 specifications have always included one that converts the number to a statically sized integer. To be precise, it describes the number into a statically sized, unsigned, big endian octet string. An octet string is nothing but a byte array.
Now, BigInteger.toByteArray returns a dynamically sized, signed, big endian octet string. So you need to implement the possible resizing and removal of initial 00 byte in a separate method, which I have at my other post here. Fortunately going back to a number is much easier as the Java implementation provides a BigInteger(int sign, byte[] value) constructor that reads in an unsigned number and skips leading zero bytes.
Having a standardized and statically sized octet string can be terribly useful, so I would not go for any other scheme.
This leaves the conversion to and from text. For that you can (indeed) use the java.util.Base64 class, which doesn't need much explaining. The only note that I must make is that it converts to an ASCII byte[] for some of the methods, so you need to use the encodeToString(byte[] src) instead.
Another method would be hexadecimals, but since Java doesn't contain a hex encoder for byte arrays in the base classes, I'd go for base 64 instead.
I have found the answer. In case you've found this looking for the answer, you just need to encode the numbers into Base64.
The following code converts the number into a dynamically sized, signed, big endian encoded integer, and then converts it back into a number using the reverse process.
// Encode
BigInteger numbers = new BigInteger("5109763");
byte[] bytes = Base64.getEncoder().encode(numbers.toByteArray());
String encoded = new String(bytes); // Encoded value
// Decode
byte[] decoded_bytes = Base64.getDecoder().decode(encoded.getBytes());
BigInteger numbers_again = new BigInteger(decoded_bytes); // Original numbers
I have two different program that wish to hash same string using Murmur3 in Python and Java respectively.
Python version 2.7.9:
mmh3.hash128('abc')
Gives 79267961763742113019008347020647561319L.
Java is Guava 18.0:
HashCode hashCode = Hashing.murmur3_128().newHasher().putString("abc", StandardCharsets.UTF_8).hash();
Gives string "6778ad3f3f3f96b4522dca264174a23b", converting to BigInterger gives 137537073056680613988840834069010096699.
How to get same result from both?
Thanks
Here's how to get the same result from both:
byte[] mm3_le = Hashing.murmur3_128().hashString("abc", UTF_8).asBytes();
byte[] mm3_be = Bytes.toArray(Lists.reverse(Bytes.asList(mm3_le)));
assertEquals("79267961763742113019008347020647561319",
new BigInteger(mm3_be).toString());
The hash code's bytes need to be treated as little endian but BigInteger interprets bytes as big endian. You were presumably using new BigInteger(hex, 16) to create the BigInteger, but the output of HashCode.toString() is actually a series of pairs of hexadecimal digits representing the hash bytes in the same order they're returned by asBytes() (little endian). (You can also reverse those pairs of hexadecimal to get a hex number that does produce the same result when passed to new BigInteger(reversedHex, 16)).
I think the documentation of toString() is somewhat confusing because of the way it refers to "big endian"; it doesn't actually mean that the output of the method is the hexadecimal number representing the bytes interpreted as big endian.
We have an open issue for adding asBigInteger() to HashCode.
If anyone is interested in the reverse answer, converting the python output to the Java output:
import mmh3
import string
char_array = '0123456789abcdef'
mumrmur = mmh3.hash_bytes('abc')
result = [f'{string.hexdigits[(char >> 4) & 0xf]}{string.hexdigits[char & 0xf]}' for char in mumrmur]
print(''.join(result))
So, I'm using a proprietary library that has its own implementation for the creation of RSA key pairs. The public key struct looks like this:
typedef struct
{
unsigned int bits; //Length of modulus in bits
unsigned char modulus[MAX_RSA_MOD_LEN]; //Modulus
unsigned char exponent[MAX_RSA_MOD_LEN]; //Exponent
} RSA_PUB_KEY
I need to figure out a way to extract both the exponent and the module so I can send them to a server as part of a validation scheme. I guess that this is a pretty standard procedure (or so I hope). I've already read these two similar questions:
How to convert an Unsigned Character array into a hexadecimal string in C
Printing the hexadecimal representation of a char array[]
But so far I've had no luck. I'm also not sure of how to use if at all necessary the "bits" field to extract the modulus. In short what I have to do is be able to recreate this public key in Java:
BigInteger m = new BigInteger(MODULUS);
BigInteger e = new BigInteger(EXPONENT);
RSAPublicKeySpec keySpec = new RSAPublicKeySpec(m, e);
KeyFactory fact = KeyFactory.getInstance("RSA");
PublicKey pubKey = fact.generatePublic(keySpec);
return pubKey;
Edit:
This is what I'm doing right now: (RSAPublic is a RSA_PUB_KEY struct as described above).
//RSAPublic.bits = length of modulus in bits
log("Modulus length: "+std::to_string(RSAPublic.bits));
log("Key length: "+std::to_string(keyLengthInBits));
//Calculating buffer size for converted hexadec. representations
int modulusLengthInBytes = (RSAPublic.bits+7)/8 ;
int exponentLengthInBytes = (keyLengthInBits+7)/8;
char convertedMod[modulusLengthInBytes*2+1];
char convertedExp[exponentLengthInBytes*2+1];
//Conversion
int i;
for(i=0; i<modulusLengthInBytes ; i++){
sprintf(&convertedMod[i*2], "%02X", RSAPublic.modulus[i]);
}
for(i=0; i<exponentLengthInBytes ; i++){
sprintf(&convertedExp[i*2], "%02X", RSAPublic.exponent[i]);
}
//Print results
printf("Modulus: %s\n", convertedMod);
printf("Exponent: %s\n", convertedExp);
And this is the output:
Modulus length: 16
Key length: 512
Modulus: 0000
Exponent: 0A000200FFFFFFFFFFFF0000600007004DDA0100B01D0000AEC642017A4513000000000000000000000000000000000000000000000000000000000000000000
I'm assuming that you can't just send binary data since you mention the hexadecimal conversion. The most compact way you can send the data as text would be with base 64 but this is more complex than hexadecimal.
Client side
Convert the unsigned char array to a hexadecimal string using a method from the links you have. The bits field will determine how many bytes from the array to use given by (bits+7)/8.
Depending on implementation you might have to explicitly select the overflow bits or the rest might be zeroed, this also depends on the endianness so since you are unsure on implementation details you might have to fiddle around with it a bit.
Once you have the encoded strings, send them to the server.
Server side
Read the encoded strings from the connection and then pass them to the BigInteger(String val, int radix) constructor using the radix of hexadecimal (16).
You will then have A BigInteger with the value you require.
If the first bytes of the public exponent are all zero's then you are dealing with a big endian array. This is most common. In principle the public exponent can be as large as the modulus, but this is commonly not the case. Most common values are 65537, 17 and 3, maybe even 2 but the 3 and 2 are not such good values. Other 2-4 byte primes are also common.
Now if you know the endianness, you can have a look at the modulus. If the highest byte value is 00 then you are dealing with a signed representation of the modulus. Otherwise it is likely unsigned. The highest order byte of the modulus that contains bits should always be 80 or higher. The reason is that otherwise the key size would be smaller than the given key size. This is assuming that the key size is a multiple of 8 of course.
Java only works with big endian for BigInteger (and any other number representation). So if you have little endian encoding in C then you need to reverse the values in Java. It is probably the best to reverse the hexadecimal values in the string to accomplish that. Make sure you handle 2 hexadecimal characters at a time.
Then, as DrYap suggested, use the hexadecimal constructor of BigInteger. Note that if you end up using a byte array then you may want to use new BigInteger(1, MODULUS) as this makes sure you get a positive number regardless of the highest order bit value in the encoding.
I am using consuming a C# web services and one of the parameters is sending a md5 hash. Java creates MD5 hash with signed (contains negative number in the byte array) and C# generates unsigned (contains no negative number in the byte array).
I have gone through multiple similar question in Stack Overflow but did not find any to my satisfaction.
All I need is unsigned byte array similar to the one c# generates. I have tried using BigInteger but I need it in an unsigned byte array since I need do further processing after that. BigInteger gives me one single integer and using tobytearray() still has negative numbers.
If I have to do 2 complement, then how can I do that. Then I can loop through the byte array and convert negative number to positive number.
I am using the following Java code for generating MD5 hash:
String text = "abc";
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] md5hash = new byte[32];
try {
md.update(text.getBytes("utf-8"), 0, text.length());
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
md5hash = md.digest();
Java bytes are signed numbers, but that only means that when considering a byte (which is a sequence of 8 bits) as a number, Java treats one of the bits as a sign bit, whereas other language read the same sequence of bits as an unsigned number, containing no sign bit.
The MD5 algorithm is a binary algorithm that transforms a sequence of bits (or bytes) into another sequence of bits (or bytes). The way Java does that is the same as the way any other language does it. It's only when displaying the bytes as numbers that you'll get different outputs depending on the way the language transforms bytes into numbers.
So the short answer is, send an MD5 hash generated using Java to a C# program, and it will work fine.
If you want to display the byte array in Java as unsigned numbers, just use the following code:
for (byte b : bytes) {
System.out.println(b & 0xFF);
}
How can I read a file to bytes in Java?
It is important to note that all the bytes need to be positive, i.e. the negative range cannot be used.
Can this be done in Java, and if yes, how?
I need to be able to multiply the contents of a file by a constant. I was assuming that I can read the bytes into a BigInteger and then multiply, however since some of the bytes are negative I am ending up with 12 13 15 -12 etc and get stuck.
Well, Java doesn't have the concept of unsigned bytes... the byte type is always signed, with values from -128 to 127 inclusive. However, this will interoperate just fine with other systems which have worked with unsigned values for example, C# code writing a byte of "255" will produce a file where the same value is read as "-1" in Java. Just be careful, and you'll be okay.
EDIT: You can convert the signed byte to an int with the unsigned value very easily using a bitmask. For example:
byte b = -1; // Imagine this was read from the file
int i = b & 0xff;
System.out.println(i); // 255
Do all your arithmetic using int, and then cast back to byte when you need to write it out again.
You generally read binary data from from files using FileInputStream or possibly FileChannel.
It's hard to know what else you're looking for at the moment... if you can give more details in your question, we may be able to help you more.
With the unsigned API in Java 8 you have Byte.toUnsignedInt. That'll be a lot cleaner than manually casting and masking out.
To convert the int back to byte after messing with it of course you just need a cast (byte)value
You wrote in a comment (please put such informations in the question - there is an edit link for this):
I need to be able to multiply the contents of a file by a constant.
I was assuming that I can read the bytes into a BigInteger and then
multiply, however since some of the bytes are negative I am ending
up with 12 13 15 -12 etc and gets stuck.
If you want to use the whole file as a BigInteger, read it in a byte[], and give this array (as a whole) to the BigInteger-constructor.
/**
* reads a file and converts the content to a BigInteger.
* #param f the file name. The content is interpreted as
* big-endian base-256 number.
* #param signed if true, interpret the file's content as two's complement
* representation of a signed number.
* if false, interpret the file's content as a unsigned
* (nonnegative) number.
*/
public static BigInteger fileToBigInteger(File f, boolean signed)
throws IOException
{
byte[] array = new byte[file.length()];
InputStream in = new FileInputStream(file);
int i = 0; int r;
while((r = in.read(array, i, array.length - i) > 0) {
i = i + r;
}
in.close();
if(signed) {
return new BigInteger(array);
}
else {
return new BigInteger(1, array);
}
}
Then you can multiply your BigInteger and save the result in a new file (using the toByteArray() method).
Of course, this very depends on the format of your file - my method assumes the file contains the result of the toByteArray() method, not some other format. If you have some other format, please add information about this to your question.
"I need to be able to multiply the contents of a file by a constant." seems quite a dubious goal - what do you really want to do?
If using a larger integer type internally is not a problem, just go with the easy solution, and add 128 to all integers before multiplying them. Instead of -128 to 127, you get 0 to 255. Addition is not difficult ;)
Also, remember that the arithmetic and bitwise operators in Java only returns integers, so:
byte a = 0;
byte b = 1;
byte c = a | b;
would give a compile time error since a | b returns an integer. You would have to to
byte c = (byte) a | b;
So I would suggest just adding 128 to all your numbers before you multiply them.
Some testing revealed that this returns the unsigned byte values in [0…255] range one by one from the file:
Reader bytestream = new BufferedReader(new InputStreamReader(
new FileInputStream(inputFileName), "ISO-8859-1"));
int unsignedByte;
while((unsignedByte = bytestream.read()) != -1){
// do work
}
It seems to be work for all bytes in the range, including those that no characters are defined for in ISO 8859-1.