SHA-1 hashing on Java and C# - java

I'm trying to validate the content of an XML node with SHA-1 , basically, we generate an SHA-1 hash with the content of that node and both sides (client C# and server Java) should have exactly the same hash.
The problem is , I have checked with a diff tool the content of both texts and there is not any difference. But I'm getting a different hash than the client.
C# hash : 60-53-58-69-29-EB-53-BD-85-31-79-28-A0-F9-42-B6-DE-1B-A6-0A
Java hash: E79D7E6F2A6F5D776447714D896D4C3A0CBC793
The way the client (C#) is generating the hash is this:
try
{
Byte[] stream = null;
using (System.Security.Cryptography.SHA1CryptoServiceProvider shaProvider = new System.Security.Cryptography.SHA1CryptoServiceProvider())
{
stream = shaProvider.ComputeHash(System.Text.Encoding.UTF8.GetBytes(text));
if (stream == null)
{
hash = "Error";
}
else
{
hash = System.BitConverter.ToString(stream);
}
}
}
catch (Exception error)
{
hash = string.Format("Error SHA-1: {0}", error);
}
return hash;
and this is how the server (Java) is generating the hash:
byte[] key = content.getBytes();
MessageDigest md = MessageDigest.getInstance("SHA1");
byte[] hash = md.digest(key);
String result = "";
for (byte b : hash) {
result += Integer.toHexString(b & 255);
}
return result.toUpperCase();
can someone help me ? .. thanks :)
UPDATE:
In order to check what's going on I have checked other ways to get a SHA1 hash in C# and I found this:
/// <summary>
/// Compute hash for string encoded as UTF8
/// </summary>
/// <param name="s">String to be hashed</param>
/// <returns>40-character hex string</returns>
public static string SHA1HashStringForUTF8String(string s)
{
byte[] bytes = Encoding.UTF8.GetBytes(s);
using (var sha1 = SHA1.Create())
{
byte[] hashBytes = sha1.ComputeHash(bytes);
return System.BitConverter.ToString(hashBytes).Replace("-",string.Empty);
}
}
This code gives this output:
E79D07E6F2A6F5D776447714D896D4C3A0CBC793
AND !! I just noticed that Python is giving the same output (sorry, I should double checked this)
So this is the deal
Using this provider: System.Security.Cryptography.SHA1CryptoServiceProvider shaProvider = new System.Security.Cryptography.SHA1CryptoServiceProvider()
Is giving a completly different output on three different machines ..
Using the above method in C# gives the same result as python does, also, for some reason Java is giving a sightly different output:
E79D7E6F2A6F5D776447714D896D4C3A0CBC793
Ideas?, is java the problem? the byte to hex method on java is the problem? there is another alternative?

Try using this as your hashing in C#:
static string Hash(string input)
{
using (SHA1Managed sha1 = new SHA1Managed())
{
var hash = sha1.ComputeHash(Encoding.UTF8.GetBytes(input));
var sb = new StringBuilder(hash.Length * 2);
foreach (byte b in hash)
{
// can be "x2" if you want lowercase
sb.Append(b.ToString("x2"));
}
return sb.ToString();
}
}
Hash("test"); //a94a8fe5ccb19ba61c4c0873d391e987982fbbd3
And then use this as your Java hashing:
private static String convertToHex(byte[] data) {
StringBuilder buf = new StringBuilder();
for (byte b : data) {
int halfbyte = (b >>> 4) & 0x0F;
int two_halfs = 0;
do {
buf.append((0 <= halfbyte) && (halfbyte <= 9) ? (char) ('0' + halfbyte) : (char) ('a' + (halfbyte - 10)));
halfbyte = b & 0x0F;
} while (two_halfs++ < 1);
}
return buf.toString();
}
public static String SHA1(String text) throws NoSuchAlgorithmException, UnsupportedEncodingException {
MessageDigest md = MessageDigest.getInstance("SHA-1");
byte[] textBytes = text.getBytes("iso-8859-1");
md.update(textBytes, 0, textBytes.length);
byte[] sha1hash = md.digest();
return convertToHex(sha1hash);
}
SHA1("test"); //a94a8fe5ccb19ba61c4c0873d391e987982fbbd3
Note you need the following imports:
import java.io.UnsupportedEncodingException; import
java.security.MessageDigest; import
java.security.NoSuchAlgorithmException;
Throws declarations are option, adjust to best fit your code!

Your problem is that you're not hashing the same bytes in both API.
If you choose to modify java's version, it should look like this:
byte[] key = content.getBytes("UTF8");
[...]
If you choose to modify c#' version, it should look like this:
stream = shaProvider.ComputeHash(System.Text.Encoding.UTF16.GetBytes(text));
[...]
Either way, both api should get the key's bytes through the same encoding.

Related

Make JAVA MD5 hash match C# MD5 hash

My job is to rewrite a bunch of Java codes is C#.
This is the JAVA code:
public static String CreateMD5(String str) {
try {
byte[] digest = MessageDigest.getInstance("MD5").digest(str.getBytes("UTF-8"));
StringBuffer stringBuffer = new StringBuffer();
for (byte b : digest) {
// i can not understand here
stringBuffer.append(Integer.toHexString((b & 255) | 256).substring(1, 3));
}
return stringBuffer.toString();
} catch (UnsupportedEncodingException | NoSuchAlgorithmException unused) {
return null;
}
}
Ok.As you can see this code is trying to make MD5 hash.But the thing i can not understand is the part that i have shown.
I tried this code in C# to rewrite this JAVA code:
public static string CreateMD5(string input)
{
// Use input string to calculate MD5 hash
using (System.Security.Cryptography.MD5 md5 = System.Security.Cryptography.MD5.Create())
{
byte[] inputBytes = System.Text.Encoding.ASCII.GetBytes(input);
byte[] hashBytes = md5.ComputeHash(inputBytes);
// Convert the byte array to hexadecimal string
StringBuilder sb = new StringBuilder();
for (int i = 0; i < hashBytes.Length; i++)
{
sb.Append(hashBytes[i].ToString("X2"));
}
return sb.ToString();
}
}
Well both codes are making MD5 hash strings but the results are different.
There is a difference in encoding between the two code snippets you've shown - your Java code uses UTF-8, but your C# code uses ASCII. This will result in a different MD5 hash computation.
Change your C# code from:
byte[] inputBytes = System.Text.Encoding.ASCII.GetBytes(input);
to:
byte[] inputBytes = System.Text.Encoding.UTF8.GetBytes(input);
This should™ fix your problem, provided there are no other code conversion errors.

Encrypt string using given modulus and exponent

I need to replicate the functionality of the following JAVA code that receives a string with the exponent and modulus of a public key to generate a public key with said parameters and encrypt a string:
package snippet;
import java.math.BigInteger;
import java.security.KeyFactory;
import java.security.Security;
import java.security.interfaces.RSAPublicKey;
import java.security.spec.RSAPublicKeySpec;
import javax.crypto.Cipher;
public class Snippet {
public static void main(String ... strings) {
try {
// Needed if you don't have this provider
Security.addProvider(new org.bouncycastle.jce.provider.BouncyCastleProvider());
//String and received public key example
String ReceivedString = "1234";
String publicRSA = "010001|0097152d7034a8b48383d3dba20c43d049";
EncryptFunc(ReceivedString, publicRSA);
//The result obtained from the ReceivedString and the publicRSA is as follows:
//Result in hex [1234] -> [777786fe162598689a8dc172ed9418cb]
} catch (Exception ex) {
System.out.println("Error: " );
ex.printStackTrace();
}
}
public static String EncryptFunc(String ReceivedString, String clavePublica) throws Exception {
String result = "";
//We separate the received public string into exponent and modulus
//We receive it as "exponent|modulus"
String[] SplitKey = clavePublica.split("\\|");
KeyFactory keyFactory = KeyFactory.getInstance("RSA","BC");
RSAPublicKeySpec ks = new RSAPublicKeySpec(new BigInteger(hex2byte(SplitKey[1])), new BigInteger(hex2byte(SplitKey[0])));
//With these specs, we generate the public key
RSAPublicKey pubKey = (RSAPublicKey)keyFactory.generatePublic(ks);
//We instantiate the cypher, with the EncryptFunc and the obtained public key
Cipher cipher= Cipher.getInstance("RSA/None/NoPadding","BC");
cipher.init(Cipher.ENCRYPT_MODE, pubKey);
//We reverse the ReceivedString and encrypt it
String ReceivedStringReverse = reverse(ReceivedString);
byte[] cipherText2 = cipher.doFinal(ReceivedStringReverse.getBytes("UTF8"));
result = byte2hex(cipherText2);
System.out.println("result in hex ["+ReceivedString+"] -> ["+result+"]");
return result;
}
public static byte[] hex2byte(String s) {
int len = s.length();
byte[] data = new byte[len / 2];
for (int i = 0; i < len; i += 2) {
data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4)
+ Character.digit(s.charAt(i+1), 16));
}
return data;
}
public static String byte2hex(byte[] bytes) {
StringBuilder result = new StringBuilder();
for (byte aByte : bytes) {
result.append(String.format("%02x", aByte));
// upper case
// result.append(String.format("%02X", aByte));
}
return result.toString();
}
public static String reverse(String source) {
int i, len = source.length();
StringBuilder dest = new StringBuilder(len);
for (i = (len - 1); i >= 0; i--){
dest.append(source.charAt(i));
}
return dest.toString();
}
}
I've tried several approaches with this one, And I have done some searching here, here, here, here and here.
I Managed to create the public key with the given parameters, but the results are always different when I encrypt the string:
using Org.BouncyCastle.Crypto;
using Org.BouncyCastle.Crypto.Digests;
using Org.BouncyCastle.Crypto.Encodings;
using Org.BouncyCastle.Crypto.Engines;
using Org.BouncyCastle.Crypto.Generators;
using Org.BouncyCastle.Crypto.Paddings;
using Org.BouncyCastle.Crypto.Parameters;
using Org.BouncyCastle.Math;
using Org.BouncyCastle.Security;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Security.Cryptography;
using System.Text;
using System.Threading.Tasks;
namespace RSACypherTest
{
public class Program
{
public static RSACryptoServiceProvider rsa;
static void Main(string[] args)
{
string str = "1234";
string publicRSA = "010001|0097152d7034a8b48383d3dba20c43d049";
string encrypted = "";
Console.WriteLine("Original text: " + str);
encrypted = Encrypt(str, publicRSA);
Console.WriteLine("Encrypted text: " + encrypted);
Console.ReadLine();
}
public static string Encrypt(string str, string PublicRSA)
{
string[] Separated = PublicRSA.Split('|');
RsaKeyParameters pubParameters = MakeKey(Separated[1], Separated[0], false);
IAsymmetricBlockCipher eng = new Pkcs1Encoding(new RsaEngine());
eng.Init(true, pubParameters);
byte[] plaintext = Encoding.UTF8.GetBytes(Reverse(str));
byte[] encdata = eng.ProcessBlock(plaintext, 0, plaintext.Length);
return ByteArrayToString(encdata);
}
public static string Reverse(string s)
{
char[] charArray = s.ToCharArray();
Array.Reverse(charArray);
return new string(charArray);
}
public static string ByteArrayToString(byte[] ba)
{
return BitConverter.ToString(ba).Replace("-", "");
}
public static byte[] StringToByteArray(string hex)
{
int NumberChars = hex.Length;
byte[] bytes = new byte[NumberChars / 2];
for (int i = 0; i < NumberChars; i += 2)
bytes[i / 2] = Convert.ToByte(hex.Substring(i, 2), 16);
return bytes;
}
private static RsaKeyParameters MakeKey(string modulusHexString, string exponentHexString, bool isPrivateKey)
{
var modulus = new BigInteger(modulusHexString, 16);
var exponent = new BigInteger(exponentHexString, 16);
return new RsaKeyParameters(isPrivateKey, modulus, exponent);
}
}
}
I'm trying to use BouncyCastle because it seems to be the most effcient way of dealing with the key generation and everything. Any help concerning this would be very much appreciated.
Thanks in advance.
This is not the answer to your question but may help you in understanding RSA encryption.
I setup a sample encryption program in C# and used your given public key (converted the BigInteger modulus & exponent to Base64 values and further just wrote the XML-String representation of the public to use this key for encryption. The keylength is good for a length of maximum 5 byte data.
When running the encryption 5 times you will receive different encodedData (here in Base64 encoding) each run. So it's the expected behavior of the RSA encryption.
As C# allows me to "build" a short key it is not possible to generate a fresh keypair of such length and I doubt that Bouncy Castle would do (but here on SO there are many colleagues with a much better understanding of BC :-).
If you would like the program you can use the following external link to the program: https://jdoodle.com/ia/40.
Result:
load a pre created public key
publicKeyXML2: lxUtcDSotIOD09uiDEPQSQ==AQAB
encryptedData in Base64: JIFfO7HXCvdi0nSxKb0eLA==
encryptedData in Base64: dvtRw0U0KtT/pDJZW2X0FA==
encryptedData in Base64: CqJJKZevO6jWH6DQ1dnkhQ==
encryptedData in Base64: G7cL6BBwxysItvD/Rg0PuA==
encryptedData in Base64: HcfZJITu/PzN84WgI8yc6g==
code:
using System;
using System.Security.Cryptography;
using System.Text;
class RSACSPSample
{
static void Main()
{
try
{
//Create byte arrays to hold original, encrypted, and decrypted data.
byte[] dataToEncrypt = System.Text.Encoding.UTF8.GetBytes("1234");
byte[] encryptedData;
//Create a new instance of RSACryptoServiceProvider to generate
//public and private key data.
using (RSACryptoServiceProvider RSA = new RSACryptoServiceProvider())
{
Console.WriteLine("load a pre created public key");
string publicKeyXML = "<RSAKeyValue><Modulus>AJcVLXA0qLSDg9PbogxD0Ek=</Modulus><Exponent>AQAB</Exponent></RSAKeyValue>";
RSA.FromXmlString(publicKeyXML);
string publicKeyXML2 = RSA.ToXmlString(false);
Console.WriteLine("publicKeyXML2: " + publicKeyXML2);
Console.WriteLine();
//Pass the data to ENCRYPT, the public key information
//(using RSACryptoServiceProvider.ExportParameters(false),
//and a boolean flag specifying no OAEP padding.
for (int i = 0; i < 5; i++)
{
encryptedData = RSAEncrypt(dataToEncrypt, RSA.ExportParameters(false), false);
string encryptedDataBase64 = Convert.ToBase64String(encryptedData);
Console.WriteLine("encryptedData in Base64: " + encryptedDataBase64);
}
}
}
catch (ArgumentNullException)
{
//Catch this exception in case the encryption did
//not succeed.
Console.WriteLine("Encryption failed.");
}
}
public static byte[] RSAEncrypt(byte[] DataToEncrypt, RSAParameters RSAKeyInfo, bool DoOAEPPadding)
{
try
{
byte[] encryptedData;
//Create a new instance of RSACryptoServiceProvider.
using (RSACryptoServiceProvider RSA = new RSACryptoServiceProvider())
{
//Import the RSA Key information. This only needs
//toinclude the public key information.
RSA.ImportParameters(RSAKeyInfo);
//Encrypt the passed byte array and specify OAEP padding.
//OAEP padding is only available on Microsoft Windows XP or
//later.
encryptedData = RSA.Encrypt(DataToEncrypt, DoOAEPPadding);
}
return encryptedData;
}
//Catch and display a CryptographicException
//to the console.
catch (CryptographicException e)
{
Console.WriteLine(e.Message);
return null;
}
}
}
While I won't mark my own answer as the correct one, I've found that there's the possibility to recreate the entire functionality of the java code mentioned in my question.
As Michael Fehr mentions in his answer, Its absolutely logical that any encryption method will try to avoid creating repeating or predictable patterns, as this answer perfectly describes.
Since in this particular situation the aim is to replicate the java code functionality, and said functionality revolves around getting the same results when encrypting a string with a given public key, we can use the answer in this post to generate a pice of code like the following:
private static string EncryptMessage(string str, string publicRSA)
{
string[] Separated = publicRSA.Split('|');
RsaKeyParameters pubParameters = MakeKey(Separated[1], Separated[0], false);
var eng = new RsaEngine();
eng.Init(true, pubParameters);
string x = Reverse(str);
byte[] plaintext = Encoding.UTF8.GetBytes(x);
var encdata = ByteArrayToString(eng.ProcessBlock(plaintext, 0, plaintext.Length));
return encdata;
}
private static RsaKeyParameters MakeKey(string modulusHexString, string exponentHexString, bool isPrivateKey)
{
byte[] mod = StringToByteArray(modulusHexString);
byte[] exp = StringToByteArray(exponentHexString);
var modulus = new BigInteger(mod);
var exponent = new BigInteger(exp);
return new RsaKeyParameters(isPrivateKey, modulus, exponent);
}
To recap:
As Michael Fehr says, it is not only normal but expected of a crypyography engine to NOT generate repeatable/predictable patterns
To deliver on the previous point, they add random "padding" to the messages
It's possible (but not recommended) to use BouncyCastle to generate a No-padding engine, emulating the functionality of Java code such as this Cipher rsa = Cipher.getInstance("RSA/ECB/nopadding");

SHA512 in C # and Java are different

There is such code on C # and java, sha512 in them differs, whether it is possible to make somehow that the result sha512 was identical? I understand the problem in BaseConverter, analog Base64 in Java? Tried
Base64.getEncoder().encodeToString(str);
But I get an error because of getEncoder(). Do I need a library for this?
Code in C#:
public string Hash(string str)
{
string resultStr = String.Empty;
byte[] data = new UTF8Encoding().GetBytes(str);
byte[] result;
SHA512 shaM = new SHA512Managed();
result = shaM.ComputeHash(data);
resultStr = ReverseString(BitConverter.ToString(result).ToLower().Replace("-", String.Empty));
return resultStr.Substring(5, 25);
}
public static string ReverseString(string s)
{
char[] charArray = s.ToCharArray();
Array.Reverse(charArray);
return new string(charArray);
}
Code in Java:
public String Hash(String str) {
try {
MessageDigest digest = MessageDigest.getInstance("SHA-512");
digest.update(str.getBytes("UTF-16LE"));
byte messageDigest[] = digest.digest();
StringBuffer hexString = new StringBuffer();
for (int i = 0; i < messageDigest.length; i++) {
String h = Integer.toHexString(0xFF & messageDigest[i]);
while (h.length() < 2)
h = "0" + h;
hexString.append(h);
}
result = hexString.toString().toLowerCase();
} catch (NoSuchAlgorithmException e) {
e.printStackTrace();
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
return ReverseString(result).substring(5, 25);
}
public static String ReverseString(String s)
{
return new StringBuilder(s).reverse().toString();
}
You're hashing different data - in Java you're converting the string to UTF-16:
digest.update(str.getBytes("UTF-16LE"));
In C# you're using UTF-8:
byte[] data = new UTF8Encoding().GetBytes(str);
(I'm not sure why you're creating a new UTF8Encoding rather than using Encoding.UTF8, admittedly.)
With different input, you will get different hashes.
In general, the way to diagnose problems like this is to compare the data at every step of the transformation, whether that's through logging or debugging. In this case you have four transformations:
Message string to message bytes
Message bytes to hash bytes
Hash bytes to hash string (hex)
Reversed hash string (hex)
Next time, check the output of each step, and you'll work out where the problem is.
(It's not obvious why you'd want to reverse the hex output anyway, but that's a different matter.)
The problem was in the input line for hashing (the string was the salt without the rest of the data, the rest of the data was empty because there was an error in the definition of EditText (EditText returned an empty string) and also fixed the encoding in Java for UTF-8.

Android's string hash doesn't match serverisde's

I'm developing an Android app and I need to send some data from server to Android device.
To prevent app from downloading too much data,I wrote a php service, which takes hash (md5 hash of last downloaded data), provided by Android and compares it to latest data's hash on server. If hashes match each other, it prints 'no_new_data', otherwise it prints latest data. Php uses md5($string) method to calculate hash - this part seems to work fine.
The problem is that hash calculated on device never matches server's one - it is wrong, even though string seems to be same. I tried even with changing encoding but it didn't help.
My md5 java code:
public static String md5(String base){
try {
MessageDigest md = MessageDigest.getInstance("MD5");
md.update(base.getBytes());
byte byteData[] = md.digest();
//convert the byte to hex format method 1
StringBuffer sb = new StringBuffer();
for (int i = 0; i < byteData.length; i++) {
sb.append(Integer.toString((byteData[i] & 0xff) + 0x100, 16).substring(1));
}
//System.out.println("Digest(in hex format):: " + sb.toString());
//convert the byte to hex format method 2
StringBuffer hexString = new StringBuffer();
for (int i=0;i<byteData.length;i++) {
String hex=Integer.toHexString(0xff & byteData[i]);
if(hex.length()==1) hexString.append('0');
hexString.append(hex);
}
return hexString.toString();
}catch (Exception e){
return "a";
}
}
Thnks :)
Sometimes md5 hash is different from serverside hash. Try this method.
public static String getMD5Hash(String s) throws NoSuchAlgorithmException {
String result = s;
if (s != null) {
MessageDigest md = MessageDigest.getInstance("MD5"); // or "SHA-1"
md.update(s.getBytes());
BigInteger hash = new BigInteger(1, md.digest());
result = hash.toString(16);
while (result.length() < 32) { // 40 for SHA-1
result = "0" + result;
}
}
return result;
}
Never, ever use String.getBytes(), which depends on the platform-default charset, which is almost never what you want. It seems likely that the platform default charset differs between Android and your server side.
Pass it a Charset instead, e.g.
myString.getBytes(StandardCharsets.UTF_8)
if you have Java 7, or
myString.getBytes("UTF-8")
if you cannot.

Getting a File's MD5 Checksum in Java

I am looking to use Java to get the MD5 checksum of a file. I was really surprised but I haven't been able to find anything that shows how to get the MD5 checksum of a file.
How is it done?
There's an input stream decorator, java.security.DigestInputStream, so that you can compute the digest while using the input stream as you normally would, instead of having to make an extra pass over the data.
MessageDigest md = MessageDigest.getInstance("MD5");
try (InputStream is = Files.newInputStream(Paths.get("file.txt"));
DigestInputStream dis = new DigestInputStream(is, md))
{
/* Read decorated stream (dis) to EOF as normal... */
}
byte[] digest = md.digest();
Use DigestUtils from Apache Commons Codec library:
try (InputStream is = Files.newInputStream(Paths.get("file.zip"))) {
String md5 = org.apache.commons.codec.digest.DigestUtils.md5Hex(is);
}
There's an example at Real's Java-How-to using the MessageDigest class.
Check that page for examples using CRC32 and SHA-1 as well.
import java.io.*;
import java.security.MessageDigest;
public class MD5Checksum {
public static byte[] createChecksum(String filename) throws Exception {
InputStream fis = new FileInputStream(filename);
byte[] buffer = new byte[1024];
MessageDigest complete = MessageDigest.getInstance("MD5");
int numRead;
do {
numRead = fis.read(buffer);
if (numRead > 0) {
complete.update(buffer, 0, numRead);
}
} while (numRead != -1);
fis.close();
return complete.digest();
}
// see this How-to for a faster way to convert
// a byte array to a HEX string
public static String getMD5Checksum(String filename) throws Exception {
byte[] b = createChecksum(filename);
String result = "";
for (int i=0; i < b.length; i++) {
result += Integer.toString( ( b[i] & 0xff ) + 0x100, 16).substring( 1 );
}
return result;
}
public static void main(String args[]) {
try {
System.out.println(getMD5Checksum("apache-tomcat-5.5.17.exe"));
// output :
// 0bb2827c5eacf570b6064e24e0e6653b
// ref :
// http://www.apache.org/dist/
// tomcat/tomcat-5/v5.5.17/bin
// /apache-tomcat-5.5.17.exe.MD5
// 0bb2827c5eacf570b6064e24e0e6653b *apache-tomcat-5.5.17.exe
}
catch (Exception e) {
e.printStackTrace();
}
}
}
The com.google.common.hash API offers:
A unified user-friendly API for all hash functions
Seedable 32- and 128-bit implementations of murmur3
md5(), sha1(), sha256(), sha512() adapters, change only one line of code to switch between these, and murmur.
goodFastHash(int bits), for when you don't care what algorithm you use
General utilities for HashCode instances, like combineOrdered / combineUnordered
Read the User Guide (IO Explained, Hashing Explained).
For your use-case Files.hash() computes and returns the digest value for a file.
For example a sha-1 digest calculation (change SHA-1 to MD5 to get MD5 digest)
HashCode hc = Files.asByteSource(file).hash(Hashing.sha1());
"SHA-1: " + hc.toString();
Note that crc32 is much faster than md5, so use crc32 if you do not need a cryptographically secure checksum. Note also that md5 should not be used to store passwords and the like since it is to easy to brute force, for passwords use bcrypt, scrypt or sha-256 instead.
For long term protection with hashes a Merkle signature scheme adds to the security and The Post Quantum Cryptography Study Group sponsored by the European Commission has recommended use of this cryptography for long term protection against quantum computers (ref).
Note that crc32 has a higher collision rate than the others.
Using nio2 (Java 7+) and no external libraries:
byte[] b = Files.readAllBytes(Paths.get("/path/to/file"));
byte[] hash = MessageDigest.getInstance("MD5").digest(b);
To compare the result with an expected checksum:
String expected = "2252290BC44BEAD16AA1BF89948472E8";
String actual = DatatypeConverter.printHexBinary(hash);
System.out.println(expected.equalsIgnoreCase(actual) ? "MATCH" : "NO MATCH");
Guava now provides a new, consistent hashing API that is much more user-friendly than the various hashing APIs provided in the JDK. See Hashing Explained. For a file, you can get the MD5 sum, CRC32 (with version 14.0+) or many other hashes easily:
HashCode md5 = Files.hash(file, Hashing.md5());
byte[] md5Bytes = md5.asBytes();
String md5Hex = md5.toString();
HashCode crc32 = Files.hash(file, Hashing.crc32());
int crc32Int = crc32.asInt();
// the Checksum API returns a long, but it's padded with 0s for 32-bit CRC
// this is the value you would get if using that API directly
long checksumResult = crc32.padToLong();
Ok. I had to add. One line implementation for those who already have Spring and Apache Commons dependency or are planning to add it:
DigestUtils.md5DigestAsHex(FileUtils.readFileToByteArray(file))
For and Apache commons only option (credit #duleshi):
DigestUtils.md5Hex(FileUtils.readFileToByteArray(file))
Hope this helps someone.
A simple approach with no third party libraries using Java 7
String path = "your complete file path";
MessageDigest md = MessageDigest.getInstance("MD5");
md.update(Files.readAllBytes(Paths.get(path)));
byte[] digest = md.digest();
If you need to print this byte array. Use as below
System.out.println(Arrays.toString(digest));
If you need hex string out of this digest. Use as below
String digestInHex = DatatypeConverter.printHexBinary(digest).toUpperCase();
System.out.println(digestInHex);
where DatatypeConverter is javax.xml.bind.DatatypeConverter
I recently had to do this for just a dynamic string, MessageDigest can represent the hash in numerous ways. To get the signature of the file like you would get with the md5sum command I had to do something like the this:
try {
String s = "TEST STRING";
MessageDigest md5 = MessageDigest.getInstance("MD5");
md5.update(s.getBytes(),0,s.length());
String signature = new BigInteger(1,md5.digest()).toString(16);
System.out.println("Signature: "+signature);
} catch (final NoSuchAlgorithmException e) {
e.printStackTrace();
}
This obviously doesn't answer your question about how to do it specifically for a file, the above answer deals with that quiet nicely. I just spent a lot of time getting the sum to look like most application's display it, and thought you might run into the same trouble.
public static void main(String[] args) throws Exception {
MessageDigest md = MessageDigest.getInstance("MD5");
FileInputStream fis = new FileInputStream("c:\\apache\\cxf.jar");
byte[] dataBytes = new byte[1024];
int nread = 0;
while ((nread = fis.read(dataBytes)) != -1) {
md.update(dataBytes, 0, nread);
};
byte[] mdbytes = md.digest();
StringBuffer sb = new StringBuffer();
for (int i = 0; i < mdbytes.length; i++) {
sb.append(Integer.toString((mdbytes[i] & 0xff) + 0x100, 16).substring(1));
}
System.out.println("Digest(in hex format):: " + sb.toString());
}
Or you may get more info
http://www.asjava.com/core-java/java-md5-example/
We were using code that resembles the code above in a previous post using
...
String signature = new BigInteger(1,md5.digest()).toString(16);
...
However, watch out for using BigInteger.toString() here, as it will truncate leading zeros...
(for an example, try s = "27", checksum should be "02e74f10e0327ad868d138f2b4fdd6f0")
I second the suggestion to use Apache Commons Codec, I replaced our own code with that.
public static String MD5Hash(String toHash) throws RuntimeException {
try{
return String.format("%032x", // produces lower case 32 char wide hexa left-padded with 0
new BigInteger(1, // handles large POSITIVE numbers
MessageDigest.getInstance("MD5").digest(toHash.getBytes())));
}
catch (NoSuchAlgorithmException e) {
// do whatever seems relevant
}
}
Very fast & clean Java-method that doesn't rely on external libraries:
(Simply replace MD5 with SHA-1, SHA-256, SHA-384 or SHA-512 if you want those)
public String calcMD5() throws Exception{
byte[] buffer = new byte[8192];
MessageDigest md = MessageDigest.getInstance("MD5");
DigestInputStream dis = new DigestInputStream(new FileInputStream(new File("Path to file")), md);
try {
while (dis.read(buffer) != -1);
}finally{
dis.close();
}
byte[] bytes = md.digest();
// bytesToHex-method
char[] hexChars = new char[bytes.length * 2];
for ( int j = 0; j < bytes.length; j++ ) {
int v = bytes[j] & 0xFF;
hexChars[j * 2] = hexArray[v >>> 4];
hexChars[j * 2 + 1] = hexArray[v & 0x0F];
}
return new String(hexChars);
}
Here is a handy variation that makes use of InputStream.transferTo() from Java 9, and OutputStream.nullOutputStream() from Java 11. It requires no external libraries and does not need to load the entire file into memory.
public static String hashFile(String algorithm, File f) throws IOException, NoSuchAlgorithmException {
MessageDigest md = MessageDigest.getInstance(algorithm);
try(BufferedInputStream in = new BufferedInputStream((new FileInputStream(f)));
DigestOutputStream out = new DigestOutputStream(OutputStream.nullOutputStream(), md)) {
in.transferTo(out);
}
String fx = "%0" + (md.getDigestLength()*2) + "x";
return String.format(fx, new BigInteger(1, md.digest()));
}
and
hashFile("SHA-512", Path.of("src", "test", "resources", "some.txt").toFile());
returns
"e30fa2784ba15be37833d569280e2163c6f106506dfb9b07dde67a24bfb90da65c661110cf2c5c6f71185754ee5ae3fd83a5465c92f72abd888b03187229da29"
String checksum = DigestUtils.md5Hex(new FileInputStream(filePath));
Another implementation: Fast MD5 Implementation in Java
String hash = MD5.asHex(MD5.getHash(new File(filename)));
Standard Java Runtime Environment way:
public String checksum(File file) {
try {
InputStream fin = new FileInputStream(file);
java.security.MessageDigest md5er =
MessageDigest.getInstance("MD5");
byte[] buffer = new byte[1024];
int read;
do {
read = fin.read(buffer);
if (read > 0)
md5er.update(buffer, 0, read);
} while (read != -1);
fin.close();
byte[] digest = md5er.digest();
if (digest == null)
return null;
String strDigest = "0x";
for (int i = 0; i < digest.length; i++) {
strDigest += Integer.toString((digest[i] & 0xff)
+ 0x100, 16).substring(1).toUpperCase();
}
return strDigest;
} catch (Exception e) {
return null;
}
}
The result is equal of linux md5sum utility.
Here is a simple function that wraps around Sunil's code so that it takes a File as a parameter. The function does not need any external libraries, but it does require Java 7.
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import javax.xml.bind.DatatypeConverter;
public class Checksum {
/**
* Generates an MD5 checksum as a String.
* #param file The file that is being checksummed.
* #return Hex string of the checksum value.
* #throws NoSuchAlgorithmException
* #throws IOException
*/
public static String generate(File file) throws NoSuchAlgorithmException,IOException {
MessageDigest messageDigest = MessageDigest.getInstance("MD5");
messageDigest.update(Files.readAllBytes(file.toPath()));
byte[] hash = messageDigest.digest();
return DatatypeConverter.printHexBinary(hash).toUpperCase();
}
public static void main(String argv[]) throws NoSuchAlgorithmException, IOException {
File file = new File("/Users/foo.bar/Documents/file.jar");
String hex = Checksum.generate(file);
System.out.printf("hex=%s\n", hex);
}
}
Example output:
hex=B117DD0C3CBBD009AC4EF65B6D75C97B
If you're using ANT to build, this is dead-simple. Add the following to your build.xml:
<checksum file="${jarFile}" todir="${toDir}"/>
Where jarFile is the JAR you want to generate the MD5 against, and toDir is the directory you want to place the MD5 file.
More info here.
Google guava provides a new API. Find the one below :
public static HashCode hash(File file,
HashFunction hashFunction)
throws IOException
Computes the hash code of the file using hashFunction.
Parameters:
file - the file to read
hashFunction - the hash function to use to hash the data
Returns:
the HashCode of all of the bytes in the file
Throws:
IOException - if an I/O error occurs
Since:
12.0
public static String getMd5OfFile(String filePath)
{
String returnVal = "";
try
{
InputStream input = new FileInputStream(filePath);
byte[] buffer = new byte[1024];
MessageDigest md5Hash = MessageDigest.getInstance("MD5");
int numRead = 0;
while (numRead != -1)
{
numRead = input.read(buffer);
if (numRead > 0)
{
md5Hash.update(buffer, 0, numRead);
}
}
input.close();
byte [] md5Bytes = md5Hash.digest();
for (int i=0; i < md5Bytes.length; i++)
{
returnVal += Integer.toString( ( md5Bytes[i] & 0xff ) + 0x100, 16).substring( 1 );
}
}
catch(Throwable t) {t.printStackTrace();}
return returnVal.toUpperCase();
}
Pulling together ideas from other answers, here's simple code with no third party dependencies (or DatatypeConverter which is longer in the latest JDKs) that generates this as a hex string compatible with output of the md5sum tool:
import java.io.IOException;
import java.math.BigInteger;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
...
static String calculateMD5(String path) throws IOException
{
try {
MessageDigest md = MessageDigest.getInstance("MD5");
md.update(Files.readAllBytes(Paths.get(path)));
return String.format("%032x", new BigInteger(1, md.digest())); // hex, padded to 32 chars
} catch (NoSuchAlgorithmException ex)
{
throw new RuntimeException(ex); // MD5 is always available so this should be impossible
}
}

Categories

Resources