Java Encode file to Base64 string To match with other encoded string

Java Encode file to Base64 string To match with other encoded string - java

private static String encodeFileToBase64Binary(String fileName)
throws IOException {
File file = new File(fileName);
byte[] bytes = loadFile(file);
byte[] encoded = Base64.encodeBase64(bytes);
String encodedString = new String(encoded,StandardCharsets.US_ASCII);
return encodedString;
}
private static byte[] loadFile(File file) throws IOException {
InputStream is = new FileInputStream(file);
long length = file.length();
if (length > Integer.MAX_VALUE) {
// File is too large
}
byte[] bytes = new byte[(int)length];
int offset = 0;
int numRead = 0;
while (offset < bytes.length
&& (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
offset += numRead;
}
if (offset < bytes.length) {
throw new IOException("Could not completely read file "+file.getName());
}
is.close();
return bytes;
}
// to get encode string
String encoded=encodeFileToBase64Binary("file.fmr");
// encoded string is :
Rk1SACAyMAAAAAFiAAABQAHgAMUAxQEAAABGNkDZADP/SEC8AD6CSECqAEcGSED+AFJtO0CgAGCKZEC6AGuFZEDgAHz1ZECzAI6HZEENAJluNEBWAJ4ZZEB1AKkTZEECALbuZEA/ALqfSECCALySSECxAMP/ZECIAMURVUAXAN2jGkCnAOD8ZEAoAOWlZEBnAOyhLkCyAP/tZECHAQMSGkD8AQTdZECfASKFGkCHASUaGkA1ASy6ZEDAAS3JZEDPAS7NZEAnATG4ZEDxATzOZEBOAUPLZEBzAVbuGkCAAWF8NEDTAWsxLkDnAXa0LkC/AX2nLkC0AYojIEBMAYvkSEDJAa0fT0CsAbwVIIDqANTsZIDIAPfnZICbAQKHO4D5AR/XZIBlASS7IIEoASbYO4CsAUetLoDvAVXSZIDaAVvDO4EHAWrLZICsAX2fNIDnAYEwNIDQAZKnT4BfAZxtZAAA
//encoded string from file using some other source.
Rk1SACAyMAAAAAFiAAABQAHgAMUAxQEAAABGNkCLACELSEDAADYDZEEYAGFxO0DGAGJ9SEC1AGkCSEA6AHWYVUDJAHp5ZEBEAHwVZECVAJgIZEEaALHrZEB4ALuOZEELAMFqZEEzAM/sNEDRANvwZEBkAN0VZECcAOIAZEEwAOjnLkEvAPXlO0CnAP71ZEB7AQYRNEBdAQ0eZED8ARDhZEDXASXcZECZAS3uGkBoAT4eO0AUAUMxSEA7AUYqZEDxAUnSZECmAVNDO0EIAXDHSEDYAXW7ZEEUAXXKSEEGAYY8IEEhAYrDNEDfAZ81ZEDQAcGqLoEBAC/7O4EGAE7zVYB+AP2QSICEARuLZIBnATUfO4D/ATXaZIDEATjSZIDRATrVZICnATvSNIBTATwnZIARAV1LGoB1AV2oO4CrAV68SIDnAWHGZIB+AWauNICVAX0ySICNAYytO4CJAZorSAAA
When i am trying to match both the encoded string , i am getting a missmatch.
please suggest method for encode file to base64 to match encoded string found from other source.
i have tried with StandardCharsets.UTF_8 and StandardCharsets.US_ASCII.

You already using apache commons-codec so I recommend adding commons-io for reading the file. That way you can remove your loadFile() method and just have:
private static String encodeFileToBase64Binary(String fileName) throws IOException {
File file = new File(fileName);
byte[] encoded = Base64.encodeBase64(FileUtils.readFileToByteArray(file));
return new String(encoded, StandardCharsets.US_ASCII);
}

Here is a solution without any required dependencies (Apacha et al.) requiring only JDK 8+:
import java.util.Base64;
import java.nio.file.Files;
private static String encodeFileToBase64(File file) {
try {
byte[] fileContent = Files.readAllBytes(file.toPath());
return Base64.getEncoder().encodeToString(fileContent);
} catch (IOException e) {
throw new IllegalStateException("could not read file " + file, e);
}
}

Since Java 8 you can use the class java.util.Base64 and the corresponding inner classes:
java.util.Base64.Encoder
java.util.Base64.Decoder
See JavaDoc: Base64-Doc
And a sample for the use: Example from Oracle

This example worked great for me: https://grokonez.com/java/java-advanced/java-8-encode-decode-an-image-base64
public static String encoder(String filePath) {
String base64File = "";
File file = new File(filePath);
try (FileInputStream imageInFile = new FileInputStream(file)) {
// Reading a file from file system
byte fileData[] = new byte[(int) file.length()];
imageInFile.read(fileData);
base64File = Base64.getEncoder().encodeToString(fileData);
} catch (FileNotFoundException e) {
System.out.println("File not found" + e);
} catch (IOException ioe) {
System.out.println("Exception while reading the file " + ioe);
}
return base64File;
}

Related

File md5 hash changes when chunking it (for netty transfer)

Question at the bottom
I'm using netty to transfer a file to another server.
I limit my file-chunks to 1024*64 bytes (64KB) because of the WebSocket protocol. The following method is a local example what will happen to the file:
public static void rechunck(File file1, File file2) {
FileInputStream is = null;
FileOutputStream os = null;
try {
byte[] buf = new byte[1024*64];
is = new FileInputStream(file1);
os = new FileOutputStream(file2);
while(is.read(buf) > 0) {
os.write(buf);
}
} catch (IOException e) {
Controller.handleException(Thread.currentThread(), e);
} finally {
try {
if(is != null && os != null) {
is.close();
os.close();
}
} catch (IOException e) {
Controller.handleException(Thread.currentThread(), e);
}
}
}
The file is loaded by the InputStream into a ByteBuffer and directly written to the OutputStream.
The content of the file cannot change while this process.
To get the md5-hashes of the file I've wrote the following method:
public static String checksum(File file) {
InputStream is = null;
try {
is = new FileInputStream(file);
MessageDigest digest = MessageDigest.getInstance("MD5");
byte[] buffer = new byte[8192];
int read = 0;
while((read = is.read(buffer)) > 0) {
digest.update(buffer, 0, read);
}
return new BigInteger(1, digest.digest()).toString(16);
} catch(IOException | NoSuchAlgorithmException e) {
Controller.handleException(Thread.currentThread(), e);
} finally {
try {
is.close();
} catch(IOException e) {
Controller.handleException(Thread.currentThread(), e);
}
}
return null;
}
So: just in theory it should return the same hash, shouldn't it? The problem is that it returns two different hashes that do not differ with every run.. file size stays the same and the content either.
When I run the method once for in: file-1, out: file-2 and again with in: file-2 and out: file-3 the hashes of file-2 and file-3 are the same! This means the method will properly change the file every time the same way.
1. 58a4a9fbe349a9e0af172f9cf3e6050a
2. 7b3f343fa1b8c4e1160add4c48322373
3. 7b3f343fa1b8c4e1160add4c48322373
Here is a little test that compares all buffers if they are equivalent. Test is positive. So there aren't any differences.
File file1 = new File("controller/templates/Example.zip");
File file2 = new File("controller/templates2/Example.zip");
try {
byte[] buf1 = new byte[1024*64];
byte[] buf2 = new byte[1024*64];
FileInputStream is1 = new FileInputStream(file1);
FileInputStream is2 = new FileInputStream(file2);
boolean run = true;
while(run) {
int read1 = is1.read(buf1), read2 = is2.read(buf2);
String result1 = Arrays.toString(buf1), result2 = Arrays.toString(buf2);
boolean test = result1.equals(result2);
System.out.println("1: " + result1);
System.out.println("2: " + result2);
System.out.println("--- TEST RESULT: " + test + " ----------------------------------------------------");
if(!(read1 > 0 && read2 > 0) || !test) run = false;
}
} catch (IOException e) {
e.printStackTrace();
}
Question: Can you help me chunking the file without changing the hash?

while(is.read(buf) > 0) {
os.write(buf);
}
The read() method with the array argument will return the number of files read from the stream. When the file doesn't end exactly as a multiple of the byte array length, this return value will be smaller than the byte array length because you reached the file end.
However your os.write(buf); call will write the whole byte array to the stream, including the remaining bytes after the file end. This means the written file gets bigger in the end, therefore the hash changed.
Interestingly you didn't make the mistake when you updated the message digest:
while((read = is.read(buffer)) > 0) {
digest.update(buffer, 0, read);
}
You just have to do the same when you "rechunk" your files.

Your rechunk method has a bug in it. Since you have a fixed buffer in there, your file is split into ByteArray-parts. but the last part of the file can be smaller than the buffer, which is why you write too many bytes in the new file. and that's why you do not have the same checksum anymore. the error can be fixed like this:
public static void rechunck(File file1, File file2) {
FileInputStream is = null;
FileOutputStream os = null;
try {
byte[] buf = new byte[1024*64];
is = new FileInputStream(file1);
os = new FileOutputStream(file2);
int length;
while((length = is.read(buf)) > 0) {
os.write(buf, 0, length);
}
} catch (IOException e) {
Controller.handleException(Thread.currentThread(), e);
} finally {
try {
if(is != null)
is.close();
if(os != null)
os.close();
} catch (IOException e) {
Controller.handleException(Thread.currentThread(), e);
}
}
}
Due to the length variable, the write method knows that until byte x of the byte array, only the file is off, then there are still old bytes in it that no longer belong to the file.

Compression and Encoding giving Wrong results in Strings

I'm trying to compress a string . I'm using Base64 encoding and decoding to conversion of String to Bytes and viceversa.
import org.apache.axis.encoding.Base64;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.zip.Deflater;
import java.util.zip.Inflater;
public class UtilTesting {
public static void main(String[] args) {
try {
String original = "I am the god";
System.out.println("Starting Zlib");
System.out.println("==================================");
String zcompressed = compressString(original);
String zdecompressed = decompressString(zcompressed);
System.out.println("Original String: "+original);
System.out.println("Compressed String: "+zcompressed);
System.out.println("Decompressed String: "+zdecompressed);
} catch (IOException e) {
e.printStackTrace();
}
public static String compressString(String uncompressedString){
String compressedString = null;
byte[] bytes = Base64.decode(uncompressedString);
try {
bytes = compressBytes(bytes);
compressedString = Base64.encode(bytes);
} catch (IOException e) {
e.printStackTrace();
}
return compressedString;
}
public static String decompressString(String compressedString){
String decompressedString = null;
byte[] bytes = Base64.decode(compressedString);
try {
bytes = decompressBytes(bytes);
decompressedString = Base64.encode(bytes);
} catch (IOException e) {
e.printStackTrace();
} catch (DataFormatException e) {
e.printStackTrace();
}
return decompressedString;
}
public static byte[] compressBytes(byte[] data) throws IOException {
Deflater deflater = new Deflater();
deflater.setInput(data);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length);
deflater.finish();
byte[] buffer = new byte[1024];
while (!deflater.finished()) {
int count = deflater.deflate(buffer); // returns the generated code... index
outputStream.write(buffer, 0, count);
}
outputStream.close();
byte[] output = outputStream.toByteArray();
return output;
}
public static byte[] decompressBytes(byte[] data) throws IOException, DataFormatException {
Inflater inflater = new Inflater();
inflater.setInput(data);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length);
byte[] buffer = new byte[1024];
while (!inflater.finished()) {
int count = inflater.inflate(buffer);
outputStream.write(buffer, 0, count);
}
outputStream.close();
byte[] output = outputStream.toByteArray();
return output;
}
}
This is giving the result :
Starting Zlib
==================================
Original String: I am the god
Compressed String: eJxTXLm29YUGAApUAw0=
Decompressed String: Iamthego
As you can see, it is missing the white-spaces and it even lost the final letter in the given String.
Can someone please suggest what is wrong with this code.
I'm following below steps:
Decode
compress
encode
save
retrieve
decode
decompress
encode.
Please help. Thank you.

In compressString, replace:
Base64.decode(uncompressedString)
with
uncompressString.getBytes(StandardCharsets.UTF_8)
You're not passing in a base64-encoded string; you simply want the bytes of the input string. Note that spaces never appear in base64 encoding, so they are likely treated as redundant and discarded.
Similarly in decompressString, replace:
Base64.encode(bytes)
with
new String(bytes, StandardCharsets.UTF_8)

md5 checksum on file in android doesn't match the md5 after i email it to myself

The functions I'm using to convert the file to a string and then to an mdf are below. I'm outputting the file paths and file names to make sure everything is cool. Is there anything I'm not considering that could change the file's (a video mp4 actually) fingerprint? I'm checking it against md5sum on ubuntu.
private static String readFileToString(String filePath)
throws java.io.IOException{
StringBuffer fileData = new StringBuffer(1000);
BufferedReader reader = new BufferedReader(
new FileReader(filePath));
char[] buf = new char[1024];
int numRead=0;
while((numRead=reader.read(buf)) != -1){
String readData = String.valueOf(buf, 0, numRead);
fileData.append(readData);
buf = new char[1024];
}
reader.close();
System.out.println(fileData.toString());
return fileData.toString();
}
public static String getMD5EncryptedString(String encTarget){
MessageDigest mdEnc = null;
try {
mdEnc = MessageDigest.getInstance("MD5");
} catch (NoSuchAlgorithmException e) {
System.out.println("Exception while encrypting to md5");
e.printStackTrace();
} // Encryption algorithm
mdEnc.update(encTarget.getBytes(), 0, encTarget.length());
String md5 = new BigInteger(1, mdEnc.digest()).toString(16) ;
return md5;
}

String isn't a container for binary data. Lose the two conversions between byte array and String. You should be reading the file as bytes and computing the MD5 directly in the bytes. You can do that while you're reading it: you don't need to read the entire file first.
And MD5 isn't an encryption: it's a secure hash.

Found this answer here: How to generate an MD5 checksum for a file in Android?
public static String fileToMD5(String filePath) {
InputStream inputStream = null;
try {
inputStream = new FileInputStream(filePath);
byte[] buffer = new byte[1024];
MessageDigest digest = MessageDigest.getInstance("MD5");
int numRead = 0;
while (numRead != -1) {
numRead = inputStream.read(buffer);
if (numRead > 0)
digest.update(buffer, 0, numRead);
}
byte [] md5Bytes = digest.digest();
return convertHashToString(md5Bytes);
} catch (Exception e) {
return null;
} finally {
if (inputStream != null) {
try {
inputStream.close();
} catch (Exception e) { }
}
}
}
private static String convertHashToString(byte[] md5Bytes) {
String returnVal = "";
for (int i = 0; i < md5Bytes.length; i++) {
returnVal += Integer.toString(( md5Bytes[i] & 0xff ) + 0x100, 16).substring(1);
}
return returnVal;
}

compression and decompression of string data in java

I am using the following code to compress and decompress string data, but the problem which I am facing is, it is easily getting compressed without error, but the decompress method throws the following error.
Exception in thread "main" java.io.IOException: Not in GZIP format
public static void main(String[] args) throws Exception {
String string = "I am what I am hhhhhhhhhhhhhhhhhhhhhhhhhhhhh"
+ "bjggujhhhhhhhhh"
+ "rggggggggggggggggggggggggg"
+ "esfffffffffffffffffffffffffffffff"
+ "esffffffffffffffffffffffffffffffff"
+ "esfekfgy enter code here`etd`enter code here wdd"
+ "heljwidgutwdbwdq8d"
+ "skdfgysrdsdnjsvfyekbdsgcu"
+ "jbujsbjvugsduddbdj";
System.out.println("after compress:");
String compressed = compress(string);
System.out.println(compressed);
System.out.println("after decompress:");
String decomp = decompress(compressed);
System.out.println(decomp);
}
public static String compress(String str) throws Exception {
if (str == null || str.length() == 0) {
return str;
}
System.out.println("String length : " + str.length());
ByteArrayOutputStream obj=new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(obj);
gzip.write(str.getBytes("UTF-8"));
gzip.close();
String outStr = obj.toString("UTF-8");
System.out.println("Output String length : " + outStr.length());
return outStr;
}
public static String decompress(String str) throws Exception {
if (str == null || str.length() == 0) {
return str;
}
System.out.println("Input String length : " + str.length());
GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(str.getBytes("UTF-8")));
BufferedReader bf = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
String outStr = "";
String line;
while ((line=bf.readLine())!=null) {
outStr += line;
}
System.out.println("Output String lenght : " + outStr.length());
return outStr;
}
Still couldn't figure out how to fix this issue!

This is because of
String outStr = obj.toString("UTF-8");
Send the byte[] which you can get from your ByteArrayOutputStream and use it as such in your ByteArrayInputStream to construct your GZIPInputStream. Following are the changes which need to be done in your code.
byte[] compressed = compress(string); //In the main method
public static byte[] compress(String str) throws Exception {
...
...
return obj.toByteArray();
}
public static String decompress(byte[] bytes) throws Exception {
...
GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(bytes));
...
}

The above Answer solves our problem but in addition to that.
if we are trying to decompress a uncompressed("not a zip format") byte[] .
we will get "Not in GZIP format" exception message.
For solving that we can add addition code in our Class.
public static boolean isCompressed(final byte[] compressed) {
return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8));
}
My Complete Compression Class with compress/decompress would look like:
import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;
public class GZIPCompression {
public static byte[] compress(final String str) throws IOException {
if ((str == null) || (str.length() == 0)) {
return null;
}
ByteArrayOutputStream obj = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(obj);
gzip.write(str.getBytes("UTF-8"));
gzip.flush();
gzip.close();
return obj.toByteArray();
}
public static String decompress(final byte[] compressed) throws IOException {
final StringBuilder outStr = new StringBuilder();
if ((compressed == null) || (compressed.length == 0)) {
return "";
}
if (isCompressed(compressed)) {
final GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(compressed));
final BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
String line;
while ((line = bufferedReader.readLine()) != null) {
outStr.append(line);
}
} else {
outStr.append(compressed);
}
return outStr.toString();
}
public static boolean isCompressed(final byte[] compressed) {
return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8));
}
}

If you ever need to transfer the zipped content via network or store it as text, you have to use Base64 encoder(such as apache commons codec Base64) to convert the byte array to a Base64 String, and decode the string back to byte array at remote client.
Found an example at Use Zip Stream and Base64 Encoder to Compress Large String Data!

Another example of correct compression and decompression:
#Slf4j
public class GZIPCompression {
public static byte[] compress(final String stringToCompress) {
if (isNull(stringToCompress) || stringToCompress.length() == 0) {
return null;
}
try (final ByteArrayOutputStream baos = new ByteArrayOutputStream();
final GZIPOutputStream gzipOutput = new GZIPOutputStream(baos)) {
gzipOutput.write(stringToCompress.getBytes(UTF_8));
gzipOutput.finish();
return baos.toByteArray();
} catch (IOException e) {
throw new UncheckedIOException("Error while compression!", e);
}
}
public static String decompress(final byte[] compressed) {
if (isNull(compressed) || compressed.length == 0) {
return null;
}
try (final GZIPInputStream gzipInput = new GZIPInputStream(new ByteArrayInputStream(compressed));
final StringWriter stringWriter = new StringWriter()) {
IOUtils.copy(gzipInput, stringWriter, UTF_8);
return stringWriter.toString();
} catch (IOException e) {
throw new UncheckedIOException("Error while decompression!", e);
}
}
}

The problem is this line:
String outStr = obj.toString("UTF-8");
The byte array obj contains arbitrary binary data. You can't "decode" arbitrary binary data as if it was UTF-8. If you try you will get a String that cannot then be "encoded" back to bytes. Or at least, the bytes you get will be different to what you started with ... to the extent that they are no longer a valid GZIP stream.
The fix is to store or transmit the contents of the byte array as-is. Don't try to convert it into a String. It is binary data, not text.

Client send some messages need be compressed, server (kafka) decompress the string meesage
Below is my sample:
compress:
public static String compress(String str, String inEncoding) {
if (str == null || str.length() == 0) {
return str;
}
try {
ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(str.getBytes(inEncoding));
gzip.close();
return URLEncoder.encode(out.toString("ISO-8859-1"), "UTF-8");
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
decompress:
public static String decompress(String str, String outEncoding) {
if (str == null || str.length() == 0) {
return str;
}
try {
String decode = URLDecoder.decode(str, "UTF-8");
ByteArrayOutputStream out = new ByteArrayOutputStream();
ByteArrayInputStream in = new ByteArrayInputStream(decode.getBytes("ISO-8859-1"));
GZIPInputStream gunzip = new GZIPInputStream(in);
byte[] buffer = new byte[256];
int n;
while ((n = gunzip.read(buffer)) >= 0) {
out.write(buffer, 0, n);
}
return out.toString(outEncoding);
} catch (IOException e) {
e.printStackTrace();
}
return null;
}

You can't convert binary data to String. As a solution you can encode binary data and then convert to String. For example, look at this How do you convert binary data to Strings and back in Java?

Calculate md5 hash of a zip file in Java program

I have a zip file, and in my Java code i want to calculate the md5 hash of the zip file. Is there any java libary i can use for this purpose ?. Some example would be really appreciated.
Thank You

I got that working a few weeks ago with this Article here:
http://www.javalobby.org/java/forums/t84420.html
Just to have it a stackoveflow:
public static void main(String[] args) throws NoSuchAlgorithmException, FileNotFoundException {
MessageDigest digest = MessageDigest.getInstance("MD5");
File f = new File("c:\\myfile.txt");
InputStream is = new FileInputStream(f);
byte[] buffer = new byte[8192];
int read = 0;
try {
while( (read = is.read(buffer)) > 0) {
digest.update(buffer, 0, read);
}
byte[] md5sum = digest.digest();
BigInteger bigInt = new BigInteger(1, md5sum);
String output = bigInt.toString(16);
System.out.println("MD5: " + output);
}
catch(IOException e) {
throw new RuntimeException("Unable to process file for MD5", e);
}
finally {
try {
is.close();
}
catch(IOException e) {
throw new RuntimeException("Unable to close input stream for MD5 calculation", e);
}
}
}

There is also an option to do that with Apache Codec like that
final String sha256 = DigestUtils.sha256Hex(new FileInputStream(file))

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Encode file to Base64 string To match with other encoded string - java

Since Java 8 you can use the class java.util.Base64 and the corresponding inner classes: java.util.Base64.Encoder java.util.Base64.Decoder See JavaDoc: Base64-Doc And a sample for the use: Example from Oracle

Related

File md5 hash changes when chunking it (for netty transfer)

Compression and Encoding giving Wrong results in Strings

md5 checksum on file in android doesn't match the md5 after i email it to myself

compression and decompression of string data in java

Calculate md5 hash of a zip file in Java program

Categories

Resources