Junk Characters Coming while Decoding Data in Base 64 - java

I am getting PDF content which is Base 64 encoded. I tried to decode it using NIFI with Processor Base64EncodeContent. The Decoded file I am sending in mail. Below is small sample of output coming in mail.
enter image description here
"No data should be available in . ¹ Check if sent . . All documents are sent as pdf to* 9 : ’ ³: > < âA m¬‘²#%é‚ÇŽÇ¢|ÀÈ™$Éز§Uû÷LÒTB¨ l,îåù˜$â´º?6N¬JC¤ŒÃ°‰_Ïg -æ¿;ž‰ìÛÖYl`õ?èÓÌ[ ÿÿ PK"
How to extract the data in PDF as sent by third party?
I have tried to decode it using JAVA code and there also its failing. Not able to open the PDF, junk characters coming there too.
ConvertedJPGPDF.pdf file used below contains Base64 encoded String.
String filePath = "C:\\Users\\xyz\\Desktop\\";
String originalFileName = "ConvertedJPGPDF.pdf";
String newFileName = "test.pdf";
byte[] input_file =
Files.readAllBytes(Paths.get(filePath+originalFileName));
// byte[] decodedBytes = Base64.getDecoder().decode(input_file);
byte[] decodedBytes1 = Base64.getMimeDecoder().decode(input_file);
FileOutputStream fos = new FileOutputStream(filePath+newFileName);
fos.write(decodedBytes1);
fos.flush();
fos.close();

You mentioned that the file contains base64 encoded string already.
ConvertedJPGPDF.pdf file used below contains Base64 encoded String.
So, you don't need to run this line:
byte[] encodedBytes = Base64.getEncoder().encode(input_file);
By doing so, you are trying to encode those bytes again.
Directly decode the input_file array and then save the obtained byte array into a .pdf file.
Update:
The ConvertedJPGPDF.pdf doesn't really have to be named .pdf. It's really a plain text file considering that it is base 64 encoded.
Anyway, the following piece of code is working for me:
String filePath = "C:\\Users\\xyz\\Desktop\\";
String originalFileName = "ConvertedJPGPDF.pdf";
String newFileName = "test.pdf";
byte[] input_file = Files.readAllBytes(Paths.get(filePath+originalFileName));
byte[] decodedBytes1 = Base64.getMimeDecoder().decode(input_file);
Files.write(Paths.get(filePath+newFileName), decodedBytes1);
Hope this helps!

Related

How to 'decode' a UTF-8 String which is built upon gzipped byte array

I got some legacy text data which is utf-8 encoded, but against a gzipped byte array.
I'm wondering whether I can get the raw data back
something like:
String text = "Hello World!";
byte[] binData = text.getBytes("UTF-8");
byte[] compressData = gzip(binData);//via GZIPOutputStream
//this is what I have
String encodedString = new String(compressData, "UTF-8");
assertEquals(text, smartDecode(encodedString));
Is it possible to provide a function like smartDecode to help me retrieve the original text 'Hello World!' back?

Cannot properly decode Base64 MIME image to byte array (Java)

I'm trying to write some selenium/java test that checks 2FA configuration process. Thus I have to scan some QR code from a page in order to process it with zxing. The image format is Base64 and I'm struggling with decoding it to the byte array. The following code should convert base64 string to byte array, and then write it to the file.
Here is the code I wrote:
String base64Source = LocalDriverManager.get().findElement(By.xpath("//img[#class='qr-code']")).getAttribute("src");
String base64Image = base64Source.split(",")[1];
byte[] decoded = Base64.getMimeDecoder().decode(base64Image);
try (OutputStream stream = new FileOutputStream("QR_CODE.png")){
stream.write(decoded);
}
This code compiles with no errors, but when I try to open generated png file I get only "Fatal error reading PNG image file: Decompression error in IDAT".
I know that base64 string is valid as I was able to convert it to the image using some online converter. Also, I checked the string with online validator and it said that this is a valid base64 MIME string.
Example of the base64 code below:
iVBORw0KGgoAAAANSUhEUgAAAeoAAAHqAQAAAADjFjCXAAAET0lEQVR4nO2dXYrrOgyAP50E5jGB%0AWUCX4uxgljScJd0dxEvpAgacx4KDzoPsxJ3hcqHppadUegiZxB9uQEjWjz2iHJD46wgNjjvuuOOO%0AO+64447fF5ciPbCUi8i4CnEEYBWZAJmWOnS63+yOvygeVFU1AfF0MV3TmU5lWnobob+lB+hUVVWv%0A8YOzO/6i+FLMl3yee2DI9kznzcx9pjK02MR7zu74a+H99wdx7LJAnyUoENJ7FpZ3lXi6yL1nd/w1%0A8R9aB6DxI6EsIxo/LqKQ/5/ZHX9NvGrdoMACwCp11dZlCQmVMK890Cks0OaVn/rbHX8wHkVEZARC%0A6lQ+zz0yATWkfVOCPVsthL3r7I6/GG62rjFfcezQeMooZLuD4SIwZPTa0j36xzv+pDiWBwkJ2NIi%0AJvOgqpo61XnI9e31uPmpv93xR+FUDcuoasZ0babk6wiaG8Xctc4yfK51jt8ku23D4oXdpA3F4Jlz%0AnSlWz+7c1jl+EJdpEYHhIjoDMgGq5x5gFQskPtMqsLypTEO2IX/Hj3f86fDqPrX4VXOpxa6pElJX%0AXli9YvO/buscv132qmrxmqlWWkMqvraEFECJOgb1dZ3jR6TRsNmebIu2Tbk21Su6NmS3dY4fkK0i%0AtgpB154wr6Iso9k1jeNXr1a5GPayWOc9J44fkbZvqfmzU8uc2LN0nUhJnjlx/JDUaGJP1e2BBJbD%0AI9SL5+scvwdeK2LLiIRznyWkUQhp7ZWly7C8F4MXpWtt4l1md/w18epShx91sN3+WeQaUmdRh8ew%0Ajt+pIgatXpWkSb3MUPWvZvNc6xy/Xbb+uq+eeLqIWgfdAqXTBASGhICAeWJdvdPJ8SPS5Ib3Smux%0AfzWugGrr2LtP3NY5frPsHtYammzlBm1tTBM0RbNmEfjU3+74o/A2XxdSUb1duZoti00M4bbO8WOy%0Ax7BNa1MJKaoSliE/1dG1zvGbpGSJm/r+FrTWAkXTbrf7Wrd1jt8uu4e9ustX6zrdfK3WdifXOsdv%0AlxrD0uThml46bdLCuyaCe1jHbxfL10mYFdvszzLaG2XpswCi8aOWwOKYtD0B4Km/3fFH4TVLvIxo%0AFNBy4gRQtv0DLAIMqca6/7zXpqen/nbHH4VfRRNbzcvWdQnM11odVrXuUfQY1vFDslXELqIsfdY4%0AfiEAGk+51zi9bS517Quw9G7rHD8iei0WNGyFryZeLXHtHle4rXP8VmlqE6W+muv2nGHbI7u72bRt%0AmXWtc/wgHmr6RCY6FTmVUxOt8SSc37YlXbnIdMfZHX8xvD3nZC/8A9TMnTabx8qL5PthHb8rvkez%0AZutYRSarza5iuvb75LbO8SPy84TYU0bCWcqpYXF6Uyx8HXJPnAAG35no+BH5flZnTZ+AMHz126md%0AaocTBwWBtb546m93/FH49xi2Pts26qheNRnvuyp8Xef4rSL632P+Xfy/1znuuOOOO+64447/Lfgf%0AFuoX02DU2vMAAAAASUVORK5CYII=
try this
String base64Source = LocalDriverManager.get().findElement(By.xpath("//img[#class='qr-code']")).getAttribute("src");
String base64Image = base64Source.split(";")[1].split(",")[1]; //Try this
byte[] decoded = Base64.getMimeDecoder().decode(base64Image);
try (OutputStream stream = new FileOutputStream("QR_CODE.png")){
stream.write(decoded);
}
Okay, so i figured it out, so now it works.
The thing is when I run:
String base64Source = LocalDriverManager.get().findElement(By.xpath("//img[#class='qr-code']")).getAttribute("src");
it adds a neweline (%0A) characters to the string so before decoding it to byte array I need to run qrCodeImage = qrCodeImage.replaceAll("%0A", ""); in order to remove them.

PDF file content to Base 64 and vice versa in Java

I need to convert PDF content to Base64 and use that as a String.
When I use the below program to test the out.pdf becomes blank.
byte[] pdfRawData = FileUtils.readFileToByteArray(new File("C:\\in.pdf")) ;
String pdfStr = new String(pdfRawData);
//My data is available in the form of String
BASE64Encoder encoder = new BASE64Encoder();
String encodedPdf = encoder.encode(pdfStr.getBytes());
System.out.println(encodedPdf);
// Decode the encoded content to test
BASE64Decoder decoder = new BASE64Decoder();
FileUtils.writeByteArrayToFile(new File("C:\\out.pdf") , decoder.decodeBuffer(encodedPdf));
Can anyone please help me?
Why are you doing:
String pdfStr = new String(pdfRawData);
instead of passing pdfRawData to the encoder?
Doing so lead to lots of encoding issue, as you don't specify the encoding of the byte array to use to build the string (it will use platform default). And this is clearly redondant (byte array -> string -> byte array)

Issues in converting base64 decoded byte array to String in java

Issues in converting base64 decoded byte array to String in java :
public static String decode(String strcontent) throws Exception
{
BASE64Decoder decoder = new BASE64Decoder();
byte[] imgBytes = decoder.decodeBuffer(strcontent);
return new String(imgBytes);
}
With the above code; was trying to create a string out of the Base 64 decoded byte array (imgBytes ) & input strcontent is base 64 encoded string. For text files it working fine , but for PDF and image files the string conversion is having issues. Have tried different encoding as UTF-8 , UTF 16 etc. But no use. The returned string is different than the original one.
When tried to write the byte array to a file like :
OutputStream out = new FileOutputStream(##path);
out.write(imgBytes);
out.close();
File is getting created properly without any issues.
I tried the below code:
byte[] imgBytes= ( new String(imgBytes1)).getBytes(); //Converting to String and back to bytes
OutputStream out = new FileOutputStream(##Filename);
out.write(imgBytes); out.close();
This time the image file is corrupted.
Please suggest.

java apache IOUtils breaks file content

I need to encode/decode pdf file into Base64 format.
So I read file from disk into String(because I will receive file in String Base64 format in future);
String pdfString = IOUtils.toString(new FileInputStream(new
File("D:\\vrpStamped.pdf")));
byte[] encoded = Base64.encodeBase64(pdfString.getBytes());
byte[] newPdfArray = Base64.decodeBase64(encoded);
FileOutputStream imageOutFile = new FileOutputStream(
"D:\\1.pdf");
imageOutFile.write(newPdfArray);
imageOutFile.close();
imageOutFile.flush();
So my D:\\1.pdf doesnt opens in AdobeReader, but if I read file straight to byte array, using IOUtils.toByteArray(..) instead ,all works fine and my D:\\1.pdf file sucessfuly opens in Adobe Reader:
byte[] encoded = Base64.encodeBase64(IOUtils.toByteArray(new FileInputStream(new File("D:\\vrpStamped.pdf"))););
It seems to me thath IOUtils.toString(..) change something inside file content. So how can I convert file to String with not content breaking?
How to encode a pdf...
byte[] bytes = IOUtils.toByteArray(new FileInputStream(new File("/home/fschaetz/test.pdf")));
byte[] encoded = Base64.encode(bytes);
String str = new String(encoded);
...now do something with this encoded String, for example, send it via a Rest service.
And now, if you receive an encoded String, you can decode and save it like this...
byte[] decoded = Base64.decode(str.getBytes());
FileOutputStream output = new FileOutputStream(new File("/home/fschaetz/result.pdf"));
output.write(decoded);
output.close();
Works perfectly fine with all files, not limited to images or pdfs.
What your example is doing is...
Read the pdf into a String (which pretty much destroys the data, since you are reading binary data into a String)
Encode this spring (which is in all likelyhood not a valid representation of the original pdf anymore)
Decode it and save it to disk

Categories

Resources