Decompressing a buffer from a SOAP response - java

This might be a dumb problem but I'm really not able to figure it out.
I'm making a SOAP request in SoapUI that retrieves me a GZIP compressed buffer for a certain file.
My issue is that I'm not able to decompress the buffer obtained ( I'm not that experienced with java ). The only results that I obtained till now are some random 10-11 characters string ( [B#6d03e736 ) or errors like "not in GZIP format)
The buffer looks like this: "1f8b0800000000000000a58e4d0ac2400c85f78277e811f2e665329975bbae500f2022dd2978ff95715ae82cdcf9415efec823c6710247582d5965c32c65aab0f5fc0a5204c415855e7c190ef61b34710bcdc7486d2bab8a7a4910d022d5e107d211ed345f2f37a103da2ddb1f619ab8acefe7fdb1beb6394998c7dfbde3dcac3acf3f399f3eeae152012e010000"
I've looked in many similar threads and ran only into examples where someone gets a random string that gets compressed and then decompressed ( mostly trough GZIPInputStream/GZIPOutputStream ).
String stringBuffer = "buffer_from_above";
byte[] buffer = stringBuffer.getBytes(StandardCharsets.ISO_8859_1); // also tried UTF-8
System.out.println(decompress(buffer));
public static String decompress(final byte[] compressed) throws IOException {
final StringBuilder outStr = new StringBuilder();
if ((compressed == null) || (compressed.length == 0)) {
return "";
}
if (isCompressed(compressed)) {
final GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(compressed));
final BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
String line;
while ((line = bufferedReader.readLine()) != null) {
outStr.append(line);
}
} else {
outStr.append(compressed);
}
return outStr.toString();
}
I would highly appreciate if someone is able to give me any tips or any advice for this matter.
Thanks for the time spent and I wish you have a great day!
Decompress function stolen from this thread: compression and decompression of string data in java

If that's really the string, then you need to convert the hexadecimal to binary before feeding it to the gzip decoder. See Convert a string representation of a hex dump to a byte array using Java?

Related

GZIPInputStream unable to decode at receiver side (invalid code lengths set)

I'm attempting to encode a String in a client using GZIPOutputStream then decoding the String in a server using GZIPOutputStream.
The client's side code (after the initial socket connection establishment) is:
// ... Establishing connection, getting a socket object.
// ... Now proceeding to send data using that socket:
DataOutputStream out = new DataOutputStream(socket.getOutputStream());
String message = "Hello World!";
ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(message);
gzip.close();
String encMessage = out.toString();
out.writeInt(encMessage.getBytes().length);
out.write(encMessage.getBytes());
out.flush();
And the server's side code (again, after establishing a connection):
DataInputStream input = new DataInputStream(socket.getInputStream());
int length = input.readInt();
byte[] buffer = new byte[length];
input.readFully(buffer);
GZIPInputStream gz = new GZIPInputStream(new ByteArrayInputStream(buffer));
BufferedReader r = new BufferedReader(new InputStreamReader(gz));
String s = "";
String line;
while ((line = r.readLine()) != null)
{
s += line;
}
I checked and the buffer length (i.e., the coded message's size) is passed correctly, so the right number of bytes is transferred.
However, I'm getting this:
java.util.zip.ZipException: invalid code lengths set
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
at parsing.ReceiveResponsesTest$TestReceiver.run(ReceiveResponsesTest.java:147)
at java.lang.Thread.run(Thread.java:745)
Any ideas?
Thanks in advance for any assistance!
You're calling toString() on the ByteArrayOutputStream - that is incorrect, and it opens up all kinds of character encoding problems that are probably biting you here. You need to call toByteArray instead:
byte[] encMessage = out.toByteArray();
out.writeInt(encMessage.length);
out.write(encMessage);
Detail:
if you use toString(), Java will encode your bytes in your platform default character encoding. That could be some Windows codepage, UTF-8, or whatnot.
However not all characters can be encoded properly, and some will be replaced by an alternative character - a question mark perhaps. Without knowing the details, it's hard to tell.
But in any case, encoding the byte array to a String, and then decoding it to a byte array again when you write it out, is very likely to change the data in the byte array. And there is not need to do it, you can just get the byte array straight away as shown in the code above.
Why on earth are you indulging in all this complication? You can reduce it all to this:
GZIPOutputStream gzip = new GZIPOutputStream(socket.getOutputStream());
DataOutputStream out = new DataOutputStream(gzip);
String message = "Hello World!";
out.writeUTF(message);
out.close();
// ...
GZIPInputStream gz = new GZIPInputStream(new ByteArrayInputStream(socket.getInputStream()));
DataInputStream input = new DataInputStream(gz);
String line = input.readUTF();
I further note that your code doesn't actually compile. I would further note that unless the messages are several orders of magnitude larger, there is no benefit to the GZipping.

Different Number of Character in Java Android InputStream and C++ ifstream

So, I am developing android application that read JSON text file containing some data. I have a 300 kb (307,312 bytes) JSON in a text file (here). I also develop desktop application (cpp) to generate and loading (and parsing) the JSON text file.
When I try to open and read it using ifstream in c++, I get the string length correctly (307,312). I even succesfully parse it.
Here is my code in C++:
std::string json = "";
std::string line;
std::ifstream myfile(textfile.txt);
if(myfile.is_open()){
while(std::getline(myfile, line)){
json += line;
json.push_back('\n');
}
json.pop_back(); // pop back the last '\n'
myfile.close();
}else{
std::cout << "Unable to open file";
}
In my android application, I put my JSON text file in res/raw folder. When I try to open and read using InputStream, the length of the string only 291,896. And I can't parse it (I parse it using jni with the same c++ code, maybe it is not important).
InputStream is = getResources().openRawResource(R.raw.textfile);
byte[] b = new byte[is.available()];
is.read(b);
in_str = new String(b);
UPDATE:
I also have try using this way.
InputStream is = getResources().openRawResource(R.raw.textfile);
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
String line = reader.readLine();
while(line != null){
in_str += line;
in_str += '\n';
line = reader.readLine();
}
if (in_str != null && in_str.length() > 0) {
in_str = in_str.substring(0, in_str.length()-1);
}
Even, I tried moving it from res/raw folder to assets folder in java android project. And of course I change the InputStream line to InputStream is = getAssets().open("textfile.txt"). Still not working.
Okay, I found the solution. It is the ASCII and UTF-8 problem.
From here:
UTF-8 Variable length encoding, 1-4 bytes per code point. ASCII values are encoded as ASCII using 1 byte.
ASCII Single byte encoding
My filesize is 307,312 bytes and basically I need to take the character each byte. So, I should need to encode the file as ASCII.
When I am using C++ ifstream, the string size is 307,312. (same as of the number character if it is using ASCII encoding)
Meanwhile, when I am using Java InputStream, the string size is 291,896. I assume that it happens because of the reader is using UTF-8 encoding instead.
So, how to use get ASCII encoding in Java?
Through this thread and this article, we can use InputStreamReader in Java and set it to ASCII. Here is my complete code:
String in_str = "";
try{
InputStream is = getResources().openRawResource(R.raw.textfile);
BufferedReader reader = new BufferedReader(new InputStreamReader(is, "ASCII"));
String line = reader.readLine();
while(line != null){
in_str += line;
in_str += '\n';
line = reader.readLine();
}
if (in_str != null && in_str.length() > 0) {
in_str = in_str.substring(0, in_str.length()-1);
}
}catch(Exception e){
e.printStackTrace();
}
If you have the same problem, hope this helps. Cheers.

Desired Result Data is not comming from compressed base64 string by scanning QR code from android phone?

I have extracted base64 string from an image, then perform compression on the base64 string.Then generate QR code using this compressed string.
But when I scan(QR CODE) the result using android phone I am getting value like 17740296 which is not value I want. My purpose is that after getting the scanned value I will decompress it and display the image using bitmap from base64. What is the wrong in my code. I am using java code to generate QR CODE.(I have tried UTF-8 also but not working). This code is working for String but not for the image.
Compressing Code is -
public static String compressBD(String str) throws IOException {
if (str == null || str.length() == 0) {
return str;
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(str.getBytes());
gzip.close();
return out.toString("ISO-8859-1");
}
Decompress Code is -
public static String decompressBD(String str) throws Exception {
if (str == null || str.length() == 0) {
return str.toString();
}
// System.out.println("Input String length : " + str.toString().length());
GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(str.getBytes("ISO-8859-1")));
BufferedReader bf = new BufferedReader(new InputStreamReader(gis, "ISO-8859-1"));
String outStr = "";
String line;
while ((line=bf.readLine())!=null) {
outStr += line;
}
//System.out.println("Output String lenght : " + outStr.length());
return outStr;
}
this wont work.
Base64 ist an encoding that can be used entirely in strings. You can print it on a piece of paper, type it into your machine and decode an image fromt it.
However, if you use a gzip compression and transform this to a string, the compression generates bytes that are outside the string encoding and can not be printed or presented as string.
Base64 is meant to be the "compressed" encoding for strings. I would really encourage you not to use string as storage, but directly store the binary data, or transmitt it. This would also be considerably faster, since base64 encoding is very slow.
Its purpose is entirely to store binary content, that contains non printable bytes, in text messages. For what ever reason.
I hope it is understandable, but it basically means, you cant store a base64 zip content in a string. You have to store the binary representation if you want to compress.

Java buffered base64 encoder for streams

I have lots of PDF files that I need to get its content encoded using base64. I have an Akka app which fetch the files as stream and distributes to many workers to encode these files and returns the string base64 for each file. I got a basic solution for encoding:
org.apache.commons.codec.binary.Base64InputStream;
...
Base64InputStream b64IStream = null;
InputStreamReader reader = null;
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
try {
b64IStream = new Base64InputStream(input, true);
reader = new InputStreamReader(b64IStream);
br = new BufferedReader(reader);
String line;
while ((line = br.readLine()) != null) {
sb.append(line);
}
} finally {
if (b64IStream != null) {
b64IStream.close();
}
if (reader != null) {
reader.close();
}
if (br != null) {
br.close();
}
}
It works, but I would like to know what would be the best way that I can encode the files using a buffer and if there is a faster alternative for this.
I tested some other approaches such as:
Base64.getEncoder
sun.misc.BASE64Encoder
Base64.encodeBase64
javax.xml.bind.DatatypeConverter.printBase64
com.google.guava.BaseEncoding.base64
They are faster but they need the entire file, correct? Also, I do not want to block other threads while encoding 1 PDF file.
Any input is really helpful. Thank you!
Fun fact about Base64: It takes three bytes, and converts them into four letters. This means that if you read binary data in chunks that are divisible by three, you can feed the chunks to any Base64 encoder, and it will encode it in the same way as if you fed it the entire file.
Now, if you want your output stream to just be one long line of Base64 data - which is perfectly legal - then all you need to do is something along the lines of:
private static final int BUFFER_SIZE = 3 * 1024;
try ( BufferedInputStream in = new BufferedInputStream(input, BUFFER_SIZE); ) {
Base64.Encoder encoder = Base64.getEncoder();
StringBuilder result = new StringBuilder();
byte[] chunk = new byte[BUFFER_SIZE];
int len = 0;
while ( (len = in.read(chunk)) == BUFFER_SIZE ) {
result.append( encoder.encodeToString(chunk) );
}
if ( len > 0 ) {
chunk = Arrays.copyOf(chunk,len);
result.append( encoder.encodeToString(chunk) );
}
}
This means that only the last chunk may have a length that is not divisible by three and will therefore contain the padding characters.
The above example is with Java 8 Base64, but you can really use any encoder that takes a byte array of an arbitrary length and returns the base64 string of that byte array.
This means that you can play around with the buffer size as you wish.
If you want your output to be MIME compatible, however, you need to have the output separated into lines. In this case, I would set the chunk size in the above example to something that, when multiplied by 4/3, gives you a round number of lines. For example, if you want to have 64 characters per line, each line encodes 64 / 4 * 3, which is 48 bytes. If you encode 48 bytes, you'll get one line. If you encode 480 bytes, you'll get 10 full lines.
So modify the above BUFFER_SIZE to something like 4800. Instead of Base64.getEncoder() use Base64.getMimeEncoder(64,new byte[] { 13, 10}). And then, when it encodes, you'll get 100 full-sized lines from each chunk except the last. You may need to add a result.append("\r\n") to the while loop.

Read String and bytes from the same file java

I'm looking for a way to switch between reading bytes (as byte[]) and reading lines of Strings from a file. I know that a byte[] can be obtained form a file through a FileInputStream, and a String can be obtained through a BufferedReader, but using both of them at the same time is proving problematic. I know how long the section of bytes are. String encoding can be kept constant from when I write the file. The filetype is a custom one that is still in development, so I can change how I write data to it.
How can I read Strings and byte[]s from the same file in java?
Read as bytes. When you have read a sequence of bytes that you know should be a string, place those bytes in an array, put the array inside a ByteArrayInputStream and use that as the underlying InputStream for a Reader to get the bytes as characters, then read those characters to produce a String.
For the later parts of this process see the related SO question on how to create a String from an InputStream.
Read the file as Strings using a BufferedReader then use String.getBytes().
Why not try this:
BufferedReader bufferedReader = null;
try {
bufferedReader = new BufferedReader(new FileReader("testing.txt"));
String line = bufferedReader.readLine();
while(line != null){
byte[] b = line.getBytes();
}
} finally {
if(bufferedReader!=null){
bufferedReader.close();
}
}
or
FileInputStream in = null;
BufferedReader bufferedReader = null;
try {
bufferedReader = new BufferedReader(new FileReader("xanadu.txt"));
String line = bufferedReader.readLine();
while(line != null){
//read your line
}
in = new FileInputStream("xanadu.txt");
int c;
while ((c = in.read()) != -1) {
//read your bytes (c)
}
} finally {
if (in != null) {
in.close();
}
if(bufferedReader!=null){
bufferedReader.close();
}
}
Read everything as bytes from the buffered input stream, and convert string sections into String's using constructor that accepts the byte array:
String string = new String(bytes, offset, length, "US-ASCII");
Depending on how the data are actually encoded, you may need to use "UTF-8" or something else as the name of the charset.

Categories

Resources