Java: Gzip string to output string - java

How do I take a string and use something like GZIPOutputStream to gzip the string and then output the zipped content as a string.
My intention is to transfer the zipped content as a post variable through HTTP.

The steps are actually pretty simple:
Use the GZIPOutputStream to write it to a ByteArrayOutputStream... close the GZIPOutputStream
Call ByteArrayOutputStream.toBytes() to get the byte array
Use a Base64 encoder on the result
The server will perform essentially the reverse of these operations.

Related

Read AWS S3 GZIP Object using GetObjectRequest with range

I am trying to read a big AWS S3 Compressed Object(gz).I don't want to read the whole object, want to read it in parts,so that i can process the uncompressed data in parallel
I am reading it with GetObjectRequest with "Range" Header, where i am setting byte range.
However, when i give a byte range in between (100,200), it fails with "Not in GZIP format"
The reason for failure is , AWS request return a stream,however when i parse it to GZIPInputStream it fails as "GZIPInputStream" expects the first byte (GZIP_MAGIC = 0x8b1f) to confirm is it gzip , which is not present in the stream.
GetObjectRequest rangeObjectRequest = new GetObjectRequest(<<Bucket>>, <<Key>>).withRange(100, 200);
S3Object object = s3Client.getObject(rangeObjectRequest);
S3ObjectInputStream rawData = object.getObjectContent();
InputStream data = new GZIPInputStream(rawData);
Can anyone guide the right approach?
GZIP is a compression format in which each byte in the file depends on all of the bytes that precede it. Which means that you can't pick an arbitrary byte range out of the file and make sense of it.
If you need to read byte ranges, you'll need to store it uncompressed.
You could also create your own file storage format that stores chunks of the file as separately-compressed blocks. You could do this using the ZIP format, where each file in the archive represents a specific block size. But you'd need to implement your own ZIP directory reader to make that work.

Reading a binary file from the file system as a BLOB to use in rhino with javascript

I'm planing to use SheetJS with rhino. And sheetjs takes a binary object(BLOB if i'm correct) as it's input. So i need to read a file from the system using stranded java I/O methods and store it into a blob before passing it to sheetjs. eg :-
var XLDataWorkBook = XLSX.read(blobInput, {type : "binary"});
So how can i create a BLOB(or appropriate type) from a binary file in java in order to pass it in.
i guess i cant pass streams because i guess XLSX needs a completely created object to process.
I found the answer to this by myself. i was able to get it done this way.
Read the file with InputStream and then write it to a ByteArrayOutputStream. like below.
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
...
buffer.write(bytes, 0, len);
Then create a byte array from it.
byte[] byteArray = buffer.toByteArray();
Finally i did convert it to a Base64 String (which is also applicable in my case) using the "Base64.encodeBase64String()" method in apache.commons.codec.binary package. So i can pass Base64 String as a method parameter.
If you further need there are lot of libraries(3rd-party and default) available for Base64 to Blob conversion as well.

Send textfile over socket using writer/reader?

Is there a way to send a textfile from client to server using XXXwriter and XXXreader instead of sending bytes?
Any suggestions?
You can wrap the InputStream in an InputStreamReader, and the OutputStream in an OutputStreamWriter. These classes bridge binary (byte[], *Stream) from/to java's Unicode text (String, char, *Reader, *Writer). Use the constructor with the correct encoding.
Charset encoding = StandardCharsets.UTF_8;
String encoding = "Windows-1252";
... new InputStreamReader(inputStream, encoding);
This however assumes that the Stream transfer is done fine. Possible errors are:
forgetting to close, not all data transfered;
use of available() which is not needed;
using a buffer to read, and not writing the actual number of bytes read, old data at the end.

Saving a file returned by web-service in java

I have to make a call to a web-service for which the response is as per following
<ns2:wsresponse>
<ns2:length>10582</ns2:length>
<ns2:filecontent>H4sIAAAAAAAAALVZa3OjSLL9fB...
(Snip)
</ns2:filecontent>
<ns2:contentType>application/gzip</ns2:contentType>
</ns2:wsresponse>
The web-service is actually returning a file which is encoded using mime-type application/gzip (as in ns2:contentType). I am not sure how to save the file on disk on the client side in java?
The <ns2:filecontent> tag appears to hold a BASE64 encoded string which probably is the content of the file.
Basically take that BASE64 encoded string decode it and the resulting byte[] can be used to store the data on disk.

InputStream reading

Good night in my timezone.
I am building an http bot, and when i receive the response from the server i want to make two things.First is to print the body of the response and because i know that the body of the response is of the type TEXT/HTML the second thing that i make is to parse the response through a html parser(in this specific case NekoHtml).
Snippet of code :
//Print the first call
printResponse(urlConnection.getInputStream());
document = new InputSource(urlConnection.getInputStream());
parser.setDocument(document);
The problem is when i run the first line (printResponse) the second line will throw an exception.
Now the questions-> This happens because the InputStream can only be read one time ?every time that we read from the inputstream the bytes are cleaned?
How can we read more that one time the content from the inputstream ?
Thanks in advance
Best regards
In addition to what Ted Hopp said take a look at Apache Commons IO library. You will find:
IOUtils.toString(urlConnection.getInputStream(), "UTF-8") utility method that takes an input stream, fully reads it and returns a string in a given encoding
TeeInputStream is an InputStream decorator that takes will copy every read byte and copy it into a given output stream as well.
Should work:
InputStream is = new TeeInputStream(urlConnection.getInputStream(), System.out);
Read the response from the server into a byte array. You can then create a ByteArrayInputStream to repeatedly read the bytes.
As Ted Hopp said:
byte [] bytes = new byte[urlConnection.getInputStream().available()];
printResponse(new ByteArrayInputStream(bytes));
document = new InputSource(new ByteArrayInputStream(bytes));
parser.setDocument(document);

Categories

Resources