Java decompressing array of bytes - java

On server (C++), binary data is compressed using ZLib function:
compress2()
and it's sent over to client (Java).
On client side (Java), data should be decompressed using the following code snippet:
public static String unpack(byte[] packedBuffer) {
InflaterInputStream inStream = new InflaterInputStream(new ByteArrayInputStream( packedBuffer);
ByteArrayOutputStream outStream = new ByteArrayOutputStream();
int readByte;
try {
while((readByte = inStream.read()) != -1) {
outStream.write(readByte);
}
} catch(Exception e) {
JMDCLog.logError(" unpacking buffer of size: " + packedBuffer.length);
e.printStackTrace();
// ... the rest of the code follows
}
Problem is that when it tries to read in while loop it always throws:
java.util.zip.ZipException: invalid stored block lengths
Before I check for other possible causes can someone please tell me can I compress on one side with compress2 and decompress it on the other side using above code, so I can eliminate this as a problem? Also if someone has a possible clue about what might be wrong here (I know I didn't provide too much of of the code in here but projects are rather big.
Thanks.

I think the problem is not with unpack method but in packedBuffer content. Unpack works fine
public static byte[] pack(String s) throws IOException {
ByteArrayOutputStream out = new ByteArrayOutputStream();
DeflaterOutputStream dout = new DeflaterOutputStream(out);
dout.write(s.getBytes());
dout.close();
return out.toByteArray();
}
public static void main(String[] args) throws Exception {
byte[] a = pack("123");
String s = unpack(a); // calls your unpack
System.out.println(s);
}
output
123

public static String unpack(byte[] packedBuffer) {
try (GZipInputStream inStream = new GZipInputStream(
new ByteArrayInputStream(packedBuffer));
ByteArrayOutputStream outStream = new ByteArrayOutputStream()) {
inStream.transferTo(outStream);
//...
return outStream.toString(StandardCharsets.UTF_8);
} catch(Exception e) {
JMDCLog.logError(" unpacking buffer of size: " + packedBuffer.length);
e.printStackTrace();
throw new IllegalArgumentException(e);
}
}
ZLib is the zip format, hence a GZipInputStream is fine.
A you seem to expect the bytes to represent text, hence be in some encoding, add that encoding, Charset, to the conversion to String (which always holds Unicode).
Note, UTF-8 is the encoding of the bytes. In your case it might be an other encoding.
The ugly try-with-resources syntax closes the streams even on exception or here the return.
I rethrowed a RuntimeException as it seems dangerous to do something with no result.

Related

Compress JSON to GZIP and Upload to S3

I'm trying to pull in JSON from a lambda, compress it to gzip format and upload to s3. I can do all of this except compress it to gzip. I have pulled various bit of code from here (S.O.) the first code but does not seem to work correctly. Here is what I have tried and the outcome:
this first method seems to make the file much smaller and is gzip format:
public void compressAndUpload(AmazonS3 s3, InputStream in) throws IOException {
Path tmpPath = Files.createTempFile("atest", ".json.gz");
OutputStream out = Files.newOutputStream(tmpPath);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(out);
IOUtils.copy(in, gzOut);
InputStream fileIn = Files.newInputStream(tmpPath);
long size = Files.size(tmpPath);
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType("application/x-gzip");
metadata.setContentLength(size);
s3.putObject(bucketName, "atest.json.gz", fileIn, metadata);
}
However, when I pull it to my local machine, but when I try to use 'gunzip' on it i get the following error message:
gzip: atest.json.gz: unexpected end of file
this next method when is not actually compressing the file and when i pull it down locally it says "not in gzip format"
public String handleRequest(Input input, Context context) {
try {
byte[] btArr = compress(input.getMessage());
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType("application/x-gzip");
metadata.setContentLength(btArr.length);
AmazonS3ClientBuilder.defaultClient().putObject(new PutObjectRequest(bucketName, "test22.json.gz",
new ByteArrayInputStream(btArr), metadata));
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
public static byte[] compress(String str) throws Exception {
if (str == null || str.length() == 0) {
return null;
}
System.out.println("String length : " + str.length());
ByteArrayOutputStream obj=new ByteArrayOutputStream();
GzipCompressorOutputStream gzip = new GzipCompressorOutputStream(obj);
gzip.write(str.getBytes("UTF-8"));
gzip.flush(); <-------******Update: This was missing.. caused it to fail.
gzip.close();
return obj.toByteArray();
}
Am I missing a step here? I feel like this should be a fairly straight forward thing...

java array byte file to human readable

I have a byte array file with me which I am trying to convert into human readable. I tried below ways :
public static void main(String args[]) throws IOException
{
//System.out.println("Platform Encoding : " + System.getProperty("file.encoding"));
FileInputStream fis = new FileInputStream("<Path>");
// Using Apache Commons IOUtils to read file into byte array
byte[] filedata = IOUtils.toByteArray(fis);
String str = new String(filedata, "UTF-8");
System.out.println(str);
}
Another approach :
public static void main(String[] args) {
File file = new File("<Path>");
readContentIntoByteArray(file);
}
private static byte[] readContentIntoByteArray(File file) {
FileInputStream fileInputStream = null;
byte[] bFile = new byte[(int) file.length()];
try {
FileInputStream(file);
fileInputStream.read(bFile);
fileInputStream.close();
for (int i = 0; i < bFile.length; i++) {
System.out.print((char) bFile[i]);
}
} catch (Exception e) {
e.printStackTrace();
}
return bFile;
}
These codes are compiling but its not yielding output file in a human readable fashion. Excuse me if this is a repeated or basic question.
Could someone please correct me where I am going wrong here?
Your code (from the first snippet) for decoding a byte file into a UTF-8 text file looks correct to me (assuming FileInputStream fis = new FileInputStream("Path") is yielding the correct fileInputStream) .
If you're expecting a text file format but are not sure which encoding the file format is in (perhaps it's not UTF-8) , you can use a library like the below to find out.
https://code.google.com/archive/p/juniversalchardet/
or just explore some of the different Charsets in the Charset library and see what they produce in your String initialization line and what you produce:
new String(byteArray, Charset.defaultCharset()) // try other Charsets here.
The second method you show has associated catches with byte to char conversion , depending on the characters, as discussed here (Byte and char conversion in Java).
Chances are, if you cannot find a valid encoding for this file, it is not human readable to begin with, before byte conversion, or the byte array file being passed to you lost something that makes it decodeable along the way.

Why is my binary data bigger after getting it from the webserver?

I need to serve a binary file through a web service implemented in Python/Django. The problem is, that when I compare the original file with the transferred file with vbindiff I see trailing bytes on the transferred file, sadly rendering it useless.
The Binary File is accessed saved by a client in Java with:
HttpURLConnection userdataConnection = null;
URL userdataUrl = null;
try {
userdataUrl = new URL("http://localhost:8000/app/vuforia/10");
userdataConnection = (HttpURLConnection) userdataUrl.openConnection();
userdataConnection.setRequestMethod("GET");
userdataConnection.setRequestProperty("Content-Type", "application/octet-stream");
userdataConnection.connect();
InputStream userdataStream = new BufferedInputStream(userdataConnection.getInputStream());
try (ByteArrayOutputStream fileStream = new ByteArrayOutputStream()) {
byte[] buffer = new byte[4094];
while (userdataStream.read(buffer) != -1) {
fileStream.write(buffer);
}
byte[] fileBytes = fileStream.toByteArray();
try (FileOutputStream fos = new FileOutputStream("./test.dat")) {
fos.write(fileBytes);
}
}
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
I think that HttpURLConnection.getInputStream only reads the body of the response, or not?
This code serves the data in the backend
in views.py:
if request.method == "GET":
all_data = VuforiaDatabase.objects.all()
data = all_data.get(id=version)
return FileResponse(data.get_dat_bytes())
in models.py:
def get_dat_bytes(self):
return self.dat_upload.open()
How do I go about transferring the binary data 1:1?
You’re ignoring the return value of InputStream.read.
From the documentation:
Returns:
the total number of bytes read into the buffer, or -1 if there is no more data because the end of the stream has been reached.
Your code is assuming that the buffer is filled with every call to userdataStream.read(buffer), instead of checking how many bytes were actually read into buffer.
You don’t need to read from an InputStream at all. Just use Files.copy:
Path file = Paths.get("./test.dat");
try (InputStream userdataStream = new BufferedInputStream(userdataConnection.getInputStream())) {
Files.copy(userdataStream, file, StandardCopyOption.REPLACE_EXISTING);
}
You always write a multiple the 4094 bytes, no matter how many bytes you actually read.
Don't do .write(buffer); write the amount you actually read. This is what userdataStream.read returns you. It can return a number smaller than the buffer size, but still positive.
If you project is using Apache Commons already, you can just use copyInputStreamToFile.
Note: 4K = 4096, not 4094, and it's a ridiculously small buffer, unless you operate something like a smartcard. On a PC, use something like a few hundred kb at least.

Gzip used for compression returns the similar size of that of original

This code I have written for compressing the inputstream
public InputStream compress(InputStream inputStream) throws IOException {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
try {
GZIPOutputStream gzipOutputStream = new GZIPOutputStream(byteArrayOutputStream);
byte[] inBytes = IOUtils.toByteArray(inputStream);
gzipOutputStream.write(inBytes);
ByteArrayInputStream retStream = new ByteArrayInputStream(byteArrayOutputStream.toByteArray());
gzipOutputStream.close();
return retStream;
} catch (IOException e) {
throw new RuntimeException(e);
}
}
It is returning the similar size's stream. Is their any bug in the code?
The input stream passed is of an image file through a test as :
#Test
public void test() throws IOException {
try {
in = this.getClass().getResourceAsStream("/testimage.jpg");
} catch (NullPointerException e) {
logger.log(Level.SEVERE, "Exception reading file in test - ", e);
}
compress(in);
}
A JPEG (.jpg) file is already compressed. Trying to compress it again will most likely expand it a tiny bit.
If you compress already compressed content it wont be that much smaller (can be even bigger). Infinite compression is not possible. JPEG is already comressed (loosely). If you would GZIP bitmap or simple text file, you would notice the difference.

Read first bytes of a file

I need a very simple function that allows me to read the first 1k bytes of a file through FTP. I want to use it in MATLAB to read the first lines and, according to some parameters, to download only files I really need eventually. I found some examples online that unfortunately do not work. Here I'm proposing the sample code where I'm trying to download one single file (I'm using the Apache libraries).
FTPClient client = new FTPClient();
FileOutputStream fos = null;
try {
client.connect("data.site.org");
// filename to be downloaded.
String filename = "filename.Z";
fos = new FileOutputStream(filename);
// Download file from FTP server
InputStream stream = client.retrieveFileStream("/pub/obs/2008/021/ab120210.08d.Z");
byte[] b = new byte[1024];
stream.read(b);
fos.write(b);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (fos != null) {
fos.close();
}
client.disconnect();
} catch (IOException e) {
e.printStackTrace();
}
}
the error is in stream which is returned empty. I know I'm passing the folder name in a wrong way, but I cannot understand how I have to do. I've tried in many way.
I've also tried with the URL's Java classes as:
URL url;
url = new URL("ftp://data.site.org/pub/obs/2008/021/ab120210.08d.Z");
URLConnection con = url.openConnection();
BufferedInputStream in =
new BufferedInputStream(con.getInputStream());
FileOutputStream out =
new FileOutputStream("C:\\filename.Z");
int i;
byte[] bytesIn = new byte[1024];
if ((i = in.read(bytesIn)) >= 0) {
out.write(bytesIn);
}
out.close();
in.close();
but it is giving an error when I'm closing the InputStream in!
I'm definitely stuck. Some comments about would be very useful!
Try this test
InputStream is = new URL("ftp://test:test#ftp.secureftp-test.com/bookstore.xml").openStream();
byte[] a = new byte[1000];
int n = is.read(a);
is.close();
System.out.println(new String(a, 0, n));
it definitely works
From my experience when you read bytes from a stream acquired from ftpClient.retrieveFileStream, for the first run it is not guarantied that you get your byte buffer filled up. However, either you should read the return value of stream.read(b); surrounded with a cycle based on it or use an advanced library to fill up the 1024 length byte[] buffer:
InputStream stream = null;
try {
// Download file from FTP server
stream = client.retrieveFileStream("/pub/obs/2008/021/ab120210.08d.Z");
byte[] b = new byte[1024];
IOUtils.read(stream, b); // will call periodically stream.read() until it fills up your buffer or reaches end-of-file
fos.write(b);
} catch (IOException e) {
e.printStackTrace();
} finally {
IOUtils.closeQuietly(inputStream);
}
I cannot understand why it doesn't work. I found this link where they used the Apache library to read 4096 bytes each time. I read the first 1024 bytes and it works eventually, the only thing is that if completePendingCommand() is used, the program is held for ever. Thus I've removed it and everything works fine.

Categories

Resources