Reading from a ZipInputStream into a ByteArrayOutputStream

Reading from a ZipInputStream into a ByteArrayOutputStream - java

I am trying to read a single file from a java.util.zip.ZipInputStream, and copy it into a java.io.ByteArrayOutputStream (so that I can then create a java.io.ByteArrayInputStream and hand that to a 3rd party library that will end up closing the stream, and I don't want my ZipInputStream getting closed).
I'm probably missing something basic here, but I never enter the while loop here:
ByteArrayOutputStream streamBuilder = new ByteArrayOutputStream();
int bytesRead;
byte[] tempBuffer = new byte[8192*2];
try {
while ((bytesRead = zipStream.read(tempBuffer)) != -1) {
streamBuilder.write(tempBuffer, 0, bytesRead);
}
} catch (IOException e) {
// ...
}
What am I missing that will allow me to copy the stream?
Edit:
I should have mentioned earlier that this ZipInputStream is not coming from a file, so I don't think I can use a ZipFile. It is coming from a file uploaded through a servlet.
Also, I have already called getNextEntry() on the ZipInputStream before getting to this snippet of code. If I don't try copying the file into another InputStream (via the OutputStream mentioned above), and just pass the ZipInputStream to my 3rd party library, the library closes the stream, and I can't do anything more, like dealing with the remaining files in the stream.

Your loop looks valid - what does the following code (just on it's own) return?
zipStream.read(tempBuffer)
if it's returning -1, then the zipStream is closed before you get it, and all bets are off. It's time to use your debugger and make sure what's being passed to you is actually valid.
When you call getNextEntry(), does it return a value, and is the data in the entry meaningful (i.e. does getCompressedSize() return a valid value)? IF you are just reading a Zip file that doesn't have read-ahead zip entries embedded, then ZipInputStream isn't going to work for you.
Some useful tidbits about the Zip format:
Each file embedded in a zip file has a header. This header can contain useful information (such as the compressed length of the stream, it's offset in the file, CRC) - or it can contain some magic values that basically say 'The information isn't in the stream header, you have to check the Zip post-amble'.
Each zip file then has a table that is attached to the end of the file that contains all of the zip entries, along with the real data. The table at the end is mandatory, and the values in it must be correct. In contrast, the values embedded in the stream do not have to be provided.
If you use ZipFile, it reads the table at the end of the zip. If you use ZipInputStream, I suspect that getNextEntry() attempts to use the entries embedded in the stream. If those values aren't specified, then ZipInputStream has no idea how long the stream might be. The inflate algorithm is self terminating (you actually don't need to know the uncompressed length of the output stream in order to fully recover the output), but it's possible that the Java version of this reader doesn't handle this situation very well.
I will say that it's fairly unusual to have a servlet returning a ZipInputStream (it's much more common to receive an inflatorInputStream if you are going to be receiving compressed content.

You probably tried reading from a FileInputStream like this:
ZipInputStream in = new ZipInputStream(new FileInputStream(...));
This won’t work since a zip archive can contain multiple files and you need to specify which file to read.
You could use java.util.zip.ZipFile and a library such as IOUtils from Apache Commons IO or ByteStreams from Guava that assist you in copying the stream.
Example:
ByteArrayOutputStream out = new ByteArrayOutputStream();
try (ZipFile zipFile = new ZipFile("foo.zip")) {
ZipEntry zipEntry = zipFile.getEntry("fileInTheZip.txt");
try (InputStream in = zipFile.getInputStream(zipEntry)) {
IOUtils.copy(in, out);
}
}

I'd use IOUtils from the commons io project.
IOUtils.copy(zipStream, byteArrayOutputStream);

You're missing call
ZipEntry entry = (ZipEntry) zipStream.getNextEntry();
to position the first byte decompressed of the first entry.
ByteArrayOutputStream streamBuilder = new ByteArrayOutputStream();
int bytesRead;
byte[] tempBuffer = new byte[8192*2];
ZipEntry entry = (ZipEntry) zipStream.getNextEntry();
try {
while ( (bytesRead = zipStream.read(tempBuffer)) != -1 ){
streamBuilder.write(tempBuffer, 0, bytesRead);
}
} catch (IOException e) {
...
}

You could implement your own wrapper around the ZipInputStream that ignores close() and hand that off to the third-party library.
thirdPartyLib.handleZipData(new CloseIgnoringInputStream(zipStream));
class CloseIgnoringInputStream extends InputStream
{
private ZipInputStream stream;
public CloseIgnoringInputStream(ZipInputStream inStream)
{
stream = inStream;
}
public int read() throws IOException {
return stream.read();
}
public void close()
{
//ignore
}
public void reallyClose() throws IOException
{
stream.close();
}
}

I would call getNextEntry() on the ZipInputStream until it is at the entry you want (use ZipEntry.getName() etc.). Calling getNextEntry() will advance the "cursor" to the beginning of the entry that it returns. Then, use ZipEntry.getSize() to determine how many bytes you should read using zipInputStream.read().

It is unclear how you got the zipStream. It should work when you get it like this:
zipStream = zipFile.getInputStream(zipEntry)

t is unclear how you got the zipStream. It should work when you get it like this:
zipStream = zipFile.getInputStream(zipEntry)
If you are obtaining the ZipInputStream from a ZipFile you can get one stream for the 3d party library, let it use it, and you obtain another input stream using the code before.
Remember, an inputstream is a cursor. If you have the entire data (like a ZipFile) you can ask for N cursors over it.
A diferent case is if you only have an "GZip" inputstream, only an zipped byte stream. In that case you ByteArrayOutputStream buffer makes all sense.

Please try code bellow
private static byte[] getZipArchiveContent(File zipName) throws WorkflowServiceBusinessException {
BufferedInputStream buffer = null;
FileInputStream fileStream = null;
ByteArrayOutputStream byteOut = null;
byte data[] = new byte[BUFFER];
try {
try {
fileStream = new FileInputStream(zipName);
buffer = new BufferedInputStream(fileStream);
byteOut = new ByteArrayOutputStream();
int count;
while((count = buffer.read(data, 0, BUFFER)) != -1) {
byteOut.write(data, 0, count);
}
} catch(Exception e) {
throw new WorkflowServiceBusinessException(e.getMessage(), e);
} finally {
if(null != fileStream) {
fileStream.close();
}
if(null != buffer) {
buffer.close();
}
if(null != byteOut) {
byteOut.close();
}
}
} catch(Exception e) {
throw new WorkflowServiceBusinessException(e.getMessage(), e);
}
return byteOut.toByteArray();
}

Check if the input stream is positioned in the begging.
Otherwise, as implementation: I do not think that you need to write to the result stream while you are reading, unless you process this exact stream in another thread.
Just create a byte array, read the input stream, then create the output stream.

Related

Why is my binary data bigger after getting it from the webserver?

I need to serve a binary file through a web service implemented in Python/Django. The problem is, that when I compare the original file with the transferred file with vbindiff I see trailing bytes on the transferred file, sadly rendering it useless.
The Binary File is accessed saved by a client in Java with:
HttpURLConnection userdataConnection = null;
URL userdataUrl = null;
try {
userdataUrl = new URL("http://localhost:8000/app/vuforia/10");
userdataConnection = (HttpURLConnection) userdataUrl.openConnection();
userdataConnection.setRequestMethod("GET");
userdataConnection.setRequestProperty("Content-Type", "application/octet-stream");
userdataConnection.connect();
InputStream userdataStream = new BufferedInputStream(userdataConnection.getInputStream());
try (ByteArrayOutputStream fileStream = new ByteArrayOutputStream()) {
byte[] buffer = new byte[4094];
while (userdataStream.read(buffer) != -1) {
fileStream.write(buffer);
}
byte[] fileBytes = fileStream.toByteArray();
try (FileOutputStream fos = new FileOutputStream("./test.dat")) {
fos.write(fileBytes);
}
}
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
I think that HttpURLConnection.getInputStream only reads the body of the response, or not?
This code serves the data in the backend
in views.py:
if request.method == "GET":
all_data = VuforiaDatabase.objects.all()
data = all_data.get(id=version)
return FileResponse(data.get_dat_bytes())
in models.py:
def get_dat_bytes(self):
return self.dat_upload.open()
How do I go about transferring the binary data 1:1?

You’re ignoring the return value of InputStream.read.
From the documentation:
Returns:
the total number of bytes read into the buffer, or -1 if there is no more data because the end of the stream has been reached.
Your code is assuming that the buffer is filled with every call to userdataStream.read(buffer), instead of checking how many bytes were actually read into buffer.
You don’t need to read from an InputStream at all. Just use Files.copy:
Path file = Paths.get("./test.dat");
try (InputStream userdataStream = new BufferedInputStream(userdataConnection.getInputStream())) {
Files.copy(userdataStream, file, StandardCopyOption.REPLACE_EXISTING);
}

You always write a multiple the 4094 bytes, no matter how many bytes you actually read.
Don't do .write(buffer); write the amount you actually read. This is what userdataStream.read returns you. It can return a number smaller than the buffer size, but still positive.
If you project is using Apache Commons already, you can just use copyInputStreamToFile.
Note: 4K = 4096, not 4094, and it's a ridiculously small buffer, unless you operate something like a smartcard. On a PC, use something like a few hundred kb at least.

How to detect if BufferedInputStream was over while filling an array in Java?

FileInputStream fin = new FileInputStream(path);
BufferedInputStream bin = new BufferedInputStream(fin);
byte[] inputByte1= new byte[500];
byte[] inputByte2= new byte[500];
byte[] inputByte3 =new byte[34];
bin.read(inputByte1);
bin.read(inputByte2);
bin.read(inputByte3);
Let's say the file had only 400 bytes. How can I detect it?
I know that I could check if (bin.read(inputByte1)!=500)
But this looks really ugly to write in each line.
My main questions is:
How to detect if before filling some array the buffer was done.
I do not want to do bin.read() for each byte and check bin.read!=-1.

First, on a Windows based system you need to escape the \ when you use it as a path separator. Next, you could use a FileInputStream (which you could wrap with a BufferedInputStream). Finally, you should close the InputStream when you're done (or you risk leaking file handles, sockets or some other resource). You might use a try-with-resources statement. Putting it all together, it might look something like
File f = new File("c:\\test\\test.txt");
try (InputStream is = new BufferedInputStream(new FileInputStream(f))) {
int val;
while ((val = is.read()) != -1) {
System.out.println((byte) val);
}
} catch (IOException ioe) {
ioe.printStackTrace();
}

I would recommend a DataInputStream over a BufferedInputStream over a FileInputStream. That way you can use the readFully() method to read exactly as many bytes as you need each time without having to loop.
c:\test\test.txt
Use forward slashes in Java:
c:/test/test.txt

Extract tar.gz file in memory in Java

I'm using the Apache Compress library to read a .tar.gz file, something like this:
final TarArchiveInputStream tarIn = initializeTarArchiveStream(this.archiveFile);
try {
TarArchiveEntry tarEntry = tarIn.getNextTarEntry();
while (tarEntry != null) {
byte[] btoRead = new byte[1024];
BufferedOutputStream bout = new BufferedOutputStream(new FileOutputStream(destPath)); //<- I don't want this!
int len = 0;
while ((len = tarIn.read(btoRead)) != -1) {
bout.write(btoRead, 0, len);
}
bout.close();
tarEntry = tarIn.getNextTarEntry();
}
tarIn.close();
}
catch (IOException e) {
e.printStackTrace();
}
Is it possible not to extract this into a seperate file, and read it in memory somehow? Maybe into a giant String or something?

You could replace the file stream with a ByteArrayOutputStream.
i.e. replace this:
BufferedOutputStream bout = new BufferedOutputStream(new FileOutputStream(destPath)); //<- I don't want this!
with this:
ByteArrayOutputStream bout = new ByteArrayOutputStream();
and then after closing bout, use bout.toByteArray() to get the bytes.

Is it possible not to extract this into a seperate file, and read it in memory somehow? Maybe into a giant String or something?
Yea sure.
Just replace the code in the inner loop that is openning files and writing to them with code that writes to a ByteArrayOutputStream ... or a series of such streams.
The natural representation of the data that you read from the TAR (like that) will be bytes / byte arrays. If the bytes are properly encoded characters, and you know the correct encoding, then you can convert them to strings. Otherwise, it is better to leave the data as bytes. (If you attempt to convert non-text data to strings, or if you convert using the wrong charset/encoding you are liable to mangle it ... irreversibly.)
Obviously, you are going to need to think through some of these issues yourself, but basic idea should work ... provided you have enough heap space.

copy the value of btoread to a String like
String s = String.valueof(byteVar);
and goon appending the byte value to the string untill end of the file reaches..

about download a file by java drive api

I use the "get" method from java drive api, and I can get the inputstream. but I cannt open the file when I use the inputstream to creat it. It likes the file is broken.
private static String fileurl = "C:\\googletest\\drive\\";
public static void newFile(String filetitle, InputStream stream) throws IOException {
String filepath = fileurl + filetitle;
BufferedInputStream bufferedInputStream=new BufferedInputStream(stream);
byte[] buffer = new byte[bufferedInputStream.available()];
File file = new File(filepath);
if (!file.exists()) {
file.getParentFile().mkdirs();
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(new FileOutputStream(filepath));
while( bufferedInputStream.read(buffer) != -1) {
bufferedOutputStream.write(buffer);
}
bufferedOutputStream.flush();
bufferedOutputStream.close();
}
}

Firstly, C:\googletest\drive\ is not a URL. It is a file system pathname.
Next, the following probably does not do what you think it does:
byte[] buffer = new byte[bufferedInputStream.available()];
The problem is that the available() call can return zero ... for a non-empty stream. The value returned by available() is an estimate of how many bytes that are currently available to read ... right now. That is not necessarily the stream length ... or anything related to it. And indeed the device drivers for some devices consistently return zero, even when there is data to be read.
Finally, this is wrong:
while( bufferedInputStream.read(buffer) != -1) {
bufferedOutputStream.write(buffer);
You are assuming that read returning -1 means that it filled the buffer. That is not so. Any one of the read calls could return with a partly full buffer. But then you write the entire buffer contents to the output stream ... including "junk" from previous reads.
Either or both of the 2nd and 3rd problems could lead to file corruption. In fact, the third one is likely to.

How to create ByteArrayInputStream from a file in Java?

I have a file that can be any thing like ZIP, RAR, txt, CSV, doc etc. I would like to create a ByteArrayInputStream from it.
I'm using it to upload a file to FTP through FTPClient from Apache Commons Net.
Does anybody know how to do it?
For example:
String data = "hdfhdfhdfhd";
ByteArrayInputStream in = new ByteArrayInputStream(data.getBytes());
My code:
public static ByteArrayInputStream retrieveByteArrayInputStream(File file) {
ByteArrayInputStream in;
return in;
}

Use the FileUtils#readFileToByteArray(File) from Apache Commons IO, and then create the ByteArrayInputStream using the ByteArrayInputStream(byte[]) constructor.
public static ByteArrayInputStream retrieveByteArrayInputStream(File file) {
return new ByteArrayInputStream(FileUtils.readFileToByteArray(file));
}

The general idea is that a File would yield a FileInputStream and a byte[] a ByteArrayInputStream. Both implement InputStream so they should be compatible with any method that uses InputStream as a parameter.
Putting all of the file contents in a ByteArrayInputStream can be done of course:
read in the full file into a byte[]; Java version >= 7 contains a convenience method called readAllBytes to read all data from a file;
create a ByteArrayInputStream around the file content, which is now in memory.
Note that this may not be optimal solution for very large files - all the file will stored in memory at the same point in time. Using the right stream for the job is important.

A ByteArrayInputStream is an InputStream wrapper around a byte array. This means you'll have to fully read the file into a byte[], and then use one of the ByteArrayInputStream constructors.
Can you give any more details of what you are doing with the ByteArrayInputStream? Its likely there are better ways around what you are trying to achieve.
Edit:
If you are using Apache FTPClient to upload, you just need an InputStream. You can do this;
String remote = "whatever";
InputStream is = new FileInputStream(new File("your file"));
ftpClient.storeFile(remote, is);
You should of course remember to close the input stream once you have finished with it.

This isn't exactly what you are asking, but is a fast way of reading files in bytes.
File file = new File(yourFileName);
RandomAccessFile ra = new RandomAccessFile(yourFileName, "rw"):
byte[] b = new byte[(int)file.length()];
try {
ra.read(b);
} catch(Exception e) {
e.printStackTrace();
}
//Then iterate through b

This piece of code comes handy:
private static byte[] readContentIntoByteArray(File file)
{
FileInputStream fileInputStream = null;
byte[] bFile = new byte[(int) file.length()];
try
{
//convert file into array of bytes
fileInputStream = new FileInputStream(file);
fileInputStream.read(bFile);
fileInputStream.close();
}
catch (Exception e)
{
e.printStackTrace();
}
return bFile;
}
Reference: http://howtodoinjava.com/2014/11/04/how-to-read-file-content-into-byte-array-in-java/

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Reading from a ZipInputStream into a ByteArrayOutputStream - java

I'd use IOUtils from the commons io project. IOUtils.copy(zipStream, byteArrayOutputStream);

It is unclear how you got the zipStream. It should work when you get it like this: zipStream = zipFile.getInputStream(zipEntry)

Related

Why is my binary data bigger after getting it from the webserver?

How to detect if BufferedInputStream was over while filling an array in Java?

Extract tar.gz file in memory in Java

about download a file by java drive api

How to create ByteArrayInputStream from a file in Java?

Categories

Resources