I have a zip file that is AES encrypted. After decrypting I'm left with a byte[] containing the zip content. But when I try to unzip it, using a ByteArrayInputStream, ZipInputStream.getNextEntry() returns null right away. Debugging, I see that my byte[] doesn't have a required local file header signature which is
static long LOCSIG = 0x04034b50L; // "PK\003\004"
so ZipInputStream.getNextEntry() returns null.
If, however, I write those decrypted bytes out to a file and then use a FileInputStream() that I pass to ZipInputStream(), everything works as expected. Below is my current code. Can anyone suggest a way to unzip without first writing out to a temporary file?
byte[] data = AESUtil.decryptInputStream(...);
ByteArrayInputStream bis = new ByteArrayInputStream(data);
ZipInputStream stream = new ZipInputStream(bis);
ZipEntry entry;
while ((entry = stream.getNextEntry()) != null) {
...
}
I've come to the conclusion that ZipInputStream just isn't flexible enough. When I substituted the above code with Apache Commons' compress classes, everything works. Below is a working implementation using that library and the same byte array:
byte[] data = AESUtil.decryptInputStream(...);
SeekableInMemoryByteChannel inMemoryByteChannel = new SeekableInMemoryByteChannel(data);
ZipFile zipFile = new ZipFile(inMemoryByteChannel);
Iterator<ZipArchiveEntry> iterator = = zipFile.getEntriesInPhysicalOrder().asIterator();
while (iterator.hasNext()) {
...
}
Related
Requirement: compress a byte[] to get another byte[] using java.util.zip.ZipOutputStream BUT without using any files on disk or in-memory(like here https://stackoverflow.com/a/18406927/9132186). Is this even possible?
All the examples I found online read from a file(.txt) and write to a file(.zip). ZipOutputStream needs a ZipEntry to work with and that ZipEntry needs a file.
However, my use case is as follows: I need to compress a chunk (say 10MB) of a file at a time using a zip format and append all these compressed chunks to make a .zip file. But, when I unzip the .zip file then it is corrupted.
I am using in-memory files as suggested in https://stackoverflow.com/a/18406927/9132186 to avoid files on disk but need a solution without these files also.
public void testZipBytes() {
String infile = "test.txt";
FileInputStream in = new FileInputStream(infile);
String outfile = "test.txt.zip";
FileOutputStream out = new FileOutputStream(outfile);
byte[] buf = new byte[10];
int len;
while ((len = in.read(buf)) > 0) {
out.write(zipBytes(buf));
}
in.close();
out.close();
}
// ACTUAL function that compresses byte[]
public static class MemoryFile {
public String fileName;
public byte[] contents;
}
public byte[] zipBytesMemoryFileWORKS(byte[] input) {
MemoryFile memoryFile = new MemoryFile();
memoryFile.fileName = "try.txt";
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ZipOutputStream zos = new ZipOutputStream(baos);
ZipEntry entry = new ZipEntry(memoryFile.fileName);
entry.setSize(input.length);
zos.putNextEntry(entry);
zos.write(input);
zos.finish();
zos.closeEntry();
zos.close();
return baos.toByteArray();
}
Scenario 1:
if test.txt has small amount of data (less than 10 bytes) like "this" then unzip test.txt.zip yeilds try.txt with "this" in it.
Scenario 2:
if test.txt has larger amount of data (more than 10 bytes) like "this is a test for zip output stream and it is not working" then unzip test.txt.zip yields try.txt with broken pieces of data and is incomplete.
this 10 bytes is the buffer size in testZipBytes and is the amount of data that is compressed at a time by zipBytes
Expected (or rather desired):
1. unzip test.txt.zip does not use the "try.txt" filename i gave in the MemoryFile but rather unzips to filename test.txt itself.
2. unzipped data is not broken and yields the input data as is.
3. I have done the same with GzipOutputStream and it works perfectly fine.
Requirement: compress a byte[] to get another byte[] using java.util.zip.ZipOutputStream BUT without using any files on disk or in-memory(like here https://stackoverflow.com/a/18406927/9132186). Is this even possible?
Yes, you've already done it. You don't actually need MemoryFile in your example; just delete it from your implementation and write ZipEntry entry = new ZipEntry("try.txt") instead.
But you can't concatenate the zips of 10MB chunks of file and get a valid zip file for the combined file. Zipping doesn't work like that. You could have a solution which minimizes how much is in memory at once, perhaps. But breaking the original file up into chunks seems unworkable.
I'm trying to get a zip file from the server.
Im using HttpURLConnection to get InputStream and this is what i have:
myInputStream.toString().getBytes().toString() is equal to [B#4.....
byte[] bytes = Base64.decode(myInputStream.toString(), Base64.DEFAULT);
String string = new String(bytes, "UTF-8");
string == �&ܢ��z�m����y....
I realy tried to unzip this file but nothing works, also there is so many questions, should I use GZIPInputStream or ZipInputStream? Do I have to save this stream as file, or I can work on InputStream
Please help, my boss is getting impatient:O
I have no idea what is in this file i have to find out:)
GZipInputStream and ZipInputStream are two different formats. https://en.wikipedia.org/wiki/Gzip
It is not a good idea to retrieve a string directly from the stream.From an InputStream, you can create a File and write data into it using a FileOutputStream.
Decoding in Base 64 is something else. If your stream has already decoded the format upstream, it's OK; otherwise you have to encapsulate your stream with another input stream that decodes the Base64 format.
The best practice is to use a buffer to avoid memory overflow.
Here is some Kotlin code that decompresses the InputStream zipped into a file. (simpler than java because the management of byte [] is tedious) :
val fileBinaryDecompress = File(..path..)
val outputStream = FileOutputStream(fileBinaryDecompress)
readFromStream(ZipInputStream(myInputStream), BUFFER_SIZE_BYTES,
object : ReadBytes {
override fun read(buffer: ByteArray) {
outputStream.write(buffer)
}
})
outputStream.close()
interface ReadBytes {
/**
* Called after each buffer fill
* #param buffer filled
*/
#Throws(IOException::class)
fun read(buffer: ByteArray)
}
#Throws(IOException::class)
fun readFromStream(inputStream: InputStream, bufferSize: Int, readBytes: ReadBytes) {
val buffer = ByteArray(bufferSize)
var read = 0
while (read != -1) {
read = inputStream.read(buffer, 0, buffer.size)
if (read != -1) {
val optimizedBuffer: ByteArray = if (buffer.size == read) {
buffer
} else {
buffer.copyOf(read)
}
readBytes.read(optimizedBuffer)
}
}
}
If you want to get the file from the server without decompressing it, remove the ZipInputStream() decorator.
Usually, there is no significant difference between GZIPInputStream or ZipInputStream, so if at all, both should work.
Next, you need to identify whether the zipped stream was Base64 encoded, or the some Base64 encoded contents was put into a zipped stream - from what you put to your question, it seems to be the latter option.
So you should try
ZipInputStream zis = new ZipInputStream( myInputStream );
ZipEntry ze = zis.getNextEntry();
InputStream is = zis.getInputStream( ze );
and proceed from there ...
basically by setting inputStream to be GZIPInputStream should be able to read the actual content.
Also for simplicity using IOUtils package from apache.commons makes your life easy
this works for me:
InputStream is ; //initialize you IS
is = new GZIPInputStream(is);
byte[] bytes = IOUtils.toByteArray(is);
String s = new String(bytes);
System.out.println(s);
I have file with names like ex.zip. In this example, the Zip file contains only one file with the same name(ie. `ex.txt'), which is quite large. I don't want to extract the zip file every time.Hence I need to read the content of the file(ex.txt) without extracting the zip file. I tried some code like below But i can only read the name of the file in the variable.
How do I read the content of the file and stores it in the variable?
Thank you in Advance
fis=new FileInputStream("C:/Documents and Settings/satheesh/Desktop/ex.zip");
ZipInputStream zis = new ZipInputStream(new BufferedInputStream(fis));
ZipEntry entry;
while((entry = zis.getNextEntry()) != null) {
i=i+1;
System.out.println(entry);
System.out.println(i);
//read from zis until available
}
Your idea is to read the zip file as it is into a byte array and store it in a variable.
Later when you need the zip you extract it on demand, saving memory:
First read the content of the Zip file in a byte array zipFileBytes
If you have Java 1.7:
Path path = Paths.get("path/to/file");
byte[] zipFileBytes= Files.readAllBytes(path);
otherwise use Appache.commons lib
byte[] zipFileBytes;
zipFileBytes = IOUtils.toByteArray(InputStream input);
Now your Zip file is stored in a variable zipFileBytes, still in compressed form.
Then when you need to extract something use
ByteArrayInputStream bis = new ByteArrayInputStream(zipFileBytes));
ZipInputStream zis = new ZipInputStream(bis);
Try this:
String zipFile = "ex.zip";
try (ZipFile zip = new ZipFile(zipFile)) {
int i = 0;
for (Enumeration<? extends ZipEntry> e = zip.entries(); e.hasMoreElements(); ) {
ZipEntry entry = (ZipEntry) e.nextElement();
System.out.println(entry);
System.out.println(i);
InputStream in = zip.getInputStream(entry);
}
}
For example, if the file contains text, and you want to print it as a String, you can read the InputStream like this: How do I read / convert an InputStream into a String in Java?
I think that in your case the fact that a zipfile is a container that can hold many files (and thus forces you to navigate to the right contained file each time you open it) seriously complicates things, as you state that each zipfile only contains one textfile. Maybe it's a lot easier to just gzip the text file (gzip is not a container, just a compressed version of your data). And it's very simple to use:
GZIPInputStream gis = new GZIPInputStream(new FileInputStream("file.txt.gz"));
// and a BufferedReader on top to comfortably read the file
BufferedReader in = new BufferedReader(new InputStreamReader(gis) );
Producing them is equally simple:
GZIPOutputStream gos = new GZIPOutputStream(new FileOutputStream("file.txt.gz"));
I get a java.lang.ArrayIndexOutOfBoundsException when using ByteArrayInputStream.
First, I use a ZipInputStream to read through a zip file,
and while looping through the zipEntries,
I use a ByteArrayInputStream to capture the data of each zipEntry
using the
ZipInputStream.read(byte[] b) and ByteArrayInputStream(byte[] b) methods.
At the end, I have a total of 6 different ByteArrayInputStream objects containing data from 6 different zipEntries.
I then use OpenCSV to read through each of the ByteArrayInputStream.
I have no problem reading 4 of the 6 ByteArrayInputStream objects, of which have byte sizes of less than 2000.
The other 2 ByteArrayInputStream objects have byte sizes of 2155 and 4010 respectively and the CSVreader was only able to read part of these 2 objects, then give an java.lang.ArrayIndexOutOfBoundsException.
This is the code I used to loop through the ZipInputStream
InputStream fileStream = attachment.getInputStream();
try {
ZipInputStream zippy = new ZipInputStream(fileStream);
ZipEntry entry = zippy.getNextEntry();
ByteArrayInputStream courseData = null;
while (entry!= null) {
String name = entry.getName();
long size = entry.getSize();
if (name.equals("course.csv")) {
courseData = copyInputStream(zippy, (int)size);
}
//similar IF statements for 5 other ByteArrayInputStream objects
entry = zippy.getNextEntry();
}
CourseDataManager.load(courseData);
}catch(Exception e){
e.printStackTrace();
}
The following is the code with which I use to copy the data from the ZipInputStream to the ByteArrayInputStream.
public ByteArrayInputStream copyInputStream(InputStream in, int size)
throws IOException {
byte[] buffer = new byte[size];
in.read(buffer);
ByteArrayInputStream b = new ByteArrayInputStream(buffer);
return b;
}
The 2 sets of openCSV codes are able to read a few lines of data, before throwing that exception, which leads me to believe that it is the byteArray that is causing the problem. Is there anything I can do or work around this problem?
I am trying to make an application that accepts a zip file, while not storing any temporary files in the web app, as I am deploying to both google app engine and tomcat server.
Fixed!!! Thanks to stephen C, i realized that read(byte[]) does not read everything so I adjusted the code to make the copyInputStream fully functional.
Since this looks like homework, here's a hint:
The read(byte[]) method returns the number bytes read.
On what line do you get the error? And have you checked the value of size? I suspect it's 0
I am trying to read a single file from a java.util.zip.ZipInputStream, and copy it into a java.io.ByteArrayOutputStream (so that I can then create a java.io.ByteArrayInputStream and hand that to a 3rd party library that will end up closing the stream, and I don't want my ZipInputStream getting closed).
I'm probably missing something basic here, but I never enter the while loop here:
ByteArrayOutputStream streamBuilder = new ByteArrayOutputStream();
int bytesRead;
byte[] tempBuffer = new byte[8192*2];
try {
while ((bytesRead = zipStream.read(tempBuffer)) != -1) {
streamBuilder.write(tempBuffer, 0, bytesRead);
}
} catch (IOException e) {
// ...
}
What am I missing that will allow me to copy the stream?
Edit:
I should have mentioned earlier that this ZipInputStream is not coming from a file, so I don't think I can use a ZipFile. It is coming from a file uploaded through a servlet.
Also, I have already called getNextEntry() on the ZipInputStream before getting to this snippet of code. If I don't try copying the file into another InputStream (via the OutputStream mentioned above), and just pass the ZipInputStream to my 3rd party library, the library closes the stream, and I can't do anything more, like dealing with the remaining files in the stream.
Your loop looks valid - what does the following code (just on it's own) return?
zipStream.read(tempBuffer)
if it's returning -1, then the zipStream is closed before you get it, and all bets are off. It's time to use your debugger and make sure what's being passed to you is actually valid.
When you call getNextEntry(), does it return a value, and is the data in the entry meaningful (i.e. does getCompressedSize() return a valid value)? IF you are just reading a Zip file that doesn't have read-ahead zip entries embedded, then ZipInputStream isn't going to work for you.
Some useful tidbits about the Zip format:
Each file embedded in a zip file has a header. This header can contain useful information (such as the compressed length of the stream, it's offset in the file, CRC) - or it can contain some magic values that basically say 'The information isn't in the stream header, you have to check the Zip post-amble'.
Each zip file then has a table that is attached to the end of the file that contains all of the zip entries, along with the real data. The table at the end is mandatory, and the values in it must be correct. In contrast, the values embedded in the stream do not have to be provided.
If you use ZipFile, it reads the table at the end of the zip. If you use ZipInputStream, I suspect that getNextEntry() attempts to use the entries embedded in the stream. If those values aren't specified, then ZipInputStream has no idea how long the stream might be. The inflate algorithm is self terminating (you actually don't need to know the uncompressed length of the output stream in order to fully recover the output), but it's possible that the Java version of this reader doesn't handle this situation very well.
I will say that it's fairly unusual to have a servlet returning a ZipInputStream (it's much more common to receive an inflatorInputStream if you are going to be receiving compressed content.
You probably tried reading from a FileInputStream like this:
ZipInputStream in = new ZipInputStream(new FileInputStream(...));
This won’t work since a zip archive can contain multiple files and you need to specify which file to read.
You could use java.util.zip.ZipFile and a library such as IOUtils from Apache Commons IO or ByteStreams from Guava that assist you in copying the stream.
Example:
ByteArrayOutputStream out = new ByteArrayOutputStream();
try (ZipFile zipFile = new ZipFile("foo.zip")) {
ZipEntry zipEntry = zipFile.getEntry("fileInTheZip.txt");
try (InputStream in = zipFile.getInputStream(zipEntry)) {
IOUtils.copy(in, out);
}
}
I'd use IOUtils from the commons io project.
IOUtils.copy(zipStream, byteArrayOutputStream);
You're missing call
ZipEntry entry = (ZipEntry) zipStream.getNextEntry();
to position the first byte decompressed of the first entry.
ByteArrayOutputStream streamBuilder = new ByteArrayOutputStream();
int bytesRead;
byte[] tempBuffer = new byte[8192*2];
ZipEntry entry = (ZipEntry) zipStream.getNextEntry();
try {
while ( (bytesRead = zipStream.read(tempBuffer)) != -1 ){
streamBuilder.write(tempBuffer, 0, bytesRead);
}
} catch (IOException e) {
...
}
You could implement your own wrapper around the ZipInputStream that ignores close() and hand that off to the third-party library.
thirdPartyLib.handleZipData(new CloseIgnoringInputStream(zipStream));
class CloseIgnoringInputStream extends InputStream
{
private ZipInputStream stream;
public CloseIgnoringInputStream(ZipInputStream inStream)
{
stream = inStream;
}
public int read() throws IOException {
return stream.read();
}
public void close()
{
//ignore
}
public void reallyClose() throws IOException
{
stream.close();
}
}
I would call getNextEntry() on the ZipInputStream until it is at the entry you want (use ZipEntry.getName() etc.). Calling getNextEntry() will advance the "cursor" to the beginning of the entry that it returns. Then, use ZipEntry.getSize() to determine how many bytes you should read using zipInputStream.read().
It is unclear how you got the zipStream. It should work when you get it like this:
zipStream = zipFile.getInputStream(zipEntry)
t is unclear how you got the zipStream. It should work when you get it like this:
zipStream = zipFile.getInputStream(zipEntry)
If you are obtaining the ZipInputStream from a ZipFile you can get one stream for the 3d party library, let it use it, and you obtain another input stream using the code before.
Remember, an inputstream is a cursor. If you have the entire data (like a ZipFile) you can ask for N cursors over it.
A diferent case is if you only have an "GZip" inputstream, only an zipped byte stream. In that case you ByteArrayOutputStream buffer makes all sense.
Please try code bellow
private static byte[] getZipArchiveContent(File zipName) throws WorkflowServiceBusinessException {
BufferedInputStream buffer = null;
FileInputStream fileStream = null;
ByteArrayOutputStream byteOut = null;
byte data[] = new byte[BUFFER];
try {
try {
fileStream = new FileInputStream(zipName);
buffer = new BufferedInputStream(fileStream);
byteOut = new ByteArrayOutputStream();
int count;
while((count = buffer.read(data, 0, BUFFER)) != -1) {
byteOut.write(data, 0, count);
}
} catch(Exception e) {
throw new WorkflowServiceBusinessException(e.getMessage(), e);
} finally {
if(null != fileStream) {
fileStream.close();
}
if(null != buffer) {
buffer.close();
}
if(null != byteOut) {
byteOut.close();
}
}
} catch(Exception e) {
throw new WorkflowServiceBusinessException(e.getMessage(), e);
}
return byteOut.toByteArray();
}
Check if the input stream is positioned in the begging.
Otherwise, as implementation: I do not think that you need to write to the result stream while you are reading, unless you process this exact stream in another thread.
Just create a byte array, read the input stream, then create the output stream.