How to download multiple files from URL as one zip file - java

I want to download multiple zip files as one zip file for a request.
I have zip file paths like C, https://test12.zip etc. So how can I download these files as a one zip file. I have been searching this for a while. All i got is examples for downloading multiple files(local) and zip them. This is what i tried for downloading one file. For multiple files it won't work.
URL url = new URL("https://test12.zip");
URLConnection connection = url.openConnection();
InputStream stream = connection.getInputStream();
BufferedOutputStream outs = new BufferedOutputStream(response.getOutputStream());
int len;
byte[] buf = new byte[1024];
while ((len = stream.read(buf)) > 0) {
outs.write(buf, 0, len);
}
outs.close();
Any help would be much appreciated.

A ZIP file consists of two parts: First the compressed file entries (filename, attributes and data) and at the end of the file there is a central directory containing a list of all entries, again with filename and attributes.
Hence, you can not directly combine or concatenate zip files. In Java you can only decompress the downloaded zip files on-the-fly (without storing them in the file-system) and at the same time using the decompressed content to create a new combined ZIP file:
First create a ZipOutputStream for the zip file you want to create.
Then use the InputStream of each download and use it with a ZipInputStream.
Iterates through all the entries in every ZipInputStream and for each entry create a new identical entry in the ZipOutputStream and copy the content from the ZipInputStream to the ZipOutputStream.
How to use ZipInputStream see for example: https://stackoverflow.com/a/36648504/150978
Note that this process requires to decompress and afterwards re-compress the file content. Depending on the archive size this can result in a high utilization of one CPU core.

Related

Write ZipEntry with given byte array in memory

I have a very confusing problem and hope that I can get some ideas here.
My problem is very simple, but I didn't find a solution yet.
I want to create a simple ZIP File with ZipEntry's in it. The ZipEntry's are created by a given byte array (saved in a Postgres-DB with Hibernate).
When I put this byte array into my ZipOutputStream.write(..) the ZIP File created is always corrupt. What am I doing wrong?
The ZIP File is transferred to a FTP-Server afterwards.
ByteArrayOutputStream bos = new ByteArrayOutputStream();
final ZipOutputStream zipOut = new ZipOutputStream(bos);
String filename = "test.zip";
for(final Attachment attachment : transportDoc.getAttachments()) {
log.debug("Adding "+attachment.getFileName()+" to ZIP file /tmp/"+filename);
ZipEntry ze = new ZipEntry(attachment.getFileName());
zipOut.putNextEntry(ze);
zipOut.write(attachment.getFileContent());
zipOut.flush();
zipOut.closeEntry();
}
zipOut.close();
org.apache.commons.io.FileUtils.writeByteArrayToFile(new File("/tmp/"+filename), bos.toByteArray());
I am confused, because when I replaced
zipOut.write(attachment.getFileContent()); //This is the byte array from db
with
zipOut.write("Bla bla".getBytes());
it worked!
But the byte array from the DB can't be corrupt, because it can be written to a file with
org.apache.commons.io.FileUtils.writeByteArrayToFile(new File("/tmp/test.png"), attachment.getFileContent());
with no problem. It is a correct file.
I hope you have some ideas left.
Thanks in advance.
EDIT:
I tried to repair the ZIP file offline and then this messages appears:
zip warning: no end of stream entry found: cglhnngplpmhipfg.png
(This png file is the byte-Array-File)
Simple unzip-command output the following:
unzip created.zip
Archive: created.zip
error [created.zip]: missing 2 bytes in zipfile
(attempting to process anyway)
error [created.zip]: attempt to seek before beginning of zipfile
(please check that you have transferred or created the zipfile in the
appropriate BINARY mode and that you have compiled UnZip properly)
(attempting to re-compensate)
replace cglhnngplpmhipfg.png? [y]es, [n]o, [A]ll, [N]one, [r]ename: y
inflating: cglhnngplpmhipfg.png
error: invalid compressed data to inflate
file #2: bad zipfile offset (local header sig): 24709
(attempting to re-compensate)
inflating: created.xml
EDIT 2:
When I write this file to the Filesystem and add this file to the ZIP by an InputStream it doesn't work either! But the File on the Filesystem is ok. I can open the Image with no problem. Its very confusing
File tmpAttachment = new File("/tmp/"+filename+attachment.getFileName());
FileUtils.writeByteArrayToFile(tmpAttachment, attachment.getFileContent());
FileInputStream inTmp = new FileInputStream(tmpAttachment);
int len;
byte[] buffer = new byte[1024];
while ((len = inTmp.read(buffer)) > 0) {
zipOut.write(buffer, 0, len);
}
inTmp.close();
EDIT 3:
This problem only appears when I try to add "complex" files like png or pdf. If I put a txt-file in it, it works.
The problem was NOT in the Zip-Library itself.
It was the transmission to an external FTP Server with wrong mode. (Not binary).
Thanks all for your help.
Try closeEntry() before flush(). Also you can try to explicitly specify the size of the entry using ze.setSize(attachment.getFileContent().length).

Download zip file servlet with contentLength

I am trying to write a servlet to download files as a zip, which will read multiple nodes in a repository (CRX) to get Inputstream of multiple images.
I am using ZipOutputStream to download the zip file.
But as the length after zipping is not known, I can't set the content-length header in response, hence the browser is not able to show the remaining time to download.
Current code:
String[] paths = request.getRequestParameters("path");
ZipOutputStream out = new ZipOutputStream(response.getOutputStream());
for (int i=0; i<paths.length;i++){
InputStream is = getStream(paths[i]);
IOUtils.copy(is, out);
IOUtils.closeQuietly(is);
out.closeEntry();
}
Is there a way to generate the ZipFile and then write to the output stream?

Compare the size of zip created using using java.util.zip to original folder size using java

I have just created a Zip file using java.util.zip. Now I would want to check if the ZIP created is correct and the uncompressed size of the ZIP file is equal to the actual folders that I have zipped.
I know of a method ZipFile.isValid() which returns true if all the headers in the ZIP are correct. Doesn't solve my problem though.
Thanks :)
Checking if the ZIP file is created correctly
if (myZipFile.isValid())
{
// The file has been created successfully
}
Knowing the Length of the directory and the file.
File dir = new File("path/to/the/directory/");
int size = dir.length();
File zip = new File("path/to/zipfile.zip");
int zipSize = zip.length();
Now you can compare them.
Unzip the *.zip file, iterate over each ZipEntry element, sum each ZipEntry.getSize(). Compare this to the sum of the file sizes you zipped. Alternatively (if you don't trust the ZipEntry headers for some reason) unzip each ZipEntry, counting the bytes, but discarding them. You might do either of these things say, as a quick check of your zip code, or maybe even in a unit test.
You can iterate over the ZipEntry's thusly:
zipInStream = new ZipInputStream(new FileInputStream(zipFile));
ZipEntry zipEntry;
while ((zipEntry = zipInStream.getNextJarEntry()) != null) {
String entryName = zipEntry.getName();
....
}

Is there a Java zip library that can fix files, à la zip -FF?

I occasionally receive .zip files in my app that throw start of central directory not found;
zipfile corrupt. exceptions. These zip files open just fine in my Mac's Finder.
I can fix these files every time from the command line, using zip -FF bad.zip --out good.zip
Can any Java ZIP libraries out there accomplish the same thing?
You probably want to just let Java execute this command, because in strict terms zip is more like a container and it can contain different compression algorithms.
In general investigating and solving problems related to compressed archives with a programmatic approach it's likely to be a tricky and long task.
Try this with your command.
I tried using ZipInputStream and ZipOutputStream. But ZipInputStream always failed at some point when doing: "getNextEntry()". Basically the following lines of code in "getNextEntry()":
...
if ((entry = readLOC()) == null) {
return null;
}
...
returned null after some entries and I could not get further.
But finally I could solve the issue using ZipFile together with ZipOutputStream because ZipFile was reading all zip entries without problem and the solution looks like this:
protected void repairZipFile(String file) throws IOException {
File repairZipFile = new File(file+".repair");
ZipFile zipFile = new ZipFile(file);
Enumeration<? extends ZipEntry> zipFileEntries = zipFile.entries();
InputStream zis;
ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(repairZipFile));
byte[] b = new byte[1024];
while(zipFileEntries.hasMoreElements()){
ZipEntry zipEntry = zipFileEntries.nextElement();
zos.putNextEntry(zipEntry);
zis = zipFile.getInputStream(zipEntry);
int n = zis.read(b);
while(n>=0) {
zos.write(b, 0, n);
n = zis.read(b);
}
zis.close();
zos.closeEntry();
}
zipFile.close();
zos.flush();
zos.close();
Files.move(repairZipFile.toPath(), (new File(file)).toPath(), StandardCopyOption.REPLACE_EXISTING);
}
There are two ways to open ZIP files in Java, using the ZipFile class, or using ZipInputStream.
As far as I remember, ZipFile reads the central directory of a zip file first - it can do this because it uses a RandomAccessFile underneath. However, ZipInputStream uses the in-line entry information, which might be better if the central directory, which I think exists at the end of the file, is missing or corrupt.
So, it might be possible to 'repair' a ZIP file in Java by reading a ZIP file using ZipInputStream, and writing it back out to another file using a ZipOutputStream, copying entry information between them. You might end up getting IO exceptions reading from the last entry of the ZipInputStream if it got truncated, but it might still save the other previous entries from the file.

The compressed (zipped) folder is invalid Java

I'm trying to zip files from server into a folder using ZipOutputStream.
After archive download it can't be opened after double click. Error "The compressed (zipped) folder is invalid" occures. But if I open it from context menu - > 7zip -> open file it works normal. What can be reason of the problem?
sourceFileName="./file.txt"'
sourceFile = new File(sourceFileName);
try {
// set the content type and the filename
responce.setContentType("application/zip");
response.addHeader("Content-Disposition", "attachment; filename=" + sourceFileName + ".zip");
responce.setContentLength((int) sourceFile.length());
// get a ZipOutputStream, so we can zip our files together
ZipOutputStream outZip = new ZipOutputStream((responce.getOutputStream());
// Add ZIP entry to output stream.
outZip.putNextEntry(new ZipEntry(sourceFile.getName()));
int length = 0;
byte[] bbuf = new byte[(int) sourceFile.length()];
DataInputStream in = new DataInputStream(new FileInputStream(sourceFile));
while ((in != null) && ((length = in.read(bbuf)) != -1)) {
outZip.write(bbuf, 0, length);
}
outZip.closeEntry();
in.close();
outZip.flush();
outZip.close();
7Zip can open a wide variety of zip formats, and is relatively tolerant of oddities. Windows double-click requires a relatively specific format and is far less tolerant.
You need to look up the zip format and then look at your file (and "good" ones) with a hex editor (such as Hex Editor Neo), to see what may be wrong.
(One possibility is that you're using the wrong compression algorithm. And there are several other variations to consider as well, particularly whether or not you generate a "directory".)
It could be that a close is missing. It could be that the path encoding in the zip cannot be handled by Windows. It might be that Windows has difficulty with the directory structure, or that a path name contains a (back)slash. So it is detective work, trying different files. If you immediately stream the zip to the HTTP response, then finish has to be called i.o. close.
After the code being posted:
The problem is the setContentLength giving the original file size. But when given, it should give the compressed size.
DataInputStream is not needed, and one should here do a readFully.
responce.setContentType("application/zip");
response.addHeader("Content-Disposition", "attachment; filename=file.zip");
//Path sourcePath = sourceFile.toPath();
Path sourcePath = Paths.get(sourceFileName);
ZipOutputStream outZip = new ZipOutputStream((responce.getOutputStream(),
StandardCharsets.UTF-8);
outZip.putNextEntry(new ZipEntry(sourcePath.getFileName().toString()));
Files.copy(sourcePath, outZip);
outZip.closeEntry();
Either finish or closethe zip at the end.
outZip.finish();
//outZip.close();
in.close();
I am not sure (about the best code style) whether to close the response output stream already oneself.
But when not closing finish() must be called, flush() will not suffice, as at the end data is written to the zip.
For file names with for instance Cyrillic letters, it would be best to add a Unicode charset like UTF-8. In fact let UTF-8 be the Esperanto standard world-wide.
A last note: if only one file one could use GZipOutputstream for file.txt.gz or query the browser's capabilities (request parameters) and deliver it compressed as file.txt.

Categories

Resources