How to read content of the Zipped file without extracting in java - java

I have file with names like ex.zip. In this example, the Zip file contains only one file with the same name(ie. `ex.txt'), which is quite large. I don't want to extract the zip file every time.Hence I need to read the content of the file(ex.txt) without extracting the zip file. I tried some code like below But i can only read the name of the file in the variable.
How do I read the content of the file and stores it in the variable?
Thank you in Advance
fis=new FileInputStream("C:/Documents and Settings/satheesh/Desktop/ex.zip");
ZipInputStream zis = new ZipInputStream(new BufferedInputStream(fis));
ZipEntry entry;
while((entry = zis.getNextEntry()) != null) {
i=i+1;
System.out.println(entry);
System.out.println(i);
//read from zis until available
}

Your idea is to read the zip file as it is into a byte array and store it in a variable.
Later when you need the zip you extract it on demand, saving memory:
First read the content of the Zip file in a byte array zipFileBytes
If you have Java 1.7:
Path path = Paths.get("path/to/file");
byte[] zipFileBytes= Files.readAllBytes(path);
otherwise use Appache.commons lib
byte[] zipFileBytes;
zipFileBytes = IOUtils.toByteArray(InputStream input);
Now your Zip file is stored in a variable zipFileBytes, still in compressed form.
Then when you need to extract something use
ByteArrayInputStream bis = new ByteArrayInputStream(zipFileBytes));
ZipInputStream zis = new ZipInputStream(bis);

Try this:
String zipFile = "ex.zip";
try (ZipFile zip = new ZipFile(zipFile)) {
int i = 0;
for (Enumeration<? extends ZipEntry> e = zip.entries(); e.hasMoreElements(); ) {
ZipEntry entry = (ZipEntry) e.nextElement();
System.out.println(entry);
System.out.println(i);
InputStream in = zip.getInputStream(entry);
}
}
For example, if the file contains text, and you want to print it as a String, you can read the InputStream like this: How do I read / convert an InputStream into a String in Java?

I think that in your case the fact that a zipfile is a container that can hold many files (and thus forces you to navigate to the right contained file each time you open it) seriously complicates things, as you state that each zipfile only contains one textfile. Maybe it's a lot easier to just gzip the text file (gzip is not a container, just a compressed version of your data). And it's very simple to use:
GZIPInputStream gis = new GZIPInputStream(new FileInputStream("file.txt.gz"));
// and a BufferedReader on top to comfortably read the file
BufferedReader in = new BufferedReader(new InputStreamReader(gis) );
Producing them is equally simple:
GZIPOutputStream gos = new GZIPOutputStream(new FileOutputStream("file.txt.gz"));

Related

Not able to read ZipInputStream returned by ZipFile.getInputStream(ZipEntry) method

I am trying to read extract a given file from zip file. Zip file contains directories & sub-directories as well. I tried Java7 nio file apis but since my zip has subdirectories as well, I need to provide complete path to extract the file, which is not suitable in my scenario. As I have to take filetobeextracted input from user. I have been trying below code for it but somehow read method of ZipInputStream not reading any contents to buffer. On debugging I found out that ZipEntry object value is null inside ZipInputStream due to its read method simply returns -1.But now I am stuck as I am not able to figure out how that value is being set for it.
try(OutputStream out=new FileOutputStream("filetoExtract");) {
zipFile = new ZipFile("zipFile");
Enumeration<? extends ZipEntry> e = zipFile.entries();
while (e.hasMoreElements()) {
ZipEntry entry = e.nextElement();
if (!entry.isDirectory()) {
String entryName = entry.getName();
String fileName = entryName.substring(entryName.lastIndexOf("/") + 1);
System.out.println(i++ + "." + entryName);
if (searchFile.equalsIgnoreCase(fileName)) {
System.out.println("File Found");
BufferedInputStream bufferedInputStream = new BufferedInputStream(zipFile.getInputStream(entry));
ZipInputStream zin = new ZipInputStream(bufferedInputStream);
byte[] buffer = new byte[9000];
int len;
while ((len = zin.read(buffer)) != -1) {
out.write(buffer, 0, len);
}
out.close();
break;
}
}
}
} catch (IOException ioe) {
System.out.println("Error opening zip file" + ioe);
}
Please advice what I am doing wrong here. Thanks
EDIT:
After debugging little more I found out that ZipFile class has inner class of similar name(ZipFileInputStream). So it was creating object of it rather than the outside ZipFileInputStream class. So I tried out below code and it worked out well. But I don't quite understand things here, what has happened. If someone could help me logic behind the scenes would be really great.
// BufferedInputStream bufferedInputStream = new
//BufferedInputStream(zipFile.getInputStream(entry));
//ZipInputStream zin = new ZipInputStream(bufferedInputStream);
InputStream zin= zipFile.getInputStream(entry);
The second line is unnecessary, as zipFile.getInputStream(entry) already returns an InputStream that represents the decompressed data. Therefore there's no need (or in fact it's wrong) to wrap that InputStream in yet another ZipInputStream:
BufferedInputStream bufferedInputStream = new BufferedInputStream(zipFile.getInputStream(entry));
ZipInputStream zin = new ZipInputStream(bufferedInputStream);

Java: how to compress a byte[] using ZipOutputStream without intermediate file

Requirement: compress a byte[] to get another byte[] using java.util.zip.ZipOutputStream BUT without using any files on disk or in-memory(like here https://stackoverflow.com/a/18406927/9132186). Is this even possible?
All the examples I found online read from a file(.txt) and write to a file(.zip). ZipOutputStream needs a ZipEntry to work with and that ZipEntry needs a file.
However, my use case is as follows: I need to compress a chunk (say 10MB) of a file at a time using a zip format and append all these compressed chunks to make a .zip file. But, when I unzip the .zip file then it is corrupted.
I am using in-memory files as suggested in https://stackoverflow.com/a/18406927/9132186 to avoid files on disk but need a solution without these files also.
public void testZipBytes() {
String infile = "test.txt";
FileInputStream in = new FileInputStream(infile);
String outfile = "test.txt.zip";
FileOutputStream out = new FileOutputStream(outfile);
byte[] buf = new byte[10];
int len;
while ((len = in.read(buf)) > 0) {
out.write(zipBytes(buf));
}
in.close();
out.close();
}
// ACTUAL function that compresses byte[]
public static class MemoryFile {
public String fileName;
public byte[] contents;
}
public byte[] zipBytesMemoryFileWORKS(byte[] input) {
MemoryFile memoryFile = new MemoryFile();
memoryFile.fileName = "try.txt";
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ZipOutputStream zos = new ZipOutputStream(baos);
ZipEntry entry = new ZipEntry(memoryFile.fileName);
entry.setSize(input.length);
zos.putNextEntry(entry);
zos.write(input);
zos.finish();
zos.closeEntry();
zos.close();
return baos.toByteArray();
}
Scenario 1:
if test.txt has small amount of data (less than 10 bytes) like "this" then unzip test.txt.zip yeilds try.txt with "this" in it.
Scenario 2:
if test.txt has larger amount of data (more than 10 bytes) like "this is a test for zip output stream and it is not working" then unzip test.txt.zip yields try.txt with broken pieces of data and is incomplete.
this 10 bytes is the buffer size in testZipBytes and is the amount of data that is compressed at a time by zipBytes
Expected (or rather desired):
1. unzip test.txt.zip does not use the "try.txt" filename i gave in the MemoryFile but rather unzips to filename test.txt itself.
2. unzipped data is not broken and yields the input data as is.
3. I have done the same with GzipOutputStream and it works perfectly fine.
Requirement: compress a byte[] to get another byte[] using java.util.zip.ZipOutputStream BUT without using any files on disk or in-memory(like here https://stackoverflow.com/a/18406927/9132186). Is this even possible?
Yes, you've already done it. You don't actually need MemoryFile in your example; just delete it from your implementation and write ZipEntry entry = new ZipEntry("try.txt") instead.
But you can't concatenate the zips of 10MB chunks of file and get a valid zip file for the combined file. Zipping doesn't work like that. You could have a solution which minimizes how much is in memory at once, perhaps. But breaking the original file up into chunks seems unworkable.

Save bytes to a pdf file and zip it

I am trying to save bytes[] data in a pdf and zip it up. Everytime I try, I get a blank pdf. Below is the code. Can anyone guide me what am I doing wrong?
byte[] decodedBytes = Base64.decodeBase64(contents);
ZipOutputStream zos = new ZipOutputStream(new BufferedOutputStream(new FileOutputStream("c:\\output\\asdwd.zip")));
//now create the entry in zip file
ZipEntry entry = new ZipEntry("asd.pdf");
zos.putNextEntry(entry);
zos.write(decodedBytes);
zos.close();
This is what I am following
To obtain the actual PDF document, you must decode the Base64-encoded string, save it as a binary file with a “.zip” extension, and then extract the PDF file from the ZIP file.
You don’t actually need to create a zip file. The instructions are telling you that the base64 data represents a zipped PDF, so you can unzip it in code and write the PDF file itself:
Path pdfFile = Files.createTempFile(null, ".pdf");
try (ZipInputStream zip = new ZipInputStream(
new ByteArrayInputStream(
Base64.getDecoder().decode(contents)))) {
ZipEntry entry;
while ((entry = zip.getNextEntry()) != null) {
String name = entry.getName();
if (name.endsWith(".pdf") || name.endsWith(".PDF")) {
Files.copy(zip, pdfFile, StandardCopyOption.REPLACE_EXISTING);
break;
}
}
}

Create a temporary java.io.File from byte[]

I must use an existing method that is like saveAttachment(Attachment attachment) where Attachment has a File attribute.
My problem is that I'm retrieving a byte[] and I want to save it using this method. How can I have a "local" File just for saving ?
Sorry if my question is dumb, I don't know much about Files in Java.
File tempFile = File.createTempFile(prefix, suffix, null);
FileOutputStream fos = new FileOutputStream(tempFile);
fos.write(byteArray);
Check out related docs:
File.createTempFile(prefix, suffix, directory);
Reading All Bytes or Lines from a File
Path file = ...;
byte[] fileArray;
fileArray = Files.readAllBytes(file);
Writing All Bytes or Lines to a File
Path file = ...;
byte[] buf = ...;
Files.write(file, buf);
You're in luck.
File.createTempFile(String prefix, String suffix)
Creates a file in the default temp directory of the OS, where it's guaranteed you can write to.

Unzip in memory for google App Engine Java

I would like to unzip a file uploaded to a servlet, and store all decompressed files to the DataStore as byte[]. Since there is no file system in GAE, I have to put everything in memory. Suppose I have byte[] allzipdata to store the original zip file data. How do I unzip the file and especially how to get inputstream from each zipentry which are in memory?
ZipInputStream zis = new ZipInputStream(new ByteArrayInputStream(allzipdata));
ZipEntry ze = zis.getNextEntry();
while(ze!=null){
}
So what's in the while loop?
Also, if I upload a file, I know the contentType using item.getContentType(); in which item is a FileItemStream. So for a zipentry, is there a way to know the contentType?
To read image data from the ZipInputStream I'd recommend to use the Apache Commons-IO library. It converts the ZIP entry of the input stream to a byte array:
ZipInputStream zis = new ZipInputStream(new ByteArrayInputStream(allzipdata));
ZipEntry ze = null;
while ((ze = zis.getNextEntry()) != null) {
// write your code to use zip entry e.g. below:
String filename = ze.getName();
System.out.println("File Name of Entry file="+fileName);
byte[] data = IOUtils.toByteArray(zis);
// now work with the image `data`
}

Categories

Resources