List .zip directories without extracting

List .zip directories without extracting - java

I am building a file explorer in Java and I am listing the files/folders in JTrees. What I am trying to do now is when I get to a zipped folder I want to list its contents, but without extracting it first.
If anyone has an idea, please share.

I suggest you have a look at ZipFile.entries().
Here's some code:
try (ZipFile zipFile = new ZipFile("test.zip")) {
Enumeration<? extends ZipEntry> zipEntries = zipFile.entries();
while (zipEntries.hasMoreElements()) {
String fileName = zipEntries.nextElement().getName();
System.out.println(fileName);
}
}
If you're using Java 8, you can avoid the use of the almost deprecated Enumeration class using ZipFile::stream as follows:
zipFile.stream()
.map(ZipEntry::getName)
.forEach(System.out::println);
If you need to know whether an entry is a directory or not, you could use ZipEntry.isDirectory. You can't get much more information than than without extracting the file (for obvious reasons).
If you want to avoid extracting all files, you can extract one file at a time using ZipFile.getInputStream for each ZipEntry. (Note that you don't need to store the unpacked data on disk, you can just read the input stream and discard the bytes as you go.

Use java.util.zip.ZipFile class and, specifically, its entries method.
You'll have something like this:
ZipFile zipFile = new ZipFile("testfile.zip");
Enumeration zipEntries = zipFile.entries();
String fname;
while (zipEntries.hasMoreElements()) {
fname = ((ZipEntry)zipEntries.nextElement()).getName();
...
}

For handling ZIP files you can use class ZipFile. It has method entries() which returns list of entries contained within ZIP file. This information is contained in the ZIP header and extraction is not required.

Related

How can I get all the directories within a jar as a List of File objects?

I have a jar which has a resources folder that contains a folder, let's call it toplevel. toplevel contains another folder, called level1. level1 then contains a list of directories. I'd like to retrieve these directories as java.io.File objects, so that another function can do things with these File objects. With the below example that'd be a List<File> like List{dira, dirb, dirc} How can this be done?
toplevel
---level1
------dir a
------dir b
------dir c

I would suggest extracting matching entries from the jar file and save to a temporary location to get the java.io.File reference.
Option #1:
If you are reading from a file system, use ZipFile to read the file then use ZipFile.getEntry("zip-path") to get the entry and save using Files.copy
See: ZipEntry to File
Option #2:
If you are reading from an input stream source, use ZipInputStream to read the jar file, then iterate, filter and apply action to matching entries. Each matching entry is coupled with a matching ZipInputStream and you can use those input streams to save them to a temporary location, then create the List<File> reference to hand off to another function.
I wrote a quick example in this repo:
https://github.com/nfet/java-zip-demo/tree/main/src/main/resources
The demo essentially just reads the jar file in the resource folder and finds a single matching zip entry (META-INF/license.txt) and saves it to a file.
See Example Implementation in:
https://github.com/nfet/java-zip-demo/blob/7dbdba9c47e0773f959d740d62fbb63949eaca94/src/main/java/com/example/jar/demo/ReadJarFile.java
<script src="https://gist.github.com/nfet/27fce2870b8cd42e3337f6a21b8e9711.js"></script>

Thanks for the help folks but after much ado, found a solution working atop this previous solution https://stackoverflow.com/a/1529707/9486041
to narrow down to the folders and its contents.
JarURLConnection connection = (JarURLConnection) folderURL.openConnection()
JarFile jar = new JarFile(new File(connection.getJarFileURL().toURI()))
Enumeration enumEntries = jar.entries();
while (enumEntries.hasMoreElements()) {
JarEntry file = (JarEntry) enumEntries.nextElement();
if (!file.name.startsWith(path + "/")) {
continue
}
File f = new File(System.getProperty("user.home") + "/tmp" + File.separator + file.getName());
f.getParentFile().mkdirs()
InputStream is = jar.getInputStream(file); // get the input stream
FileOutputStream fos = new FileOutputStream(f);
while (is.available() > 0) { // write contents of 'is' to 'fos'
fos.write(is.read());
}
fos.close();
is.close();
}
jar.close()

Programatically Extract Single Specific File From 7zip Archive - Java - Linux

I would really appreciate your input on the below scenario please.
The requirements:
- I have a 7zip archive file with several thousands of files in it
- I have a java application running on linux that is required to retrieve individual files from the 7 zip file
I would like to retrieve a file from the archive by its path (e.g. my7zFile.7z/file1.pdf) without having to iterate through all the files in the archive and comparing file names.
I would like to avoid having to extract all files from the archive before running the search (the uncompressed archive is several TB).
I had a look into 7zip Java Binding - specifically the IInArchive class, the only extract method seems to work via file index, not via file name:
http://sevenzipjbind.sourceforge.net/javadoc/net/sf/sevenzipjbinding/IInArchive.html
Do you know of any other libraries that could help me with this use case or am I overlooking a way of doing this with 7zip jbinding?
Thank you
Kind regards,
Tobi

Sadly it appears the API doesn't provide enough to fulfill all your requirements. In order to extract a single file it appears you need to walk the archive index. The simplified interface to the archive makes this much easier:
The ISimpleInArchive interface provides:
ISimpleInArchiveItem[] getArchiveItems()
Allowing you to retrieve an list of items in the archive.
The ISimpleInArchiveItem interface provides the method:
java.lang.String getPath()
Hence you can walk the archiveItems comparing on path. Granted this is against your requirements.
However, note this walks the index table and does not extract the files until requested. Once you have the item your after you can use:
ExtractOperationResult extractSlow(ISequentialOutStream SequentialOutStream)
on the item you have found to actually extract it.
Looking at the 7z file format (note this is not the official site of 7zip), the header information is all at the end of the file with the Signature header at the start of the file giving an offset to the start of the header info. So provided the SevenZip bindings are written nicely, your search will at most read the start of the file (SignatureHeader) to find the offset to the HeaderInfo section, then walk the HeaderInfo section in order to build up the file list required in getArchiveItems(). Only once you have the item you need will it shift back to the index of the actual stream for the file you want extracted (most likely when you call extractSlow).
So whilst not all your requirements are met, the overhead of the search/compare required is limited to only searching the header info of the archive.

Once I wrote a code to read from all the files and folders from a zip file. I had a long file(text)/folder hierarchy inside the zip file. I am not sure whether that will help you or not. I am sharing the skeleton of the code.
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;
ZipFile zipFile = new ZipFile(filepath); // filepath of the zip file
Enumeration<? extends ZipEntry> entries = zipFile.entries();
while (entries.hasMoreElements()) {
ZipEntry entry = entries.nextElement();
if (entry.isDirectory()) { // found directory inside the zipFile
// write your code here
} else {
InputStream stream = zipFile.getInputStream(entry);
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
// write your code to read the content of the file
}
}
You can modify the code to reach your desired file in the zip. But i don't think you will be able to access the file directly rather you have to walk through all the paths of the zip archive. Note that, ZipFile iterates through all file and folders inside a zipped file in DFS (Depth First Search) manner. You will find detailed relevant examples in web.

Read tgz w/out unpacking it onto computer or Unpack as temp & delete when program closes?

Hey guys I'm currently using jarchivelib which can be found Here I'm stuck on figuring out a way to read the file without having to use the unpack method because it makes a file of the unpacked version. EX:
File archive = new File("/home/jack/archive.zip");
File destination = new File("/home/jack/archive");
Archiver archiver = ArchiverFactory.createArchiver(ArchiveFormat.ZIP);
archiver.extract(archive, destination);
I want to make it so i don't have to unpack it to read the files... If there is no way to do that I'm guessing in my method for Jframe.setDefualtCloseOpperation i'll have to make a custom one so it deletes the files? or is there a better way for handling temp files?

If all you want to do is to extract the file, why not use Java's built in zip to extract the file or if it is password protected you can use Zip4j. These libraries support streams, so that you can extract the contents of the file without writing it a FileStream

As of version 0.4.0, the jarchivelib Archiver API supports streaming an archive rather than extracting it directly onto the filesystem.
ArchiveStream stream = archiver.stream(archive);
ArchiveEntry entry;
while((entry = stream.getNextEntry()) != null) {
// access each archive entry individually using the stream
// or extract it using entry.extract(destination)
// or fetch meta-data using entry.getName(), entry.isDirectory(), ...
}
stream.close();
when the stream is pointing to an entry after calling getNextEntry, you can use the stream.read methods just as you would reading an individual entry.

Efficient use of FileSystems for listing zip file entries stored in resources

Zip files in Java 7 can be treated similarly like file system:
http://fahdshariff.blogspot.cz/2011/08/java-7-working-with-zip-files.html
In my project I'd like to process the zip file stored in resources. But I cannot find any efficient way for converting the resource stream (getResourceAsStream) into a new FileSystems object directly without storing that file to the disk first:
Map<String, String> nameMap = new HashMap<>();
Path path = Files.createTempFile(null, ".zip");
Files.copy(MyClass.class.getResourceAsStream("/data.zip"), path, StandardCopyOption.REPLACE_EXISTING);
try (FileSystem zipFileSystem = FileSystems.newFileSystem(path, null)) {
Files.walkFileTree(zipFileSystem.getPath("/"), new NameMapParser(nameMap));
}
Files.delete(path);
Am I missing something?

No, this isn't possible. The reason for this is that a) streams often can't be read twice and b) ZIP archives need random reading.
The list of files is attached at the end, so you need to skip the data, find the file entry, locate the position of the data entry and then seek backwards.
This is why code like Java WebStart downloads and caches the files.
Note that you don't have to write the ZIP archive to disk, though. You can use ShrinkWrap to create an in-memory filesystem.

How to access files inside of a folder that is inside of a ZipEntry

Sorry for the confusing title. Basically I have a ZipFile that has a bunch of .txt files in it but also has one folder. The code I am showing below is finding that folder in the zip entries. I have done this part just fine. The problem is that once I find the folder it is a ZipEntry. Which does not happen to have any useful methods to get entries inside of that folder. The folder I am finding has more .txt files in it that I want to process (that is the main goal).
zipFile = new ZipFile(zipName);
Enumeration<? extends ZipEntry> entries = zipFile.entries();
while(entries.hasMoreElements()){
ZipEntry current = entries.nextElement();
if(current.getName().equals(folderName)) {
assertTrue(current.isDirectory());
//Here is where I want to get the files in the folder
}
}

ZipEntry has a method isDirectory() which
Returns true if this is a directory entry. A directory entry is
defined to be one whose name ends with a '/'.
What you'll want to do is iterate over all the the entries (as you are doing) and get the InputStream for those that are inside the directory, ie. that have a path relative to the directory.
Say folderName has the value "/zip/myzip/directory", then a file inside that directory will have a name as "/zip/myzip/directory/myfile.txt". You can use the Java NIO Path api to help you
Path directory = Paths.get("/zip/myzip/directory"); // you get this directory path from the ZipEntry
Path file = Paths.get(current.getName());
if (file.startsWith(directory)) {
// do your thing
}
You can get the InputStream as
zipFile.getInputStream(current);
Note that paths inside a Zip file will be relative to the root of the Zip location. If the zip is at
C:/Users/You/Desktop/myzip.zip
a folder directly inside the zip with show a path like
directory/

Something like that may help you
final ZipFile zf = new ZipFile(filename);
for (final Enumeration<? extends ZipEntry> e = zf.entries(); e.hasMoreElements();) {
final ZipEntry ze = e.nextElement();
if (!ze.isDirectory()) {
final String name = ze.getName();
//.....
}
}
Enjoy it ;-)

Actually there's a simpler way to do this. If you know that the current entry is a directory then when using the ZipInputStream, the next element will automatically be whatever is in that directory. For example, let's say your directory structure is this:
Dir1/A.txt Dir1/B.txt Dir2/C.txt D.txt
Then to access all three above you simply have to go about it this way :
ZipInputStream Zis = new ZipInputStream(in);
ZipEntry entry = Zis.getNextEntry();
while (entry != null) {
if(!entry.isDirectory)
//do something with entry
//else continue
entry = Zis.getNextEntry();
}
This will iterate through all the files (in the order they're listed) without having to explicitly check to see if they're directories since you're not doing anything different with them.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.