Reading gzip files inside gzip file using Java - java

Using Java I have to read text files which are inside gz file which is in another .tar.gz
gz_ltm_logs.tar.gz is the filename. It then has files ltm.1.gz, ltm.2.gz inside it and then these files have text files in them.
I wanted to do it using java.util.zip.* only but if it is impossible then I can look at other libraries.
I thought I will be able to do it using java.util.zip. But doesn't seem straightforward

Here's some code to give you an idea. This method will try to extract a given tar.gz file to outputFolder.
public static void extract(File input, File outputFolder) throws IOException {
byte[] buffer = new byte[1024];
GZIPInputStream gzipFile = new GZIPInputStream(new FileInputStream(input));
ByteOutputStream tarStream = new ByteOutputStream();
int gzipLengthRead;
while ((gzipLengthRead = gzipFile.read(buffer)) > 0){
tarStream.write(buffer, 0, gzipLengthRead);
}
gzipFile.close();
org.apache.tools.tar.TarInputStream tarFile = null;
// files inside the tar
OutputStream out = null;
try {
tarFile = new org.apache.tools.tar.TarInputStream(tarStream.newInputStream());
tarStream.close();
TarEntry entry = null;
while ((entry = tarFile.getNextEntry()) != null) {
String outFilename = entry.getName();
if (entry.isDirectory()) {
File directory = new File(outputFolder, outFilename);
directory.mkdirs();
} else {
File outputFile = new File(outputFolder, outFilename);
File outputDirectory = outputFile.getParentFile();
if (!outputDirectory.exists()) {
outputDirectory.mkdirs();
}
out = new FileOutputStream(outputFile);
// Transfer bytes from the tarFile to the output file
int innerLen;
while ((innerLen = tarFile.read(buffer)) > 0) {
out.write(buffer, 0, innerLen);
}
out.close();
}
}
} finally {
if (tarFile != null) {
tarFile.close();
}
if (out != null) {
out.close();
}
}
}

Related

How to decompress BZIP (not BZIP2) with Apache Commons

I have been working on a task to decompress from different types of file format such as "zip,tar,tbz,tgz". I am able to do for all except tbz because apache common compress library provides BZIP2 compressors. But I need to decompress a old BZIP not BZIP2. Is there any way to do it java. I have added the code I have done so far for extracting different tar file archives using apache commons library below.
public List<ArchiveFile> processTarFiles(String compressedFilePath, String fileType) throws IOException {
List<ArchiveFile> extractedFileList = null;
TarArchiveInputStream is = null;
FileOutputStream fos = null;
BufferedOutputStream dest = null;
try {
if(fileType.equalsIgnoreCase("tar"))
{
is = new TarArchiveInputStream(new FileInputStream(new File(compressedFilePath)));
}
else if(fileType.equalsIgnoreCase("tbz")||fileType.equalsIgnoreCase("bz"))
{
is = new TarArchiveInputStream(new BZip2CompressorInputStream(new FileInputStream(new File(compressedFilePath))));
}
else if(fileType.equalsIgnoreCase("tgz")||fileType.equalsIgnoreCase("gz"))
{
is = new TarArchiveInputStream(new GzipCompressorInputStream(new FileInputStream(new File(compressedFilePath))));
}
TarArchiveEntry entry = is.getNextTarEntry();
extractedFileList = new ArrayList<>();
while (entry != null) {
// grab a zip file entry
String currentEntry = entry.getName();
if (!entry.isDirectory()) {
File destFile = new File(Constants.DEFAULT_ZIPOUTPUTPATH, currentEntry);
File destinationParent = destFile.getParentFile();
// create the parent directory structure if needed
destinationParent.mkdirs();
ArchiveFile archiveFile = new ArchiveFile();
int currentByte;
// establish buffer for writing file
byte data[] = new byte[(int) entry.getSize()];
// write the current file to disk
fos = new FileOutputStream(destFile);
dest = new BufferedOutputStream(fos, (int) entry.getSize());
// read and write until last byte is encountered
while ((currentByte = is.read(data, 0, (int) entry.getSize())) != -1) {
dest.write(data, 0, currentByte);
}
dest.flush();
dest.close();
archiveFile.setExtractedFilePath(destFile.getAbsolutePath());
archiveFile.setFormat(destFile.getName().split("\\.")[1]);
extractedFileList.add(archiveFile);
entry = is.getNextTarEntry();
} else {
new File(Constants.DEFAULT_ZIPOUTPUTPATH, currentEntry).mkdirs();
entry = is.getNextTarEntry();
}
}
} catch (IOException e) {
System.out.println(("ERROR: " + e.getMessage()));
} catch (Exception e) {
System.out.println(("ERROR: " + e.getMessage()));
} finally {
is.close();
dest.flush();
dest.close();
}
return extractedFileList;
}
The original Bzip was supposedly using a patented algorithm so Bzip2 was born using algorithms and techniques that were not patented.
That might be the reason why it's no longer in widespread use and open source libraries ignore it.
There's some C code for decompressing Bzip files shown here (gist.github.com mirror).
You might want to read and rewrite that in Java.

Java - Create Zip-file with multiple files from different locations with subfolders

I am trying to generate a zip file in Java, that contains several files of different types (e.g. images, fonts etc) that are lying in different locations. Furthermore I want the zip file to have subfolders where the files are put by their type (e.g. images should go to the images folder within the zip.
These are the files that I have (each can be in a different location):
index.html
img1.jpg
img2.jpg
font1.woff
font2.woff
style.css
custom.js
And this is how they should be in the zip file:
index.html
images/img1.jpg
images/img2.jpg
fonts/font1.woff
fonts/font2.woff
js/custom.js
css/styles.css
So far I have managed to take one file in a specific path and prompt the user for the output location. A zip-file will be generated with the file that is specified in the input. Here is the code I have so far:
JFrame parentFrame = new JFrame();
JFileChooser fileChooser = new JFileChooser();
fileChooser.setDialogTitle("Speicherort auswählen");
int userSelection = fileChooser.showSaveDialog(parentFrame);
String pathToFile;
if (userSelection == JFileChooser.APPROVE_OPTION) {
File fileToSave = fileChooser.getSelectedFile();
print(fileToSave.getAbsolutePath());
pathToFile = fileToSave.getAbsolutePath();
}
pathToFile = pathToFile.replace("\\", "/");
String outFileName = pathToFile;
String inFileName = "C:/Users/asoares/Desktop/mobio_export_test/index.html";
ZipOutputStream zos = null;
FileInputStream fis = null;
try {
zos = new ZipOutputStream(new FileOutputStream(outFileName));
fis = new FileInputStream(inFileName);
zos.putNextEntry(new ZipEntry(new File(inFileName).getName()));
int len;
byte[] buffer = new byte[2048];
while((len = fis.read(buffer, 0, buffer.length)) > 0) {
zos.write(buffer, 0, len);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if(fis != null){
try {
fis.close();
} catch (IOException e) {}
}
if(zos != null){
try {
zos.closeEntry();
zos.close();
} catch (IOException e) {}
}
}
I would be really glad if someone can help me!!!
It should work like this.
The zip directory name should at best be created by another method (there are more image types than jpg :)).
public static Path zip(List<Path> files, Path zipFileTarget) throws IOException {
try (FileOutputStream fos = new FileOutputStream(zipFileTarget.toFile());
ZipOutputStream zos = new ZipOutputStream(fos)) {
if (!Files.exists(zipFileTarget))
Files.createFile(zipFileTarget);
createEntries(files, zos);
zos.close();
return zipFileTarget;
}
}
private static List<String> createEntries(List<Path> files, ZipOutputStream zos) throws IOException {
List<String> zippedFiles = new ArrayList<>();
Matcher matcherFileExt = Pattern.compile("^.*\\.([^.]+)$").matcher("");
for (Path f : files) {
if (Files.isRegularFile(f)) {
String fileName = f.getFileName().toString();
String fileExt = matcherFileExt.reset(fileName).matches()
? matcherFileExt.replaceAll("$1")
: "unknown";
// You should determine the dir name with a more sophisticated
// approach.
String dir;
if (fileExt.equals("jpg")) dir = "images";
else if (fileExt.equals("woff")) dir = "fonts";
else dir = fileExt;
zos.putNextEntry(new ZipEntry(dir + "/" + fileName));
Files.copy(f, zos);
zippedFiles.add(fileName);
}
}
return zippedFiles;
}
Edit: this approach works with java 1.7+. You can easily convert a File object to a Path object by calling its toPath() method.

Zip file not deleted even if I am getting its correct name and path

I am trying to delete a zip file after unziping. but I am not able to delete it:
if (file.getName().contains(".zip")) {
System.out.println(file.getAbsolutePath()); // I am getting the correct path
file.delete();
System.out.println(file.getName()); // I am getting the correct name Script-1.zip
}
This is the full code
public class Zip4 {
public static void main(String[] args) {
File[] files = new File(args[0]).listFiles();
for(File file : files)
// System.out.println(file.getName());
//if(file.getName().contains("1400") && file.getName().contains(".zip"))
extractFolder(args[0] + file.getName(), args[1]);
DeleteFiles();
// for(File file : files)
// System.out.println("File:C:/1/"+ file.getName());
// extractFolder(args[0]+file.getName(),args[1]);
}
private static void DeleteFiles()
{
File f = null;
File[] paths;
f = new File("D:/Copyof");
paths = f.listFiles();
for(File path:paths)
{
// prints file and directory paths
if(path.getName().contains("J14_0_0RC") || path.getName().contains(".zip") || path.getName().contains(".log"))
{
//System.out.println(path);
path.delete();
}
}
}
private static void extractFolder(String zipFile,String extractFolder)
{
try
{
int BUFFER = 2048;
File file = new File(zipFile);
ZipFile zip = new ZipFile(file);
String newPath = extractFolder;
new File(newPath).mkdir();
Enumeration zipFileEntries = zip.entries();
// Process each entry
while (zipFileEntries.hasMoreElements())
{
// grab a zip file entry
ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();
String currentEntry = entry.getName();
File destFile = new File(newPath, currentEntry);
//destFile = new File(newPath, destFile.getName());
File destinationParent = destFile.getParentFile();
// create the parent directory structure if needed
destinationParent.mkdirs();
if (!entry.isDirectory())
{
BufferedInputStream is = new BufferedInputStream(zip
.getInputStream(entry));
int currentByte;
// establish buffer for writing file
byte data[] = new byte[BUFFER];
// write the current file to disk
FileOutputStream fos = new FileOutputStream(destFile);
BufferedOutputStream dest = new BufferedOutputStream(fos,
BUFFER);
// read and write until last byte is encountered
while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
dest.write(data, 0, currentByte);
}
dest.flush();
dest.close();
fos.flush();
fos.close();
is.close();
}
}
if(file.getName().contains(".zip"))
{
System.out.println(file.getAbsolutePath());
file.delete();
System.out.println(file.getName());
}
}
catch (Exception e)
{
System.out.println("Error: " + e.getMessage());
}
}
}
ZipFile is a closeable resource. So either close() it once you're done in a finally block or create it with try-with-resources (since java7):
try(ZipFile zip = new ZipFile(file)){
//unzip here
}
file.delete();
Apart from this, you should revisit this block
dest.flush();
dest.close();
fos.flush();
fos.close();
is.close();
which is quite prone to resource leaks. If one of the upper calls fails, all subsequent calls are not invoked, resulting in unclosed resources and resource leakage.
So best would be to use try-with-resources here, too.
try(BufferedInputStream is = new BufferedInputStream(zip.getInputStream(entry));
FileOutputStream fos = new FileOutputStream(destFile);
BufferedOutputStream dest = new BufferedOutputStream(fos, BUFFER)) {
//write the data
} //all streams are closed implicitly here
Or use an existing tool for that, for example Apache Commons IO IOUtil.closeQuietly(resource) or embedd every single call into
if(resource != null) {
try{
resource.close();
} catch(IOException e){
//omit
}
}
You could also omit the call to flush() which is done implicitly when closing the resource.

How should I extract compressed folders in java?

I am using the following code to extract a zip file in Java.
import java.io.*;
import java.util.zip.*;
class testZipFiles
{
public static void main(String[] args)
{
try
{
String filename = "C:\\zip\\includes.zip";
testZipFiles list = new testZipFiles( );
list.getZipFiles(filename);
}
catch (Exception e)
{
e.printStackTrace();
}
}
public void getZipFiles(String filename)
{
try
{
String destinationname = "c:\\zip\\";
byte[] buf = new byte[1024];
ZipInputStream zipinputstream = null;
ZipEntry zipentry;
zipinputstream = new ZipInputStream(
new FileInputStream(filename));
zipentry = zipinputstream.getNextEntry();
while (zipentry != null)
{
//for each entry to be extracted
String entryName = zipentry.getName();
System.out.println("entryname "+entryName);
int n;
FileOutputStream fileoutputstream;
File newFile = new File(entryName);
String directory = newFile.getParent();
if(directory == null)
{
if(newFile.isDirectory())
break;
}
fileoutputstream = new FileOutputStream(
destinationname+entryName);
while ((n = zipinputstream.read(buf, 0, 1024)) > -1)
fileoutputstream.write(buf, 0, n);
fileoutputstream.close();
zipinputstream.closeEntry();
zipentry = zipinputstream.getNextEntry();
}//while
zipinputstream.close();
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
Obviously this will not extract a folder tree because of the break statement. I tried to use recursion to process a folder tree but failed. Could someone show me how to improve this code to handle a folder tree instead of a compressed single level folder.
You can use File.mkdirs() to create folders. Try changing your method like this:
public static void getZipFiles(String filename) {
try {
String destinationname = "c:\\zip\\";
byte[] buf = new byte[1024];
ZipInputStream zipinputstream = null;
ZipEntry zipentry;
zipinputstream = new ZipInputStream(
new FileInputStream(filename));
zipentry = zipinputstream.getNextEntry();
while (zipentry != null) {
//for each entry to be extracted
String entryName = destinationname + zipentry.getName();
entryName = entryName.replace('/', File.separatorChar);
entryName = entryName.replace('\\', File.separatorChar);
System.out.println("entryname " + entryName);
int n;
FileOutputStream fileoutputstream;
File newFile = new File(entryName);
if (zipentry.isDirectory()) {
if (!newFile.mkdirs()) {
break;
}
zipentry = zipinputstream.getNextEntry();
continue;
}
fileoutputstream = new FileOutputStream(entryName);
while ((n = zipinputstream.read(buf, 0, 1024)) > -1) {
fileoutputstream.write(buf, 0, n);
}
fileoutputstream.close();
zipinputstream.closeEntry();
zipentry = zipinputstream.getNextEntry();
}//while
zipinputstream.close();
} catch (Exception e) {
e.printStackTrace();
}
}
Another option is commons-compress, for which there is sample code on the site linked above.
I needed to do this because of an API which I was using required a File parameter, which you can't get from a resource in a JAR.
I found that the answer from #Emre didn't work correctly. For some reason ZipEntry skipped a few files in the JAR (no apparent pattern to this). I fixed this by using JarEntry instead. There is also a bug in the above code where the file in the zip entry could be enumerated before the directory is, which causes an exception because the directory hasn't been created yet.
Note that the below code depends on Apache Commons utility classes.
/**
*
* Extract a directory in a JAR on the classpath to an output folder.
*
* Note: User's responsibility to ensure that the files are actually in a JAR.
* The way that I do this is to get the URI with
* URI url = getClass().getResource("/myresource").toURI();
* and then if url.isOpaque() we are in a JAR. There may be a more reliable
* way however, please edit this answer if you know of one.
*
* #param classInJar A class in the JAR file which is on the classpath
* #param resourceDirectory Path to resource directory in JAR
* #param outputDirectory Directory to write to
* #return String containing the path to the folder in the outputDirectory
* #throws IOException
*/
private static String extractDirectoryFromClasspathJAR(Class<?> classInJar, String resourceDirectory, String outputDirectory)
throws IOException {
resourceDirectory = StringUtils.strip(resourceDirectory, "\\/") + File.separator;
URL jar = classInJar.getProtectionDomain().getCodeSource().getLocation();
//Note: If you want to extract from a named JAR, remove the above
//line and replace "jar.getFile()" below with the path to the JAR.
JarFile jarFile = new JarFile(new File(jar.getFile()));
byte[] buf = new byte[1024];
Enumeration<JarEntry> jarEntries = jarFile.entries();
while (jarEntries.hasMoreElements()) {
JarEntry jarEntry = jarEntries.nextElement();
if (jarEntry.isDirectory() || !jarEntry.getName().startsWith(resourceDirectory)) {
continue;
}
String outputFileName = FilenameUtils.concat(outputDirectory, jarEntry.getName());
//Create directories if they don't exist
new File(FilenameUtils.getFullPath(outputFileName)).mkdirs();
//Write file
FileOutputStream fileOutputStream = new FileOutputStream(outputFileName);
int n;
InputStream is = jarFile.getInputStream(jarEntry);
while ((n = is.read(buf, 0, 1024)) > -1) {
fileOutputStream.write(buf, 0, n);
}
is.close();
fileOutputStream.close();
}
jarFile.close();
String fullPath = FilenameUtils.concat(outputDirectory, resourceDirectory);
return fullPath;
}

IOException - Access Denied Using FileOutputStream

I get the following IOException :
java.io.IOException: Access is denied
at java.io.WinNTFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java:850)
at zipUnzipper.main(zipUnzipper.java:41)
When trying to run the following piece of code :
public class zipUnzipper {
public zipUnzipper() {
}
public static void main(String[] args){
//Unzip to temp folder. Add all files to mFiles. Print names of all files in mFfiles.
File file = new File("C:\\aZipFile.zip");
String filename = file.getName();
String filePathName = new String();
int o = filename.lastIndexOf('.');
filename = filename.substring(0,o);
try {
ZipFile zipFile = new ZipFile (file.getAbsoluteFile());
Enumeration entries = zipFile.entries();
while(entries.hasMoreElements()) {
ZipEntry zipEntry = (ZipEntry) entries.nextElement();
System.out.println("Unzipping: " + zipEntry.getName());
BufferedInputStream bis = new BufferedInputStream(zipFile.getInputStream(zipEntry));
byte[] buffer = new byte[2048];
filePathName = "C:\\TEMP\\"+filename+"\\";
File fileToWrite = new File(filePathName+ zipEntry.getName());
fileToWrite.mkdirs();
fileToWrite.createNewFile();
FileOutputStream fos = new FileOutputStream(fileToWrite);
BufferedOutputStream bos = new BufferedOutputStream( fos , buffer.length );
int size;
while ((size = bis.read(buffer, 0, buffer.length)) != -1) {
bos.write(buffer, 0, size);
}
bos.flush();
bos.close();
bis.close();
}
zipFile.close();
File folder = new File (filePathName);
File [] mFiles = folder.listFiles();
for (int x=0; x<mFiles.length; x++) {
System.out.println(mFiles[x].getAbsolutePath());
}
} catch (ZipException ze) {
ze.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
It seems to me that for some reason the JVM can't create a new file. The code runs perfectly well if the files already exist. Is there some kind of access file which dictates whether the JVM can create a new file or am I simply doing something wrong?
Any help is much appreciated :-)
I'm running Java 1.4 and have been testing in JDeveloper in Windows XP.
The issue is that these calls step on each other:
fileToWrite.mkdirs(); //creates a directory e.g. C:\temp\foo\x
fileToWrite.createNewFile(); //attempts to create a file C:\temp\foo\x
The create operation fails because you just created a directory with the same name than the file you want to create.
What you want to do instead is:
fileToWrite.getParentFile().mkdirs()
And also, the call to createNewFile() is unnecessary.
Based on your code. The following "unzips" a zip file:
import java.io.*;
import java.util.zip.ZipFile;
import java.util.zip.ZipEntry;
import java.util.Enumeration;
public class Unzipper {
public static void main(String[] args)
throws IOException {
final File file = new File(args[0]);
final ZipFile zipFile = new ZipFile(file);
final byte[] buffer = new byte[2048];
final File tmpDir = new File(System.getProperty("java.io.tmpdir"), zipFile.getName());
if(!tmpDir.mkdir() && tmpDir.exists()) {
System.err.println("Cannot create: " + tmpDir);
System.exit(0);
}
for(final Enumeration entries = zipFile.entries(); entries.hasMoreElements();) {
final ZipEntry zipEntry = (ZipEntry) entries.nextElement();
System.out.println("Unzipping: " + zipEntry.getName());
final InputStream is = zipFile.getInputStream(zipEntry);
final File fileToWrite = new File(tmpDir, zipEntry.getName());
final File folder = fileToWrite.getParentFile();
if(!folder.mkdirs() && !folder.exists()) {
System.err.println("Cannot create: " + folder);
System.exit(0);
}
if(!zipEntry.isDirectory()) {
//No need to use buffered streams since we're doing our own buffering
final FileOutputStream fos = new FileOutputStream(fileToWrite);
int size;
while ((size = is.read(buffer)) != -1) {
fos.write(buffer, 0, size);
}
fos.close();
is.close();
}
}
zipFile.close();
}
}
Disclaimer: I haven't tested it beyond the very basics.
Why are you calling createNewFile()? Just create the FileOutputStream.
It also could be that in context where you are launching the application you haven't access rights to the place where you are trying to create the file. Launch the app as admin or create the file in the project folder.

Categories

Resources