Tar problem with apache commons compress - java

I'm having a hard time trying to tar some files using the compress library.
My code is the following, and is taken from the commons.compress wiki exemples :
private static File createTarFile(String[] filePaths, String saveAs) throws Exception{
File tarFile = new File(saveAs);
OutputStream out = new FileOutputStream(tarFile);
TarArchiveOutputStream aos = (TarArchiveOutputStream) new ArchiveStreamFactory().createArchiveOutputStream("tar", out);
for(String filePath : filePaths){
File file = new File(filePath);
TarArchiveEntry entry = new TarArchiveEntry(file);
entry.setSize(file.length());
aos.putArchiveEntry(entry);
IOUtils.copy(new FileInputStream(file), aos);
aos.closeArchiveEntry();
}
aos.finish();
out.close();
return tarFile;
}
There is no error during the process, but when I try to untar the file, I got the following :
XXXX:XXXX /home/XXXX$ tar -xf typeCommandes.tar
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
Also, the archive IS slighty smaller in size than the original file, which isnt normal for a tar, so there DO is a problem...
-rw-r--r-- 1 XXXX nobody 12902400 Jan 14 17:11 typeCommandes.tar
-rw-r--r-- 1 XXXX nobody 12901888 Jan 14 17:16 typeCommandes.csv
Anyone can tell me what I'm doing wrong ? Thanks

You're not closing the TarArchiveOutputStream. Add aos.close() after aos.finish()

Small correction to the code above.
It does not close input stream, while Apache lib assumes that stream is managed by calling client.
See the fix below (put this code after the line 'aos.putArchiveEntry(entry)') :
FileInputStream fis = new FileInputStream(fileForPuttingIntoTar);
IOUtils.copy(fis, aos);
fis.close();
aos.closeArchiveEntry();

the example here -> http://commons.apache.org/compress/examples.html uses the method putNextEntry(entry) which you seem to omit.

See also my answer here
import org.apache.commons.compress.archivers.ArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveInputStream;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
public class TarUpdater {
private static final int buffersize = 8048;
public static void updateFile(File tarFile, File[] flist) throws IOException {
// get a temp file
File tempFile = File.createTempFile(tarFile.getName(), null);
// delete it, otherwise you cannot rename your existing tar to it.
if (tempFile.exists()) {
tempFile.delete();
}
if (!tarFile.exists()) {
tarFile.createNewFile();
}
boolean renameOk = tarFile.renameTo(tempFile);
if (!renameOk) {
throw new RuntimeException(
"could not rename the file " + tarFile.getAbsolutePath() + " to " + tempFile.getAbsolutePath());
}
byte[] buf = new byte[buffersize];
TarArchiveInputStream tin = new TarArchiveInputStream(new FileInputStream(tempFile));
OutputStream outputStream = new BufferedOutputStream(Files.newOutputStream(tarFile.toPath()));
TarArchiveOutputStream tos = new TarArchiveOutputStream(outputStream);
tos.setLongFileMode(TarArchiveOutputStream.LONGFILE_POSIX);
//read from previous version of tar file
ArchiveEntry entry = tin.getNextEntry();
while (entry != null) {//previous file have entries
String name = entry.getName();
boolean notInFiles = true;
for (File f : flist) {
if (f.getName().equals(name)) {
notInFiles = false;
break;
}
}
if (notInFiles) {
// Add TAR entry to output stream.
if (!entry.isDirectory()) {
tos.putArchiveEntry(new TarArchiveEntry(name));
// Transfer bytes from the TAR file to the output file
int len;
while ((len = tin.read(buf)) > 0) {
tos.write(buf, 0, len);
}
}
}
entry = tin.getNextEntry();
}
// Close the streams
tin.close();//finished reading existing entries
// Compress new files
for (int i = 0; i < flist.length; i++) {
if (flist[i].isDirectory()) {
continue;
}
InputStream fis = new FileInputStream(flist[i]);
TarArchiveEntry te = new TarArchiveEntry(flist[i],flist[i].getName());
//te.setSize(flist[i].length());
tos.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);
tos.setBigNumberMode(2);
tos.putArchiveEntry(te); // Add TAR entry to output stream.
// Transfer bytes from the file to the TAR file
int count = 0;
while ((count = fis.read(buf, 0, buffersize)) != -1) {
tos.write(buf, 0, count);
}
tos.closeArchiveEntry();
fis.close();
}
// Complete the TAR file
tos.close();
tempFile.delete();
}
}

Related

Why does my program skip files when unzipping by java.util.zip?

I read quite a few articles, but I did not find a similar problem and its solution.
I'm try to read all files and some skipped with method zis.getNextEntry
public static void main(String[] args) throws Exception {
String fileZip = "src/main/resources/unzipTest/fias_xml.zip";
ZipInputStream zis = new ZipInputStream(new FileInputStream(fileZip));
ZipEntry entry;
while ((entry = zis.getNextEntry()) != null) {
System.out.println(entry.getName());
}
}
}
But if you unzip with WinRar, for example, everything will be unzipped correctly
Archive files
After running the program
Or how i can see why some files doesn't read?
Can the archive be broken?
After I unzipped and re-zipped the files by using winrar, the program worked correctly. Why was winrar able to do this, but the java code was not?
zipArchive
jdk1.8.0_161
Based on the test i did i able to print each directory and file name correctly.
There 2 scenario came to my mind:
i) the filename length or the complete length is more what the platform can handle. But this also should be same case while do unzip from winrar
ii) Was there any permission issue, but again it won't be selective manner.
can you please let me which jdk version ?
Will u be able to sent me the zip file, I can try to simulate.
public void unzip(String zipFile, String destDir)
{
try
{
int BUFFER = 8*1024;
File file = new File(zipFile);
ZipFile zip = new ZipFile(file);
String newPath = destDir;
new File(newPath).mkdir();
Enumeration zipFileEntries = zip.entries();
while (zipFileEntries.hasMoreElements())
{
ZipEntry entry = (ZipEntry) zipFileEntries.nextElement();
String currentEntry = entry.getName();
File destFile = new File(newPath, currentEntry);
File destinationParent = destFile.getParentFile();
destinationParent.mkdirs();
if (!entry.isDirectory())
{
BufferedInputStream is = new BufferedInputStream(zip
.getInputStream(entry));
int currentByte;
byte[] data = new byte[BUFFER];
FileOutputStream fos = new FileOutputStream(destFile);
BufferedOutputStream dest = new BufferedOutputStream(fos,
BUFFER);
while ((currentByte = is.read(data, 0, BUFFER)) != -1) {
dest.write(data, 0, currentByte);
}
dest.flush();
dest.close();
is.close();
}
}
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
}

Reading gzip files inside gzip file using Java

Using Java I have to read text files which are inside gz file which is in another .tar.gz
gz_ltm_logs.tar.gz is the filename. It then has files ltm.1.gz, ltm.2.gz inside it and then these files have text files in them.
I wanted to do it using java.util.zip.* only but if it is impossible then I can look at other libraries.
I thought I will be able to do it using java.util.zip. But doesn't seem straightforward
Here's some code to give you an idea. This method will try to extract a given tar.gz file to outputFolder.
public static void extract(File input, File outputFolder) throws IOException {
byte[] buffer = new byte[1024];
GZIPInputStream gzipFile = new GZIPInputStream(new FileInputStream(input));
ByteOutputStream tarStream = new ByteOutputStream();
int gzipLengthRead;
while ((gzipLengthRead = gzipFile.read(buffer)) > 0){
tarStream.write(buffer, 0, gzipLengthRead);
}
gzipFile.close();
org.apache.tools.tar.TarInputStream tarFile = null;
// files inside the tar
OutputStream out = null;
try {
tarFile = new org.apache.tools.tar.TarInputStream(tarStream.newInputStream());
tarStream.close();
TarEntry entry = null;
while ((entry = tarFile.getNextEntry()) != null) {
String outFilename = entry.getName();
if (entry.isDirectory()) {
File directory = new File(outputFolder, outFilename);
directory.mkdirs();
} else {
File outputFile = new File(outputFolder, outFilename);
File outputDirectory = outputFile.getParentFile();
if (!outputDirectory.exists()) {
outputDirectory.mkdirs();
}
out = new FileOutputStream(outputFile);
// Transfer bytes from the tarFile to the output file
int innerLen;
while ((innerLen = tarFile.read(buffer)) > 0) {
out.write(buffer, 0, innerLen);
}
out.close();
}
}
} finally {
if (tarFile != null) {
tarFile.close();
}
if (out != null) {
out.close();
}
}
}

Extracting zip file into a folder throws "Invalid entry size (expected 46284 but got 46285 bytes)" for one of the entry

When I am trying to extract the zip file into a folder as per the below code, for one of the entry (A text File) getting an error as "Invalid entry size (expected 46284 but got 46285 bytes)" and my extraction is stopping abruptly. My zip file contains around 12 text files and 20 TIF files. It is encountering the problem for the text file and is not able to proceed further as it is coming into the Catch block.
I face this problem only in Production Server which is running on Unix and there is no problem with the other servers(Dev, Test, UAT).
We are getting the zip into the servers path through an external team who does the file transfer and then my code starts working to extract the zip file.
...
int BUFFER = 2048;
java.io.BufferedOutputStream dest = null;
String ZipExtractDir = "/y34/ToBeProcessed/";
java.io.File MyDirectory = new java.io.File(ZipExtractDir);
MyDirectory.mkdir();
ZipFilePath = "/y34/work_ZipResults/Test.zip";
// Creating fileinputstream for zip file
java.io.FileInputStream fis = new java.io.FileInputStream(ZipFilePath);
// Creating zipinputstream for using fileinputstream
java.util.zip.ZipInputStream zis = new java.util.zip.ZipInputStream(new java.io.BufferedInputStream(fis));
java.util.zip.ZipEntry entry;
while ((entry = zis.getNextEntry()) != null)
{
int count;
byte data[] = new byte[BUFFER];
java.io.File f = new java.io.File(ZipExtractDir + "/" + entry.getName());
// write the files to the directory created above
java.io.FileOutputStream fos = new java.io.FileOutputStream(ZipExtractDir + "/" + entry.getName());
dest = new java.io.BufferedOutputStream(fos, BUFFER);
while ((count = zis.read(data, 0, BUFFER)) != -1)
{
dest.write(data, 0, count);
}
dest.flush();
dest.close();
}
zis.close();
zis.closeEntry();
}
catch (Exception Ex)
{
System.Out.Println("Exception in \"ExtractZIPFiles\"---- " + Ex.getMessage());
}
I can't understand the problem you're meeting, but here is the method I use to unzip an archive:
public static void unzip(File zip, File extractTo) throws IOException {
ZipFile archive = new ZipFile(zip);
Enumeration<? extends ZipEntry> e = archive.entries();
while (e.hasMoreElements()) {
ZipEntry entry = e.nextElement();
File file = new File(extractTo, entry.getName());
if (entry.isDirectory()) {
file.mkdirs();
} else {
if (!file.getParentFile().exists()) {
file.getParentFile().mkdirs();
}
InputStream in = archive.getInputStream(entry);
BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(file));
IOUtils.copy(in, out);
in.close();
out.close();
}
}
}
Calling:
File zip = new File("/path/to/my/file.zip");
File extractTo = new File("/path/to/my/destination/folder");
unzip(zip, extractTo);
I never met any issue with the code above, so I hope that could help you.
Off the top of my head, I could think of these reasons:
There could be problem with the encoding of the text file.
The file needs to be read/transferred in "binary" mode.
There could be an issue with the line ending \n or \r\n
The file could simply be corrupt. Try opening the file with a zip utility.

Java: Maintaining zipped files Modified Date

A proprietary program that I'm working with zips up and extracts certain files without changing the modified date of the files when unzipping. I'm also creating my own zip and extraction tool based off the source code in our program but when I'm unzipping the files the modified date of all zipped files is showing with the unzip time & date. Here's the code for my extraction:
public static int unzipFiles(File zipFile, File extractDir) throws Exception
{
int totalFileCount = 0;
String zipFilePath = zipFile.getPath();
System.out.println("Zip File Path: " + zipFilePath);
ZipFile zfile = new ZipFile(zipFile);
System.out.println("Size of ZipFile: "+zfile.size());
Enumeration<? extends ZipEntry> entries = zfile.entries();
while (entries.hasMoreElements())
{
ZipEntry entry = entries.nextElement();
System.out.println("ZipEntry File: " + entry.getName());
File file = new File(extractDir, entry.getName());
if (entry.isDirectory())
{
System.out.println("Creating Directory");
file.mkdirs();
}
else
{
file.getParentFile().mkdirs();
InputStream in = zfile.getInputStream(entry);
try
{
copy(in, file);
}
finally
{
in.close();
}
}
totalFileCount++;
}
return totalFileCount;
}
private static void copy(InputStream in, OutputStream out) throws IOException
{
byte[] buffer = new byte[1024];
System.out.println("InputStream/OutputStram copy");
while (true)
{
int readCount = in.read(buffer);
if (readCount < 0)
{
break;
}
out.write(buffer, 0, readCount);
}
}
I'm sure there is a better way to do this other than doing the inputstream/outputstream copy. I'm sure this is the culprit as doing an extraction with winRAR does not change the date with the files I zipped.
Use ZipEntry.getTime to get the last-modified time and File.setLastModified to set it on the file after you are done copying it.

IOException - Access Denied Using FileOutputStream

I get the following IOException :
java.io.IOException: Access is denied
at java.io.WinNTFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java:850)
at zipUnzipper.main(zipUnzipper.java:41)
When trying to run the following piece of code :
public class zipUnzipper {
public zipUnzipper() {
}
public static void main(String[] args){
//Unzip to temp folder. Add all files to mFiles. Print names of all files in mFfiles.
File file = new File("C:\\aZipFile.zip");
String filename = file.getName();
String filePathName = new String();
int o = filename.lastIndexOf('.');
filename = filename.substring(0,o);
try {
ZipFile zipFile = new ZipFile (file.getAbsoluteFile());
Enumeration entries = zipFile.entries();
while(entries.hasMoreElements()) {
ZipEntry zipEntry = (ZipEntry) entries.nextElement();
System.out.println("Unzipping: " + zipEntry.getName());
BufferedInputStream bis = new BufferedInputStream(zipFile.getInputStream(zipEntry));
byte[] buffer = new byte[2048];
filePathName = "C:\\TEMP\\"+filename+"\\";
File fileToWrite = new File(filePathName+ zipEntry.getName());
fileToWrite.mkdirs();
fileToWrite.createNewFile();
FileOutputStream fos = new FileOutputStream(fileToWrite);
BufferedOutputStream bos = new BufferedOutputStream( fos , buffer.length );
int size;
while ((size = bis.read(buffer, 0, buffer.length)) != -1) {
bos.write(buffer, 0, size);
}
bos.flush();
bos.close();
bis.close();
}
zipFile.close();
File folder = new File (filePathName);
File [] mFiles = folder.listFiles();
for (int x=0; x<mFiles.length; x++) {
System.out.println(mFiles[x].getAbsolutePath());
}
} catch (ZipException ze) {
ze.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
It seems to me that for some reason the JVM can't create a new file. The code runs perfectly well if the files already exist. Is there some kind of access file which dictates whether the JVM can create a new file or am I simply doing something wrong?
Any help is much appreciated :-)
I'm running Java 1.4 and have been testing in JDeveloper in Windows XP.
The issue is that these calls step on each other:
fileToWrite.mkdirs(); //creates a directory e.g. C:\temp\foo\x
fileToWrite.createNewFile(); //attempts to create a file C:\temp\foo\x
The create operation fails because you just created a directory with the same name than the file you want to create.
What you want to do instead is:
fileToWrite.getParentFile().mkdirs()
And also, the call to createNewFile() is unnecessary.
Based on your code. The following "unzips" a zip file:
import java.io.*;
import java.util.zip.ZipFile;
import java.util.zip.ZipEntry;
import java.util.Enumeration;
public class Unzipper {
public static void main(String[] args)
throws IOException {
final File file = new File(args[0]);
final ZipFile zipFile = new ZipFile(file);
final byte[] buffer = new byte[2048];
final File tmpDir = new File(System.getProperty("java.io.tmpdir"), zipFile.getName());
if(!tmpDir.mkdir() && tmpDir.exists()) {
System.err.println("Cannot create: " + tmpDir);
System.exit(0);
}
for(final Enumeration entries = zipFile.entries(); entries.hasMoreElements();) {
final ZipEntry zipEntry = (ZipEntry) entries.nextElement();
System.out.println("Unzipping: " + zipEntry.getName());
final InputStream is = zipFile.getInputStream(zipEntry);
final File fileToWrite = new File(tmpDir, zipEntry.getName());
final File folder = fileToWrite.getParentFile();
if(!folder.mkdirs() && !folder.exists()) {
System.err.println("Cannot create: " + folder);
System.exit(0);
}
if(!zipEntry.isDirectory()) {
//No need to use buffered streams since we're doing our own buffering
final FileOutputStream fos = new FileOutputStream(fileToWrite);
int size;
while ((size = is.read(buffer)) != -1) {
fos.write(buffer, 0, size);
}
fos.close();
is.close();
}
}
zipFile.close();
}
}
Disclaimer: I haven't tested it beyond the very basics.
Why are you calling createNewFile()? Just create the FileOutputStream.
It also could be that in context where you are launching the application you haven't access rights to the place where you are trying to create the file. Launch the app as admin or create the file in the project folder.

Categories

Resources