Creation gzip archive using Apache Commons Compress - java

I succeed to create gz archive with expected content, but how can I set the filename inside the archive?
I mean, if archive myfile.gz was created, the file inside it will be named "myfile", but I want to name it like source file, for example, "1.txt"
Current code:
public static void gz() throws FileNotFoundException, IOException {
GZIPOutputStream out = null;
String filePaths[] = {"C:/Temp/1.txt","C:/Temp/2.txt"};
try {
out = new GZIPOutputStream(
new BufferedOutputStream(new FileOutputStream("C:/Temp/myfile.gz")));
RandomAccessFile f = new RandomAccessFile(filePaths[0], "r");
byte[] b = new byte[(int)f.length()];
f.read(b);
out.write(b, 0, b.length);
out.finish();
out.close();
} finally {
if(out != null) out.close();
}
}

GZip compresses a stream. Typically, when people use GZip with multiple files, they also use tar to munch them together.
gzip archive with multiple files inside

Related

Split/join a binary file into multiple parts without loading file into memory?

In Java, how do you split a binary file into multiple parts while only loading a small portion of the File into memory at one time?
So I have a file FullFile that is large. I need to upload it to cloud storage but it's so large that it often times out.
I can make this problem less likely if I split the file and upload in chunks.
So I need to split FullFile into files of chunk size MaxChunkSize.
List<File> fileSplit(File fullFile, int maxChunkSize)
File fileJoin(List<File> splitFiles)
Most code snippets around require the file to be text. But in my case the files are compressed binary.
What would be the best way to implement these methods?
Below is the full answer:
The maxChunkSize represents the size in bytes of a file chunk.
In the example below I read a 5mb zip file and split it into five 1MB chunks and later join them back using the fileJoin function.
The method stageLocally stages the files locally but you can modify it to work with any cloud storage. (Better to abstract this out so you can switch between multiple storage implementations)
You can tweak maxChunkSize based on the amount of data you want to store inmemory at a given time
The IOutils.copy() methods is from the commons library, here is the maven link. You can also use Files.copy() in liue of it. The Files.copy() methods comes from the java.nio package, so you don't have to add an external dependency to use it.
I have ommitted the exception handling for brevity.
public static void main(String[] args) throws IOException {
File input = new File(_5_MB_FILE_PATH);
File outPut = fileJoin(split(input, 1_024_000));
System.out.println(IOUtils.contentEquals(Files.newInputStream(input.toPath()), Files.newInputStream(outPut.toPath())));
}
public static List<File> split(File largeFile, int maxChunkSize) throws IOException {
InputStream in = Files.newInputStream(largeFile.toPath());
List<File> list = new ArrayList<>();
final byte[] buffer = new byte[maxChunkSize];
int dataRead = in.read(buffer);
while (dataRead > -1) {
list.add(stageLocally(buffer, dataRead));
dataRead = in.read(buffer);
}
return list;
}
private static File stageLocally(byte[] buffer, int length) throws IOException {
File outPutFile = File.createTempFile("temp-", "split", new File(TEMP_DIRECTORY));
FileOutputStream fos = new FileOutputStream(outPutFile);
fos.write(buffer, 0, length);
fos.close();
return outPutFile;
}
public static File fileJoin(List<File> list) throws IOException {
File outPutFile = File.createTempFile("temp-", "unsplit", new File(TEMP_DIRECTORY));
FileOutputStream fileOutputStream = new FileOutputStream(outPutFile);
for (File file : list) {
InputStream in = Files.newInputStream(file.toPath());
IOUtils.copy(in, fileOutputStream);
in.close();
}
fileOutputStream.close();
return outPutFile;
}
Let me know if this helps.

java: how to open exist zip file and store bufferedimage into it

I am doing a project dealing with images.
And one function is to zip images. Just as my code shows, it will new a zipOutputStream every time i call compress. As a result, the previous zip file will be overwrite if the path is the same.
public void compress() throws IOException {
String localPath = iProcessor.getPath();
String name = getName(localPath);
String type = getType(name);
ByteArrayOutputStream os = new ByteArrayOutputStream();
ImageIO.write(iProcessor.getImg(), type, os);
ByteArrayInputStream file = new ByteArrayInputStream(os.toByteArray());
ZipOutputStream out = new ZipOutputStream(new FileOutputStream(path));
out.putNextEntry(new ZipEntry(name));
//write into zip
int len;
byte[] buffer = new byte[1024];
while ((len = file.read(buffer)) > 0) {
out.write(buffer, 0, len);
}
out.closeEntry();
file.close();
out.close();
System.out.println("Create zip file successfully!\n");
}
Is there any methods that when I input the same path, it will open the same zipfile and store the image into it? Thanks
With JDK13 and above, a zip filesystem provides an easy way to modify ZIP files. Once ZIP filesystem is setup you can create directories and copy files using the Files.createDirectories() and Files.copy():
private static void addToZip(Path zip, Path file, String zipPath) throws IOException {
try (FileSystem fs = FileSystems.newFileSystem(zip, Map.of("create","true"))) {
Path root = fs.getRootDirectories().iterator().next();
Path target = root.resolve(zipPath);
Files.createDirectories(target.getParent());
Files.copy(file, target, StandardCopyOption.REPLACE_EXISTING);
}
}
For example these calls will update / create ZIP by copying a file to particular path:
addToZip(Path.of("/tmp/xyz.zip"), Path.of("/tmp/123.txt"), "some/path/somename.txt");
addToZip(Path.of("/tmp/xyz.zip"), Path.of("/tmp/123.txt"), "anothername.txt");

Using LZ4 Compression in Java for multiple files

I'm trying to compress multiple files into a single archive but with my current code, it only compresses it into a single blob inside the zip. Does anyone know how to segment the files with LZ4?
public void zipFile(File[] fileToZip, String outputFileName, boolean activeZip)
{
try (FileOutputStream fos = new FileOutputStream(new File(outputFileName), true);
LZ4FrameOutputStream lz4fos = new LZ4FrameOutputStream(fos);)
{
for (File a : fileToZip)
{
try (FileInputStream fis = new FileInputStream(a))
{
byte[] buf = new byte[bufferSizeZip];
int length;
while ((length = fis.read(buf)) > 0)
{
lz4fos.write(buf, 0, length);
}
}
}
}
catch (Exception e)
{
LOG.error("Zipping file failed ", e);
}
}
LZ4 algorithm is close with LZMA. In case you can use LZMA then you can create zip archive with LZMA compression.
List<Path> files = Collections.emptyList();
Path zip = Paths.get("lzma.zip");
ZipEntrySettings entrySettings = ZipEntrySettings.builder()
.compression(Compression.LZMA, CompressionLevel.NORMAL)
.lzmaEosMarker(true).build();
ZipSettings settings = ZipSettings.builder().entrySettingsProvider(fileName -> entrySettings).build();
ZipIt.zip(zip)
.settings(settings)
.add(files);
See details in zip4jvm
LZ4 compresses a stream of bytes. You would need to archive your multiple files into a single archive such as a Tar Archive, then feed it into the LZ4 compressor.
I created a Java library that does this for you https://github.com/spoorn/tar-lz4-java.
If you want to implement it yourself, here's a technical doc that includes details on how to LZ4 compress a directory using TarArchive from Apache Commons and lz4-java: https://github.com/spoorn/tar-lz4-java/blob/main/SUMMARY.md#lz4

Zip entry stream not writing anything [duplicate]

I am currently extracting the contents of a war file and then adding some new files to the directory structure and then creating a new war file.
This is all done programatically from Java - but I am wondering if it wouldn't be more efficient to copy the war file and then just append the files - then I wouldn't have to wait so long as the war expands and then has to be compressed again.
I can't seem to find a way to do this in the documentation though or any online examples.
Anyone can give some tips or pointers?
UPDATE:
TrueZip as mentioned in one of the answers seems to be a very good java library to append to a zip file (despite other answers that say it is not possible to do this).
Anyone have experience or feedback on TrueZip or can recommend other similar libaries?
In Java 7 we got Zip File System that allows adding and changing files in zip (jar, war) without manual repackaging.
We can directly write to files inside zip files as in the following example.
Map<String, String> env = new HashMap<>();
env.put("create", "true");
Path path = Paths.get("test.zip");
URI uri = URI.create("jar:" + path.toUri());
try (FileSystem fs = FileSystems.newFileSystem(uri, env))
{
Path nf = fs.getPath("new.txt");
try (Writer writer = Files.newBufferedWriter(nf, StandardCharsets.UTF_8, StandardOpenOption.CREATE)) {
writer.write("hello");
}
}
As others mentioned, it's not possible to append content to an existing zip (or war). However, it's possible to create a new zip on the fly without temporarily writing extracted content to disk. It's hard to guess how much faster this will be, but it's the fastest you can get (at least as far as I know) with standard Java. As mentioned by Carlos Tasada, SevenZipJBindings might squeeze out you some extra seconds, but porting this approach to SevenZipJBindings will still be faster than using temporary files with the same library.
Here's some code that writes the contents of an existing zip (war.zip) and appends an extra file (answer.txt) to a new zip (append.zip). All it takes is Java 5 or later, no extra libraries needed.
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.Enumeration;
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;
import java.util.zip.ZipOutputStream;
public class Main {
// 4MB buffer
private static final byte[] BUFFER = new byte[4096 * 1024];
/**
* copy input to output stream - available in several StreamUtils or Streams classes
*/
public static void copy(InputStream input, OutputStream output) throws IOException {
int bytesRead;
while ((bytesRead = input.read(BUFFER))!= -1) {
output.write(BUFFER, 0, bytesRead);
}
}
public static void main(String[] args) throws Exception {
// read war.zip and write to append.zip
ZipFile war = new ZipFile("war.zip");
ZipOutputStream append = new ZipOutputStream(new FileOutputStream("append.zip"));
// first, copy contents from existing war
Enumeration<? extends ZipEntry> entries = war.entries();
while (entries.hasMoreElements()) {
ZipEntry e = entries.nextElement();
System.out.println("copy: " + e.getName());
append.putNextEntry(e);
if (!e.isDirectory()) {
copy(war.getInputStream(e), append);
}
append.closeEntry();
}
// now append some extra content
ZipEntry e = new ZipEntry("answer.txt");
System.out.println("append: " + e.getName());
append.putNextEntry(e);
append.write("42\n".getBytes());
append.closeEntry();
// close
war.close();
append.close();
}
}
I had a similar requirement sometime back - but it was for reading and writing zip archives (.war format should be similar). I tried doing it with the existing Java Zip streams but found the writing part cumbersome - especially when directories where involved.
I'll recommend you to try out the TrueZIP (open source - apache style licensed) library that exposes any archive as a virtual file system into which you can read and write like a normal filesystem. It worked like a charm for me and greatly simplified my development.
You could use this bit of code I wrote
public static void addFilesToZip(File source, File[] files)
{
try
{
File tmpZip = File.createTempFile(source.getName(), null);
tmpZip.delete();
if(!source.renameTo(tmpZip))
{
throw new Exception("Could not make temp file (" + source.getName() + ")");
}
byte[] buffer = new byte[1024];
ZipInputStream zin = new ZipInputStream(new FileInputStream(tmpZip));
ZipOutputStream out = new ZipOutputStream(new FileOutputStream(source));
for(int i = 0; i < files.length; i++)
{
InputStream in = new FileInputStream(files[i]);
out.putNextEntry(new ZipEntry(files[i].getName()));
for(int read = in.read(buffer); read > -1; read = in.read(buffer))
{
out.write(buffer, 0, read);
}
out.closeEntry();
in.close();
}
for(ZipEntry ze = zin.getNextEntry(); ze != null; ze = zin.getNextEntry())
{
out.putNextEntry(ze);
for(int read = zin.read(buffer); read > -1; read = zin.read(buffer))
{
out.write(buffer, 0, read);
}
out.closeEntry();
}
out.close();
tmpZip.delete();
}
catch(Exception e)
{
e.printStackTrace();
}
}
I don't know of a Java library that does what you describe. But what you described is practical. You can do it in .NET, using DotNetZip.
Michael Krauklis is correct that you cannot simply "append" data to a war file or zip file, but it is not because there is an "end of file" indication, strictly speaking, in a war file. It is because the war (zip) format includes a directory, which is normally present at the end of the file, that contains metadata for the various entries in the war file. Naively appending to a war file results in no update to the directory, and so you just have a war file with junk appended to it.
What's necessary is an intelligent class that understands the format, and can read+update a war file or zip file, including the directory as appropriate. DotNetZip does this, without uncompressing/recompressing the unchanged entries, just as you described or desired.
As Cheeso says, there's no way of doing it. AFAIK the zip front-ends are doing exactly the same as you internally.
Anyway if you're worried about the speed of extracting/compressing everything, you may want to try the SevenZipJBindings library.
I covered this library in my blog some months ago (sorry for the auto-promotion). Just as an example, extracting a 104MB zip file using the java.util.zip took me 12 seconds, while using this library took 4 seconds.
In both links you can find examples about how to use it.
Hope it helps.
See this bug report.
Using append mode on any kind of
structured data like zip files or tar
files is not something you can really
expect to work. These file formats
have an intrinsic "end of file"
indication built into the data format.
If you really want to skip the intermediate step of un-waring/re-waring, you could read the war file file, get all the zip entries, then write to a new war file "appending" the new entries you wanted to add. Not perfect, but at least a more automated solution.
Yet Another Solution: You may find code below useful in other situations as well. I have used ant this way to compile Java directories, generating jar files, updating zip files,...
public static void antUpdateZip(String zipFilePath, String libsToAddDir) {
Project p = new Project();
p.init();
Target target = new Target();
target.setName("zip");
Zip task = new Zip();
task.init();
task.setDestFile(new File(zipFilePath));
ZipFileSet zipFileSet = new ZipFileSet();
zipFileSet.setPrefix("WEB-INF/lib");
zipFileSet.setDir(new File(libsToAddDir));
task.addFileset(zipFileSet);
task.setUpdate(true);
task.setProject(p);
task.init();
target.addTask(task);
target.setProject(p);
p.addTarget(target);
DefaultLogger consoleLogger = new DefaultLogger();
consoleLogger.setErrorPrintStream(System.err);
consoleLogger.setOutputPrintStream(System.out);
consoleLogger.setMessageOutputLevel(Project.MSG_DEBUG);
p.addBuildListener(consoleLogger);
try {
// p.fireBuildStarted();
// ProjectHelper helper = ProjectHelper.getProjectHelper();
// p.addReference("ant.projectHelper", helper);
// helper.parse(p, buildFile);
p.executeTarget(target.getName());
// p.fireBuildFinished(null);
} catch (BuildException e) {
p.fireBuildFinished(e);
throw new AssertionError(e);
}
}
this a simple code to get a response with using servlet and send a response
myZipPath = bla bla...
byte[] buf = new byte[8192];
String zipName = "myZip.zip";
String zipPath = myzippath+ File.separator+"pdf" + File.separator+ zipName;
File pdfFile = new File("myPdf.pdf");
ZipOutputStream out = new ZipOutputStream(new FileOutputStream(zipPath));
ZipEntry zipEntry = new ZipEntry(pdfFile.getName());
out.putNextEntry(zipEntry);
InputStream in = new FileInputStream(pdfFile);
int len;
while ((len = in.read(buf)) > 0) {
out.write(buf, 0, len);
}
out.closeEntry();
in.close();
out.close();
FileInputStream fis = new FileInputStream(zipPath);
response.setContentType("application/zip");
response.addHeader("content-disposition", "attachment;filename=" + zipName);
OutputStream os = response.getOutputStream();
int length = is.read(buffer);
while (length != -1)
{
os.write(buffer, 0, length);
length = is.read(buffer);
}
Here are examples how easily files can be appended to existing zip using TrueVFS:
// append a file to archive under different name
TFile.cp(new File("existingFile.txt"), new TFile("archive.zip", "entry.txt"));
// recusively append a dir to the root of archive
TFile src = new TFile("dirPath", "dirName");
src.cp_r(new TFile("archive.zip", src.getName()));
TrueVFS, the successor of TrueZIP, uses Java 7 NIO 2 features under the hood when appropriate but offers much more features like thread-safe async parallel compression.
Beware also that Java 7 ZipFileSystem by default is vulnerable to OutOfMemoryError on huge inputs.
Here is Java 1.7 version of Liam answer which uses try with resources and Apache Commons IO.
The output is written to a new zip file but it can be easily modified to write to the original file.
/**
* Modifies, adds or deletes file(s) from a existing zip file.
*
* #param zipFile the original zip file
* #param newZipFile the destination zip file
* #param filesToAddOrOverwrite the names of the files to add or modify from the original file
* #param filesToAddOrOverwriteInputStreams the input streams containing the content of the files
* to add or modify from the original file
* #param filesToDelete the names of the files to delete from the original file
* #throws IOException if the new file could not be written
*/
public static void modifyZipFile(File zipFile,
File newZipFile,
String[] filesToAddOrOverwrite,
InputStream[] filesToAddOrOverwriteInputStreams,
String[] filesToDelete) throws IOException {
try (ZipOutputStream out = new ZipOutputStream(new FileOutputStream(newZipFile))) {
// add existing ZIP entry to output stream
try (ZipInputStream zin = new ZipInputStream(new FileInputStream(zipFile))) {
ZipEntry entry = null;
while ((entry = zin.getNextEntry()) != null) {
String name = entry.getName();
// check if the file should be deleted
if (filesToDelete != null) {
boolean ignoreFile = false;
for (String fileToDelete : filesToDelete) {
if (name.equalsIgnoreCase(fileToDelete)) {
ignoreFile = true;
break;
}
}
if (ignoreFile) {
continue;
}
}
// check if the file should be kept as it is
boolean keepFileUnchanged = true;
if (filesToAddOrOverwrite != null) {
for (String fileToAddOrOverwrite : filesToAddOrOverwrite) {
if (name.equalsIgnoreCase(fileToAddOrOverwrite)) {
keepFileUnchanged = false;
}
}
}
if (keepFileUnchanged) {
// copy the file as it is
out.putNextEntry(new ZipEntry(name));
IOUtils.copy(zin, out);
}
}
}
// add the modified or added files to the zip file
if (filesToAddOrOverwrite != null) {
for (int i = 0; i < filesToAddOrOverwrite.length; i++) {
String fileToAddOrOverwrite = filesToAddOrOverwrite[i];
try (InputStream in = filesToAddOrOverwriteInputStreams[i]) {
out.putNextEntry(new ZipEntry(fileToAddOrOverwrite));
IOUtils.copy(in, out);
out.closeEntry();
}
}
}
}
}
this works 100% , if you dont want to use extra libs ..
1) first, the class that append files to the zip ..
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;
public class AddZip {
public void AddZip() {
}
public void addToZipFile(ZipOutputStream zos, String nombreFileAnadir, String nombreDentroZip) {
FileInputStream fis = null;
try {
if (!new File(nombreFileAnadir).exists()) {//NO EXISTE
System.out.println(" No existe el archivo : " + nombreFileAnadir);return;
}
File file = new File(nombreFileAnadir);
System.out.println(" Generando el archivo '" + nombreFileAnadir + "' al ZIP ");
fis = new FileInputStream(file);
ZipEntry zipEntry = new ZipEntry(nombreDentroZip);
zos.putNextEntry(zipEntry);
byte[] bytes = new byte[1024];
int length;
while ((length = fis.read(bytes)) >= 0) {zos.write(bytes, 0, length);}
zos.closeEntry();
fis.close();
} catch (FileNotFoundException ex ) {
Logger.getLogger(AddZip.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(AddZip.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
2) you can call it in your controller ..
//in the top
try {
fos = new FileOutputStream(rutaZip);
zos = new ZipOutputStream(fos);
} catch (FileNotFoundException ex) {
Logger.getLogger(UtilZip.class.getName()).log(Level.SEVERE, null, ex);
}
...
//inside your method
addZip.addToZipFile(zos, pathFolderFileSystemHD() + itemFoto.getNombre(), "foto/" + itemFoto.getNombre());
Based on the answer given by #sfussenegger above, following code is used to append to a jar file and download it:
public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
Resource resourceFile = resourceLoader.getResource("WEB-INF/lib/custom.jar");
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try (ZipOutputStream zos = new ZipOutputStream(baos, StandardCharsets.ISO_8859_1);) {
try (ZipFile zin = new ZipFile(resourceFile.getFile(), StandardCharsets.ISO_8859_1);) {
zin.stream().forEach((entry) -> {
try {
zos.putNextEntry(entry);
if (!entry.isDirectory()) {
zin.getInputStream(entry).transferTo(zos);
}
zos.closeEntry();
} catch (Exception ex) {
ex.printStackTrace();
}
});
}
/* build file records to be appended */
....
for (FileContents record : records) {
zos.putNextEntry(new ZipEntry(record.getFileName()));
zos.write(record.getBytes());
zos.closeEntry();
}
zos.flush();
}
response.setContentType("application/java-archive");
response.setContentLength(baos.size());
response.setHeader(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=\"custom.jar\"");
try (BufferedOutputStream out = new BufferedOutputStream(response.getOutputStream())) {
baos.writeTo(out);
}
}

Java: Bzip2 library

I need to create Bzip2 archive.
A downloaded bzip2 library from 'Apache ant'.
I use class CBZip2OutputStream:
String s = .....
CBZip2OutputStream os = new CBZip2OutputStream(fos);
os.write(s.getBytes(Charset.forName("UTF-8")));
os.flush();
os.close();
(I didn't find any example how to use it, so I decided to use it in this way)
But it creates a corrupted archive on the disk.
You have to add BZip2 header (two bytes: 'B','Z') before writing the content:
//Write 'BZ' before compressing the stream
fos.write("BZ".getBytes());
//Write to compressed stream as usual
CBZip2OutputStream os = new CBZip2OutputStream(fos);
... the rest ...
Then, for instance, you can extract contents of your bzipped file with cat compressed.bz2 | bunzip2 > uncompressed.txt on a *nix system.
I have not found an example but in the end I understood how to use CBZip2OutputStream so here is one :
public void createBZipFile() throws IOException{
// file to zip
File file = new File("plane.jpg");
// fichier compresse
File fileZiped= new File("plane.bz2");
// Outputstream for fileZiped
FileOutputStream fileOutputStream = new FileOutputStream(fileZiped);
fileOutputStream.write("BZ".getBytes());
// we getting the data in a byte array
byte[] fileData = getArrayByteFromFile(file);
CBZip2OutputStream bzip = null;
try{
bzip = new CBZip2OutputStream(fileOutputStream );
bzip.write(fileData, 0, fileData.length);
bzip.flush() ;
bzip.close();
}catch (IOException ex) {
ex.printStackTrace();
}
fos.close();
}

Categories

Resources