Check compressed archive for corruption

Check compressed archive for corruption - java

I am creating compressed archives with tar and bzip2 using jarchivelib which utilizes org.apache.commons.compress.
try {
Archiver archiver = ArchiverFactory.createArchiver(ArchiveFormat.TAR, CompressionType.BZIP2);
File archive = archiver.create(archiveName, destination, sourceFilesArr);
} catch (IOException e) {
e.printStackTrace();
}
Sometimes it can happen that the created file is corrupted, so I want to check for that and recreate the archive if necessary. There is no error thrown and I detected the corruption when trying to decompress it manually with tar -xf file.tar.bz2 (Note: extracting with tar -xjf file.tar.bz2 works flawlessly)
tar: Archive contains `\2640\003\203\325#\0\0\0\003\336\274' where numeric off_t value expected
tar: Archive contains `\0l`\t\0\021\0' where numeric mode_t value expected
tar: Archive contains `\003\301\345\0\0\0\0\006\361\0p\340' where numeric time_t value expected
tar: Archive contains `\0\210\001\b\0\233\0' where numeric uid_t value expected
tar: Archive contains `l\001\210\0\210\001\263' where numeric gid_t value expected
tar: BZh91AY&SY"'ݛ\003\314>\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\343\262\037\017\205\360X\001\210: Unknown file type `', extracted as normal file
tar: BZh91AY&SY"'ݛ�>��������������������������������������X�: implausibly old time stamp 1970-01-01 00:59:59
tar: Skipping to next header
tar: Exiting with failure status due to previous errors
Is there a way using org.apache.commons.compress to check a compressed archive if it is corrupted? Since the files can be at the size of several GB an approach without decompressing would be great.

As bzip2 compression produces a stream, there is no way how to check for corruption without decompressing that stream and passing it to tar to check.
Anyway, in your case you actually decompress directly with tar and not passing first to bzip2. This is the root cause. You need to always use the -j flag to tar as it's compressed by bzip2. That's why the second command works correctly.

Related

Dowloading Java jdk on linux

I'm trying to download java jdk and when I try to extract the file I get this message
tar (child): jdk-8u241-linux-i586.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
I have it downloaded and the file is on my desktop and this is what i entered
root#faiq-desktop:~/Desktop# tar zxvf jdk-8u241-linux-i586.tar.gz

It seems like you are facing this because you are not in that directory, first you need to cd into that directory.
tar -zxvf filename.tar.gz # Replace the filename with your's
referenced from: “Cannot open: No such file or directory” when extracting a tar file

Can't install java sdk, webupd8team repo gives sha256 errors and tar.gz from oracle gives broken output on Ubuntu 18.04

When attempting to install java from the linuxuprising/java or ppa:webupd8team/java i get an error like this
Hash Sum mismatch
Hashes of expected file:
- SHA256:973d8ef6268da61e1003963186b8157207c6bc6e48d37d5e11fc2a5885a5b708
- SHA1:28ede7da1dca520a59f4d7787511742c9b923084 [weak]
- MD5Sum:7ba8544ff3e98f3c13d07a1803916781 [weak]
- Filesize:78077648 [weak]
Hashes of received file:
- SHA256:ddb59fc787e254508a3cab0eec31bf0139b9510d83591e0f54a605e6a1f5615b
- SHA1:1d78a279a988822b0c34312b284e73cd02103425 [weak]
- MD5Sum:f44ade468641a9aea12a59539886c580 [weak]
- Filesize:78077648 [weak]
I've tried installing from multiple version across both repos. I've also tried installation oracle.com but when running tar -xzf jdk-version.tar.gz on the downloaded file i get an output like this, regardless of the jdk version i download from their site
$ sudo tar -xzf jdk-8u181-linux-x64.tar.gz
tar: Skipping to next header
tar: Archive contains ‘*\264\0\005*\264\0\a’ where numeric mode_t value expected
tar: Archive contains ‘I\003\0\001\0\032\0\030\0\001\0\023’ where numeric time_t value expected
tar: Archive contains ‘\207o\257\0\0\0\002’ where numeric uid_t value expected
tar: Archive contains ‘\024\0\0\0\006\0\001’ where numeric gid_t value expected
tar: *Y�: implausibly old time stamp 1969-12-31 18:59:59
tar: Skipping to next header
tar: Exiting with failure status due to previous errors
It also generates an empty file called *Y� (invalid encoding)
I've very lost at this point, and would like to just get jdk working. I feel like something is corrupted but how? why?

Issue using Coldfusion FileExists when checking files with UTF-8 and ASCII

When trying to detect the existence of the files that were encoded in UTF-8 with FileExists function, the files could not be found.
I found that in the Coldfusion server the Java File Encoding was originally set to "UTF-8". For some unknown reason it was back to default "ASCII". I suspect that this is the issue.
For example, a user uploaded a photo named 云拼花.jpg while the server Java file encoding was set to UTF-8, and now with the server Java file encoding set to ASCII, I use
<cfif FileExists("#currentpath##pic#")>
The result will be not found i.e. file does not exist. However if I simply display it using:
<IMG SRC="/images/#pic#">
The image will display. This caused issues when I try to test the existence of the images. The images are there but can't be found by FileExists.
Now the directory has a mix of files encoded in either UTF-8 or ASCII. Is there anyway to:
force any upload file to UTF-8 encoding
check for the existence of the file
regardless of CF Admin Java File Encoding setting?

Add this to your page.
<cfprocessingdirective pageencoding="utf-8">
This should fix the issue.

How to unzip file zipped by PKZIP in mainframe by Java?

I am trying to write a program in Java to unzip files zipped by PKZIP tool in Mainframe. However, I have tried below 3 ways, none of them can solve my problem.
By exe.
I have tried to open it by WinRAR, 7Zip and Linux command(unzip).
All are failed with below error message :
The archive is either in unknown format or damaged
By JDK API - java.util.ZipFile
I also have tried to unzip it by JDK API, as this website described.
However, it fails with error message :
IO Error: java.util.zip.ZipException: error in opening zip file
By Zip4J
I also have tried to use Zip4J. It failed too, with error message :
Caused by: java.io.IOException: Negative seek offset
at java.io.RandomAccessFile.seek(Native Method)
at net.lingala.zip4j.core.HeaderReader.readEndOfCentralDirectoryRecord(HeaderReader.java:117)
... 5 more
May I ask if there is any java lib or linux command can extract zip file zipped by PKZIP in Mainframe? Thanks a lot!

I have successfully read files that were compressed with PKZip on z/OS and transferred to Linux. I was able to read them with java.util.zip* classes:
ZipFile ifile = new ZipFile(inFileName);
// faster to loop through entries than open the zip file as a stream
Enumeration<? extends ZipEntry> entries = ifile.entries();
while ( entries.hasMoreElements()) {
ZipEntry entry = entries.nextElement();
if (!entry.isDirectory()) { // skip directories
String entryName = entry.getName();
// code to determine to process omitted
InputStream zis = ifile.getInputStream(entry);
// process the stream
}
}
The jar file format is just a zip file, so the "jar" command can also read such files.
Like the others, I suspect that maybe the file was not transferred in binary and so was corrupted. On Linux you can use the xxd utility (piped through head) to dump the first few bytes to see if it looks like a zip file:
# xxd myfile.zip | head
0000000: 504b 0304 2d00 0000 0800 2c66 a348 eb5e PK..-.....,f.H.^
The first 4 bytes should be as shown. See also the Wikipedia entry for zip files
Even if the first 4 bytes are correct, if the file was truncated during transmission that could also cause the corrupt file message.

File size limitations of ZipOutputStream?

I am using the ZipOutputStream to create ZIP files. It works fine, but the Javadoc is quite sparse, so I'm left with questions about the characteristics of ZipOutputStream:
Is there a limit for the maximum supported file sizes? Both for files contained in the ZIP and for the resulting ZIP file itself? The size argument is long, but who knows. (Let us assume that the filesystem imposes no limits.)
What is the minimum input file size that justifies use of the DEFLATED method?
I will always read the resulting ZIP file using ZipInputStream.

The most important aspect is that in a current Java-7 JDK, ZipOutputStream creates ZIP files according to the 2012 PKZIP specification, which also includes support for ZIP64. Note that the ZIP64 features had bugs at first, but any recent version of the Java 7 JDK will be OK.
The maximum file size is thus 264-1 bytes. I tried it with a 10 GB test file. This is much larger than the 4 GB of standard ZIP. I could add it to the ZIP file with no problems, also if the resulting ZIP file itself grew beyond 4 GB.
The minimum file size which justifies use of the DEFLATED method is 22 bytes. This has nothing to do with the minimum ZIP file size which is incidentally also 22 bytes (for empty ZIP files). I empirically determined this number by adding strings of as of increasing length (see diagram below). Such a sequence of identical characters compresses very well, so in the real world, the break-even point will be higher.

Following are the limits of ZIP file format:
The minimum size of a .ZIP file is 22 bytes. The maximum size for both
the archive file and the individual files inside it is 4,294,967,295
bytes (232−1 bytes, or 4 GiB minus 1 byte) for standard .ZIP, and
18,446,744,073,709,551,615 bytes (264−1 bytes, or 16 EiB minus 1 byte)
for ZIP64.[31]
Reference : Zip (file format)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Check compressed archive for corruption - java

Related

Dowloading Java jdk on linux

Can't install java sdk, webupd8team repo gives sha256 errors and tar.gz from oracle gives broken output on Ubuntu 18.04

Issue using Coldfusion FileExists when checking files with UTF-8 and ASCII

How to unzip file zipped by PKZIP in mainframe by Java?

File size limitations of ZipOutputStream?

Categories

Resources