File separators of Path name of ZipEntry? - java

ZIP entries store the full path name of the entry because (I'm sure of the next part) the ZIP archive is not organized as directories. The metadata contains the info about how files are supposed to be stored (inside directories).
If I create a ZIP file in Windows, when I unzip the data in another OS, e.g. Mac OS X, the file structure remains as it used to be in Windows. Is this because the unzipper is designed to handle this, or isit because the file separators inside the ZIP are standard?
I'm asking this because I'm trying to find an entry inside a ZIP file using the name of the zipped file. But which file separator should I use to make it work in systems other than Windows?
I'm using Java, and the method: .getName() of the ZipEntry gives me the path using the Windows file separator \. Would it be enough if I use the java File.separator separator to make it work on another OS? Or will I have to try to find my file with each possible separator?
Honorary Correct Answer Mention
The answer given by #Eren Yilmaz is correct describing the functionality of many tools (or even the one you can code yourself). But given that the .zip standard clearly documents how it must be, the correct answer had to be updated

The .zip file specification states:
4.4.17.1 The name of the file, with optional relative path.
The path stored MUST not contain a drive or
device letter, or a leading slash. All slashes
MUST be forward slashes '/' as opposed to
backwards slashes '\' for compatibility with Amiga
and UNIX file systems etc. If input came from standard
input, there is no file name field.

The file separator is dependent on the application that creates the zip file. Some applications use the system file separator, whereas some use the "civilized" forward slash "/". So, if you are creating the zip file and then consuming it, then you can simply use a forward slash as file separator. If the zip file is created on somewhere else, then you should find out which separator was used. I don't know a simple way, but you can use a brute method and check out both separator types as you progress.
Some applications, especially custom zip creation codes, can mix the separators on different zip entries, so don't forget to check out each entry.

Related

Java Cross Platform File Operations

I developed a software in netbeans + Ubuntu and then converted the runnable .jar file of netbeans to .exe file using a converter software.
I used:
File f = new File("./dir/fileName");
which works fine in Ubuntu but it gives an error in Windows, because the directory pattern of both OSs are different.
Absolute paths should not be hardcoded. They should be read e.g. from a config file or user input.
Then you can use the NIO.2 File API to create your file paths: Paths.get(...) (java.io.File is a legacy API).
In your case it could be:
Path filePath = Paths.get("dir", "fileName");
I used: File f = new File("./dir/fileName") which works fine in Ubuntu but it gives error in Windows, bcz the directory pattern of both os are different.
It is presumably failing because that file doesn't exist at that path. Note that it is a relative path, so the problem could have been that the the path could not be resolved from the current directory ... because the current directory was not what the application was expecting.
In fact, it is perfectly fine to use forward slashes in pathnames in Java on Window. That's because at the OS level, Windows accepts both / and \ as path separators. (It doesn't work in the other direction though. UNIX, Linux and MacOS do not accept backslash as a pathname separator.)
However Puce's advice is mostly sound:
It is inadvisable to hard-code paths into your application. Put them into a config file.
Use the NIO2 Path and Paths APIs in preference to File. If you need to assemble paths from their component parts, these APIs offer clean ways to do it while hiding the details of path separators. The APIs are also more consistent than File, and give better diagnostics.
But: if you do need to get the pathname separator, File.separator is an acceptable way to get it. Calling FileSystem.getSeparator() may be better, but you will only see a difference if your application is using different FileSystem objects for different file systems with different separators.
You can use File.separator as you can see in api docs:
https://docs.oracle.com/javase/8/docs/api/java/io/File.html

String changes its value causing a java.io.FileNotFoundException

I am loading a document from a file, this file (which is in a URL) for some reason changes its name causing the java.io.FileNotFoundException.
Although I use a user input I have tried putting the name of the file directly, but it shows the same error.
File input = new File("/example/");
I expect the file name to be /example/, but the debugging shows it to be \example
You are obviously running your code in a Windows OS, which uses '\' as its file path separator character.
File automatically converts file separators ('/' and '\'), no matter what is specified in the String path, to the local file system's separator, thereby using the normalized local form, which is what you are seeing.
Your path is a absolute path, so the example file should be in the root directory. If you are expecting the file to be relative to where you are running your app from, remove the leading / to make it a relative path.

Split two files

I have combined two files in android, using this Linux command
cat file1.png file2.zip > file3.png
How can I split two files again?I just want the zip file to be retrieved separately.
Is there any specific command?I've tried these codes:
unzip file3.png
Replaced png with zip:
unzip file3.zip
but none of them work.
The only application with which I can open the combination, is winrar on windows
And also I tried several unzipping and unraring apps on android but none of them work except RAR app by rarlab
Is there any source for those apps I mentioned to unrar/unzip the file?
Strictly speaking : there's no way.
You might look for the PK 0x04 0x03 as a separator in the answer above, but you don't have any guaranty that this char sequence does not show up in the image data of the file1.png as well.
All together it's a funny question. If you want to split files like this on a regular basis rethink your strategy. If you need it to correct a one time mistake or something, you can split finding the seperator and be ok in over 99% of the cases.
What you need to know is that PNG files start with the hex value 0x89 followed by the text PNG. Zip files start with PK 0x04 0x03. You could write a utility which reads a file and outputs the bytes read in to a new file, using a new output file when a certain file signature is detected.
For a one-of solution, you can use vim, though you have to be careful to stop vim from adding a newline character to the end of the line.
Copy your input file for safety
cp file3.png f1
cp file3.png f2
vim -b f1
and in vim type
:set noeol
search for the start of the zip file
/PK
checking that the sequence found is PK^C^D. If not, look for the next match.
Delete the end of the line from PK with
d$
Move down a line, delete the remainder of the file and save
j
:.,$ d
ZZ
Similarly, delete the top of the file in f2 to get the zip file.
Note: don't name f2 as f2.zip because vim is smart enough to open this as as a zip file, which is not what you want here.
I'm not sure what you trying to accomplish by "hiding" a zip file into a PNG, but if you are trying to make a single file Winrar can open, then that's an odd way to do it.
You do not make a .zip (or any other type of archive) file when you cat a file to the either the start or end of a zip file. That simply appends two binary files together.
The reason winrar can open your "combined" binary file is that it most likely recognizes the file headers and can decipher you have 2 files.
I suggest you look into the usage of the zip command, for how to add files to an archive. I quick search shows, for example
zip -rv zipfile.zip newfile.txt
Will add newfile.txt to zipfile.zip.

Best way to process command line file path argument in Java

I'd like to pass a file path argument to my application in a relative form, e.g ~/test.conf or ../test.conf, but i can't get a proper full file path, though i've tried it with old java.io and new java.nio Files/Paths. Is there a general way to get a resolved file path without large amount of code? It would be fine for the solution to work only in unix envs like OsX or Debian.
Update
With a provided argument like ~/test.conf
in case of getAbsolutePath it returns a path with a prefixed current folder - /Users/currentUser/Projects/Personal/TestProject/~/text.conf. Canonical path returns the same.
The hard bit here is dealing with POSIX home directories, and sure you deal with ~otheruser/dir/test.conf too (if you want to do it properly).
Luckily that's covered in How to handle ~ in file paths.
TL;DR - use something like:
path.replace("~/", System.getProperty("user.home") + "/");
Once you've done that, and as others have commented, you can just use standard java.io methods (including getCanonicalPath()).

Unzip files created with WinZIP with I18N file names?

People these days create their ZIP archives with WinZIP, which allows for internationalized (i.e. non-latin: cyrillic, greek, chinese, you name it) file names.
Sadly, trying to unpack such file causes trouble:
UNIX unzip creates garbage-named files and dirs like "®£¤ ©¤¥èì".
Java and its jar command fails miserably on such archives.
Is there a passable way to unpack such files programmatically? UNIX or Java.
DotNetZip supports unicode and arbitrary encodings for filenames within zipfiles, either for reading or writing zips.
It's a .NET library. For Unix usage, you would need Mono as a pre-requisite.
If the zipfile is correctly constructed by WinZip, in other words if it's compliant with the zip spec from PKWare, then there's no special work you need to do to specify the encoding at the time you unpack it. According to the zip spec, there are two supported encodings used for filenames in zipfiles: UTF-8 and IBM437. The use of one or the other of these encodings is specified in the zip metadata and any zip library can detect and use it. DotNetZip automatically detects it when reading a compliant zip. like this:
using (var zip = ZipFile.Read("thearchive.zip"))
{
foreach (var e in zip)
{
// e.FileName refers to the name on the entry
e.Extract("extract-directory");
}
}
There are archive programs that produce zips that are "non compliant" w.r.t. encoding. WinRar is one - it will create a zip that has filenames encoded in the default encoding in use on the computer. In Shanghai it will use cp950, while in Iceland, something else, and in Lisbon, something else. The advantage to "non compliance" here is that Windows Explorer will open and correctly display i18n-ized filenames in such zips. In other words, "non compliance" is often what people want, because Windows doesn't (yet?) support UTF-8 zip files.
(This all has to do with the encoding used in the zipfile, not the encoding used in the files contained in the zip file)
The zip spec doesn't allow for the specification of an arbitrary text encoding in the zip metadata. In other words if you use cp950 when creating the zip, then your extract logic needs to "know" to use cp950 when extracting - nothing in the zip file carries that information. In addition, of course, the zip library you use to programmatically extract must support arbitrary encodings. As far as I know, Java's zip library does not. DotNetZip does. Like so:
using (ZipFile zip = ZipFile.Read(zipToExtract,
System.Text.Encoding.GetEncoding(950)))
{
foreach (ZipEntry e in zip)
{
e.Extract(extractDirectory);
}
}
DotNetZip can also create zip files with arbitrary encodings - "non compliant" zips.
DotNetZip is free, and open source.
The solution I've found:
Apache commons-compress can unzip such archives just fine, if supplied with correct fallback charset.

Categories

Resources