Search a string in compressed files in compressed file

Search a string in compressed files in compressed file - java

I have a compressed file (an EAR), which contains several compressed files (EAR's, WAR's and JAR's), which also may contain compressed files (JAR's.). Is there a way to find a specific string in this structure using UNIX commands without manually decompressing them one by one?
Thank You.

With the Java jar executable (jar tf yourEar.ear), you coud list all contained files in in the EAR in the standard output.
But it doesn't list recursively jars.
So you could :
chain this result to a grep that specifies the searched string in the filename.
for each filename ended with .jar listed in the output, you could reuse the same logic.

Related

How do I move a file into a compiled .jar file using Java?

I'm making a file import system, and I can't move files into the compiled .jar file the application is in.
Here's what I'm trying to do:
Path FROM = Paths.get(filePath.getText());
Path TO = Paths.get("C:\\Users\\" + System.getProperty("user.name") +
"\\AppData\\Roaming\\.minecraft\\mods\\music_crafter-1.0\\src\\main\\resources\\assets\\music_crafter\\sounds\\block\\music_player");
//jar file
Files.move(FROM, TO.resolve(FROM.getFileName()), StandardCopyOption.REPLACE_EXISTING);

You need to handle the jar file internally. A Jar is not a directory, it is a compressed container file (pretty much a ZIP file with a different extension).
To do this, given that you are on Java 6, you have 2 options:
Unzip the contents to a temporary working directory (there are built
in APIs for this, or use a library such as Apache Commons Compress)
do your work (copying, deleting, etc) and then re-zip.
Make external command line calls to the Jar utilities that come with
Java
Of those, only (1) makes any real sense.
A third option would be available if you could up your Java to 7+ which would be:
3. Use a Zip File System Provider to to treat it as a file system in code
All that said, however:
As per comments on your question, you really might want to look at if this something you need to do at all? Why do you need to insert into existing jars? If this is 'external' data, it would be much better in a separate resource location/container, not the application jar.

JAR - Listing files into a folder

I would like to get a list of file contained in a directory which is in a jar package.
I have an "images" folder, within it I have an Images class that should load all images from that directory.
In the past i used the MyClass.class.getResourceAsStream("filename"); to read files, but how do I read a directory?
This is what I tried:
System.out.println(Images.class.getResource("").getPath());
System.out.println(new File(Images.class.getResource("").getPath()).listFiles());
I tried with Images.class.getResource because I have to work with File and there isn't a constructor that accepts an InputStream.
The code produces
file:/home/k55/Java/MyApp/dist/Package.jar!/MyApp/images/
null
So it is finding the folder which I want to list files from, but it is not able to list files.
I've read on other forums that in fact you can't use this method for folders in a jar archive, so how can I accomplish this?
Update: if possible, i would like to read files without having to use the ZipInputStream

You can't do that easily.
What you need to do:
Get the path of the jar file.
Images.class.getResource("/something/that/exists").getPath()
Strip "!/something/that/exists".
Use Zip File System to browse the Jar file.
It's a little bit of hacking.

Java Desktop Application - how to obtain list of files with similar file names in a specific folder

I have a Java Desktop Application, in which at an intermediate stage, some files with the following file names are generated
file-01-1.xml
file-01-2.xml
file-01-3.xml
and so on.
The number of files with such names is variable, I want to determine the total number of files of above type of name, so that I can then do further processing on these files. (THese files are generated by a DOS command which does not give number of files generated in its output/number of files generated varies depending on input file, hence this problem).

You can implement pure java solution using File.list(), File.listFiles() combining them with FileFilter. Both methods return arrays, so you can retrieve the array length to get number of files.
This method might be ineffective if number of files is very big (e.g. thousands). In this case I'd suggest you to implement platform specific solution by executing external command like sh ls file* | wc -l.

You can use a custom FilenameFilter when listing file in the output folder.
See: http://download.oracle.com/javase/6/docs/api/java/io/File.html#list%28java.io.FilenameFilter%29

Use File.listFiles(FilenameFilter) or similar methods.

How to get rid of Thumbs.db file in Windows NTFS file system?

My web based java application storing files in Local Drive(E.g: D:/AppData). It's scanning a folder for files(String[] nameOfFiles = dirName.list();) and displays all the files in the folder. The Thumbs.db also coming with them. How to omit that file? For now, i am deleting it before scanning the folder.
Is there any other way in java to skip that file from scanning?

Assuming that dirName is a File object then File.list() has an overloaded member that takes a FilenameFilter object which can be used to filter the list of files returned.

Unzip files created with WinZIP with I18N file names?

People these days create their ZIP archives with WinZIP, which allows for internationalized (i.e. non-latin: cyrillic, greek, chinese, you name it) file names.
Sadly, trying to unpack such file causes trouble:
UNIX unzip creates garbage-named files and dirs like "®£¤ ©¤¥èì".
Java and its jar command fails miserably on such archives.
Is there a passable way to unpack such files programmatically? UNIX or Java.

DotNetZip supports unicode and arbitrary encodings for filenames within zipfiles, either for reading or writing zips.
It's a .NET library. For Unix usage, you would need Mono as a pre-requisite.
If the zipfile is correctly constructed by WinZip, in other words if it's compliant with the zip spec from PKWare, then there's no special work you need to do to specify the encoding at the time you unpack it. According to the zip spec, there are two supported encodings used for filenames in zipfiles: UTF-8 and IBM437. The use of one or the other of these encodings is specified in the zip metadata and any zip library can detect and use it. DotNetZip automatically detects it when reading a compliant zip. like this:
using (var zip = ZipFile.Read("thearchive.zip"))
{
foreach (var e in zip)
{
// e.FileName refers to the name on the entry
e.Extract("extract-directory");
}
}
There are archive programs that produce zips that are "non compliant" w.r.t. encoding. WinRar is one - it will create a zip that has filenames encoded in the default encoding in use on the computer. In Shanghai it will use cp950, while in Iceland, something else, and in Lisbon, something else. The advantage to "non compliance" here is that Windows Explorer will open and correctly display i18n-ized filenames in such zips. In other words, "non compliance" is often what people want, because Windows doesn't (yet?) support UTF-8 zip files.
(This all has to do with the encoding used in the zipfile, not the encoding used in the files contained in the zip file)
The zip spec doesn't allow for the specification of an arbitrary text encoding in the zip metadata. In other words if you use cp950 when creating the zip, then your extract logic needs to "know" to use cp950 when extracting - nothing in the zip file carries that information. In addition, of course, the zip library you use to programmatically extract must support arbitrary encodings. As far as I know, Java's zip library does not. DotNetZip does. Like so:
using (ZipFile zip = ZipFile.Read(zipToExtract,
System.Text.Encoding.GetEncoding(950)))
{
foreach (ZipEntry e in zip)
{
e.Extract(extractDirectory);
}
}
DotNetZip can also create zip files with arbitrary encodings - "non compliant" zips.
DotNetZip is free, and open source.

The solution I've found:
Apache commons-compress can unzip such archives just fine, if supplied with correct fallback charset.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.