Recursively finding only directories with FileUtils.listFiles - java

I want to collect a list of all files under a directory, in particular including subdirectories. I like not doing things myself, so I'm using FileUtils.listFiles from Apache Commons IO. So I have something like:
import java.io.File;
import java.util.Collection;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.filefilter.TrueFileFilter;
public class TestListFiles {
public static void main(String[] args) {
Collection<File> found = FileUtils.listFiles(new File("foo"),
TrueFileFilter.INSTANCE, TrueFileFilter.INSTANCE);
for (File f : found) {
System.out.println("Found file: " + f);
}
}
}
Problem is, this only appears to find normal files, not directories:
$ mkdir -p foo/bar/baz; touch foo/one_file
$ java -classpath commons-io-1.4.jar:. TestListFiles
Found file: foo/one_file
I'm already passing TrueFileFilter to both of the filters, so I can't think of anything more inclusive. I want it to list: "foo", "foo/one_file", "foo/bar", "foo/bar/baz" (in any order).
I would accept non-FileUtils solutions as well, but it seems silly to have to write my own BFS, or even to collect the set of parent directories from the list I do get. (That would miss empty subdirectories anyway.) This is on Linux, FWIW.

An old answer but this works for me:
FileUtils.listFilesAndDirs(new File(dir), TrueFileFilter.INSTANCE, DirectoryFileFilter.DIRECTORY);
shows both:
I use:
FileUtils.listFilesAndDirs(new File(dir), new NotFileFilter(TrueFileFilter.INSTANCE), DirectoryFileFilter.DIRECTORY)
Only shows directories and not files...

Have you tried simply:
File rootFolder = new File(...);
File[] folders = rootFolder.listFiles((FileFilter) FileFilterUtils.directoryFileFilter());
It seems to work for me.
You will need recursion, of course.
Hope it helps.

I avoid the Java IO libraries in most of my non-trivial applications, preferring Commons VFS instead. I believe a call to this method with the appropriate params will accomplish your goal, but I'll grant its a long way to go for the functionality.
Specifically, this code will do what you want:
FileObject[] files = fileObject.findFiles(new FileSelector() {
public boolean includeFile(FileSelectInfo fileInfo) {
return fileInfo.getFile().getType() == FileType.FOLDER; }
public boolean traverseDescendents(FileSelectInfo fileInfo) {
return true;
}
});
where fileObject is an instance of FileObject.

If you look at the source code and read between the lines in the JavaDoc, you will see that -- unfortunately -- this API is not designed to do what you want. It will return a list of files (not a list of files and directories) that match the provided arguments. In the source code -- look at the method innerListFiles -- you will see that directories are searched and not added to the result list.
I am not aware of any public API that will do what you want. Hopefully someone else will know of one. Most will probably be a DFS, not a BFS, which may or may not matter for your purposes. (So far, all Java code I've ever looked at that did a directory tree traversal did it via a depth-first search. Which doesn't mean that BFS's aren't out there, of course.)
If you really want a list of everything under a given directory, it's easy enough to roll your own. But I understand your wish to not reinvent the wheel.
Note: It's possible that Apache Commons Finder will support what you need, but this library is in The Commons Sandbox, which means it is more experimental at this stage. It may or may not be complete and it may or may not be maintained. It also may be heavyweight for what you are looking for.

An easier+complete Commons VFS solution:
FileSystemManager fsManager = VFS.getManager();
FileObject fileObject = fsManager.resolveFile( "yourFileNameHere" );
FileObject[] files = fileObject.findFiles( new FileTypeSelector( FileType.FOLDER ) )

It should work, based on their API.
Here is my own version of FileUtils, not as complete as Commons IO, it contains only what I need. Search for findFiles or you can use iterate to avoid creating huge lists(sometime/most of the time you just want to do something with those files so collecting them in a List it doesn't makes sense).

Related

Hadoop setting fsPermission recursively to dir using Java

Hi I have test program which loads files into hdfs at this path user/user1/data/app/type/file.gz Now this test program runs multiple times by multiple users. So I want to set file permission to rwx so that anyone can delete this file. I have the following code
fs.setPermission(new Path("user/user1/data"),new FsPermission(FsAction.ALL,FsAction.ALL,FsAction.ALL))
Above line gives drwxrwxrwx to all dirs but for the file.gz it gives permission as -rw-r--r-- why so? Because of this reason another user apart from me not able to delete this file through test program. I can delete file through test program because I have full permssion.
Please guide. I am new to Hadoop. Thanks in advance.
Using FsShell APIs solved my dir permission problem. It may be not be optimal way but since I am solving it for test code it should be fine.
FsShell shell=new FsShell(conf);
try {
shell.run(new String[]{"-chmod","-R","777","user/usr1/data"});
}
catch ( Exception e) {
LOG.error("Couldnt change the file permissions ",e);
throw new IOException(e);
}
I think Hadoop doesn't provide a java API to provide permission recursively in the version you are using. The code is actually giving permission to the dir user/user1/data and nothing else. You should better use FileSystem.listStatus(Path f) method to list down all files in the directory and use Filsystem.setPermission to them individually. It's working currently on my 2.0.5-alpha-gphd-2.1.0.0 cluster.
However from Hadoop 2.2.0 onward a new constructor
FsPermission(FsAction u, FsAction g, FsAction o, boolean sb)
is being provided. The boolean field may be the recursive flag you wanted. But the documentation(including param name) is too poor to infer something concrete.
Also have a look at Why does "hadoop fs -mkdir" fail with Permission Denied? although your sitaution may be different. (in my case dfs.permissions.enabled is still true).
I wrote this in scala but I think you could easily adapt it:
def changeUserGroup(user:String,fs:FileSystem, path: Path): Boolean ={
val changedPermission = new FsPermission(FsAction.ALL,FsAction.ALL,FsAction.ALL, true)
val fileList = fs.listFiles(path, true)
while (fileList.hasNext()) {
fs.setPermission(fileList.next().getPath(),changedPermission)
}
return true
}
You will also have to add a logic for the error handling, I am always returning true.

Determine the location of a Java package

I need to find the jar from a Java project that provides a certain logical Java package (e.g. com.example.functionality), but there are hundreds of them, and their names aren't particularly useful.
How to find out the mappings that are created between dirs/files/jars and packages/classes?
obj.getClass().getProtectionDomain().getCodeSource()
See: javadoc
You can do it in code:
Class myClass = Class.forName("com.example.functionality");
// eg. /com/example/functionality.class
String classfilePath = '/' + myClass.getName().replace(".", "/") + ".class";
URL location = myClass.getResource(classfilePath);
That URL will be the JAR file (or the class folder if it isn't in a jar).
Slightly hacky though - may not work for all classloaders.
For a one-off search, http://www.jarfinder.com/ is handy. It has in impressive index, which seems to know about everything in Maven Central as well as many other download sites around the web, and lets you search by class name to find which JARs contain that class.

Java 1.4.2 File.listFiles not working properly with CIFS mounts - workaround?

I'm using Java 1.4.2 and Debian 6.0.3. There's a shared Windows folder in the network, which is correctly mounted to /mnt/share/ via fstab (e.g. it's fully visible from OS and allows all operations) using CIFS. However, when I try to do this in Java:
System.out.println(new File("/mnt/share/").listFiles().length)
it would always return 0, meaning File[] returned by listFiles is empty. The same problem applies to every subdirectory of /mnt/share/. list returns empty array as well. Amusingly enough, other File functions like "create", "isDirectory" or even "delete" work fine. Directories mounted from USB flash drive (fat32) also work fine.
I tested this on 2 different "shared folders" from different Windows systems; one using domain-based authentication system, another using "simple sharing" - that is, guest access. The situation seems weird, since mounted directories should become a part of a file system, so any program could use it. Or so I thought, at least.
I want to delete a directory in my program, and I currently see no other way of doing it except recursive walking on listFiles, so this bug becomes rather annoying. The only "workaround" I could think of is to somehow run an external bash script, but it seems like a terrible solution.
Edit: It seems this is 1.4.2-specific bug, everything works fine in Java 6. But I can't migrate, so the problem remains.
Could you suggest some workaround? Preferably without switching to third-party libs instead of native ones, I can't say I like the idea of rewriting the whole project for the sake of single code line.
Since Java 1.2 there is method File.getCanonicalFile(). In your case with mounted directory you should use exactly this one in such style:
new File("/mnt/share/").getCanonicalFile().listFiles()
So, two and half years later after giving up I encounter the same problem, again stuck with 1.4.2 because I need to embed the code into obsolete Oracle Forms 10g version.
If someone, by chance, stumbles onto this problem and decides to solve it properly, not hack his way through, it most probably has to do with (highly) unusual inode mapping that CIFS does upon mounting the remote filesystem, causing more obscure bugs some of which can be found on serverfault. One of the side-effects of such mapping is that all directories have zero hard-link count. Another one is that all directories have "size" of exactly 0, instead of usual "sector size or more", which can be checked even with ls.
I can't be sure without examining the (proprietary) source code, but I can guess that Java prior to 1.5 used some shortcut like checking link count internally instead of actually calling readdir() with C, which works equally well for any mounted FS.
Anyway, the second side-effect can be used to create a simple wrapper around File which won't rely on system calls unless it suspects a directory is mounted using CIFS. Other versions of list and listFiles functions in java.io.File, even ones using filters, rely on list() internally, so it's OK to override only it.
I didn't care about listFiles returning File[] not FileEx[] so I didn't bother to override it, but is should be simple enough. Obviously, that code can work only in Unix-like systems having ls command handy.
package FSTest;
import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
public class FileEx extends File
{
public FileEx(String path)
{
super(path);
}
public FileEx(File f)
{
super(f.getAbsolutePath());
}
public String[] list()
{
if (this.canRead() && this.isDirectory())
{
/*
* Checking the length of dir is not the most reliable way to distinguish CIFS mounts.
* However, zero directory length generally indicates something unusual,
* so calling ls on it wouldn't hurt. Ordinary directories don't suffer any overhead this way.
* If this "zero-size" behavior is ever changed by CIFS but list() still won't work,
* it will be safer to call super.list() first and call this.listUsingExec if returned array has 0 elements.
* Though it might have serious performance implications, of course.
*/
if (this.length() > 0)
return super.list();
else
return this.listUsingExec();
}
else
return null;
}
private String[] listUsingExec()
{
Process p;
String command = "/bin/ls -1a " + this.getAbsolutePath();
ArrayList list = new ArrayList();
try
{
p = Runtime.getRuntime().exec(command);
p.waitFor();
BufferedReader reader = new BufferedReader(new InputStreamReader(p.getInputStream()));
for (String line = reader.readLine(); line != null; line = reader.readLine())
{
if (!line.equalsIgnoreCase(".") && !line.equalsIgnoreCase(".."))
list.add(line);
}
String[] ret = new String[list.size()];
list.toArray(ret);
return ret;
}
catch (IOException e)
{
return null;
}
}
}

How can I make OS X recognize drive letters?

I know. Heresy. But I'm in a bind. I have a lot of config files that use absolute path names, which creates an incompatibility between OS X and Windows. If I can get OS X (which I'm betting is the more flexible of the two) to recognize Q:/foo/bar/bim.properties as a valid absolute file name, it'll save me days of work spelunking through stack traces and config files.
In the end, I need this bit of Java test code to print "SUCCESS!" when it runs:
import java.io.*;
class DriveLetterTest {
static public void main(String... args) {
File f = new File("S:");
if (f.isDirectory()) {
System.out.println("SUCCESS!");
} else {
System.out.println("FAIL!");
}
}
}
Anyone know how this can be done?
UPDATE: Thanks for all the feedback, everyone. It's now obvious to me I really should have been clearer in my question.
Both the config files and the code that uses them belong to a third-party package I cannot change. (Well, I can change them, but that means incurring an ongoing maintenance load, which I want to avoid if at all possible.)
I'm in complete agreement with all of you who are appalled by this state of affairs. But the fact remains: I can't change the third-party code, and I really want to avoid forking the config files.
Short answer: No.
Long answer: For Java you should use System.getProperties(XXX).
Then you can load a Properties file or Configuration based on what you find in os.name.
Alternate Solution just strip off the S: when you read the existing configuration files on non-Windows machines and replace them with the appropriate things.
Opinion: Personally I would bite the bullet and deal with the technical debt now, fix all the configuration files at build time when the deployment for OSX is built and be done with it.
public class WhichOS
{
public static void main(final String[] args)
{
System.out.format("System.getProperty(\"os.name\") = %s\n", System.getProperty("os.name"));
System.out.format("System.getProperty(\"os.arch\") = %s\n", System.getProperty("os.arch"));
System.out.format("System.getProperty(\"os.version\") = %s\n", System.getProperty("os.version"));
}
}
the output on my iMac is:
System.getProperty("os.name") = Mac OS X
System.getProperty("os.arch") = x86_64
System.getProperty("os.version") = 10.6.4
Honestly, don't hard-code absolute paths in a program, even for a single-platform app. Do the correct thing.
The following is my wrong solution, saved to remind myself not to repeat giving a misdirected advice ... shame on me.
Just create a symbolic link named Q: just at the root directory / to / itself.
$ cd /
$ ln -s / Q:
$ ln -s / S:
You might need to use sudo. Then, at the start of your program, just chdir to /.
If you don't want Q: and S: to show up in the Finder, perform
$ /Developer/Tools/SetFile -P -a V Q:
$ /Developer/Tools/SetFile -P -a V S:
which set the invisible-to-the-Finder bit of the files.
The only way you can replace java.io.File is to replace that class in rt.jar.
I don't recommend that, but the best way to do this is to grab a bsd-port of the OpenJDK code, make necessary changes, build it and redistribute the binary with your project. Write a shell script to use your own java binary and not the built-in one.
PS. Just change your config files! Practice your regex skills and save yourself a lot of time.
If you are not willing to change your config file per OS, what are they for in first place?
Every installation should have its own set of config files and use it accordingly.
But if you insist.. you just have to detect the OS version and if is not Windows, ignore the letter:
Something along the lines:
boolean isWindows = System.getProperty("os.name").toLowerCase()
.contains("windows");
String folder = "S:";
if (isWindows && folder.matches("\\w:")) {
folder = "/";
} else if (isWindows && folder.matches("\\w:.+")) {
folder = folder.substring(2);// ignoring the first two letters S:
}
You get the idea
Most likely you'd have to provide a different java.io.File implementation that can parse out the file paths correctly, maybe there's one someone already made.
The real solution is to put this kind of stuff (hard-coded file paths) in configuration files and not in the source code.
Just tested something out, and discovered something interesting: In Windows, if the current directory is on the same logical volume (i.e. root is the same drive letter), you can leave off the drive letter when using a path. So you could just trim off all those drive letters and colons and you should be fine as long as you aren't using paths to items on different disks.
Here's what I finally ended up doing:
I downloaded the source code for the java.io package, and tweaked the code for java.io.File to look for path names that start with a letter and a colon. If it finds one, it prepends "/Volumes/" to the path name, coughs a warning into System.err, then continues as normal.
I've added symlinks under /Volumes to the "drives" I need mapped, so I have:
/Volumes/S:
/Volumes/Q:
I put it into its own jar, and put that jar at the front of the classpath for this project only. This way, the hack affects only me, and only this project.
Net result: java.io.File sees a path like "S:/bling.properties", and then checks the OS. If the OS is OS X, it prepends "/Volumes/", and looks for a file in /Volumes/S:/bling.properties, which is fine, because it can just follow the symlink.
Yeah, it's ugly as hell. But it gets the job done for today.

Unzip Archive with Groovy

is there a built-in support in Groovy to handle Zip files (the groovy way)?
Or do i have to use Java's java.util.zip.ZipFile to process Zip files in Groovy ?
Maybe Groovy doesn't have 'native' support for zip files, but it is still pretty trivial to work with them.
I'm working with zip files and the following is some of the logic I'm using:
def zipFile = new java.util.zip.ZipFile(new File('some.zip'))
zipFile.entries().each {
println zipFile.getInputStream(it).text
}
You can add additional logic using a findAll method:
def zipFile = new java.util.zip.ZipFile(new File('some.zip'))
zipFile.entries().findAll { !it.directory }.each {
println zipFile.getInputStream(it).text
}
In my experience, the best way to do this is to use the Antbuilder:
def ant = new AntBuilder() // create an antbuilder
ant.unzip( src:"your-src.zip",
dest:"your-dest-directory",
overwrite:"false" )
This way you aren't responsible for doing all the complicated stuff - ant takes care of it for you. Obviously if you need something more granular then this isn't going to work, but for most 'just unzip this file' scenarios this is really effective.
To use antbuilder, just include ant.jar and ant-launcher.jar in your classpath.
AFAIK, there isn't a native way. But check out this article on how you'd add a .zip(...) method to File, which would be very close to what you're looking for. You'd just need to make an .unzip(...) method.
The Groovy common extension project provides this functionality for Groovy 2.0 and above: https://github.com/timyates/groovy-common-extensions
The below groovy methods will unzip into specific folder (C:\folder). Hope this helps.
import org.apache.commons.io.FileUtils
import java.nio.file.Files
import java.nio.file.Paths
import java.util.zip.ZipFile
def unzipFile(File file) {
cleanupFolder()
def zipFile = new ZipFile(file)
zipFile.entries().each { it ->
def path = Paths.get('c:\\folder\\' + it.name)
if(it.directory){
Files.createDirectories(path)
}
else {
def parentDir = path.getParent()
if (!Files.exists(parentDir)) {
Files.createDirectories(parentDir)
}
Files.copy(zipFile.getInputStream(it), path)
}
}
}
private cleanupFolder() {
FileUtils.deleteDirectory(new File('c:\\folder\\'))
}
This article expands on the AntBuilder example.
http://preferisco.blogspot.com/2010/06/using-goovy-antbuilder-to-zip-unzip.html
However, as a matter of principal - is there a way to find out all of the properties, closures, maps etc that can be used when researching a new facet in groovy/java?
There seem to be loads of really useful things, but how to unlock their hidden treasures? The NetBeans/Eclipse code-complete features now seem hopelessly limited in the new language richness that we have here.
Unzip using AntBuilder is good way.
Second option is use an third party library - I recommend Zip4j
Although taking the question a bit into another direction, I started off using Groovy for a DSL that I was building, but ended up using Gradle as a starting point to better handle a lot of the file-based tasks that I wanted to do (eg., unzip and untar files, execute other programs, etc). Gradle builds on what groovy can do, and can be extended further via plugins.
// build.gradle
task doUnTar << {
copy {
// tarTree uses file ext to guess compression, or may be specific
from tarTree(resources.gzip('foo.tar.gz'))
into getBuildDir()
}
}
task doUnZip << {
copy {
from zipTree('bar.zip')
into getBuildDir()
}
}
Then, for example (this extracts the bar.zip and foo.tgz into the directory build):
$ gradle doUnZip
$ gradle doUnTar
def zip(String s){
def targetStream = new ByteArrayOutputStream()
def zipStream = new GZIPOutputStream(targetStream)
zipStream.write(s.getBytes())
zipStream.close()
def zipped = targetStream.toByteArray()
targetStream.close()
return zipped.encodeBase64()
}

Categories

Resources