My task is to show a tree of all directories/files of a PC drive,
I have a class DirectoryNode that extends DefaultMutableTreeNode with File field directoryPath. I build nodes recursively:
public void buildDirectoryTree(){
if(!directoryPath.isDirectory()){
return;
}
for(File f : directoryPath.listFiles()){
if(f.isHidden() || !f.exists()) continue;
DirectoryNode newChild = new DirectoryNode(f);
add(newChild);
newChild.buildDirectoryTree();
}
}
It works fine for concrete directories, but when I try to use it for whole drive, or some large directories, JTree with this node does not show up at all
I think it encounters a problem with specific directories. I've add exists and is Hidden checks to skip this problem roots, but it didn't help.
In addition, exists, isHidden and isDirectory return false for some of my valid directories directories (I am using Windows 10).
File.listFiles() is one of those ancient methods that violates Java's convention/good practice to never return null from a method that returns an array. So you have to check for null.
From the docs:
An array of abstract pathnames denoting the files and directories in
the directory denoted by this abstract pathname. The array will be
empty if the directory is empty. Returns null if this abstract
pathname does not denote a directory, or if an I/O error occurs.
I have changed your code to make it a little safer. If it's called from the EDT, you might want to add some log message or the like instead of throwing the exception into nirvana.
public void buildDirectoryTree() throws IOException {
if (!directoryPath.isDirectory()) {
return;
}
final File[] files = directoryPath.listFiles();
if (files != null) {
for (File f : files) {
if (f.isHidden() || !f.exists()) continue;
DirectoryNode newChild = new DirectoryNode(f);
add(newChild);
newChild.buildDirectoryTree();
}
} else {
throw new IOException("Failed to list files for " + directoryPath);
}
}
As others have pointed out, there are more modern APIs and they have been introduced for good reasons. I recommend to read up on NIO2 and the Path APIs for better solutions.
Related
Before you speculate something like "This guy is asking for homework help", I'll go ahead and clear any doubts you may have and say yes, this is related to homework. However, I hope that does not take away from the learning that this question provides to me and/or anyone who reads this in the future.
Background: We're currently working on recursion and our assignment asks that we write a program that uses command arguments to check a directory and its file contents for a string(that is also a command argument). We must use recursion for this.
-I want to make this clear that I UNDERSTAND WHAT THE ASSIGNMENT IS ASKING
I am simply asking, how would this work recursively because I just don't get it.
We did a problem where we had to find the size of a directory and it made sense, but I don't get how to check if something is a directory or file and based on that we read its contents or go deeper into the directory until we find a file.
Here's what I've currently done. Not too sure how wrong this is as I'm basing entirely off of the 'check the size of a directory' assignment we previously did:
The folder that I'm checking is something like this:
Directory ---> files --inside main directory --->> Two directories ----> files within both of those directories
public class SearchingForStrings {
public static void main(String[] args) {
String path = "."; // default location of this project
File sf = new File(path);
String mysteriesDirectory = args[0];
String keyString = args[1];
countLinesWithString(sf, mysteriesDirectory, keyString);
}
public static int countLinesWithString(File startPath, String mysteriesDirectory, String keyString) {
if(!startPath.exists()) {
throw new IllegalArgumentException("File " + startPath + " does not exist!");
} else if(startPath.isFile()) {
return Integer.parseInt(startPath.getAbsolutePath()); // Just to show where the file is I located the parsing is just to stop an error from flagging on this part; Going to ask professor if it's okay with him
// this is where we would begin reading the contents of the files
} else if(startPath.isDirectory()) {
// This is where our recursion would take place: essentially
// we will be going 'deeper' into the directory until we find a file
//File[] subFiles = startPath.listFiles();
countLinesWithString(startPath, mysteriesDirectory, keyString);
} else {
throw new IllegalStateException("Unknown file type: " + startPath);
}
}
}
In short: Could someone explain how recursion would work if you wanted to go deeper into a director(y/ies)?
I'll give this a try. It's something that is easier to explain than to understand.
The recursive method, on which you have made a decent start, might be documented as follows:
"For a given directory: for each file in the directory, count all the lines which contain a given string; for each directory in the directory, recurse".
The recursion is possible - and useful - because your original target is a container, and one of the types of things it can contain is another container.
So think of the counting method like this:
int countLines(dir, string) // the string could be an instance variable, also, and not passed in
{
var countedLines = 0;
for each item in dir:
if item is file, countedLines += matchedLinesInFile(item, string);
else if item is dir, countedLines += countLines(item, string);
else throw up; // or throw an exception -- your choice
}
then call countLines from an exterior method with the original dir to use, plus the string.
One of the things that trips people up about recursion is that, after you get it written, it doesn't seem possible that it can do all that it does. But think through the above for different scenarios. If the dir passed in has files and no dirs, it will accumulate countedLines for each file in the dir, and return the result. That's what you want.
If the dir does contain other dirs, then for each one of those, you're going to call the routine and start on that contained dir. The call will accumulate countedLines for each file in that dir, and call itself for each dir recursively down the tree, until it reaches a dir that has no dirs in it. And it still counts lines in those, it just doesn't have any further down to recurse.
At the lowest level, it is going to accumulate those lines and return them. Then the second-lowest level will get that total to add to its total, and start the return trips back up the recursion tree.
Does that explain it any better?
Just help you get started with recursion check this :
It will recursively go from base directory printing all the folders and files.
Modify this to your requirements. Try and let us know.
import java.io.File;
public class Test {
public static void getResource(final String resourcePath) {
File file = new File(resourcePath);
if (file.isFile()) {
System.out.println("File Name : " + file.getName());
return;
} else {
File[] listFiles = file.listFiles();
if (listFiles != null) {
for (File resourceInDirectory : listFiles) {
if (!resourceInDirectory.isFile()) {
System.out.println("Folder "
+ resourceInDirectory.getAbsolutePath());
getResource(resourceInDirectory.getAbsolutePath());
} else {
getResource(resourceInDirectory.getAbsolutePath());
}
}
}
}
}
public static void main(String[] args) {
final String folderPath = "C:/Test";
getResource(folderPath);
}
}
My program collect all path to files on the computer(OS Ubuntu) to one Map.
The key in the Map is a file size and value is list of canonical path to files the size is equal to key.
Map<Long, ArrayList<String>> map = new HashMap<>(100000);
Total number of files on computer is: 281091
A method that collects the files, it is recursively.
private void scanner(String path) throws Exception {
File[] dirs = new File(path).listFiles(new FileFilter() {
#Override
public boolean accept(File file) {
if (file.isFile() && file.canRead()) {
long size = file.length();
String canonPath = file.getCanonicalPath();
if (map.containsKey(size))
map.get(size).add(canonPath);
else map.put(size, new ArrayList<>(Arrays.asList(canonPath)));
return false;
}
return file.isDirectory() && file.canRead();
}
});
for (File dir : dirs) {
scanner(dir.getCanonicalPath());
}
}
When I begin start scanning from the root folder "/" have exception:
Exception in thread "main" java.lang.StackOverflowError
at java.io.UnixFileSystem.canonicalize0(Native Method)
at java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:172)
at java.io.File.getCanonicalPath(File.java:589)
at taskB.FileScanner.setCanonPath(FileScanner.java:49)
at taskB.FileScanner.access$000(FileScanner.java:12)
at taskB.FileScanner$1.accept(FileScanner.java:93)
at java.io.File.listFiles(File.java:1217)
at taskB.FileScanner.scanner(FileScanner.java:85)
at taskB.FileScanner.scanner(FileScanner.java:109)
at taskB.FileScanner.scanner(FileScanner.java:109)
...
But for test I filled directory "~/Documents" more than 400~ thousand files and began to scanning from it. Everything works fine.
Why when the program starts from the root directory "/" where less 300 thousand files I have exception? What should I do to prevent this was?
StackOverflow means that you called so many nested functions that your program ran out of space in memory for the function call information (retained for after returning from the call). In your case I suspect that it is due to parsing the "." (current directory) and ".." (parent directory) entries when returned in the directory list, thus you recurse into the same directory more than once.
The most likely explanation is that you have a symbolic link somewhere in the filesystem that creates a cycle (an infinite loop). For example, the following would be a cycle
/home/userid/test/data -> /home/userid
While scanning files you need to ignore symbolic links to directories.
#Jim Garrison was right, this was due to symbolic links. Solve their problems I found here.
I use the isSymbolicLink(Path) method.
return file.isDirectory() && file.canRead() && !Files.isSymbolicLink(file.toPath());
My question is: if this two functions have something different? I mean I know that they return something different, but is it possible that number of elements in one would be different then in the second one. I will try to explain. I implemented TreeModel for one of my class trying to make nice view on files on the PC basing on JTree. So here is the part of it:
public Object getChild(Object parent, int index) {
File[] children = ((File) parent).listFiles();
if(children == null || index < 0 || index >= children.length) {
return null;
}
File result = new MyFile(children[index]);
return result;
}
public int getChildCount(Object parent) {
//---
//String[] children = ((File)parent).list();
File[] children = ((File)parent).listFiles();
//---
if(children == null) {
return 0;
}
return children.length;
}
I marked interesting code. If I changed this two lines for this commented one, sometimes I get NullPointerException after loading TreeModel: jtree.setModel(treeModel);. This uncommented does not cause any trouble. I checked the docs and it says nothing unusual including returning null by both methods. What is going on here?
Both methods do essentially the same, look at http://www.docjar.com/html/api/java/io/File.java.html for details.
As already pointed, but clarified only within the comments in post from D.R
list method returns String array with filenames (files and
directories)
listFiles return array of class File of the same
See doc pages, eg. https://docs.oracle.com/javase/7/docs/api/java/io/File.html
String[] list()
Returns an array of strings naming the files and directories in the directory denoted by this abstract pathname.
File[] listFiles()
Returns an array of abstract pathnames denoting the files in the directory denoted by this abstract pathname.
I am not sure why both methods exist, probably String array will be faster and less memory consuming than File array one
I need to create app which uses non-recursive walk through filesystem and prints out files which are on a certain depth.
What I have:
public void putFileToQueue() throws IOException, InterruptedException {
File root = new File(rootPath).getAbsoluteFile();
checkFile(root, depth);
Queue<DepthControl> queue = new ArrayDeque<DepthControl>();
DepthControl e = new DepthControl(0, root);
do {
root = e.getFileName();
if (root.isDirectory()) {
File[] files = root.listFiles();
if (files != null)
for (File file : files) {
if (e.getDepth() + 1 <= depth && file.isDirectory()) {
queue.offer(new DepthControl(e.getDepth() + 1,file));
}
if (file.getName().contains(mask)) {
if (e.getDepth() == depth) {
System.out.println(Thread.currentThread().getName()
+ " putting in queue: "
+ file.getAbsolutePath());
}
}
}
}
e = queue.poll();
} while (e != null);
}
And helper class
public class DepthControl {
private int depth;
private File file;
public DepthControl(int depth, File file) {
this.depth = depth;
this.file = file;
}
public File getFileName() {
return file;
}
public int getDepth() {
return depth;
}
}
I received answer, that this program uses additional memory because of Breadth-first search(hope right translation). I have O(k^n), where k - average amount of subdirectories, n - depth. This program could be easily done with O(k*n). Please help me to fix my algorithm.
I think this should do the job and is a bit simpler. It just keeps track of files at next level, expands them, then repeats the process. The algorithm itself keeps track of depth so there is no need for that extra class.
// start in home directory.
File root = new File(System.getProperty("user.dir"));
List<File> expand = new LinkedList<File>();
expand.add(root);
for (int depth = 0; depth < 10; depth++) {
File[] expandCopy = expand.toArray(new File[expand.size()]);
expand.clear();
for (File file : expandCopy) {
System.out.println(depth + " " + file);
if (file.isDirectory()) {
expand.addAll(Arrays.asList(file.listFiles()));
}
}
}
In Java 8, you can use stream, Files.walk and a maxDepth of 1
try (Stream<Path> walk = Files.walk(Paths.get(filePath), 1)) {
List<String> result = walk.filter(Files::isRegularFile)
.map(Path::toString).collect(Collectors.toList());
result.forEach(System.out::println);
} catch (IOException e) {
e.printStackTrace();
}
To avoid recursion when walking a tree there are basically two options:
Use a "work list" (similar to the above) to track work to be done. As each item is examined new work items that are "discovered" as a result are added to the work list (can be FIFO, LIFO, or random order -- doesn't matter conceptually though it will often affect "locality of reference" for performance).
Use a stack/"push down list" so essentially simulate the recursive scheme.
For #2 you have to write an algorithm that is something of a state machine, returning to the stack after every step to determine what to do next. The stack entries, for a tree walk, basically contain the current tree node and the index into the child list of the next child to examine.
If you're using Java 7, there is a very elegant method of walking file trees. You'll need to confirm whether it meets your needs recursion wise though.
import java.nio.file.*;
import java.nio.file.attribute.BasicFileAttributes;
import static java.nio.file.FileVisitResult.*;
public class myFinder extends SimpleFileVisitor<Path> {
public FileVisitResult visitFile(Path file, BasicFileAttributes attr) { }
public FileVisitResult postVisitDirectory(Path dir, IOException exc) { }
public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) { }
public FileVisitResult visitFileFailed(Path file, IOException exc) { }
<snip>
}
Essentially it does a depth first walk of the tree and calls certain methods when it enters/exits directories and when it "visits" a file.
I believe this to be specific to Java 7 though.
http://docs.oracle.com/javase/tutorial/essential/io/walk.html
Assuming you want to limit the amount of space used and:
you can assume the list of files/directories is static over the course of your traversal, AND
you can assume the list of files/directories in a give directory are always returned in the same order
you have access to the parent of the current directory
Then you can traverse the directory using only the information of the last node visited. Specifically, something along the lines of
1. Keep track of the last Entry (directory or file) visited
2. Keep track of the current directory
3. Get a list of files in the current directory
4. Find the index of the last Entry visited in the list of files
5. If lastVisited is the last Entry in the current directory,
5.1.1 If current directory == start directory, we're done
5.1.2 Otherwise, lastVisited = the current directory and current directory is the parent directory
5.2. Otherwise, visit the element after lastVisited and set lastVisited to that element
6. Repeat from step 3
If I can, I'll try to write up some code to show what I mean tomorrow... but I just don't have the time right now.
NOTE: This isn't a GOOD way to traverse the directory structure... its just a possible way. Outside the normal box, and probably for good reason.
You'll have to forgive me for not giving sample code in Java, I don't have the time to work on that atm. Doing it in Tcl is faster for me and it shouldn't be too hard to understand. So, that being said:
proc getFiles {dir} {
set result {}
foreach entry [glob -tails -directory $dir * .*] {
if { $entry != "." && $entry != ".." } {
lappend result [file join $dir $entry]
}
}
return [lsort $result]
}
proc listdir {startDir} {
if {! ([file exists $startDir] && [file isdirectory $startDir])} {
error "File '$startDir' either doesn't exist or isnt a directory"
}
set result {}
set startDir [file normalize $startDir]
set currDir $startDir
set currFile {}
set fileList [getFiles $currDir]
for {set i 0} {$i < 1000} {incr i} { # use for to avoid infinate loop
set index [expr {1 + ({} == $currFile ? -1 : [lsearch $fileList $currFile])}]
if {$index < ([llength $fileList])} {
set currFile [lindex $fileList $index]
lappend result $currFile
if { [file isdirectory $currFile] } {
set currDir $currFile
set fileList [getFiles $currDir]
set currFile {}
}
} else {
# at last entry in the dir, move up one dir
if {$currDir == $startDir} {
# at the starting directory, we're done
return $result
}
set currFile $currDir
set currDir [file dirname $currDir]
set fileList [getFiles $currDir]
}
}
}
puts "Files:\n\t[join [listdir [lindex $argv 0]] \n\t]"
And, running it:
VirtualBox:~/Programming/temp$ ./dirlist.tcl /usr/share/gnome-media/icons/hicolor
Files:
/usr/share/gnome-media/icons/hicolor/16x16
/usr/share/gnome-media/icons/hicolor/16x16/status
/usr/share/gnome-media/icons/hicolor/16x16/status/audio-input-microphone-high.png
/usr/share/gnome-media/icons/hicolor/16x16/status/audio-input-microphone-low.png
/usr/share/gnome-media/icons/hicolor/16x16/status/audio-input-microphone-medium.png
/usr/share/gnome-media/icons/hicolor/16x16/status/audio-input-microphone-muted.png
/usr/share/gnome-media/icons/hicolor/22x22
[snip]
/usr/share/gnome-media/icons/hicolor/48x48/devices/audio-subwoofer-testing.svg
/usr/share/gnome-media/icons/hicolor/48x48/devices/audio-subwoofer.svg
/usr/share/gnome-media/icons/hicolor/scalable
/usr/share/gnome-media/icons/hicolor/scalable/status
/usr/share/gnome-media/icons/hicolor/scalable/status/audio-input-microphone-high.svg
/usr/share/gnome-media/icons/hicolor/scalable/status/audio-input-microphone-low.svg
/usr/share/gnome-media/icons/hicolor/scalable/status/audio-input-microphone-medium.svg
/usr/share/gnome-media/icons/hicolor/scalable/status/audio-input-microphone-muted.svg
And - of course - there's always the multi-threaded option to avoid recursion.
Create a queue of files.
If it's a file add it to the queue.
If it's a folder start a new thread to list files in it that also feeds this queue.
Get next item.
Repeat from 2 as necessary.
Obviously this may not list the files in a predictable order.
I need to determine if a user-supplied string is a valid file path (i.e., if createNewFile() will succeed or throw an Exception) but I don't want to bloat the file system with useless files, created just for validation purposes.
Is there a way to determine if the string I have is a valid file path without attempting to create the file?
I know the definition of "valid file path" varies depending on the OS, but I was wondering if there was any quick way of accepting C:/foo or /foo and rejecting banana.
A possible approach may be attempting to create the file and eventually deleting it if the creation succeeded, but I hope there is a more elegant way of achieving the same result.
Path class introduced in Java 7 adds new alternatives, like the following:
/**
* <pre>
* Checks if a string is a valid path.
* Null safe.
*
* Calling examples:
* isValidPath("c:/test"); //returns true
* isValidPath("c:/te:t"); //returns false
* isValidPath("c:/te?t"); //returns false
* isValidPath("c/te*t"); //returns false
* isValidPath("good.txt"); //returns true
* isValidPath("not|good.txt"); //returns false
* isValidPath("not:good.txt"); //returns false
* </pre>
*/
public static boolean isValidPath(String path) {
try {
Paths.get(path);
} catch (InvalidPathException | NullPointerException ex) {
return false;
}
return true;
}
Edit:
Note Ferrybig's
comment : "The only disallowed character in a file name on Linux is the NUL character, this does work under Linux."
This would check for the existance of the directory as well.
File file = new File("c:\\cygwin\\cygwin.bat");
if (!file.isDirectory())
file = file.getParentFile();
if (file.exists()){
...
}
It seems like file.canWrite() does not give you a clear indication if you have permissions to write to the directory.
File.getCanonicalPath() is quite useful for this purpose. IO exceptions are thrown for certain types of invalid filenames (e.g. CON, PRN, *?* in Windows) when resolving against the OS or file system. However, this only serves as a preliminary check; you will still need to handle other failures when actually creating the file (e.g. insufficient permissions, lack of drive space, security restrictions).
A number of things can go wrong when you try and create a file:
Your lack the requisite permissions;
There is not enough space on the device;
The device experiences an error;
Some policy of custom security prohibits you from creating a file of a particular type;
etc.
More to the point, those can change between when you try and query to see if you can and when you actually can. In a multithreaded environment this is one of the primary causes of race conditions and can be a real vulnerability of some programs.
Basically you just have to try and create it and see if it works. And that's the correct way to do it. It's why things like ConcurrentHashMap has a putIfAbsent() so the check and insert is an atomic operation and doesn't suffer from race conditions. Exactly the same principle is in play here.
If this is just part of some diagnostic or install process, just do it and see if it works. Again there's no guarantee that it'll work later however.
Basically your program has to be robust enough to die gracefully if it can't write a relevant file.
boolean canWrite(File file) {
if (file.exists()) {
return file.canWrite();
}
else {
try {
file.createNewFile();
file.delete();
return true;
}
catch (Exception e) {
return false;
}
}
}
Here's something you can do that works across operating systems
Using regex match to check for existing known invalid characters.
if (newName.matches(".*[/\n\r\t\0\f`?*\\<>|\":].*")) {
System.out.println("Invalid!");
} else {
System.out.println("Valid!");
}
Pros
This works across operating systems
You can customize it whatever way
you want by editing that regex.
Cons
This might not be a complete list and need more research to fill in more invalid patterns or characters.
Just do it (and clean up after yourself)
A possible approach may be attempting to create the file and eventually deleting it if the creation succeeded, but I hope there is a more elegant way of achieving the same result.
Maybe that's the most robust way.
Below is canCreateOrIsWritable that determines whether your program is able to create a file and its parent directories at a given path, or, if there's already a file there, write to it.
It does so by actually creating the necessary parent directories as well as an empty file at the path. Afterwards, it deletes them (if there existed a file at the path, it's left alone).
Here's how you might use it:
var myFile = new File("/home/me/maybe/write/here.log")
if (canCreateOrIsWritable(myFile)) {
// We're good. Create the file or append to it
createParents(myFile);
appendOrCreate(myFile, "new content");
} else {
// Let's pick another destination. Maybe the OS's temporary directory:
var tempDir = System.getProperty("java.io.tmpdir");
var alternative = Paths.get(tempDir, "second_choice.log");
appendOrCreate(alternative, "new content in temporary directory");
}
The essential method with a few helper methods:
static boolean canCreateOrIsWritable(File file) {
boolean canCreateOrIsWritable;
// The non-existent ancestor directories of the file.
// The file's parent directory is first
List<File> parentDirsToCreate = getParentDirsToCreate(file);
// Create the parent directories that don't exist, starting with the one
// highest up in the file system hierarchy (closest to root, farthest
// away from the file)
reverse(parentDirsToCreate).forEach(File::mkdir);
try {
boolean wasCreated = file.createNewFile();
if (wasCreated) {
canCreateOrIsWritable = true;
// Remove the file and its parent dirs that didn't exist before
file.delete();
parentDirsToCreate.forEach(File::delete);
} else {
// There was already a file at the path → Let's see if we can
// write to it
canCreateOrIsWritable = java.nio.file.Files.isWritable(file.toPath());
}
} catch (IOException e) {
// File creation failed
canCreateOrIsWritable = false;
}
return canCreateOrIsWritable;
}
static List<File> getParentDirsToCreate(File file) {
var parentsToCreate = new ArrayList<File>();
File parent = file.getParentFile();
while (parent != null && !parent.exists()) {
parentsToCreate.add(parent);
parent = parent.getParentFile();
}
return parentsToCreate;
}
static <T> List<T> reverse(List<T> input) {
var reversed = new ArrayList<T>();
for (int i = input.size() - 1; i >= 0; i--) {
reversed.add(input.get(i));
}
return reversed;
}
static void createParents(File file) {
File parent = file.getParentFile();
if (parent != null) {
parent.mkdirs();
}
}
Keep in mind that between calling canCreateOrIsWritable and creating the actual file, the contents and permissions of your file system might have changed.