I have a folder in which continuously new files are being dumped.In Java,what is the best way to detect changes in file-system (ie. a specified folder in which the files are being dumped) and add the newly arrived files to a queue data structure so that i can sequentially process each incoming file.
I'm aware of listFiles() function in the File class but using this I can only get files that are available at an instant of time. Of course I can continuously poll the folder and get the list of files in it using a thread.But is this the best way or is there a better way to accomplish this.
Continuously polling is the way to do it in Java as of now - though don't poll too often, it can be quite a heavy operation if the directory contains lots of entries.
JDK 7 will have a specific API for doing just this java.nio.file.WatchFile
There is unfortunately no standard way to do this until JDK7 comes out.
But there are some libraries available on the internet which use the native functions of different operating systems to do this.
The libraries which I have looked at are
jPoller and jNotify
But in the end I ended simply polling the directory which was interesting for me when I had to do that.
Related
I'm using WatchService for synchronization data files with the application workbench. When I rename/move the watched directory I don't get any event nor the WatchKey won't become invalid. I still get events from the renamed directory but as far as I know there no way to find out the actual Path for the WatchKey besides WatchKey.watchable() which however still returns original directory path. I would like to avoid need of locking the watched directory against changes since I want to keep the application as lightweight as possible.
I have experienced this problem with JDK 7u10 on Windows 7
Do you know any workaround for this issue without locking the directory or watching all directories to the root?
UPDATE
On Linux I have observed the same behavior.
So far it seems I have three options now.
1) Rely on user's discipline that he/she won't move the data directories. I don't really like this options since it might lead to undefined behavior.
2) Use more extensive non-standard native library
3) Create hierarchy of watchdogs on superior directories. These would accept only ENTRY_DELETE events since this event (or OVERFLOW) must appear at the moment the actual watched directory is moved or deleted and thus invalid.
My understanding is that renaming a directory will generate file system events on the old and new parent directories, not on the directory that is renamed. According to the answer to Can iNotify tell me where a monitored file is moved?, the OS cannot tell you where something was moved to unless you are monitoring the destination directory. (And besides, in Java 7/8 MOVE events aren't handled by the watch service implementation.)
UPDATE
You could try the jpathwatch project that adds support for (platform specific) extended events using the standard Java7 WatchService APIs.
References:
documentation - http://jpathwatch.wordpress.com/
javadoc - http://jpathwatch.sourceforge.net/
Using JDK 7 I've had success in watching specific directories for new file creations, deletions and modifications using java.nio.file.StandardWatchEventKinds.*
I'm hoping someone may know a way to get Java to detect new file creations regardless of their path.
I am wanting to do this so I can calculate an MD5 sum for each newly written file.
Thanks for any advice you can offer.
Ok, short answer is I don't think Java can do that out of the box. You'd have to either intercept calls to the operating system which would require something closer to the bare metal, or you could do as suggested in another answer and register listeners to every folder from the root down, not to mention other drives in the case of windows machines.
The first approach would need custom JNI which assumes the OS has such a hook and allows user code access.
The second approach would work but could consume a large amount of memory to track all the listeners. In windows right-click on c:\ and select and see just how many folders we're talking about.
One possibility - not a convenient one, but a possibility - is to walk the directory tree for the directories you want to watch, registering each in a WatchService. That's not a very nice way to go about it, and it could be a problem depending on how large the actual directory tree is.
I do not know StandardWatchEvents (although it sounds convenient).
One way to do one you want is to use a native window API such as ReadDirectoryChangesW (or volume changes). It's painful, but works (been there, done that, wish I had another option at the time).
I have a folder in which continuously new files are being dumped.In Java,what is the best way to detect changes in file-system (ie. a specified folder in which the files are being dumped) and add the newly arrived files to a queue data structure so that i can sequentially process each incoming file.
I'm aware of listFiles() function in the File class but using this I can only get files that are available at an instant of time. Of course I can continuously poll the folder and get the list of files in it using a thread.But is this the best way or is there a better way to accomplish this.
Continuously polling is the way to do it in Java as of now - though don't poll too often, it can be quite a heavy operation if the directory contains lots of entries.
JDK 7 will have a specific API for doing just this java.nio.file.WatchFile
There is unfortunately no standard way to do this until JDK7 comes out.
But there are some libraries available on the internet which use the native functions of different operating systems to do this.
The libraries which I have looked at are
jPoller and jNotify
But in the end I ended simply polling the directory which was interesting for me when I had to do that.
I have looked at the source code of Apache Commons FileUtils.java class to see how they implement unix like touch functionality. But I wanted to confirm with the community here if my use case would be met by the implementation as it opens and closes a FileOutputStream to provide touch functionality
We have two webservers and one common server between them where a File is residing
For our application we need to use the time modified of this file to make some decisions. We actually don't want to modify the file but change its last modified date when some particular activity happens on one of the webservers.
Its important that last modified time set for the file is taken from the central server to avoid worrying about time differences between two web servers. Therefore changing file.setLastModfiied is not a good option as webserver would send its own time.
But I am wondering that even if I use Apache Commons FileUtils touch method to do this, would closing stream on one webserver set the last modified time of the file using time of the webserver or the central server.
Sorry for so much details but could not see any other way to explain the issue
If you "touch" a file in the filesystem of one webserver, then the timestamp of the file will be set using the clock of that server. I don't think you can solve your problem that way.
I think you've got three options:
configure the servers to synchronize their clocks to the common timebase; e.g. using NTP,
put all files whose timestamps must be accurate to the common timebase on one server, or
change your system design so that it is immune to problems with different servers' clocks being out of sync.
It would be much better to make use of a shared database if you have one so that you can avoid issues of concurrency and synchronisation. I can't recommend any simple and safe distributed file flag system.
Supposing I have a File f that represents a directory, then f.delete() will only delete the directory if it is empty. I've found a couple of examples online that use File.listFiles() or File.list() to get all the files in the directory and then recursively traverses the directory structure and delete all the files. However, since it's possible to create infinitely recursive directory structures (in both Windows and Linux (with symbolic links)) presumably it's possible that programs written in this style might never terminate.
So, is there a better way to write such a program so that it doesn't fall into these pitfalls? Do I need to keep track of everywhere I've traversed and make sure I don't go around in circles or is there a nicer way?
Update: In response to some of the answers (thanks guys!) - I'd rather the code didn't follow symbolic links and stayed within the directory it was supposed to delete. Can I rely on the Commons-IO implementation to do that, even in the Windows case?
If you really want your recursive directory deletion to follow through symbolic links, then I don't think there is any platform independent way of doing so without keeping track of all the directories you have traversed.
However, in pretty much every case I can think of you would just want to delete the actual symbolic link pointing to the directory rather than recursively following through the symbolic link.
If this is the behaviour you want then you can use the FileUtils.deleteDirectory method in Apache Commons IO.
Try Apache Commons IO for a tested implementation.
However, I don't think it this handles the infinite-recursion problem.
File.getCanonicalPath() will tell you the “real” name of the file, including resolved symlinks. When while scanning you come across a directory you alread know (because you stored them in a Map) bail out.
If you could know which files are symlinks, you could just skip over those.
There is unfortunately no "clean" way of detecting symlinks in Java. Check out this pure Java workaround or this one involving native code.
At least under MacOSX, deleting a symbolic link to a directory does not delete the directory itself, and can therefore be deleted even if the target directory is not empty.
I assume this holds for most POSIX operating systems. And as far as I know, links under windows are also just files, and can be deleted as such from a Java program.