I have a folder in which continuously new files are being dumped.In Java,what is the best way to detect changes in file-system (ie. a specified folder in which the files are being dumped) and add the newly arrived files to a queue data structure so that i can sequentially process each incoming file.
I'm aware of listFiles() function in the File class but using this I can only get files that are available at an instant of time. Of course I can continuously poll the folder and get the list of files in it using a thread.But is this the best way or is there a better way to accomplish this.
Continuously polling is the way to do it in Java as of now - though don't poll too often, it can be quite a heavy operation if the directory contains lots of entries.
JDK 7 will have a specific API for doing just this java.nio.file.WatchFile
There is unfortunately no standard way to do this until JDK7 comes out.
But there are some libraries available on the internet which use the native functions of different operating systems to do this.
The libraries which I have looked at are
jPoller and jNotify
But in the end I ended simply polling the directory which was interesting for me when I had to do that.
Related
I'm using a third-party commercial library which seems to be leaking file handles (I verified this on Linux using lsof). Eventually the server (Tomcat) starts getting the infamous "Too many open files error", and I have to re-start the JVM.
I've already contacted the vendor. In the meantime, however, I would like to find a workaround for this. I do not have access to their source code. Is there any way, in Java, to clean up file handles without having access to the original File object (or FileWriter, FileOutputStream, etc.)?
a fun way would be to write a dynamic library and use LD_PRELOAD to load it for the java instance you are launching ... this DLL could override the appropriate underlying open(2) system call (or use some other logic) to close existing file descriptors of the process before passing the call to the libc implementation (or the kernel). You need to do some serious accounting and possibly deal with threads; but it can be done. Especially if you take hints from /proc/pid/fd/ for figuring whether or not a close is appropriate for the target fd.
You could, on startup, open a bunch of files and use File*putStream.getFD() to obtain a bunch of java.io.FileDescriptors, then close them, but hold onto the descriptors. Later you might be able to create streams using those stored FileDescriptors and close them.
I have not tested this, so would not be surprised if it did not work on some platforms.
I have a folder in which continuously new files are being dumped.In Java,what is the best way to detect changes in file-system (ie. a specified folder in which the files are being dumped) and add the newly arrived files to a queue data structure so that i can sequentially process each incoming file.
I'm aware of listFiles() function in the File class but using this I can only get files that are available at an instant of time. Of course I can continuously poll the folder and get the list of files in it using a thread.But is this the best way or is there a better way to accomplish this.
Continuously polling is the way to do it in Java as of now - though don't poll too often, it can be quite a heavy operation if the directory contains lots of entries.
JDK 7 will have a specific API for doing just this java.nio.file.WatchFile
There is unfortunately no standard way to do this until JDK7 comes out.
But there are some libraries available on the internet which use the native functions of different operating systems to do this.
The libraries which I have looked at are
jPoller and jNotify
But in the end I ended simply polling the directory which was interesting for me when I had to do that.
I have looked at the source code of Apache Commons FileUtils.java class to see how they implement unix like touch functionality. But I wanted to confirm with the community here if my use case would be met by the implementation as it opens and closes a FileOutputStream to provide touch functionality
We have two webservers and one common server between them where a File is residing
For our application we need to use the time modified of this file to make some decisions. We actually don't want to modify the file but change its last modified date when some particular activity happens on one of the webservers.
Its important that last modified time set for the file is taken from the central server to avoid worrying about time differences between two web servers. Therefore changing file.setLastModfiied is not a good option as webserver would send its own time.
But I am wondering that even if I use Apache Commons FileUtils touch method to do this, would closing stream on one webserver set the last modified time of the file using time of the webserver or the central server.
Sorry for so much details but could not see any other way to explain the issue
If you "touch" a file in the filesystem of one webserver, then the timestamp of the file will be set using the clock of that server. I don't think you can solve your problem that way.
I think you've got three options:
configure the servers to synchronize their clocks to the common timebase; e.g. using NTP,
put all files whose timestamps must be accurate to the common timebase on one server, or
change your system design so that it is immune to problems with different servers' clocks being out of sync.
It would be much better to make use of a shared database if you have one so that you can avoid issues of concurrency and synchronisation. I can't recommend any simple and safe distributed file flag system.
I have a set of files. The set of files is read-only off a NTFS share, thus can have many readers. Each file is updated occasionally by one writer that has write access.
How do I ensure that:
If the write fails, that the previous file is still readable
Readers cannot hold up the single writer
I am using Java and my current solution is for the writer to write to a temporary file, then swap it out with the existing file using File.renameTo(). The problem is on NTFS, renameTo fails if target file already exists, so you have to delete it yourself. But if the writer deletes the target file and then fails (computer crash), I don't have a readable file.
nio's FileLock only work with the same JVM, so it useless to me.
How do I safely update a file with many readers using Java?
According to the JavaDoc:
This file-locking API is intended to
map directly to the native locking
facility of the underlying operating
system. Thus the locks held on a file
should be visible to all programs that
have access to the file, regardless of
the language in which those programs
are written.
I don't know if this is applicable, but if you are running in a pure Vista/Windows Server 2008 solution, I would use TxF (transactional NTFS) and then make sure you open the file handle and perform the file operations by calling the appropriate file APIs through JNI.
If that is not an option, then I think you need to have some sort of service that all clients access which is responsible to coordinate the reading/writing of the file.
On a Unix system, I'd remove the file and then open it for writing. Anybody who had it open for reading would still see the old one, and once they'd all closed it it would vanish from the file system. I don't know if NTFS has similar semantics, although I've heard that it's losely based on BSD's file system so maybe it does.
Something that should always work, no matter what OS etc, is changing your client software.
If this is an option, then you could have a file "settings1.ini" and if you want to change it, you create a file "settings2.ini.wait", then write your stuff to it and then rename it to "settings2.ini" and then delete "settings1.ini".
Your changed client software would simply always check for settings2.ini if it has read settings1.ini last, and vice versa.
This way you have always a working copy.
There might be no need for locking. I am not too familiar with the FS API on Windows, but as NTFS supports both hard links and soft links, AFAIK, you can try this if your setup allows it:
Use a hard or soft link to point to the actual file, and name the file diferently. Let everyone access the file using the link's name.
Write the new file under a different name, in the same folder.
Once it is finished, have the file point to the new file. Optimally, Windows would allow you to create the new link with replacing the existing link in one atomic operation. Then you'd effectively have the link always identify a valid file, either the old or the new one. At worst, you'd have to delete the old one first, then create the link to the new file. In that case, there'd be a short time span in which a program would not be able to locate the file. (Also, Mac OS X offers a "ExchangeObjects" function that allows you to swap two items atomically - maybe Windows offers something similar).
This way, any program that has the old file already opened will continue to access the old one, and you won't get into its way creating the new one. Only if an app then notices the existence of the new version, it could then close the current and open it again, this way getting access to the new version.
I don't know, however, how to create links in Java. Maybe you have to use some native API for that.
I hope this helps anyways.
I have been dealing with something similar recently. If you are running Java 5, perhaps you could consider using NIO file locks in conjunction with a ReentrantReadWriteLock? Make sure all code referencing the FileChannel object ALSO references the ReentrantReadWriteLock. This way the NIO locks it at a per-VM level while the reentrant lock locks it at a per-thread level.
FileLock fileLock = filechannel.lock(position, size, shared);
reentrantReadWriteLock.lock();
// do stuff
fileLock.release();
reentrantReadWriteLock.unlock();
Of course, some exception handling would be required.
Supposing I have a File f that represents a directory, then f.delete() will only delete the directory if it is empty. I've found a couple of examples online that use File.listFiles() or File.list() to get all the files in the directory and then recursively traverses the directory structure and delete all the files. However, since it's possible to create infinitely recursive directory structures (in both Windows and Linux (with symbolic links)) presumably it's possible that programs written in this style might never terminate.
So, is there a better way to write such a program so that it doesn't fall into these pitfalls? Do I need to keep track of everywhere I've traversed and make sure I don't go around in circles or is there a nicer way?
Update: In response to some of the answers (thanks guys!) - I'd rather the code didn't follow symbolic links and stayed within the directory it was supposed to delete. Can I rely on the Commons-IO implementation to do that, even in the Windows case?
If you really want your recursive directory deletion to follow through symbolic links, then I don't think there is any platform independent way of doing so without keeping track of all the directories you have traversed.
However, in pretty much every case I can think of you would just want to delete the actual symbolic link pointing to the directory rather than recursively following through the symbolic link.
If this is the behaviour you want then you can use the FileUtils.deleteDirectory method in Apache Commons IO.
Try Apache Commons IO for a tested implementation.
However, I don't think it this handles the infinite-recursion problem.
File.getCanonicalPath() will tell you the “real” name of the file, including resolved symlinks. When while scanning you come across a directory you alread know (because you stored them in a Map) bail out.
If you could know which files are symlinks, you could just skip over those.
There is unfortunately no "clean" way of detecting symlinks in Java. Check out this pure Java workaround or this one involving native code.
At least under MacOSX, deleting a symbolic link to a directory does not delete the directory itself, and can therefore be deleted even if the target directory is not empty.
I assume this holds for most POSIX operating systems. And as far as I know, links under windows are also just files, and can be deleted as such from a Java program.