Java - Check if other programs read a file? - java

Can I somehow check if another program reads a specified file?
I want my program to monitor a file, and whenever it is accessed by another program, it should run some code. Is this possible?

As some people have mentionned, Java new IO offers you to watch a directory/files for some activities :
ENTRY_CREATE – A directory entry is created.
ENTRY_DELETE – A directory entry is deleted.
ENTRY_MODIFY – A directory entry is modified.
OVERFLOW – Indicates that events might have been lost or discarded. You do not have to register for the OVERFLOW event to receive it.
However, as you can see, it does not allow you to detect if the file has been accessed. If really you want to do that, you will have to write some native code.
On windows, you can list who access a file using Handle. I believe you could call this command from a java program (let say each couple of minutes) then from the output detect if the file is used.
I'm pretty sure there is alternative for other OS.

BasicFileAttributes interface offers last access time. But it wont be able to tell you which program accessed it. As mentioned by others WatchService will also do the same but what you want to do can be achieved via logging from those programs and then check those logs for determining what to do next.

Related

What would be best practice If I am trying to constantly check if a directory exists? JAVA

I have a Java application that creates multiple threads. There is 1 producer thread which reads from a 10gb file, parses that information, creates objects from it and puts them into multiple blocking queues (5 queues).
The rest of the 5 consumer threads read from a blockingqueue (each consumer thread has its own blockingqueue). The consumer threads then each write to an individual file, so 5 files in total get created. It takes around 30min to create all files.
The problem:
The threads are writing to an external mount directory in a linux box. We've experience problems where other linux mounts have gone down and applications crash so I want to prevent that in this application.
What I would like to do is keep checking if the mount (directory) exists before writing to it. Im assuming if the directory goes down it will throw a FileNotFoundException. If that is the case I want it to keep checking if the directory is there for about 10-20min before completely crashing. Because I dont want to have to read the 10gb file again I want the consumer threads to be able to pick up from where they last left off.
What Im not sure would be best practice is:
Is it best to check if the directory exists in the main class before creating the threads? Or check in each consumer thread?
If I keep checking if the directory exists in each consumer thread it seems like repeatable code. I can check in the main class but it takes 30min to create these files. What if in those 30min the mount goes down then if Im only checking in the main class whether the directory exists the application will crash. Or if Im already writing to a directory is it impossible for an external directory to go down? Does it get locked?
thank you
We have something similar in our application, but in our case we are running a web app and if our mounted file system goes down we just throw an exception, but we want to do something more elegant, like you do...
I would recommend using a combination of the following patterns: State, CircuitBreaker, which I believe CircuitBreaker is a more specific version of the State pattern, and Observer/Observable.
These would work in the following way...
Create something that represents your file system. Maybe a class called MountedFileSystem. Make all your write calls to this particular class.
This class will catch all FileNotFoundException and one occurs, the CircutBreaker gets triggered. This change will be like the State pattern. One state is when things are working 'fine', the other state is when things aren't working 'fine', meaning that the mount has gone away.
Then, in the background, I would have a task that starts on a thread and checks the actual underlying file system to see if it is back. When the file system is back, change the state in the MountedFileSystem, and fire an Event (Observer/Observable) to try writing the files again to disk.
And as yuan quigfei stated, I am fairly certain you're going to have to rewrite those files. I just don't see being able to restart writing to them, but perhaps someone else has an idea.
write a method to detect folder exist or not.
call this method before actual writing.
create 5 thread based on 2. Once detect file is not existed, you seems have no choice but rewrite. Of course, you don't need re-read if all your content are in memory already(Big memory).

Is java.io.File.createNewFile() atomic in a network file system?

EDIT : Well, I'm back a bunch of months later, the lock mechanism that I was trying to code doesn't work, because createNewFile isn't reliable on the NFS. Check the answer below.
Here is my situation : I have only 1 application which may access the files, so I don't have any constraint about what other applications may do, but the application is running concurrently on several servers in the production environment for redundancy and performance purposes (a couple of machines are hosting each a couple of JVM with our apps).
Basically, what I need is to put some kind of flag in a folder to tell the other instances to leave this folder alone as another instance is already dealing with it.
Many search results are telling to use FileLock to achieve this, but I checked the Javadoc, and from my understanding it will not help much, since it's using the hosting OS's locking possibilities. So I doubt that it will help much since there are different hosting machines.
This question covers a similar subject : Java file locking on a network , and the accepted answer is recommending to implement your own kind of cooperative locking process (using the File.createNewFile() as asked by the OP).
The Javadoc of File.createNewFile() says that the process is atomically creating the file if it doesn't already exist. Does that work reliably in a network file system ?
I mean, how is it possible with the potential network lag to do both existence check and creation simultaneously ? :
The check for the existence of the file and the creation of the file if it does not exist are a single operation that is atomic with respect to all other filesystem activities that might affect the file.
No, createNewFile doesn't work properly on a network file system.
Even if the system call is atomic, it's only atomic regarding the OS, and not over the network.
Over the time, I got a couple of collisions, like once every 2-3 months (approx. once every 600k files).
The thing that happens is my program is running in 6 separates instances over 2 separate servers, so let's call them A1,A2,A3 and B1,B2,B3.
When A1, A2, and A3 try to create the same file, the OS can properly ensure that only one file is created, since it is working with itself.
When A1 and B1 try to create the same file at the same exact moment, there is some form of network cache and/or network delays happening, and they both get a true return from File.createNewFile().
My code then proceeds by renaming the parent folder to stop the other instances of the program from unnecessarily trying to process the folder and that's where it fails :
On A1, the folder renaming operation is successful, but the lock file can't be removed, so A1 just lets it like that and keeps on processing new incoming folders.
On B1, the folder renaming operation (File.renameTo(), can't do much to fix it) gets stuck in a infinite loop because the folder was already renamed (also causing a huge I/O traffic according to my sysadmin), and B1 is unable to process any new file until the program is rebooted.
The check for the existence of the file and the creation of the file if it does not exist are a single operation that is atomic with respect to all other filesystem activities that might affect the file.
That can be implemented easily via the open() system call or its equivalents in any operating system I have ever used.
I mean, how is it possible with the potential network lag to do both
existence check and creation simultaneously ?
There is a difference between simultaneously and atomically. Java doc is not saying anything about this function being a set of two simultaneous actions but two actions designed to work in atomic way. If this method is built to do two operations atomically than means file will never be created without checking file existence first and if file gets created by current call then it means there were no files present and if file doesn't get created that means there was already a file by that name.
I don't see a reason to doubt function being atomic or working reliably despite call being on network or local disk. Local call is equally unreliable - so many things can go wrong in an IO.
What you have to doubt is when trying to use empty file created by this function as a Lock as explained D-Mac's answer for this question and that is what explicitly mentioned in Java Doc for this function too.
You are looking for a directory lock and empty files working as a directory lock ( to signal other processes and threads to not touch it ) has worked quite well for me provided due care is taken to write logic to check for file existence,lock file clean up and orphaned locks.

Detect java program running on linux machine

Ok, so I have a couple of Java programs that I'm running using a chron job on a linux server. These jobs run every ten minutes or so, take literally two minutes to run, and then exit. I need to add a way for the programs to detect, when they start up, if there is already an instance of themselves running, and if so to exit without going any further. I'm really not sure of the best way to handle this though and am hoping someone can offer some advice.
One approach I've considered is to run a command line argument from the java code that does some sort of PS command and looks through those to see if it's running. This seems pretty finicky and complex though for something so small. Plus, I'm not all that knowledgeable with linux and am not even sure the best way to do that. If anyone has some better thoughts, please let me know. Or if that is the best way, if you could provide the linux commands I'd need I'd appreciate it. Thanks.
If you have a writable /tmp directory you can use a lockfile.
When your Java program starts up, check for a file with a name unique to your application (e.g. "my-lock-file.lock") in the /tmp directory. If none exists, create one, and remove it when you're done. If one exists, just exit.
You can check the existence of a file with the .exists() method of the java.io.File class.
If your code needs to be portable, you can use System.getProperty("java.io.tmpdir")); to get an appropriate temporary directory for the platform your code is running on.
You could look at JMX and the Attach API to query for running JVMs.
Or, as Andrew logvinov mentioned, by using a lock file.
If you are using Java WebStart, there's already native support for this.
Many programs solve this by creating a temporary file that points to their PID (often referred to as a "lock" file). The filename should encode all relevant information to distinguish this process from other processes that could legitimately run in parallel.
For example, if the process is bound to a user, it should contain the user name. If the process is bound to a machine, it should (also) contain the hostname (if you put it in machine-bound temp. directory, this is debatable. If you put it in a home directory, think of the case of multiple machines sharing a home via NFS).
The location of these files is typically /tmp. This is a great location, as /tmp is typically wiped during system boot, so no orphan files are left in case of a system crash. Another solution employed by some programs is to put the lock file in the user settings directory, if it is related to the settings. E.g. mozilla thunderbird has a file called /home/<username>/.thunderbird/<profilename>.default/lock.
The file should contain the PID of the process. The idea is simple: If the file contains the PID, it is easy to check whether this process is indeed still running. So if the process crashes, the file gets orphaned. The new process instance will check the PID in the file, see that it is not running any more, and ignore the file (overwrite).
Putting it all together, you could create a file like this:
/tmp/myawesomeservice-username-hostname-lock
With the content:
12345

How to check whether the files is being used by another program

Is it possible in java to check whether a file is being used by another program?
If you have no control over the program that could potentially be using the file, then generally no.
If you do have control, then the program could tell you whether or not it's using the file.
If the file system is accessed by another Java program, then you can use in
java.nio.channels.FileLock
package, saying that its totally dependant on the OS.
See the link for further details.
http://docs.oracle.com/javase/1.4.2/docs/api/java/nio/channels/FileLock.html
On Windows platforms a file is locked when it is opened. So by testing (directly or indirectly) for a read or write lock, you can get a hint of whether some other process is reading or writing the file. The problem is that most platforms don't lock open files by default, and on all platforms, files can be locked independently of reading and writing. So this approach is going to be non-portable and unreliable, unless you have specific information about how "the other" application behaves.

How to check for a dynamically created file in Java?

I have an application where I need to check for a file which may be created dynamically during my execution, I will give up after some MAX time where the file has yet to show up. I wanted to know if there was a more efficient method in Java of checking for the file other than polling for it and then sleeping every X seconds? If not what would be the most efficient manner of doing this?
You currently have to poll the file system as you mentioned. Java 7 is supposed to have file system notifications, so this should get easier at some point.
If the same program is doing the creation of the file as the polling, you could instead have the logic that creates the file notify the part of the program using Object.notify(). A general description of the wait() and notify/notifyAll() mechanism can be found here: http://java.sun.com/docs/books/tutorial/essential/concurrency/guardmeth.html
You could try JPoller to poll for the file changes.
If you are running on Windows, you can get directory change notifications, see Obtaining Directory Change Notifications. Of course, this is not cross-platform, and will require use of JNA or similar native bridge. In fact, JNA offsers such as class, the FileMonitor class (in the download) that uses the underlying platform's file change notification.
If you are watching a handlful of files or fewer, then of course, polling is unlikely to be a performance problem, it's just not a "feel-good" solution - but not so bad to warrant the pain of a non pure java solution. Monitoring directories containing thousands of files on the other hand would benefit from direct noficiation from the OS.

Categories

Resources