EDIT : Well, I'm back a bunch of months later, the lock mechanism that I was trying to code doesn't work, because createNewFile isn't reliable on the NFS. Check the answer below.
Here is my situation : I have only 1 application which may access the files, so I don't have any constraint about what other applications may do, but the application is running concurrently on several servers in the production environment for redundancy and performance purposes (a couple of machines are hosting each a couple of JVM with our apps).
Basically, what I need is to put some kind of flag in a folder to tell the other instances to leave this folder alone as another instance is already dealing with it.
Many search results are telling to use FileLock to achieve this, but I checked the Javadoc, and from my understanding it will not help much, since it's using the hosting OS's locking possibilities. So I doubt that it will help much since there are different hosting machines.
This question covers a similar subject : Java file locking on a network , and the accepted answer is recommending to implement your own kind of cooperative locking process (using the File.createNewFile() as asked by the OP).
The Javadoc of File.createNewFile() says that the process is atomically creating the file if it doesn't already exist. Does that work reliably in a network file system ?
I mean, how is it possible with the potential network lag to do both existence check and creation simultaneously ? :
The check for the existence of the file and the creation of the file if it does not exist are a single operation that is atomic with respect to all other filesystem activities that might affect the file.
No, createNewFile doesn't work properly on a network file system.
Even if the system call is atomic, it's only atomic regarding the OS, and not over the network.
Over the time, I got a couple of collisions, like once every 2-3 months (approx. once every 600k files).
The thing that happens is my program is running in 6 separates instances over 2 separate servers, so let's call them A1,A2,A3 and B1,B2,B3.
When A1, A2, and A3 try to create the same file, the OS can properly ensure that only one file is created, since it is working with itself.
When A1 and B1 try to create the same file at the same exact moment, there is some form of network cache and/or network delays happening, and they both get a true return from File.createNewFile().
My code then proceeds by renaming the parent folder to stop the other instances of the program from unnecessarily trying to process the folder and that's where it fails :
On A1, the folder renaming operation is successful, but the lock file can't be removed, so A1 just lets it like that and keeps on processing new incoming folders.
On B1, the folder renaming operation (File.renameTo(), can't do much to fix it) gets stuck in a infinite loop because the folder was already renamed (also causing a huge I/O traffic according to my sysadmin), and B1 is unable to process any new file until the program is rebooted.
The check for the existence of the file and the creation of the file if it does not exist are a single operation that is atomic with respect to all other filesystem activities that might affect the file.
That can be implemented easily via the open() system call or its equivalents in any operating system I have ever used.
I mean, how is it possible with the potential network lag to do both
existence check and creation simultaneously ?
There is a difference between simultaneously and atomically. Java doc is not saying anything about this function being a set of two simultaneous actions but two actions designed to work in atomic way. If this method is built to do two operations atomically than means file will never be created without checking file existence first and if file gets created by current call then it means there were no files present and if file doesn't get created that means there was already a file by that name.
I don't see a reason to doubt function being atomic or working reliably despite call being on network or local disk. Local call is equally unreliable - so many things can go wrong in an IO.
What you have to doubt is when trying to use empty file created by this function as a Lock as explained D-Mac's answer for this question and that is what explicitly mentioned in Java Doc for this function too.
You are looking for a directory lock and empty files working as a directory lock ( to signal other processes and threads to not touch it ) has worked quite well for me provided due care is taken to write logic to check for file existence,lock file clean up and orphaned locks.
Related
I have a Java application that creates multiple threads. There is 1 producer thread which reads from a 10gb file, parses that information, creates objects from it and puts them into multiple blocking queues (5 queues).
The rest of the 5 consumer threads read from a blockingqueue (each consumer thread has its own blockingqueue). The consumer threads then each write to an individual file, so 5 files in total get created. It takes around 30min to create all files.
The problem:
The threads are writing to an external mount directory in a linux box. We've experience problems where other linux mounts have gone down and applications crash so I want to prevent that in this application.
What I would like to do is keep checking if the mount (directory) exists before writing to it. Im assuming if the directory goes down it will throw a FileNotFoundException. If that is the case I want it to keep checking if the directory is there for about 10-20min before completely crashing. Because I dont want to have to read the 10gb file again I want the consumer threads to be able to pick up from where they last left off.
What Im not sure would be best practice is:
Is it best to check if the directory exists in the main class before creating the threads? Or check in each consumer thread?
If I keep checking if the directory exists in each consumer thread it seems like repeatable code. I can check in the main class but it takes 30min to create these files. What if in those 30min the mount goes down then if Im only checking in the main class whether the directory exists the application will crash. Or if Im already writing to a directory is it impossible for an external directory to go down? Does it get locked?
thank you
We have something similar in our application, but in our case we are running a web app and if our mounted file system goes down we just throw an exception, but we want to do something more elegant, like you do...
I would recommend using a combination of the following patterns: State, CircuitBreaker, which I believe CircuitBreaker is a more specific version of the State pattern, and Observer/Observable.
These would work in the following way...
Create something that represents your file system. Maybe a class called MountedFileSystem. Make all your write calls to this particular class.
This class will catch all FileNotFoundException and one occurs, the CircutBreaker gets triggered. This change will be like the State pattern. One state is when things are working 'fine', the other state is when things aren't working 'fine', meaning that the mount has gone away.
Then, in the background, I would have a task that starts on a thread and checks the actual underlying file system to see if it is back. When the file system is back, change the state in the MountedFileSystem, and fire an Event (Observer/Observable) to try writing the files again to disk.
And as yuan quigfei stated, I am fairly certain you're going to have to rewrite those files. I just don't see being able to restart writing to them, but perhaps someone else has an idea.
write a method to detect folder exist or not.
call this method before actual writing.
create 5 thread based on 2. Once detect file is not existed, you seems have no choice but rewrite. Of course, you don't need re-read if all your content are in memory already(Big memory).
I'm working on a multithreaded server in Java.
The server monitors a directory of files. Clients can ask the server:
to download a file from the server directory
to upload a new version of an already existing file to the server, overwriting the old version in the server directory.
To do the transfers, I'm planning to use FileChannels and SocketChannels, using the methods transferFrom and transferTo. According to the documentation, these two methods are thread safe.
The thing is that a single call to these two function could not be sufficient to read/write the file entirely.
The problem arises if there are more than one request on the same file at the same time. In this scenario, multiple threads could be doing read/write operations on the same file. Now, the single calls to transferFrom/transferTo are thread safe, according to the Java documentation. But a single call to these two functions could not be sufficient to read/write the file entirely. If thread A is replying to a download request and thread B is replying to an upload request referring to the same file, it could happen that:
Thread A starts reading from the file
In thread A, for some reason the read call returns before the EOF
Thread B overwrites the entire file with a single write call
Thread A continues reading from the file
In this case, the downloading client receives a portion of the old version and a portion of the new version.
To solve this I think I should be using some sort of locking, but I'm not sure how to do it in an efficient way. I could create two synchronized methods for reading and writing, but that creates obviously too much contention.
The best solution I have in mind is to use lock striping. Before doing any read/write operation, an hash based on the filename is calculated. Then, the lock in position lockArr[hash % numOfLocks] is acquired.
I think also that I should be using ReadWriteLocks, since multiple simultaneous reads should be allowed.
Now, this is my analysis of the problem and I could be completely wrong. Is there any better solution to this?
Locking means that somebody has to wait for somebody else -- not the best solution.
When the client uploads a file, you should write it out to a temp file on the same disk (usually in the same directory), and then when the file upload is done:
Rename the old version to a temporary name. Any current readers should be forced to close the old one, re-open the temp version, and seek to the correct position.
Rename the uploaded file to the target file name.
Delete the temp version of the old file when any readers are done with it.
In a typical implementation, you'd need a centralized class (lets call it ConcurrentFileAccessor) to manage the interactions between threads.
Readers would need to register with this class, and synchronize on some object during the actual read operation. When an upload completes, the writer would have to claim all those locks to block reads, close all the read files, rename the old version, reopen, seek, and then release them to allow the readers to continue.
I am looking to force synchronisation to disk after files are written at certain points in my application. Since it runs on Linux, I could get away with just running
Runtime.getRuntime().exec("sync");
However, I would rather not introduce Linux specific system calls and would rather use
java.io.FileDescriptor#sync();
However, I use Apache VFS to perform operations on the local file system and to my knowledge it does not provide access to the underlying file descriptor. But do I need access to the actual file descriptor that was just written to to force synchronization? Could I not just use any FileDescriptor to call sync for the same effect, for example
FileDescriptor.in.sync();
Would that be a valid approach, and would the results match that of calling sync in Linux?
Just in case anyone knows if / how it is possible to get access to the underlying FileDescriptor in VFS, it would be useful to know as well.
Edit: it appears that
FileDescriptor.in.sync();
does not want to work on Linux (although it works on my Windows machine when run from Eclipse), but
new FileOutputStream(new File("anyfile")).getFD().sync();
definitely works and the results of calling this match the results of calling the Linux sync command directly. However, it involves opening and closing a redundant file output stream, so it's not exactly ideal. Any other reason this might be a bad idea, as it does seem to work? Is there some other way to get a FileDescriptor that can be used to sync?
I investigates such issues some time ago: Question 1, Question 2.
In Linux, a java.io.FileDescriptor#sync call ensures that the modified data of the file associated with the descriptor is sent to the disk. (That cheap disk tend to skip the write and only place the data in an unreliable (aka no NVRAM) write cache is a different/additional problem.)
It does not guarantee that also modified data of other files is written back. This is just not in the contract of sync or of the underlying fsync POSIX function.
However, in certain circumstances (e.g. ext3 in data=ordered mode), an fsync on a file writes back up modified data of the file system. This is really fun because this may create significant latencies just because some other application has created a ton of dirty blocks.
I am writing a Java application which should (among other things) generate a sequence of integers, starting with a given number (such as 900, 901, 902, 903, ... - the 900 is given as a parameter).
The current sequence value should persist when the application gets shut down and then started again.
When multiple instances of the application are running at the same time, they should share the same sequence (e.g. the union of the sequences generated by all instances should be the same as the sequence generated by a single instance, when running alone).
The administrator should be able to shut down the application and reset the current sequence value manually.
If the application crashes, the file should always stay accessible for other instances so that they can continue to work.
It was decided that the application would use a plain text file which would contain just the current number. When the application starts, it checks out if the file already exists and if not, creates it and writes the initial number into it. Everytime the application is about to generate a new number, it should read the current value inside the file, use it as the current sequence value, and then increment the number in the file.
I would like to now, how to do these two things atomically (with regards to other running instances of the same application):
check out if a file exists and if not, create it and write a number into it
read the current content of a file and then change it
Suggestions on how to achieve the listed goals in other ways are appreciated as well.
Using a database sequence would be a simple and solid solution but you've decided it will be a file. Then you'll need to manage the distributed synchronization yourself. There are systems offering that, like Terracotta or Hazelcast. I would definitely use one of them instead of implementing a new one based on locking a file. Why not a database?
I would create a lock file when a client writes the file and delete that lock file immediatly when the write process is done.
When the lock file is present other clients will not read or write the db file and wait until the lock file is deleted - simultanious reads are allowed.
You questions:
Shutdownhook
Is solved by using the lock file mechanisem
Every client could create an ID file beside the db file and when that file is deleted by the admin the client shuts down.
Depends: if the shutdownhook is respected this should not be a problem but if the client is killed immediatly you dont have any chance to clean up.
Problems:
If to many clients try to write the db file you cannot make shure that the first client will be served first.
What happens if a clients crashes during the write process and is not able to clean up the lock file?
What happens if two clients try to create the lock file at the same time? I think this depends on the os filesystem.
I have multiple applications running in one virtual machine. I have multiple virtual machines running on one server. And I have multiple servers. They all share a file using a shared folder on linux. The file is read and written by all applications. During the write process no application is allowed to read this file. The same for writing: If an application is reading the file no application is allowed to write it.
How do I manage to synchronize the applications so they will wait for the write process to finish before they read, and vice versa? (the applications inside a vm have to be synchronized and also applications across servers)
Curent implementation uses "file semaphores". If the file is about to be written the application tries to "acquire" the semaphore by creating an additional file (lets name it "file.semaphore") in the shared folder. If the "file.semaphore" file already exists this means the semaphore is already locked by a different application. This approach has the problem that I cannot make sure that the "file exists"-test and "create file"- operation are executed atomic. This way it is possible that two applications test for the "file.semaphore" file, see it does not exist and try to create the file at the same time.
You can use NIO locking capabilities. See FileChannel#lock().
However, this will work only if underlying filesystem supports locking over the network. Recent NFS should support it. Probably, Samba supports them too, but can’t say for sure.
See article for example.
Have a look at the Javadocs for the createNewFile() method - it specifically states that creating files is not a reliable method for synchronization, and recommends the FileLock class instead (it's another package in java.nio.channels so is essentially the same as what Ivan Dubrov is suggesting).
This would imply that your identification of the problem is accurate, and no amount of playing around will solve this with traditional file creation. My first thought was to check the return code from createNewFile(), but if the Javadocs say it's not suitable then it's time to move on.
Need to combine file locking for protection between JVM's with synchronization within threads of a given JVM. See the answer by cyber-monk here
I am also trying to determine the best way to solve this problem for a similar situation (less participating processes, but still same underlying problem). If you haven't been able to employ the file locking scheme suggested by Ivan (e.g. system|language|network service does not support it), maybe you could designate one of the participants as a referee. All participants write unique semaphores, call them "participant#.request" when they want the file. The referee polls the file system for these semaphores. When he sees one, he writes back "participant#.lock", and deletes the request. If he happens to see multiple at the "same time" he selects one at random (or first by file modification time) and deletes only their request. Then, the participant issued the lock knows they can access the file safely. When the participant is done with the file, they delete their own lock. While there is a lock in place, no other locks are issued by the referee. Any requests that are present after the user deletes their lock could be served a new lock without issuing a new request, so you could have the other users poll for their lock after sending the request. Probably this is what the locking mechanism is doing anyway, except maybe for the ability to manage the lock as a queue that comes with requests being processed in the order they are received (i.e. if the referee uses modification time). Also, since you're in charge of the referee you could set timeouts to locks, allowing him issue timeout semaphores to the process that is hogging the file and then remove the lock (hoping of course that if that process with the lock died, it did so nicely).