I know that many OSes perform some sort of locking on the filesystem to prevent inconsistent views. Are there any guarantees that Java and/or Android make about thread-safety of file access? I would like to know as much about this as possible before I go ahead and write the concurrency code myself.
If I missed a similar question that was answered feel free to close this thread. Thanks.
Android is built on top of Linux, so inherits Linux's filesystem semantics. Unless you explicitly lock a file, multiple applications and threads can open it for read/write access. Unless you actually need cross-process file synchronization, I would suggest using normal Java synchronization primitives for arbitrating access to the file.
The normal reading/writing functionality (FileInputStream, etc) does not provide any thread-safety AFAIK. To achieve thread-safety, you need to use FileChannel. This would look something like:
FileInputStream in = new FileInputStream("file.txt");
FileChannel channel = in.getChannel();
FileLock lock = channel.lock();
// do some reading
lock.release();
I would read the File Lock doc, and take care with the threading!
Related
From what I have searched so far, I find 2 solutions.
One is to create or reuse a file, then try to lock the file.
File file = new File(lockFile);
RandomAccessFile randomAccessFile = new RandomAccessFile(file, "rw");
FileLock fileLock = randomAccessFile.getChannel().tryLock();
if (fileLock == null) {
// means someone have obtain the lock, therefore it is expected that an application is there
System.exit(0);
}
Advantage of this approach:
If the application is shutdown abnormally, OS will help us release the lock. We do not need to manually delete the file, in order to work.
Source: How to implement a single instance Java application?
Create a file, without any lock, just use the presence of the file to determine if an application is running or not.
Disadvantage of this approach:
If application shutdown abnormally, we need to manually remove the file, in order to work.
Though I personally think this is worse option compared to 1, however it seems library use this approach more often. For example, Play Framework (RUNNING_PID file).
So can someone suggest why framework seem to suggest use of 2 over 1? What are the advantages of such approach?
In addition, is the selection choice depends on performance and ease of use. For example, client side application should choose 1 and server side application should choose 2?
I have a multi-threading application where some of my threads should read data from queue and write them to a file. The problem here is that I am confusing should I create new BufferedWriter instance every time when one of my threads reads value from queue and writes it to same file or I can have just one BufferedWriter instance and flush() every time. One problem in second solution is that I should detect when I should close the BufferedWriter without using Java 7 perfect solution for closing resources in try-catch block.
Does the second solution solves some performance issues?
What are best practices on this?
I'd recommend using one BufferedWriter for writing to the file that is shared by all threads. For the sake of performance, the BufferedWriter should be kept open until the application decides that there is no more output. (Opening and closing files is relatively expensive.)
You also need to have the application threads use some kind of locking (e.g. synchronize on the BufferedWriter) to ensure that they don't try to write at the same time. The BufferedWriter class is not thread-safe.
The try/finally or try-with-resources approach to managing file resources is important in cases where you are opening lots of files. If you are only dealing with one file, and it needs to be open for the entire duration of the application, then resource management is not so critical. (But you do need to make sure that you either close or flush before the application exits.)
But I think BufferedWriter is thread-safe, because underlying implementation of write() methods using synchronized blocks
Well in that sense, yes it is. However, that assumes that when your thread writes data, it does it in a single write(...) call.
Note also that BufferedWriter is not specified to be thread-safe, even if it is thread-safe in the current implementation.
A BufferedWriter should ever lead to one file/Writer/OutputStream; if you have many targets you will need many buffers. If you want to write to the same file from multiple threads you will need to synchronize on the earliest bit; you can synchronized on the BufferredWriter if you don't have more high-level constructs that the character stream. If you synchronize on the BufferedWriter, you will not need to call flush() after the end of each access.
I read that new File("path") doesn't physically create a file on disk. Though in the API it is said:
Instances of this class may or may not denote an actual file-system
object such as a file or a directory. If it does denote such an object then that object resides in a partition. A partition is an operating system-specific portion of storage for a file system. A single storage device (e.g. a physical disk-drive, flash memory, CD-ROM) may contain multiple partitions.
So I'm curious if it is safe to have such code in multithreaded environment:
File file = new File( "myfile.zip" );
// do some operations with file (fill it with content)
file.saveSomewhere(); // just to denote that I save it after several operations
For example, thread1 comes here, creates an instance and starts doing operations. Meanwhile thread2 interrupts it, creates its instance with the same name (myfile.zip) and does some other operations. After that they consequently save the file.
I need to be sure that they don't work with the same file and the last thread saving the file overwrites the previous one.
No, File does not keep a lock and is not safe for the pattern you describe. You should either lock or keep the file in some wrapper class.
If you would provide a little bit more of your code, somebody can certainly help you find a suitable pattern.
Certainly the lines you commented will not thread-safe, you will have to protect them with a mutex or a monitor. The gold rule is: every time you have to write on something in a multithread context, it's necessary to protect that region to grant the thread-safeness (Bernstein conditions).
I'm not sure if the statement you propose requires to be protected too as i never used that command, but i thought this could be helpful to someone else too.
From what I know and researched, the synchronized keyword in Java lets synchronize a method or code block statement to handle multi-threaded access. If I want to lock a file for writing purposes on a multi-threaded environment, I must should use the classes in the Java NIO package to get the best results. Yesterday, I come up with a question about handling a shared servlet for file I/O operations, and BalusC comments are good to help with the solution, but the code in this answer confuses me. I'm not asking community "burn that post" or "let's downvote him" (note: I haven't downvoted it or anything, and I have nothing against the answer), I'm asking for an explanation if the code fragment can be considered a good practice
private static File theFile = new File("theonetoopen.txt");
private void someImportantIOMethod(Object stuff){
/*
This is the line that confuses me. You can use any object as a lock, but
is good to use a File object for this purpose?
*/
synchronized(theFile) {
//Your file output writing code here.
}
}
The problem is not about locking on a File object - you can lock on any object and it does not really matter (to some extent).
What strikes me is that you are using a non final monitor, so if another part of your code reallocates theFile: theFile = new File();, the next thread that comes around will lock with a different object and you don't have any guarantee that your code won't be executed by 2 threads simultaneously any more.
Had theFile been final, the code would be ok, although it is preferable to use private monitors, just to make sure there is not another piece of code that uses it for other locking purposes.
If you only need to lock the file within a single application then it's OK (assuming final is added).
Note that the solution won't work if you load the class more than once using different class loaders. For example, if you have a web application that is deployed twice in the same web server, each instance of the application will have its own lock object.
As you mention, if you want the locking to be robust and have the file locked from other programs too, you should use FileLock (see the docs, on some systems it is not guaranteed that all programs must respect the lock).
Had you seen: final Object lock = new Object() would you be asking?
As #assylias pointed out the problem is that the lock is not final here
Every object in Java can act as a lock for synchronization. They are called intrinsic locks. Only one thread at a time can execute a block of code guarded by a given lock.
More on that: http://docs.oracle.com/javase/tutorial/essential/concurrency/locksync.html
Using synchronized keyword for the whole method could have performance impact on your application. That's why you can sometimes use synchronized block.
You should remember that lock reference can't be changed. The best solution is to use final keyword.
I need to create a File System Manager (more or less) which can read or write data to files.
My problem is how do I handle concurrency?
I can do something like
public class FileSystemManager {
private ReadWriteLock readWriteLock = new ReentrantReadWriteLock();
public byte[] read(String path) {
readWriteLock.readLock().lock();
try {
...
} finally {
readWriteLock.readLock().unlock();
}
}
public void write(String path, byte[] data) {
readWriteLock.writeLock().lock();
try {
...
} finally {
readWriteLock.writeLock().unlock();
}
}
}
But this would mean all access to the write (for example) will be locked, even if the first invocation is targeting /tmp/file1.txt and the second invocation is targeting /tmp/file2.txt.
Any ideas how to go about this?
Suggest Message Passing For Concurrency Not Threads
In general, this kind of locking happens beneath the java level. Are you really planning on reading and writing the same files and directories? Implementing directories?
Right now there is lots of unsafe threading code that may start blowing up as threads start really running together on multicore hardware.
My suggestion would be to manage concurrency via message passing. You can roll your own if this is an educational exercise, or use one of zillions of queuing and message systems out there. In this kind of system you have only one reader/writer process, but possibly many clients. You avoid the need for file and thread concurrency management.
One advantage of the message passing paradigm is that it will be much, much easier to create a distributed system for load balancing.
Can't you create a different object for each Path and then use synchronize blocks and synchronize on "this"?
You can store the ReadWriteLock instances in a map keyed on path, just make sure that you get concurrent access to the map correct (possibly using ConcurrentHashMap).
If you actually care about locking the file using operating system primitives you might try looking into using java.nio.FileChannel. This has support for fine grained locking of file regions among other things. Also check out java.nio.channels.FileLock.
I would look deeply into Java 5 and the java.util.concurrent package. I'd also recommend reading Brian Goetz' "Java Concurrency in Practice".