Manage a lock in a file system - java

I have two Java processes and need to make sure that they do not simultaneously access directory /dir. I am not sure how to properly implement this behaviour.
My idea would be to define a certain file lock.txt and do something like
if not (lock.txt exists)
{
create lock.txt with content "process 1"
do something in /dir
delete lock.txt
}
But I guess I could run into some kind of race condition if both processes check this simultaneously.
EDIT: my Java processes are separate programs.

Look at the FileLock class here: http://docs.oracle.com/javase/6/docs/api/java/nio/channels/FileLock.html
You could have found this with a bit of googling

Related

Is Files.copy a thread-safe function in Java?

I have a function, that's purpose is to create a directory and copy a csv file to that directory. This same function gets ran multiple times, each time by an object in a different thread. It gets called in the object's constructor, but I have logic in there to only copy the file if it does not already exist (meaning, it checks to make sure that one of the other instances in parallel did not already create it).
Now, I know that I could simply rearrange the code so that this directory is created and the file is copied before the objects are ran in parallel, but that is not ideal for my use case.
I am wondering, will the following code ever fail? That is, due to one of the instances being in the middle of copying a file, while another instance attempts to start copying that same file to the same location?
private void prepareGroupDirectory() {
new File(outputGroupFolderPath).mkdirs();
String map = "/path/map.csv"
File source = new File(map);
String myFile = "/path/test_map.csv";
File dest = new File(myFile);
// copy file
if (!dest.exists()) {
try{
Files.copy(source, dest);
}catch(Exception e){
// do nothing
}
}
}
To sum it all up. Is this function thread-safe in the sense that, different threads could all run this function in parallel without it breaking? I think yes, but any thoughts would be helpful!
To be clear, I have tested this many many times and it has worked every time. I am asking this question to make sure, that in theory, it will still never fail.
EDIT: Also, this is highly simplified so that I could ask the question in an easy to understand format.
This is what I have now after following comments (I still need to use nio instead), but this is currently working:
private void prepareGroupDirectory() {
new File(outputGroupFolderPath).mkdirs();
logger.info("created group directory");
String map = instance.getUploadedMapPath().toString();
File source = new File(map);
String myFile = FilenameUtils.getBaseName(map) + "." + FilenameUtils.getExtension(map);
File dest = new File(outputGroupFolderPath + File.separator + "results_" + myFile);
instance.setWritableMapForGroup(dest.getAbsolutePath());
logger.info("instance details at time of preparing group folder: {} ", instance);
final ReentrantLock lock = new ReentrantLock();
lock.lock();
try {
// copy file
if (!dest.exists()) {
String pathToWritableMap = createCopyOfMap(source, dest);
logger.info(pathToWritableMap);
}
} catch (Exception e) {
// do nothing
// thread-safe
} finally {
lock.unlock();
}
}
It isn't.
What you're looking for is the concept of rotate-into-place. The problem with file operations is that almost none of it is atomic.
Presumably you don't just want 'only one' thread to win the race for making this file, you also want that file to either be perfect, or not exist at all: You would not want anybody to be able to observe that CSV file in a half-baked state, and you most certainly wouldn't want a crash halfway through generating the CSV file to mean that the file is there, half-baked, but its mere existence means it prevents any attempt to write it out properly. You can't use finally blocks or exception catching to address this issue; someone might trip over a powercable.
So, how do you solve all these problems?
You do not write to foo.csv. Instead you write to foo.csv.23498124908.tmp where that number is randomly generated. Because that just isn't the actual CSV file anybody is looking for, you can take all the time in the world to finish it properly. Once it is done, then you do the magic trick:
You rename foo.csv.23498124908.tmp into foo.csv, and do so atomically - one instant in time foo.csv does not exist, the next instant in time it does and it has the complete contents. Also, that rename will only succeed if the file didn't exist before: It is impossible for two separate threads to both rename their foo.csv.23481498.tmp file into foo.csv simultaneously. If you were to try it and get the timing just perfect, one of them (arbitrary which one) 'wins', the other one gets an IOException and doesn't rename anything.
The way to do this is using Files.move(from, to, StandardCopyOptions.ATOMIC_MOVE). ATOMIC_MOVE is even kind enough to flat out refuse to execute if somehow the OS/filesystem combination simply does not support ATOMIC_MOVE (they pretty much all do, though).
The second advantage is that this locking mechanism works even if you have multiple entirely different apps running. If they all use ATOMIC_MOVE or the equivalent of this in that language's API, only one can win, whether we're talking 'threads in a JVM' or 'apps on a system'.
If you want to instead avoid the notion that multiple threads are both simultaneously doing the work to make this CSV file even though only one should do so and the rest should 'wait' until the first thread is done, file system locks are not the answer - you can try (make an empty file whose existence is a sign that some other thread is working on it) - and there's even a primitive for that in java's java.nio.file APIs. The CREATE_NEW flag can be used when creating a file, which means: Atomically create it, failing if the file already exists with concurrency guarantees (if multiple processes/threads all run that simultaneously, one succeeds and all others fail, guaranteed). However, CREATE_NEW can only atomically create. It cannot atomically write, nothing can (hence the whole 'rename it into place' trick above).
The problem with such locks are two fold:
If the JVM crashes that file doesn't go away. Ever launched a linux daemon process, such as postgresd, and it told you that 'the pid file is still there, if there is no postgres running please delete it'? Yeah, that problem.
There's no way to know when it is done, other than to just re-check for that file's existence every few milliseconds. If you wait very few milliseconds you're trashing the disk potentially (hopefully your OS and disk cache algorithms do a decent job). If you wait a lot you might be waiting around for no reason for a long time.
Hence why you shouldn't do this stuff, and just use locks within the process. Use synchronized or make a new java.util.concurrent.ReentrantLock or whatnot.
To answer your code snippet specifically, no that is broken: It is possible for 2 threads to run simultaneously and both get false when it runs dest.exists(), thus both entering the copy block, and then they fall all over each other when copying - depending on file system, usually one thread ends up 'winning', with their copy operation succeeding and the other thread's seemingly lost to the aether (most file systems are ref/node based, meaning, the file was written to disk but its 'pointer' was immediately overwritten, and the filesystem considers it garbage, more or less).
Presumably you consider that a failing scenario, and your code does not guarantee that it can't happen.
NB: What API are you using? Files.copy(instanceOfJavaIoFile, anotherInstanceOfJavaIoFile) isn't java. There is java.nio.file.Files.copy(instanceOfjnfPath, anotherInstanceOfjnfPath) - that's the one you want. Perhaps this Files you have is from apache commons? I strongly suggest you don't use that stuff; those APIs are usually obsolete (java itself has better APIs to do the same thing), and badly designed. Ditch java.io.File, it's outdated API. Use java.nio.file instead. The old API doesn't have ATOMIC_MOVE or CREATE_NEW, and doesn't throw exceptions when things go wrong - it just returns false which is easily ignored and has no room to explain what went wrong. Hence why you should not use it. One of the major issues with the apache libraries is that it uses the anti-pattern of piling a ton of static utility methods into a giant container. Unfortunately, the second take on file stuff in java itself (java.nio.file) is similarly boneheaded API design. I guess in the java world, third time will be the charm. At any rate, a bad core java API with advanced capabilities is still a better than a bad apache utility API that wraps around the older API which simply does not expose the kinds of capabilities you need here.

Forcefully terminating a thread I didn't write in Java

Everywhere I look about how to forcefully stop a thread in Java, I see "just do an exit variable check instead, your program is broken if you need to force kill."
I have a rather unique situation though. I am writing a Java program that dynamically loads and runs other Java classes in a separate thread. (No comments about security risks please, this is a very specific use case).
The trouble is, since other people will have written the classes that need to be loaded, there's no way to guarantee they'll implement the stop checking and whatnot correctly. I need a way to immediately terminate their thread, accepting all the risks involved. Basically I want to kill -9 their thread if I need to. How can I do this in Java?
Update: here's a bit more info:
This is actually an Android app
The user code depends on classes in my application
A user class must be annotated with #UserProgram in order to be "registered" by my application
The user also has the option of building their classes right into the application (by downloading a project with the internal classes already compiled into a libraries and putting their classes in a separate module) rather than having them dynamically loaded from a JAR.
The user classes extend from my template class which has a runUserProgram() method that they override. Inside that method, they are free to do anything they want. They can check isStopRequested() to see if I want them to stop, but I have no guarantee that they'll do that.
On startup, my application loads any JARs specified and scans both all the classes in the application and the classes in those JARs to find any classes annotated with the aforementioned annotation. Once a list of those classes is built, it is fed into the frontend where the UI provides a list of programs that can be run. Once a program is selected, a "start" button must be pressed to actually start it. When it is pressed, the button changes to a "stop" button and a callback is fired into the backend to load up the selected class in a new thread and call the runUserProgram() method. When the "stop" button is pressed, a variable is set which causes isStopRequested() to return true.
You can kill -9 it by running in its own process i.e. start with a ProcessBuilder and call Process.destroyForcibly() to kill it.
ProcessBuilder pb = new ProcessBuilder("java", "-cp", "myjar.jar");
pb.redirectErrorStream();
Process process = pb.start();
// do something with the program.
Scanner sc = new Scanner(process.getOutputStream());
while (sc.hasNextLine()) {
System.out.println(sc.nextLine());
}
// when done, possibly in another thread so it doesn't get blocked by reading.
process.waitFor(1, TimeUnit.SECONDS);
if (process.isAlive())
process.destroyForcibly();
Java 8 had Thread.stop(). The problem is that it could only work reasonably for very limited use cases, so limited you were better off using interrupts, and if the code isn't trusted, neither are any good.
There is the deprecated Thread.stop() but don't use it.
There is no way to cleanly terminate another thread without it cooperating.
The thread can be in a state where it allocated some memory, or added some objects to some global state, locked some mutexes, etc. If you kill it at the wrong moment, you risk leaking memory or even causing a deadlock.
It would be possible through JNI, under Windows there is a TerminateThread API that you can call, there is (hopefully) probably a similar thing under Android. The trouble will be getting the thread's native handle, you would need to obtain that when your user "program" is first loaded, probably by calling another JNI method from the thread in question as part of the initialisation process and getting the current thread handle from that.
I have not tried this myself, best case is that this "works" and kills the thread, but it is going to cause that thread to leak resources. Worst case is that it will leave the JVM in an inconsistent state internally, which will probably crash your entire application.
I really think this is a Bad Idea.
A better design, if you want to allow this, is to run your user code in another process and communicate with it via sockets or pipes. This way you can relatively safely terminate the other process if necessary. It's more work, but it's going to be a lot better in the long run.
You shold use Thread.interrupt().

Java concurrent writes from multiple threads to a single text file?

I have a multi-threaded Java 7 program (a jar file) which uses JDBC to perform work (it uses a fixed thread pool).
The program works fine and it logs things as it progresses to the command shell console window (System.out.printf()) from multiple concurrent threads.
In addition to the console output I also need to add the ability for this program to write to a single plain ASCII text log file - from multiple threads.
The volume of output is low, the file will be relatively small as its a log file, not a data file.
Can you please suggest a good and relatively simple design/approach to get this done using Java 7 features (I dont have Java 8 yet)?
Any code samples would also be appreciated.
thank you very much
EDIT:
I forgot to add: in Java 7 using Files.newOutputStream() static factory method is stated to be thread safe - according to official Java documentation. Is this the simplest option to write a single shared text log file from multiple threads?
If you want to log output, why not use a logging library, like e.g. log4j2? This will allow you to tailor your log to your specific needs, and can log without synchronizing your threads on stdout (you know that running System.out.print involves locking on System.out?)
Edit: For the latter, if the things you log are thread-safe, and you are OK with adding LMAX' disruptor.jar to your build, you can configure async loggers (just add "async") that will have a logging thread take care of the whole message formatting and writing (and keeping your log messages in order) while allowing your threads to run on without a hitch.
Given that you've said the volume of output is low, the simplest option would probably be to just write a thread-safe writer which uses synchronization to make sure that only one thread can actually write to the file at a time.
If you don't want threads to block each other, you could have a single thread dedicated to the writing, using a BlockingQueue - threads add write jobs (in whatever form they need to - probably just as strings) to the queue, and the single thread takes the values off the queue and writes them to the file.
Either way, it would be worth abstracting out the details behind a class dedicated for this purpose (ideally implementing an interface for testability and flexibility reasons). That way you can change the actual underlying implementation later on - for example, starting off with the synchronized approach and moving to the producer/consumer queue later if you need to.
Keep a common PrintStream reference where you'll write to (instead of System.out) and set it to System.out or channel it through to a FileOutputStream depending on what you want.
Your code won't change much (barely at all) and PrintStream is already synchronized too.

Process Synchronization in java

Process A writes in a file XYZ, when executed. There are processes B and C, which when executed, reads the file XYZ. So, while process A is up, B and C should wait for A to complete. To provide synchronization can I use java.nio package? or I should use something like FileLock or sockets? Can we mention the time to wait for the second process to wait?
Edited: The file is created during the first write process. In such case, can I make it shared resource?
Using java.nio package's file lock could be a better solution, I hope. But, I think java.nio is not full-fledged till JDK 1.6.
http://www.withoutbook.com/DifferenceBetweenSubjects.php?subId1=7&subId2=43&d=Java%206%20vs%20Java%207
FileLock:
http://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileLock.html
One way could be the usage of a flag. Just a boolean stillWriting which is readable from outside.
As soon process A did its Job, this flag is set to false and your processes B/C can start their work with this file.
Assuming A wants to start again editing this file, it'll set this flag back to true and block the other two processes.
Using locks would be a good idea. You can use Conditions from JavaAPi.
Refer to [http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Condition.html#awaitNanos(long)][1]
When A is working it should signal the thread to await and then on completion it can signal so that other thread waiting to start can proceed. Also this is very appropriate when we use shared resource.

What do I do about a Java program that spawned two instaces of itself?

I have a java JAR file that is triggered by a SQL server job. It's been running successfully for months. The process pulls in a structured flat file to a staging database then pushes that data into an XML file.
However yesterday the process was triggered twice at the same time. I can tell from a log file that gets created, it looks like the process ran twice simultaneously. This caused a lot of issues and the XML file that it kicked out was malformed and contained duplicate nodes etc.
My question is, is this a known issue with Java JVM's spawning multiple instances of itself? Or should I be looking at sql server as the culprit? I'm looking into 'socket locking' or file locking to prevent multiple instances in the future.
This is the first instance of this issue that I've ever heard of.
More info:
The job is scheduled to run every minute.
The job triggers a .bat file that contains the java.exe - jar filename.jar
The java program runs, scans a directory for a file and then executes a loop to process if the file if it finds one. After it processes the file it runs another loop that kicks out XML messages.
I can provide code samples if that would help.
Thank you,
Kevin
It's not a Java problem. If you want the app to run alone, no copies, you should use the shell script or the java app to make and remove a lock somewhere.
You actually start multiple java's by starting more than 1 batch job with the same command. Windows nor Java can now that's not what you want. You could solve that by something like:
public static void main(String [ ] args)
{
createLockIfNotExists();
try {
yourstuff;
} finally {
releaseLock();
}
}
private static void createLockIfNotExists() throws MyLockAlreadyExists {
// A bit tricky
// check if LOCKFILE exists, if yes throw MyLockAlreadyExists
// try to create LOCKFILE, can fail if at 1 ms earlier an other app created
// that file, so an exception while creating also results in LockAlreadyExists
}
Are there good examples somewhere which handle this locking? Maybe in Apache Commons?
Here seems to be a functioning example for Windows.
You could also use the database to write your lock. Lock the locking table before you use it of course so no 2 processes write their lock at the same time, and afterwards read the lock record to check whether you actually got the lock. Something like pseudo code:
SELECT * FROM lock_table;
if locks.length > 0: someone else is running
LOCK lock_table;
INSERT INTO lock_table VALUES(my_pid);
UNLOCK lock_table;
SELECT pid FROM lock_table;
if pids.length > 1: what happened?
if pids[0] != my_pid: someone else got the lock
A bit more juice and you also add not only the PID but also a timestamp, and check whether that timestamp is stale (too old).

Categories

Resources