WatchService: missed and unhandled events - java

I'm having an issue with WatchService. Here is a snippet of my code:
public void watch(){
//define a folder root
Path myDir = Paths.get(rootDir+"InputFiles/"+dirName+"/request");
try {
WatchService watcher = myDir.getFileSystem().newWatchService();
myDir.register(watcher, StandardWatchEventKinds.ENTRY_CREATE);
WatchKey watckKey = watcher.take();
List<WatchEvent<?>> events = watckKey.pollEvents();
for (WatchEvent event : events) {
//stuff
}
}catch(Exception e){}
watckKey.reset();
}
*First of all, know that watch() is called inside an infinite loop.
The problem is that when creating multiple files at a time, some events are missing. For example, if I copy-paste three files into the ".../request" folder, only one gets caught, the others remain as if nothing happened, neither an OVERFLOW event is triggered. In some different Computer and OS, it reaches up to two files, but if one tries 3 or more, the rest still untouched.
I found a workaround though, but I don't think it's the best practice. This is the flow:
The process starts and then stops at
WatchKey watckKey = watcher.take();
as expected, (as per Processing events). Then, I drop 3 files together in "request" folder, thus, process resumes at
List<WatchEvent<?>> events = watckKey.pollEvents();
The issue is here. It seems like the thread goes so fast through this line that two CREATED events stay behind and are lost, only one is taken. The workaround was to add an extra line right above this one, like this:
Thread.sleep(1000);
List<WatchEvent<?>> events = watckKey.pollEvents();
This seems to be a solution, at least for three and several more simultaneous files, but it's not scalable at all.
So in conclusion, I would like to know if there is a better solution for this issue. FYI, I'm running a Win 7 64
Thanks a lot in advance!

Be sure to reset your watchKey. Some of the aforementioned answers don't, which could explain dropped events as well. I recommend the examples given in the official Oracle documentation: https://docs.oracle.com/javase/tutorial/essential/io/notification.html
Beware that, even when used correctly, the reliability of file services depends heavily on the underlying OS. In general, it should be considered a best-effort mechanism that doesn't give a 100% guarantee.

If watch is called inside a infinite loop then you are creating watch service infinite no of times hence the possibility of losing events , I would suggest do the following , Call your method watchservice once:
public void watchservice()
{
Thread fileWatcher = new Thread(() ->
{
Path path = Paths.get(rootDir+"InputFiles/"+dirName+"/request");
Path dataDir = Paths.get(path);
try
{
WatchService watcher = dataDir.getFileSystem().newWatchService();
dataDir.register(watcher, StandardWatchEventKinds.ENTRY_CREATE);
while (true)
{
WatchKey watckKey;
try
{
watckKey = watcher.take();
}
catch (Exception e)
{
logger.error("watchService interupted:", e);
return;
}
List<WatchEvent<?>> events = watckKey.pollEvents();
for (WatchEvent<?> event : events)
{
logger.debug("Event Type : "+ event.kind() +" , File name found :" + event.context());
if (event.kind() != StandardWatchEventKinds.OVERFLOW)
{
// do your stuff
}
}
}
}
catch (Exception e)
{
logger.error("Error: " , e);
}
});
fileWatcher.setName("File-Watcher");
fileWatcher.start();
fileWatcher.setUncaughtExceptionHandler((Thread t, Throwable throwable) ->
{
logger.error("Error ocurred in Thread " + t, throwable);
});
}

Related

FileNotFound exception even though file is in the place during watch service in java

I have a watch service running on a folder, when I am trying to modify and existing file using evenKind == Modify (basically pasting a same file without removing the current file) I am getting FileNotFoundException (The process cannot access the file because it is being used by another process.)
if (eventKind == StandardWatchEventKinds.ENTRY_MODIFY) {
String newFileChecksum = null;
if (eventPath.toFile().exists()) {
newFileChecksum = getFileChecksum(eventPath.toFile());
}
if (fileMapper.containsKey(eventPath)) {
String existingFileChecksum = fileMapper.get(eventPath);
if (!existingFileChecksum.equals(newFileChecksum)) {
fileMapper.replace(eventPath, existingFileChecksum, newFileChecksum);
log.info("listener.filemodified IN");
for (DirectoryListener listener : this.listeners) {
listener.fileModified(this, eventPath);
}
log.info("listener.filemodified OUT");
} else {
log.info("existing checksum");
log.debug(String.format(
"Checksum for file [%s] has not changed. Skipping plugin processing.",
eventPath.getFileName()));
}
}
}
In the code when...getFileChecksum() is called
if (eventPath.toFile().exists()) {
newFileChecksum = getFileChecksum(eventPath.toFile());
}
So ideally, eventPath.toFile().exists() is TRUE, hence code is going inside if but when getFileChecksum() is called, it goes to method...
private synchronized String getFileChecksum(File file) throws IOException, NoSuchAlgorithmException {
MessageDigest md5Digest = MessageDigest.getInstance("MD5");
FileInputStream fis = null;
if(file.exists()) {
try {
fis = new FileInputStream(file);
} catch(Exception e) {
e.printStackTrace();
}
} else {
log.warn("File not detected.");
}
byte[] byteArray = new byte[1024];
int bytesCount = 0;
while ((bytesCount = fis.read(byteArray)) != -1) {
md5Digest.update(byteArray, 0, bytesCount);
};
fis.close();
byte[] bytes = md5Digest.digest();
StringBuilder stringBuilder = new StringBuilder();
for (int i=0; i< bytes.length ;i++) {
stringBuilder.append(Integer.toString((bytes[i] & 0xff) + 0x100, 16).substring(1));
}
return stringBuilder.toString();
}
}
An exception is coming fis = new FileInputStream(file); even if the file is present in the folder.
FileNotFoundException (The process cannot access the file because it is being used by another process.)
I created a RandomAccessFile and a channel to release any LOCK placed on file, but it is not working. Please suggest what could be happening here.
//UPDATE --> This is the infinite while loop that I have,
WHAT IS HAPPENING? WHEN I PUT A FILE 1 create and 2 update are getting called, suppose, when I am deleting the file, 1 delete 1 modify is being called, and IF I PUT THE SAME FILE BACK TO FOLDER, I GET CREATE but before CREATE is finishing, MODIFY IS BEING called. and create is not running instead modify is running.
I fixed this issue by putting Thread.sleep(500) between
WatchKey wk = watchService.take();
Thread.sleep(500)
for (WatchEvent<?> event : wk.pollEvents()) {
But I dont think I can justify use of sleep here. Please help
WatchService watchService = null;
WatchKey watchKey = null;
while (!this.canceled && (watchKey == null)) {
watchService = watchService == null
? FileSystems.getDefault().newWatchService() : watchService;
watchKey = this.directory.register(watchService,
StandardWatchEventKinds.ENTRY_MODIFY, StandardWatchEventKinds.ENTRY_DELETE,
StandardWatchEventKinds.ENTRY_CREATE);
}
while (!this.canceled) {
try {
WatchKey wk = watchService.take();
for (WatchEvent<?> event : wk.pollEvents()) {
Kind<?> eventKind = event.kind();
System.out.println("Event kind : " + eventKind);
Path dir = (Path)wk.watchable();
Path eventPath = (Path) event.context();
Path fullPath = dir.resolve(eventPath);
fireEvent(eventKind, fullPath);
}
wk.reset();
}
I have a better approach, use and a while loop on a var isFileReady like this...
var isFileReady = false;
while(!isFile...) {
}
inside while create a try and catch.
try {
FileInputStream fis = new FileInputStream();
isFileReady = true;
} catch () {
catch exception or print file not ready.
}
This will solve your problem.
The WatchService is verbose and may report multiple ENTRY_MODIFY events for save operation - even when another application is part way through or doing writes repeatedly. Your code is probably acting on a modify event while the other app is still writing and there may be a second ENTRY_MODIFY on its way.
A safer strategy for using the WatchService is to collate the events you receive and only act on the changes when there is a pause. Something like this will ensure that you block on first event but then poll the watch service with small timeout to see if more changes are present before you act on the previous set:
WatchService ws = ...
HashSet<Path> modified = new HashSet<>();
while(appIsRunning) {
int countNow = modified.size();
WatchKey k = countNow == 0 ? ws.take() : ws.poll(1, TimeUnit.MILLISECONDS);
if (k != null) {
// Loop through k.pollEvents() and put modify file path into modified set:
// DO NOT CALL fireEvent HERE, save the path instead:
...
if (eventKind == ENTRY_MODIFY)
modified.add(filePath);
}
// Don't act on changes unless no new events:
if (countNow == modified.size()) {
// ACT ON modified list here - the watch service did not report new changes
for (Path filePath : modified) {
// call fireEvent HERE:
fireEvent(filePath);
}
// reset the list so next watch call is take() not poll(1)
modified.clear();
}
}
If you are also looking out for CREATE and DELETE operations with MODIFY you will have to collate and ignore some of the earlier events because the last recorded event type can take precedence over a previously recorded type. For example, if calling take() then poll(1) until nothing new is reported:
Any DELETE then CREATE => you might want to consider as MODIFY
Any CREATE then MODIFY => you might want to consider as CREATE
Any CREATE or MODIFY then a DELETE => treat as DELETE
Your logic would also want to only act when value of modified.size() + created.size() + deleted.size() gets changed between runs.
let me guess...
modify event gets called when you modify a file. to modify the file you most likely use a seperate tool like notepad that opens and LOCKS the file.
your watcher gets an event that the file gets modified (right now) but you can not modify it again (which fileinputstream wants to do) since it is locked already.

Java WatchService and multithreading

I'm trying to build a process that will watch a list of directories (populated via JPA) and when a new file is detected in a folder a new thread is started to process that folder. A maximum of one thread should only be running per folder but multiple threads could run spanning different folders.
I've got that working somewhat with the below code but the issue I've found is.. say 1 out of 5 files have moved so far. A thread will be immediately made once the first is detected, the ProcessDatasource thread would then loop through the dir and make 1 file objects to process. In the mean time 4 files would trigger the systemfilewatcher but would block due to a datasource thread already running on that folder. Now since filesystemwatcher will have already triggered when the files landed it won't run again which will leave those 4 files in limbo until another lands in that folder....
To solve this I thought if a file lands and a thread is already running I could call a method within the thread to add the file to the List of files it's currently processing but I'm struggling to do that when the threads are made dynamically in the below loop. Of course this could just be an awful way of doing all this so open to any suggestions.
private boolean checkThreadRunning(String threadName){
Set<Thread> threadSet = Thread.getAllStackTraces().keySet();
for ( Thread t : threadSet){
if ( t.getThreadGroup() == Thread.currentThread().getThreadGroup() && t.getName().equals(threadName)) {
return true;
}
}
return false;
}
public void run(String... args) throws IOException {
WatchService watchService = FileSystems.getDefault().newWatchService();
List<DataSource> datasourceList = readDataSources(); // Load a list of DataSource objects into the datasourceList.
Map<WatchKey, DataSource> keys = registerKeys(watchService, datasourceList);
WatchKey key;
while ((key = watchService.take()) != null) {
DataSource dataSource = keys.get(key);
for (WatchEvent<?> event : key.pollEvents()) {
String dataSourceName = dataSource.getDatasourceName();
String threadName = "datasourceThread-" + dataSourceName;
// Check if there is already a thread running on this datasource (folder)
if (checkThreadRunning(threadName)) {
System.out.println("Found another file for datasource " + dataSourceName + "but an instance is already running");
// Need something here to pass this new file into the currently running thread to be processed...
} else {
// If not then start a thread which will work through processing the files within the folder.
new Thread(new ProcessDatasource(threadName, dataSource)).start();
}
}
key.reset();
}
}

Using FileWatcher with Multithreading

I am trying to integrate Multithreading with FileWatcher service in java. i.e., I am constantly listening to a particular directory -> whenever a new file is created, I need to spawn a new thread which processes the file (say it prints the file contents). I kind of managed to write a code which compiles and works (but not as expected). It works sequentially meaning file2 is processed after file1 and file 3 is processed after file 2. I want this to be executed in parallel.
Adding the code snippet:
while(true) {
WatchKey key;
try {
key = watcher.take();
Path dir = keys.get(key);
for (WatchEvent<?> event: key.pollEvents()) {
WatchEvent.Kind<?> kind = event.kind();
if (kind == StandardWatchEventKinds.OVERFLOW) {
continue;
}
if(kind == StandardWatchEventKinds.ENTRY_CREATE){
boolean valid = key.reset();
if (!valid) {
break;
}
log.info("New entry is created in the listening directory, Calling the FileProcessor");
WatchEvent<Path> ev = (WatchEvent<Path>)event;
Path newFileCreatedResolved = dir.resolve(ev.context());
try{
FileProcessor processFile = new FileProcessor(newFileCreatedResolved.getFileName().toString());
Future<String> result = executor.submit(processFile);
try {
System.out.println("Processed File" + result.get());
} catch (ExecutionException e) {
e.printStackTrace();
}
//executor.shutdown(); add logic to shut down
}
}
}
}
}
and the FileProcessor class
public class FileProcessor implements Callable <String>{
FileProcessor(String triggerFile) throws FileNotFoundException, IOException{
this.triggerFile = triggerFile;
}
public String call() throws Exception{
//logic to write to another file, this new file is specific to the input file
//returns success
}
What is happening now -> If i transfer 3 files at a time, they are sequentially. First file1 is written to its destination file, then file2, file3 so on.
Am I making sense? Which part I need to change to make it parallel? Or Executor service is designed to work like that.
The call to Future.get() is blocking. The result isn't available until processing is complete, of course, and your code doesn't submit another task until then.
Wrap your Executor in a CompletionService and submit() tasks to it instead. Have another thread consume the results of the CompletionService to do any processing that is necessary after the task is complete.
Alternatively, you can use the helper methods of CompletableFuture to set up an equivalent pipeline of actions.
A third, simpler, but perhaps less flexible option is simply to incorporate the post-processing into the task itself. I demonstrated a simple task wrapper to show how this might be done.

Java File.exists and other File operations returning wrong results for an existing File (network, macosx)

The filesystem AirportHDD is mounted (AFP) from the beginning and the file exists when I start this little program.
I tried to figure out the whole day why the following is not working, but couldnt find any solution:
public static void main(String[] arguments)
{
while(1==1)
{
File f=new File(
"/Volumes/AirportHDD/test/lock.csv");
System.out.println(f.exists());
AmySystem.sleep(100);
}
}
the output is:
true, true, ...
as soon as I remove the file from a different computer (AirportHDD is a mounted harddisk over network) then the output keeps saying:
true, true, ...
when I open the finder and goto this directory the output changes to: false, false, ...
when the file is added again (via another pc) the output is still:
false, false, ...
but if you open the finder again and click on the directory and finder shows the existing file, the output changes suddenly to: false, true, true, true, ...
NOTE:
also all other file operations like opening for read are failing as long as java 'thinks' the file is not there
if the program itself is creating and deleting the files then problem is not occurring
just found out while testing that with samba sharing everything is ok, but with AFP it just wont work
is there a way to tell java to do the same thing as finder, like a refresh, or do not try to cache, whatever?
I think you might be looking for the WatchService. Oracle was also kind enough to provide a tutorial.
Because the longevity of these links aren't guaranteed, I'll edit in an example code in a couple of minutes. I just wanted to let you know I think I found something in case you want to start looking at it for yourself.
UPDATE
Following the linked tutorial, I came up with code like this. I'm not sure it'll work (don't have time to test it), but it might be enough to get you started. The WatchService also has a take() method that will wait for events, which means you could potentially assume the file's existence (or lack thereof) based on the last output you gave. That will really depend on what this program will be interacting with.
If this works, good. If not, maybe we can figure out how to fix it based on whatever errors you're getting. Or maybe someone else will come along and give a better version of this code (or better option altogether) if they're more acquainted with this than I am.
public static void main(String[] arguments) {
Path path = Paths.get("/Volumes/AirportHDD/test/lock.csv");
WatchService watcher = FileSystems.getDefault().newWatchService();
WatchKey key = null;
try {
key = path.register(watcher,
ENTRY_CREATE,
ENTRY_DELETE);
} catch (IOException x) {
System.err.println(x);
}
while(true) {//I tend to favor this infinite loop, but that's just preference.
key = watcher.poll();
if(key != null) {
for (WatchEvent<?> event: key.pollEvents()) {
WatchEvent.Kind<?> kind = event.kind();
if (kind == OVERFLOW || kind == ENTRY_DELETE) {
System.out.println(false);
}
else if (kind == ENTRY_CREATE) {
System.out.println(true);
}
}//for(all events)
}//if(file event occured)
else {
File f=new File(path);
System.out.println(f.exists());
}//else(no file event occured)
AmySystem.sleep(100);
}//while(true)
}//main() method
Here is a JUnit test that shows the problem
The problem still happens using Samba on OSX Mavericks. A possible reason
is explaned by the statement in:
http://appleinsider.com/articles/13/06/11/apple-shifts-from-afp-file-sharing-to-smb2-in-os-x-109-mavericks
It aggressively caches file and folder properties and uses opportunistic locking to enable better caching of data.
Please find below a checkFile that will actually attempt to read a few bytes and forcing a true file access to avoid the caching misbehaviour ...
JUnit test:
/**
* test file exists function on Network drive
* #throws Exception
*/
#Test
public void testFileExistsOnNetworkDrive() throws Exception {
String testFileName="/Volumes/bitplan/tmp/testFileExists.txt";
File testFile=new File(testFileName);
testFile.delete();
for (int i=0;i<10;i++) {
Thread.sleep(50);
System.out.println(""+i+":"+OCRJob.checkExists(testFile));
switch (i) {
case 3:
// FileUtils.writeStringToFile(testFile, "here we go");
Runtime.getRuntime().exec("/usr/bin/ssh phobos /usr/bin/touch "+testFileName);
break;
}
}
}
checkExists source code:
/**
* check if the given file exists
* #param f
* #return true if file exists
*/
public static boolean checkExists(File f) {
try {
byte[] buffer = new byte[4];
InputStream is = new FileInputStream(f);
if (is.read(buffer) != buffer.length) {
// do something
}
is.close();
return true;
} catch (java.io.IOException fnfe) {
}
return false;
}
The problem is the network file system AFP. With the use of SAMBA everything works like expected.
Maybe the OS returns the wrong file info in OSX with the use of AFP in these scenarios.

Use jpathwatch to find out when a program finished writing to a file

I'm currently using jpathwatch to watch for new files created in a folder. All fine, but I need to find out when a program finished writing to a file.
The library's author describes on his website (http://jpathwatch.wordpress.com/faq/) how that's done but somehow I don't have a clue how to do that. Maybe it's described a bit unclear or I just don't get it.
I would like to ask whether you could give me a snippet which demonstrates how to do that.
This is the basic construct:
public void run() {
while (true) {
WatchKey signalledKey;
try {
signalledKey = watchService.take();
} catch (InterruptedException ix) {
continue;
} catch (ClosedWatchServiceException cwse) {
break;
}
List<WatchEvent<?>> list = signalledKey.pollEvents();
signalledKey.reset();
for (WatchEvent<?> e : list) {
if (e.kind() == StandardWatchEventKind.ENTRY_CREATE) {
Path context = (Path) e.context();
String filename = context.toString();
// do something
} else if (e.kind() == StandardWatchEventKind.ENTRY_DELETE) {
Path context = (Path) e.context();
String filename = context.toString();
// do something
} else if (e.kind() == StandardWatchEventKind.OVERFLOW) {
}
}
}
}
From the FAQ for jpathwatch, the author says that you will get an ENTRY_MODIFY event regularly when a file is being written and that event will stop being generated when the file writing is complete. He is suggesting that you keep a list of files and the time stamp for the last generated event for each file.
At some interval (which he refers to as a timeout), you scan through the list of files and their timestamps. If any file has a time stamp that is older than your timeout interval, then that should mean that it isn't being updated anymore and is probably complete.
He even suggests you try to determine the rate at a file is growing and calculate out when it should complete so that you can set your poll time to the expected completion duration.
Does that clear it up at all? Sorry I'm not up to expressing that in code :)

Categories

Resources