parallel, exactly same rename operations on NFS succeeds - java

Is it possible that multiple parallel rename operations of the same file to the same other name on NFS succeeds? This is what I experience.
I implemented file processors that work in parallel. They process files, by first "reserving" the file. The reservation is implemented via rename - adding some fixed prefix (e.g. RESERVED_, same for all processors). Only after successful reservation, they start processing the file. In this implementation I assumed that rename will succeed in only one processor. Unfortunately, what I experience is that rename succeeds in multiple processors...
More details:
Processors are implemented in Java. I use the call File::renameTo.
Processors work on separate servers.
time is synced with NFS via NTP.
I avoid FS-specific implementation (like using fcntl locks), because I wanted the same logic to work also for FTP/SFTP.
Does anyone know what is the expected behavior? Is there something I'm doing wrong or an easy fix (maybe some mounting option)? If this approach is just wrong, do you know of other approaches to generic file-locking on NFS?
Thanks for any help

Related

Design pattern for 2 varities of the same feature

All,
I am developing a feature that will on execution of an operation write the logs to a file server using ftp. Note that the write to file server will happen only if a file server is configured. If no server is configured the operation will exit and return the status. The flow is something like this:
1. Execute operation
2. If file server connected (check in DB and ping), write logs
3. return
Now I would like to know if there are design patterns for this, same feature, however the scope of the feature will vary depending on whether or not some configuration is done. I would much appreciate help on this for 2 scenarios:
Static - If the DB config is one time during boot up - as in post bootup the system can "assume" that the file server is there or not based on the read from DB
Dynamic- When the system is up and running, I might bring up a file server and configure DB. Ideally for this scenario the system should detect the file server and start writing logs to it, rather than being forced to reboot the system.
Requesting help in this regard.
Thanks
Your design looks like a breach of the Single Responsibility Principle. You are entangling two different concerns: the first concern is the operation itself, and the second is shipping the logs to a central location.
Think about separating your component into two simpler, independent components. One of them performs the business operation and writes logs, say to a local file, and that's it. The other component checks for the existence of new logs on the local file system and copies them to the central location.
You didn't mention whether or not you are using an existing logging framework, such as Log4J. If not, it would probably be a good idea - if you try to roll out your own logging framework, you can end up having to deal with additional unforeseen complexities, such as dealing with log levels (INFO, DEBUG, ERROR, etc.).
With regards to your original message - I'd consider using the Factory pattern - create a factory class that can internally check whether the file server is available, and return one of two different logger types - something like a ConsoleLogger and an FTPLogger. Make sure that both of these implement the same interface so that your calling code doesn't have to care about what type of logger it's using. Alternatively, you can also use a Decorator that can wrap the object performing your operation - and once it completes a request, have the decorator do the logging.
A final comment - try to avoid checking whether the file server is available every time that you log. A database hit on every log call could result in horrible performance, not to mention that you'll have to ensure that errors in the logging method (such as DB locks) don't result in your entire operation failing.

How to refactor procedural start-up code?

I have a class (Android Activity) which handles start-up of my application. The application has some pretty complex start-up rules. Right now it looks like a bunch of spaghetti and I'm looking for strategies for refactoring it.
It's honestly such a mess I'm having problems hacking it down to provides pseudo code. In general there are some rules for start-up that are basically codified in logic:
Steps:
Check for error on last exit and flush local cache if necessary
Download settings file
Parse settings and save settings to local native format
Using the values in settings, do a bunch of 'house keeping'
Using a value in settings, download core data component A
Parse component A and load up local cache
During this logic, its also updating the user interface. All of this is handled in a zig-zagging, single monolithic class. Its very long, its got a bunch of dependencies, the logic is very hard to follow and it seems to touch way too many parts of the application.
Is there a strategy or framework that can be used to break up procedural start-up code?
Hmmm. Based on your steps, I see various different "concerns":
Reading and saving settings.
Downloading settings and components (not sure what a "component" is here) from the server.
Reading and instantiating components.
Flush and read cache.
Housekeeping (not really sure what this all entails).
UI updates (not really sure what this requires either).
You might try splitting up the code into various objects along the lines of the above, for example:
SettingsReader
ServerCommunicationManager (?)
ComponentReader
Cache
Not sure about 5 and 6, since I don't have much to go on there.
Regarding frameworks, well, there are various ones such as the previously mentioned Roboguice, that can help with dependency injection. Those may come in handy, or it may be easier just to do this by hand. I think that before you consider dependency injection, though, you need to untangle the code. All that dependency injection frameworks do is to initialize your objects for you -- you have to make sure that the objects make sense first.
Without any more details, the only suggestion that I can think of is to group the various steps behind well structured functions which do one thing and one thing only.
Your 6 steps look to be a good start for the 6 functions your init function should have. If #2 was synchronous (I doubt it), I would merge #2, #3 into a getSettings function.

Detecting newly created files though Java in realtime

Using JDK 7 I've had success in watching specific directories for new file creations, deletions and modifications using java.nio.file.StandardWatchEventKinds.*
I'm hoping someone may know a way to get Java to detect new file creations regardless of their path.
I am wanting to do this so I can calculate an MD5 sum for each newly written file.
Thanks for any advice you can offer.
Ok, short answer is I don't think Java can do that out of the box. You'd have to either intercept calls to the operating system which would require something closer to the bare metal, or you could do as suggested in another answer and register listeners to every folder from the root down, not to mention other drives in the case of windows machines.
The first approach would need custom JNI which assumes the OS has such a hook and allows user code access.
The second approach would work but could consume a large amount of memory to track all the listeners. In windows right-click on c:\ and select and see just how many folders we're talking about.
One possibility - not a convenient one, but a possibility - is to walk the directory tree for the directories you want to watch, registering each in a WatchService. That's not a very nice way to go about it, and it could be a problem depending on how large the actual directory tree is.
I do not know StandardWatchEvents (although it sounds convenient).
One way to do one you want is to use a native window API such as ReadDirectoryChangesW (or volume changes). It's painful, but works (been there, done that, wish I had another option at the time).

2 java processes - one reading and one writing to the same file

I have two java processes which I want completely decoupled from each other.
I figure that the best way to do this is for one to write out its data to file and the other to read it from that file (the second might also have to write to the file to say its processed the line).
Problems I envisage are do with similtaneous access to the file. Is there a good simple pattern I can use to get around this problem? Is there a library that handles this sort of functionality?
Best way to describe it is as a simple direct message passing mechanism I could implement using files. (Simpler than JMS).
Thanks Dan
If you want a simple solution and you can assume that "rename file" is an atomic operation (this is not completely true), each one of the processes can rename the file when reading it or writing to it and rename back when it finishes. The other one will not find the file and will wait until the file appears.
you mean like a named pipe? it's possible but java doesn't allow pipe creation unless you use non portable processes
You are asking for functionality that is exactly what JMS does. JMS is an API which has many implemententations. Can you you not just use a lightweight implementation? I don't see why you think this is "complicated". By the time you've mananged to reliably implement your solution you'll have found that it's not trivial to deal with all the edge cases.
Correct me if I don't understand your problem...
Why don't you look at file locks ? When a program acquire the lock, the other wait until the lock is released
If you are not locked on a file-based solution, a database can solve your problem.
Each record will be a line written by the writing process. A single column in the record will be untouched and the reading process will use it to indicate that it red the record.
Naturally you will have to deal with cleanup of the table before it becomes to large, or its partitioning so it will be easy for the reading process to find information inside it.
If you must use a file - you can think of another file that just has the ID of the record that the reader process read - that way you don't need to have concurrently writing processes on the same file.

Touch a file using apache FileUtils

I have looked at the source code of Apache Commons FileUtils.java class to see how they implement unix like touch functionality. But I wanted to confirm with the community here if my use case would be met by the implementation as it opens and closes a FileOutputStream to provide touch functionality
We have two webservers and one common server between them where a File is residing
For our application we need to use the time modified of this file to make some decisions. We actually don't want to modify the file but change its last modified date when some particular activity happens on one of the webservers.
Its important that last modified time set for the file is taken from the central server to avoid worrying about time differences between two web servers. Therefore changing file.setLastModfiied is not a good option as webserver would send its own time.
But I am wondering that even if I use Apache Commons FileUtils touch method to do this, would closing stream on one webserver set the last modified time of the file using time of the webserver or the central server.
Sorry for so much details but could not see any other way to explain the issue
If you "touch" a file in the filesystem of one webserver, then the timestamp of the file will be set using the clock of that server. I don't think you can solve your problem that way.
I think you've got three options:
configure the servers to synchronize their clocks to the common timebase; e.g. using NTP,
put all files whose timestamps must be accurate to the common timebase on one server, or
change your system design so that it is immune to problems with different servers' clocks being out of sync.
It would be much better to make use of a shared database if you have one so that you can avoid issues of concurrency and synchronisation. I can't recommend any simple and safe distributed file flag system.

Categories

Resources