parallel write and read from file - java

I want to have a central log file in my system, to which a certain application can write and read from.
The writes are for new data, and the reads will be to compare generated data to written data.
I would like this application to run in multiple instances at a time, which means I need to find a way to read diffs from the file, and write.
I have seen this code, but it's good for one go over the file and I don't see it working in multiple instances.
I'm building this app as a command line tool, so I'm thinking about creating a file for each instance and them migrating it to the "general" log file.
I'd like to hear inputs from the forum regarding the different approaches to this question.
What I'm worried about is having a few instances reading and writing from the same file and generating a lock.
This is the code I have found so far:
public class Tp {
public static void main(String[] args) throws IOException{
File f = new File("/path/to/your/file/filename.txt");
BufferedWriter bw = new BufferedWriter(new FileWriter(f));
BufferedReader br = new BufferedReader(new FileReader(f));
bw.write("Some text");
bw.flush();
System.out.println(br.readLine());
bw.write("Some more text");
bw.flush();
bw.close();
br.close();
}
}

You seem to be trying writing and reading the same file not only in one program but even within one thread. I do not believe this would be of benefit as during the program you know when/what you wrote so you can get rid of the whole I/O logic.
In the beginning try to write two different programs that run as separate processes. If need be, you can still try to bring them into the same JVM as separate threads.
Writing for sure is no problem, so the more interesting part is the reading logic. I'd probably implement this algorithm:
Loop until the program is terminated...
open the file, use skip() to jump to the location with new data
consume the existing data
remember how many bytes were read/remember the file size
close the file
wait until file has changed
Waiting for the file to change can be done by monitoring File.lastModified or File.length, or using the WatchService.
But be aware if you have multiple applications writing to the same file in parallel it can break any meaningful structure you have in the data. Log4j ensures parallel writes from within one application/multiple threads will go correctly into the file. If you need multiple processes running synchronized writes, consider logging into a database.

Related

How to achieve Synchronization in Multithreading Java App

I've designed a Server-Client App using Java and i have connected to the Server more than one users.
The Server provides some features, such as:
downloading files
creating files
writting/appending files etc.
I spotted some issues that need to be synchronized when two or more users send the same request.
For Example: When users want to download the same file at the same time, how can i synchronize this action by using synchronized blocks or any other method?
//main (connecting 2 users in the server)
ServerSocket server= new ServerSocket(8080, 50);
MyThread client1=new MyThread(server);
MyThread client2=new MyThread(server);
client1.start();
client2.start();
Here is the method i would like to synchronize:
//outstream = new BufferedWriter(new OutputStreamWriter(sock.getOutputStream()));//output to client
//instream = new BufferedReader(new InputStreamReader(sock.getInputStream()));//input from client
public void downloadFile(File file,String name) throws FileNotFoundException, IOException {
synchronized(this)
{
if (file.exists()) {
BufferedReader readfile = new BufferedReader(new FileReader(name + ".txt"));
String newpath = "../Socket/" + name + ".txt";
BufferedWriter socketfile = new BufferedWriter(new FileWriter(newpath));
String line;
while ((line = readfile.readLine()) != null) {
outstream.write(line + "\n");
outstream.flush();
socketfile.write(line);
}
outstream.write("EOF\n");
outstream.flush();
socketfile.close();
outstream.write("Downloaded\n");
outstream.flush();
} else {
outstream.write("FAIL\n");
}
outstream.flush();
}
}
Note: This method is in a class that extends Thread and is being used when i want to "download" the file in the overriden method Run()
Does this example assures me that when 2 users want to download the same file one of them will have to wait? and will the other one get it? Thanks for your time!
Locking in concurrent is used to provide mutual exclusion to some piece of code. For locking you can use as synchronized as unstructured lock like ReentrantLock and others.
The main goal of any lock is to provide mutual exclusion to the piece of code placed inside which will mean that this piece will be executed only by one thead at a time. Section inside the lock is called critical section.
To achieve a proper locking it is not enough just to place critical code there. Also you have to make sure that modification of you variables made inside the critical section is made only there. Because if you locked some piece of code but references to variables inside also got passed to some concurrent executing thread without any locking then lock wont save you in that case and you will get a data race. Locks secure only execution of a critical section and only guarantee you that code placed in the critical section will be executed only by one thread at a time.
//outstream = new BufferedWriter(new
OutputStreamWriter(sock.getOutputStream()));//output to client
//instream = new BufferedReader(new
InputStreamReader(sock.getInputStream()));//input from client
public void downloadFile(File file,String name) throws
FileNotFoundException, IOException {
synchronized(this)
{
Who is the owner of this method? Client? If yes then it won't work. You should lock on the same object. It should be shared with all threads which require locking. But in your case every client will have it's own lock and the other threads know nothing about other thread's locks. You can lock at the Client.class. This will work.
synchronize(this) vs synchronize(MyClass.class)
After doing that you will have proper locking for reading (downloading) the file. But what about write? Imagine the case when during reading some other thread will want to modify that file. But you have locks only for reading. You are reading the beginning of the file and the other thread is modifying the end of it. So the writing thread will succeed and you will get logically a corrupted file with the begging from one file and the end of the other. Of course file systems and standard java library try to take care about such cases (by using locks in readers\writes, locking the file offsets etc) but in general it is a possible scenario. So you will need also the same lock on write. And read and write methods should share and use the same lock.
And we've came to a situation when we have correct behavior but low performance. This is our tradeoff. But we can do better. Now we are using the same lock for every write and read method and this means that we can read or write to only one any file at a time. But this is not correct cause we can modify or read different files without any possible corruption. So the better approach will be to associate a lock with a file not the whole method. And here nio comes to help you.
How can I lock a file using java (if possible)
https://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileLock.html
And actually you can read a file concurrently if offsets are different. Due to obvious physical reasons you can't read the same part of a file concurrently. But concurrent read and taking care about offsets seems as a huge overhead fmpv and im not sure that you will need that. Anyway here is some info: Concurrent reading of a File (java preferred)

Reuse an InputStream to a Process in Java

I am using ProcessBuilder to input and receive information from a C++ program, using Java. After starting the process once, I would like to be able to input new strings, and receive their output, without having to restart the entire process. This is the approach I have taken thus far:
public void getData(String sentence) throws InterruptedException, IOException{
InputStream stdout = process.getInputStream();
InputStreamReader isr = new InputStreamReader(stdout);
OutputStream stdin = process.getOutputStream();
OutputStreamWriter osr = new OutputStreamWriter(stdin);
BufferedWriter writer = new BufferedWriter(osr);
BufferedReader reader = new BufferedReader(isr);
writer.write(sentence);
writer.close();
String ch = reader.readLine();
preprocessed="";
while (ch!=null){
preprocessed = preprocessed+"~"+ch;
ch = reader.readLine();
}
reader.close();
}
Each time I want to send an input to the running process, I call this method. However, there is an issue: the first time I send an input, it is fine, and the output is received perfectly. However, the second time I call it, I receive the error
java.io.IOException: Stream closed
which is unexpected, as everything is theoretically recreated when the method is called again. Moreover, removing the line the closes the BufferedWriter results in the code halting at the following line, as if the BufferedReader is waiting for the BufferedWriter to be closed.
One final thing - even when I create a NEW BufferedWriter and instruct the method to use that when called for the second time, I get the same exception, which I do not understand at all.
Is there any way this can be resolved?
Thanks a lot!
Your unexpected IOException happens because when Readers and Writers are closed, they close their underlying streams in turn.
When you call your method the first time, everything appears to work. But you close the writer, which closes the process output stream, which closes stdin from the perspective of the process. Not sure what your C++ binary looks like, but probably it just exits happily when it's done with all its input.
So subsequent calls to your method don't work.
There's a separate but similar issue on the Reader side. You call readLine() until it returns null, meaning the Reader has felt the end of the stream. But this only happens when the process is completely done with its stdout.
You need some way of identifying when you're done processing a unit of work (whatever you mean by "sentence") without waiting for the whole entire stream to end. The stream has no concept of the logical pause between outputs. It's just a continuous stream. Reader and Writer are just a thin veneer to buffer between bytes and characters but basically work the same as streams.
Maybe the outputs could have delimiters. Or you could send the length of each chunk of output before actually sending the output and distinguish outputs that way. Or maybe you know in advance how long each response will be?
You only get one shot through streams. So they will have to outlive this method. You can't be opening and closing streams if you want to avoid restarting your process every time. (There are other ways for processes to communicate, e.g. sockets, but that's probably out of scope.)
On an orthogonal note, appending to a StringBuilder is generally more efficient than a big loop of string concatenations when you're accumulating your output.
You might also have some thread check process.exitValue() or otherwise make sure the process is working as intended.
Don't keep trying to create and close your Streams, because once you close it, it's closed for good. Create them once, then in your getData(...) method use the existing Streams. Only close your Streams or their wrapping classes when you're fully done with them.
Note that you should open and close the Streams in the same method, and thus may need additional methods or classes to help you process the Streams. Consider creating a Runnable class for this and then reading from the Streams in another Thread. Also don't ignore the error stream, as that may be sending key information that you will need to fully understand what's going on here.

what are the concern regarding simultaneous read and write to a file?

consider the following scenario:
Process 1 (Writer) continuously appends a line to a file ( sharedFile.txt )
Process 2 (Reader) continuously reads a line from sharedFile.txt
my questions are:
In java is it possible that :
Reader process somehow crashes Writer process (i.e. breaks the process of Writer)?
Reader some how knows when to stop reading the file purely based on the file stats (Reader doesn't know if others are writing to the file)?
to demonsterate
Process one (Writer):
...
while(!done){
String nextLine;//process the line
writeLine(nextLine);
...
}
...
Process Two (Reader):
...
while(hasNextLine()){
String nextLine= readLine();
...
}
...
NOTE:
Writer Process has priority. so nothing must interfere with it.
Since you are talking about processes, not threads, the answer depends on how the underlying OS manages open file handles:
On every OS I'm familiar with, Reader will never crash a writer process, as Reader's file handle only allows reading. On Linux, system calls a Reader can potentially invoke on the underlying OS are open(2) with O_RDONLY flag, lseek(2) and read(2) -- are known not to interfere with the syscalls that the Writer is invoking, such as write(2).
Reader most likely won't know when to stop reading on most OS. More precisely, on some read attempt it will receive zero as the number of read bytes and will treat this as an EOF (end of file). At this very moment, there can be Writer preparing to append some data to a file, but Reader have no way of knowing it.
If you need a way for two processes to communicate via file, you can do it using some extra files that pass meta-information between Readers and Writers, such as whether there are Writer currently running. Introducing some structure into a file can be useful too (for example, every Writer appends a byte to a file indicating that the write process is happening).
For very fast non-blocking I/O you may want consider memory mapped files via Java's MappedByteBuffer.
The code will not crash. However, the reader will terminate when the end is reached, even if the writer may still be writing. You will have to synchronize somehow!
Concern:
Your reader thread can read a stale value even when you think another writer thread has updated the variable value
Even if you write to a file if synchronization is not there you will see a different value while reading
Java File IO and plain files were not designed for simultaneous writes and reads. Either your reader will overtake your writer, or your reader will never finish.
JB Nizet provided the answer in his comment. You use a BlockingQueue to hold the writer data while you're reading it. Either the queue will empty, or the reader will never finish. You have the means through the BlockingQueue methods to detect either situation.

java BufferedWriter file deleted from other source

i have wrote a small piece of code that can be summarized as
Thread() {
run() {
BufferedWriter fileout = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(log, true), "UTF-8"));;
while (true) {
fileout.write(blockingQueue.take());
}
}
}
now, some other threads will produce rows and add them to blockingQueue.
now, if i remove the file from console, the fileout.write doesn't fail nor throw exceptions.
I was wondering how i can re-open the file if someone remove the file from
filesystem via rm logfile.txt from console.
The problem is not how to reopen it, but how to detect that the file was removed.
Some options are
1.do take() and save it to a string
2. open the file and write to it
but even if i change the code in this way, it doesn't guarantee that
the file get written before someone remove it.
The other option is to lock the file, but i don't want to do that.
I don't want to avoid the delete of the file :)
If the file you are writing to can disappear, your best option is to not keep the stream open, but recreate a fresh FileOutputStream whenever you need to write something. That will recreate the file, too (which I suppose is what you want).
Or you could check if the file exists before each write. I suppose that performance-wise the two methods come to about the same.
If performance is an issue, you could buffer in memory and when the buffer is full, open the FileOutputStream (and immediately close it again after writing out the buffer).

Writing to file blocks sometimes

I implemented a download manager, which works fine except that I noted one thing, sometimes the thread blocks for a while(50 milliseconds to up to 10 seconds) when writing to files, I am running this program on Android(Linux based), my guess is if there're some kind of buffer in the OS level that needs to be flushed, and my writing actually writes to that buffer, and if that buffer is full, writing needs to wait.
My question is what is the possible reason that could cause the blocking?
IO is well known to be a 'blocking' activity, hence your question should be 'what should you do while your program is busy waiting for IO to complete'
Adopting some of the well known concurrency strategy and event-based programming pattern is a good start
- I have done the writing and reading of files in the following way and never encountered any probs.
Eg:
File f = new File("Path");
FileWriter fw = new FileWriter(f);
BufferedWriter bw = new BufferedWriter(fw);
- You can alternatively try out the NIO package in Java.

Categories

Resources