I have a multi-threading application where some of my threads should read data from queue and write them to a file. The problem here is that I am confusing should I create new BufferedWriter instance every time when one of my threads reads value from queue and writes it to same file or I can have just one BufferedWriter instance and flush() every time. One problem in second solution is that I should detect when I should close the BufferedWriter without using Java 7 perfect solution for closing resources in try-catch block.
Does the second solution solves some performance issues?
What are best practices on this?
I'd recommend using one BufferedWriter for writing to the file that is shared by all threads. For the sake of performance, the BufferedWriter should be kept open until the application decides that there is no more output. (Opening and closing files is relatively expensive.)
You also need to have the application threads use some kind of locking (e.g. synchronize on the BufferedWriter) to ensure that they don't try to write at the same time. The BufferedWriter class is not thread-safe.
The try/finally or try-with-resources approach to managing file resources is important in cases where you are opening lots of files. If you are only dealing with one file, and it needs to be open for the entire duration of the application, then resource management is not so critical. (But you do need to make sure that you either close or flush before the application exits.)
But I think BufferedWriter is thread-safe, because underlying implementation of write() methods using synchronized blocks
Well in that sense, yes it is. However, that assumes that when your thread writes data, it does it in a single write(...) call.
Note also that BufferedWriter is not specified to be thread-safe, even if it is thread-safe in the current implementation.
A BufferedWriter should ever lead to one file/Writer/OutputStream; if you have many targets you will need many buffers. If you want to write to the same file from multiple threads you will need to synchronize on the earliest bit; you can synchronized on the BufferredWriter if you don't have more high-level constructs that the character stream. If you synchronize on the BufferedWriter, you will not need to call flush() after the end of each access.
Related
consider the following scenario:
Process 1 (Writer) continuously appends a line to a file ( sharedFile.txt )
Process 2 (Reader) continuously reads a line from sharedFile.txt
my questions are:
In java is it possible that :
Reader process somehow crashes Writer process (i.e. breaks the process of Writer)?
Reader some how knows when to stop reading the file purely based on the file stats (Reader doesn't know if others are writing to the file)?
to demonsterate
Process one (Writer):
...
while(!done){
String nextLine;//process the line
writeLine(nextLine);
...
}
...
Process Two (Reader):
...
while(hasNextLine()){
String nextLine= readLine();
...
}
...
NOTE:
Writer Process has priority. so nothing must interfere with it.
Since you are talking about processes, not threads, the answer depends on how the underlying OS manages open file handles:
On every OS I'm familiar with, Reader will never crash a writer process, as Reader's file handle only allows reading. On Linux, system calls a Reader can potentially invoke on the underlying OS are open(2) with O_RDONLY flag, lseek(2) and read(2) -- are known not to interfere with the syscalls that the Writer is invoking, such as write(2).
Reader most likely won't know when to stop reading on most OS. More precisely, on some read attempt it will receive zero as the number of read bytes and will treat this as an EOF (end of file). At this very moment, there can be Writer preparing to append some data to a file, but Reader have no way of knowing it.
If you need a way for two processes to communicate via file, you can do it using some extra files that pass meta-information between Readers and Writers, such as whether there are Writer currently running. Introducing some structure into a file can be useful too (for example, every Writer appends a byte to a file indicating that the write process is happening).
For very fast non-blocking I/O you may want consider memory mapped files via Java's MappedByteBuffer.
The code will not crash. However, the reader will terminate when the end is reached, even if the writer may still be writing. You will have to synchronize somehow!
Concern:
Your reader thread can read a stale value even when you think another writer thread has updated the variable value
Even if you write to a file if synchronization is not there you will see a different value while reading
Java File IO and plain files were not designed for simultaneous writes and reads. Either your reader will overtake your writer, or your reader will never finish.
JB Nizet provided the answer in his comment. You use a BlockingQueue to hold the writer data while you're reading it. Either the queue will empty, or the reader will never finish. You have the means through the BlockingQueue methods to detect either situation.
So I have a java process that needs to constantly append a new line to a file every 100 milliseconds. I am currently use BufferedWriter for this, but from what I have read, the BufferedWriter object should always be .close()'d when its finished.
If I did this, I would have to create a new BufferedWriter object every few milliseconds, which is not ideal. Are there any issues with creating one static BufferedWriter, and just .flush()'ing it after every write?
Finally, is BufferedWriter the best class to use for this, if performance is a concern? Are there any viable alternatives?
Thanks!
The BufferedWriter should be closed when it's finished. If you're doing something like logging, it's entirely acceptable to hold an open writer in the object responsible for the logging and then close it at the end of the run (or whenever you roll over to a new log file).
What you shouldn't do is simply open the writer and then discard the reference without closing it; this can leak resources, and in the case of something with buffering, might lose the last part of the output.
Are there any issues with creating one static BufferedWriter, and just .flush()'ing it after every write?
There is nothing wrong with a single, long-lived BufferedWriter. In fact it is a good idea. (Whether you use a static or something else is a different issue ... but that design decision does not impact on functionality and performance.)
Calling flush after each write is more questionable from a performance perspective. It will cause your application to make a lot of write syscalls ... which is expensive. I would only do that if you need the logging to be written immediately. The alternatively it to flush on a timer ... or not flush at all, and rely on the (final) close of the BufferedWriter to flush any outstanding data.
But either way, a long-lived BufferedWriter that you flush is likely to be better than creating, writing and closing lots of BufferedWriter objects.
Suppose a Java application writes to a file using BufferedWriter API (and does not call flush after every write). I guess that if the application exits with System.exit the buffer is not flushed and so the file might be corrupted.
Suppose also that the application component, which decides to exit, is not aware of the component, which writes to the file.
What is the easiest and correct way to solve the "flush problem" ?
You may use the Runtime.addShutdownHook method, which can be used to add a jvm shutdown hook. This is basically a unstarted Thread, which executes on shutdown of the Java Virtual Machine.
So if you have a handle of the file available for that thread, then you can try to close the stream and flush the output.
Note: Although it seems feasible to use this, but I believe there will be implementation challenges to it because of the fact that whether your file handle is not stale when your shutdown hook is called. So the better approach should be to close your streams gracefully using finally blocks in the code where file operations are done.
You can add a shutdown hook but you need to have a reference to each of these BufferedWriter or other Flushable or Closable objects. You won't gain anything from it. You should perform close() and flush() directly in the code that is manipulating the object.
Think of the Information Expert GRASP pattern, the code manipulating the BufferedWriter is the place that has the information about when an operation is finished and should be flushed, so that's where that logic should go.
If some application component is calling System.exit when things aren't done, I would consider that an abnormal exit, should not return 0 and therefore shouldn't guarantee that streams are flushed.
I implemented a download manager, which works fine except that I noted one thing, sometimes the thread blocks for a while(50 milliseconds to up to 10 seconds) when writing to files, I am running this program on Android(Linux based), my guess is if there're some kind of buffer in the OS level that needs to be flushed, and my writing actually writes to that buffer, and if that buffer is full, writing needs to wait.
My question is what is the possible reason that could cause the blocking?
IO is well known to be a 'blocking' activity, hence your question should be 'what should you do while your program is busy waiting for IO to complete'
Adopting some of the well known concurrency strategy and event-based programming pattern is a good start
- I have done the writing and reading of files in the following way and never encountered any probs.
Eg:
File f = new File("Path");
FileWriter fw = new FileWriter(f);
BufferedWriter bw = new BufferedWriter(fw);
- You can alternatively try out the NIO package in Java.
Is it possible to have one thread write to the OutputStream of a Java Socket, while another reads from the socket's InputStream, without the threads having to synchronize on the socket?
Sure. The exact situation you're describing shouldn't be a problem (reading and writing simultaneously).
Generally, the reading thread will block if there's nothing to read, and might timeout on the read operation if you've got a timeout specified.
Since the input stream and the output stream are separate objects within the Socket, the only thing you might concern yourself with is, what happens if you had 2 threads trying to read or write (two threads, same input/output stream) at the same time? The read/write methods of the InputStream/OutputStream classes are not synchronized. It is possible, however, that if you're using a sub-class of InputStream/OutputStream, that the reading/writing methods you're calling are synchronized. You can check the javadoc for whatever class/methods you're calling, and find that out pretty quick.
Yes, that's safe.
If you wanted more than one thread reading from the InputStream you would have to be more careful (assuming you are reading more than one byte at a time).