ReadWriteLock for concurrency access to file - java

I have a class which implements read and write operation to file in a concurrent environment. I know BufferedInputStream and BufferedWriter are synchronized but in my case read and write operations can be used simultaneously. Now I use ReentrantReadWriteLock but I'm not confident about a solution correctly.
public class FileSource {
private final File file;
private final ReadWriteLock lock;
public FileWrapper(final File file) {
if (Objects.isNull(file)) {
throw new IllegalArgumentException("File can't be null!");
}
this.file = file;
this.lock = new ReentrantReadWriteLock();
}
public String getContent() {
final Lock readLock = lock.readLock();
readLock.lock();
final StringBuilder sb = new StringBuilder();
try (final BufferedInputStream in =
new BufferedInputStream(
new FileInputStream(file))) {
int data;
while ((data = in.read()) > 0) {
sb.append(data);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
readLock.unlock();
}
return sb.toString();
}
public void saveContent(final String content) {
final Lock writeLock = lock.writeLock();
writeLock.lock();
try (BufferedWriter out =
new BufferedWriter(
new FileWriter(file))) {
out.write(content);
} catch (IOException e) {
e.printStackTrace();
} finally {
writeLock.unlock();
}
}
}
ReentrantReadWriteLock is the correct solution in this case or I need to use ReentrantLock or something else? (with a reason)
This discussion not about class design like File as a state or send File directly in the method or using nio package or ext. It shouldn't be a utility class. Method signatures and File as a field must stay without changes. It is about potential concurrency problems with File and InputStream\OutputStream.

RRWL is fine here. Of course, if some code makes a new FileWrapper("/foo/bar.txt") and then some other code also makes a separate new FileWrapper("/foo/bar.txt") those two wrappers will be falling all over themselves and will cause things to go pearshaped; I assume you have some external mechanism to ensure this cannot happen. If you don't, some take on ConcurrentHashMap and its concurrency methods (such as computeIfAbsent; don't use the plain jane get/put for these) can help you out.
Note that your exception handling is bad. Exception messages should not end in punctuation (think about it: Without this rule, 80% of all exception messages would end in an exclamation mark and will make log files a fun exercise), and in general, if you ever write catch (Exception e) { e.printStackTrace(); }, you go to that special place reserved for people who talk in movie theaters, and people who write that.
I'd say a method called saveContent is justified in throwing some checked exception; after all, it's rather obvious that can fail, and code calling it can feasibly be expected to perhaps take some action if it does.
If you just can't get there, the proper ¯\(ツ)/¯ I dunno catch block handler is: throw new RuntimeException("uncaught", e);. Not e.printStackTrace();. The latter logs to an uncontrollable place, shaves off useful information, and crucially, keeps on running as if nothing is wrong, silently ignoring the fact that a save call just failed, whereas the former preserves all and will in fact abort code execution. It makes it hard to recover, but at least it's easier than e.printStackTrace, and if you wanted it to be easier, than make a special exception. Or just throw that IOException unmolested (you get way shorter code to boot!).
Another insiduous bug in this code is that it uses 'platform default charset encoding' to read your file, which is very rarely what you want.
The new Files API can also read the entire file in one go, saves you a ton of code on that read op. As you upgrade your code, you get the benefit of the Files API's unique take on charset encodings: Unlike most other places in the java libraries, java.nio.file.Files will assume you meant to use UTF-8 encoding if you fail to specify (instead of 'platform default', i.e. 'the thing that you cannot test for that will blow up in production and waste a week of your time chasing after it').

Related

file thread synchronization

How to perform read and write operation by using thread synchronization.
Condition: If one file exists where writers may write information to, only one writer may write at a time. Confusion may arise if a reader is trying read at the same as a writer is writing. Since readers only look at the data, but do not modify the data, we can allow more than one reader to read at the same time.
//reader thread
class Read extends Thread {
static File Reader fr1 = null;
static Buffered Reader br1 = null;
static synchronized void reader() throws IO Exception {
String path ="C:/Users/teja/Documents/file1.txt";
fr1 = new File Reader(path);
br1 = new Buffered Reader(fr);
int i;
while ((i = br. read()) != -1)
System .out. print((char) i);
System .out. print ln();
}
public void run() {
try {
reader();
} catch (IO Exception e) {
e. print Stack Trace();
}
}
}
//writer code
class Writer extends Thread {
static Buffered Writer bw1 = null;
static File Writer fw1 = null;
static synchronized void writer() throws IO Exception {
Scanner scanner = new Scanner(System.in);
System .out .print ln("enter data to be added:");
String data = scanner. nextLine();
String path = "C:/Users/vt/Documents/file1.txt";
fw1 = new File Writer(path, true);
bw1 = new Buffered Writer(fw1);
bw1.newLine();
bw1.write(data);
bw1.flush();
scanner. close();
System. out . println("data added");
}
public void run() {
try {
writer();
} catch (IO Exception e) {
e. print Stack Trace();
}
}
}
//main method
public class File Read Write {
public static void main(String[] args) {
Read rd1 =new Read();
Read rd2=new Read();
Writer wt1=new Writer();
rd1.run();
rd2.run();
wt1.run();
rd1. run();
}
}
I am new to files and threading in java. I know this is not correct approach. Guide me.
If one file exists where writers may write information to, only one writer may write at a time. Confusion may arise if a reader is trying read at the same as a writer is writing. Since readers only look at the data, but do not modify the data, we can allow more than one reader to read at the same time.
There are two approaches to this.
(1) Either lock the resource and have the readers wait until the writer has completed the writing operation (or likewise, have a writer waits until all readers are done). This approach guarantees consistency, but can be slow if a lot of writers/readers are working on the resource at the same time (see Lock in java.util.concurrent.locks package).
(2) Keep an in-memory-version of the contents of the file that is served to readers only. When a change is made, this in-memory version is updated. Here, you'll have more speed, but you lose consistency and you'll need more memory.
The condition you want to avoid is generally referred as race condition and what you want to avoid it is a synchronization method between threads. There are more choices available but the most suitable for your case are mutex and read-write lock.
A mutex basically just lock the resource before any operation on the shared resource is performed, independently from the type of operation and free it after the operation is terminated. So a read will block the resource and any other operation, read or write will be blocked.
A write will block the resource too so again no other read or write operation can be performed before the action is terminated and mutex unlocked. So basically a mutex has 2 states: locked and unlocked.
read-write lock gives you more freedom based on the fact that read only operations do not result in inconsistencies. A read-write lock has 3 states: unlocked, read lock, write lock. A write lock works as a regular mutex blocking any other operation. A read lock on the contrary blocks only write operations.
I am not a Java expert but from this answer mutex in Java can be used as the following:
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;
private final Lock lock = new ReentrantLock(true);
lock.lock()
/*open file and do stuff*/
try {
// do something
} catch (Exception e) {
// handle the exception
} finally {
lock.unlock();
}
Instead here you can find a description of a read-write lock class.
From the implementation point of view you can create an instance of one of the two synchronization method and have your read/write thread instances keeping a reference to it, as an instance variable.

Optional debugging output in a multi-threaded application--I mean *for the client*, not for the sake of figuring out deadlocks or bugs

I'm working on my first multi-threaded application, for the sake of learning. I really need to learn it. I already have a single-threaded function that reads in all text files in a directory, and replaces all indentation tabs to three spaces.
It has the ability to pass in an Appendable for the sake of optional extra information (listing each file, giving statistics, etcetera). If they pass in null, they want no debugging.
I'm trying to determine what's the best way of handling this in a multi-threaded version, but searching for "debugging multi-threaded java" is giving me nothing but how to diagnose bugs and deadlocks.
Can I safely stick with an Appendable or should I be considering something else? I'm not sure how to deal with interleaving messages, but the first thing I want to figure out is thread safety.
Rather than passing in an Appendable, consider using slf4j in your library to do the logging.
If no logging framework is linked in at run-time, no logging will be done. If the application is doing logging already, then there's probably a front-end to it that slf4j will output to.
I'd recommend using Logback for your logging output, as it's nicely configurable, either through configuration files or directly in code. All you need to do to get rudimentary output is include the JAR.
Debugging threads is often a case of trying to figure out presentation. Log4j is great generally. You can configure it to tag each line with the thread name as well as the timestamp. Once you do this you can filter the output based on thread name and follow a single thread.
A good filtering tool is really important. The most basic would be tail and pipe it through grep--but if it's something you do a lot you might want to layer something on top of the log--like a GUI with tabs for each thread or something like that.
Log4j itself will have no problem dealing with threads.
If you really want to do it yourself, pass in a DIFFERENT appendable to each thread, then when the thread is done dump it or save it to a file. You probably want to use just one thread to dump/save the appendables.
The problem with using Appendable from multiple threads is that it is not specified as thread safe.
Thread safety is the responsibility of classes that extend and implement this interface.
The answer is therefore to use a thread-safe multiplexor. This one uses a BlockingQueue and a thread that pulls data out of it and forwards it to their Appendable.
class TellThemWhatIsHappening implements Appendable {
// The pipe to their system/log.
private final Appendable them;
// My internal queue.
private final BlockingQueue<String> queue = new ArrayBlockingQueue<>(10);
// Hav I been interrupted?
private volatile boolean interrupted = false;
public TellThemWhatIsHappening(Appendable them) {
// Record the target Appendable.
this.them = them;
// Grow my thread.
Thread t = new Thread(consumer);
// Make sure it doesn't hold your app open.
t.setDaemon(true);
// Start the consumer runnning.
t.start();
}
// The runnable that consumes the queue and passes it on to them.
private Runnable consumer = new Runnable() {
#Override
public void run() {
while (!interrupted) {
try {
// Pull from the queue and push to them.
them.append(queue.take());
} catch (InterruptedException ex) {
// We got interrupted.
interrupted = true;
} catch (IOException ex) {
// Not sure what you shoudl do here. Their appendable threw youy an exception.
interrupted = true;
}
}
}
};
Continued...
private void append(String s) throws IOException {
// No point if they are null.
if (them != null) {
try {
queue.put(s);
} catch (InterruptedException ex) {
// What should we do here?
interrupted = true;
}
}
}
#Override
public Appendable append(CharSequence csq) throws IOException {
append(csq.toString());
return this;
}
#Override
public Appendable append(CharSequence csq, int start, int end) throws IOException {
append(csq.subSequence(start, end).toString());
return this;
}
#Override
public Appendable append(char c) throws IOException {
append("" + c);
return this;
}
}
However - it is a very good idea to use a proper logging system for logging rather than growing your own.

Will not closing a stringwriter cause a leak?

I realize that in java the GC will eventually cleanup objects, but I'm asking if it is bad practice to not close your string writer, currently I am doing this:
private static String processTemplate(final Template template, final Map root) {
StringWriter writer = new StringWriter();
try {
template.process(root, writer);
} catch (TemplateException e) {
logger.error(e.getMessage());
} catch (IOException e) {
logger.error(e.getMessage());
}
finally {
}
return writer.toString();
}
Should I be closing the writer and creating a new String like this:
String result = "";
...
finally {
result = writer.toString();
writer.close();
}
Is this better to do?
The javadoc is quite explicit:
Closing a StringWriter has no effect.
And a quick look at the code confirms it:
public void close() throws IOException {
}
It's not holding any non-memory resource. It will be garbage collected like anything else.
The close() probabaly merely exists because other writer objects do hold resources that need to be cleaned up, and the close() is needed to satify the interface.
No, not closing a StringWriter will not cause a leak: as noted, StringWriter#close() is a nop, and the writer only holds memory, not external resources, so these will be collected when the writer is collected. (Explicitly, it holds references to objects in private fields that do not escape the object, concretely a StringBuffer, so no outside references.)
Further, you generally shouldn't close a StringWriter, because it adds boilerplate to your code, obscuring the main logic, as we'll see. However, to reassure readers that you're being careful and doing this intentionally, I'd recommend commenting this fact:
// Don't need to close StringWriter, since no external resource.
Writer writer = new StringWriter();
// Do something with writer.
If you do want to close the writer, most elegant is to use try-with-resources, which will automatically call close() when you exit the body of the try block:
try (Writer writer = new StringWriter()) {
// Do something with writer.
return writer.toString();
}
However, since Writer#close() throws IOException, your method now needs to also throw IOException even though it never occurs, or you need to catch it, to prove to the compiler that it is handled. This is quite involved:
Writer writer = new StringWriter();
try {
// Do something with writer, which may or may not throw IOException.
return writer.toString();
} finally {
try {
writer.close();
} catch (IOException e) {
throw new AssertionError("StringWriter#close() should not throw IOException", e);
}
}
This level of boilerplate is necessary because you can't just put a catch on the overall try block, as otherwise you might accidentally swallow an IOException thrown by the body of your code. Even if there isn't any currently, some might be added in future and you'd want to be warned of this by the compiler. The AssertionError is documenting the current behavior of StringWriter#close(), which could potentially change in a future release, though that is extremely unlikely; it also masks any exception that may occur in the body of the try (again, this should never occur in practice). This is far too much boilerplate and complexity, and you'd clearly be better off omitting the close() and commenting why.
A subtle point is that not only does Writer#close() throw an IOException, but so does StringWriter#close(), so you can't eliminate the exception by making the variable a StringWriter instead of a Writer. This is different from StringReader, which overrides the close() method and specifies that it does not throw an exception! See my answer to Should I close a StringReader?. This may look wrong – why would you have a method that does nothing but may throw an exception?? – but is presumably for forward compatibility, to leave open the possibility of throwing an IOException on close in future, as this is an issue for writers generally. (It could also just be a mistake.)
To summarize: it's fine to not close a StringWriter, but the reason to not do the usual right thing, namely try-with-resources, is just because close() declares that it throws an exception that it doesn't actually throw in practice, and handling this precisely is a lot of boilerplate. In any other case it's better to just use the conventionally correct resource-management pattern and prevent problems and head-scratching.
At the end of the method there is no reference left to the writer therefore it will be freed by the GC.

Java: merging InputStreams

My goal is to create (or use existing) an InputStream implementation (say, MergeInputStream) that will try to read from a multiple InputStreams and return the first result. After that it will release lock and stop reading from all InputStreams until next mergeInputStream.read() call. I was quite surprised that I didn't found any such tool. The thing is: all of the source InputStreams are not quite finite (not a file, for example, but a System.in, socket or such), so I cannot use SequenceInputReader. I understand that this will probably require some multi-thread mechanism, but I have absolutely no idea how to do it. I tried to google it but with no result.
The problem of reading input from multiple sources and serializing them into one stream is preferably solved using SelectableChannel and Selector. This however requires that all sources are able to provide a selectable channel. This may or may not be the case.
If selectable channels are not available, you could choose to solve it with a single thread by letting the read-implementation do the following: For each input stream is, check if is.available() > 0, and if so return is.read(). Repeat this procedure until some input stream has data available.
This method however, has two major draw-backs:
Not all implementations of InputStream implements available() in a way such that it returns 0 if and only if read() will block. The result is, naturally, that data may not be read from this stream, even though is.read() would return a value. Whether or not this is to be considered as a bug is questionable, as the documentation merely states that it should return an "estimate" of the number of bytes available.
It uses a so called "busy-loop", which basically means that you'll either need to put a sleep in the loop (which results in a reading latency) or hog the CPU unnecessarily.
Your third option is to deal with the blocking reads by spawning one thread for each input stream. This however will require careful synchronization and possibly some overhead if you have a very high number of input streams to read from. The code below is a first attempt to solve it. I'm by no means certain that it is sufficiently synchronized, or that it manages the threads in the best possible way.
import java.io.*;
import java.util.concurrent.*;
import java.util.concurrent.atomic.AtomicInteger;
public class MergedInputStream extends InputStream {
AtomicInteger openStreamCount;
BlockingQueue<Integer> buf = new ArrayBlockingQueue<Integer>(1);
InputStream[] sources;
public MergedInputStream(InputStream... sources) {
this.sources = sources;
openStreamCount = new AtomicInteger(sources.length);
for (int i = 0; i < sources.length; i++)
new ReadThread(i).start();
}
public void close() throws IOException {
String ex = "";
for (InputStream is : sources) {
try {
is.close();
} catch (IOException e) {
ex += e.getMessage() + " ";
}
}
if (ex.length() > 0)
throw new IOException(ex.substring(0, ex.length() - 1));
}
public int read() throws IOException {
if (openStreamCount.get() == 0)
return -1;
try {
return buf.take();
} catch (InterruptedException e) {
throw new IOException(e);
}
}
private class ReadThread extends Thread {
private final int src;
public ReadThread(int src) {
this.src = src;
}
public void run() {
try {
int data;
while ((data = sources[src].read()) != -1)
buf.put(data);
} catch (IOException ioex) {
} catch (InterruptedException e) {
}
openStreamCount.decrementAndGet();
}
}
}
I can think of three ways to do this:
Use non-blocking I/O (API documentation). This is the cleanest solution.
Multiple threads, one for each merged input stream. The threads would block on the read() method of the associated input stream, then notify the MergeInputStream object when data becomes available. The read() method in MergedInputStream would wait for this notification, then read data from the corresponding stream.
Single thread with a busy loop. Your MergeInputStream.read() methods would need to loop checking the available() method of every merged input stream. If no data is available, sleep a few ms. Repeat until data becomes available in one of the merged input streams.

RAII in Java... is resource disposal always so ugly?

I just played with Java file system API, and came down with the following function, used to copy binary files. The original source came from the Web, but I added try/catch/finally clauses to be sure that, should something wrong happen, the Buffer Streams would be closed (and thus, my OS ressources freed) before quiting the function.
I trimmed down the function to show the pattern:
public static void copyFile(FileOutputStream oDStream, FileInputStream oSStream) throw etc...
{
BufferedInputStream oSBuffer = new BufferedInputStream(oSStream, 4096);
BufferedOutputStream oDBuffer = new BufferedOutputStream(oDStream, 4096);
try
{
try
{
int c;
while((c = oSBuffer.read()) != -1) // could throw a IOException
{
oDBuffer.write(c); // could throw a IOException
}
}
finally
{
oDBuffer.close(); // could throw a IOException
}
}
finally
{
oSBuffer.close(); // could throw a IOException
}
}
As far as I understand it, I cannot put the two close() in the finally clause because the first close() could well throw, and then, the second would not be executed.
I know C# has the Dispose pattern that would have handled this with the using keyword.
I even know better a C++ code would have been something like (using a Java-like API):
void copyFile(FileOutputStream & oDStream, FileInputStream & oSStream)
{
BufferedInputStream oSBuffer(oSStream, 4096);
BufferedOutputStream oDBuffer(oDStream, 4096);
int c;
while((c = oSBuffer.read()) != -1) // could throw a IOException
{
oDBuffer.write(c); // could throw a IOException
}
// I don't care about resources, as RAII handle them for me
}
I am missing something, or do I really have to produce ugly and bloated code in Java just to handle exceptions in the close() method of a Buffered Stream?
(Please, tell me I'm wrong somewhere...)
EDIT: Is it me, or when updating this page, I saw both the question and all the answers decreased by one point in a couple of minutes? Is someone enjoying himself too much while remaning anonymous?
EDIT 2: McDowell offered a very interesting link I felt I had to mention here:
http://illegalargumentexception.blogspot.com/2008/10/java-how-not-to-make-mess-of-stream.html
EDIT 3: Following McDowell's link, I tumbled upon a proposal for Java 7 of a pattern similar to the C# using pattern: http://tech.puredanger.com/java7/#resourceblock . My problem is explicitly described. Apparently, even with the Java 7 do, the problems remain.
The try/finally pattern is the correct way to handle streams in most cases for Java 6 and lower.
Some are advocating silently closing streams. Be careful doing this for these reasons: Java: how not to make a mess of stream handling
Java 7 introduces try-with-resources:
/** transcodes text file from one encoding to another */
public static void transcode(File source, Charset srcEncoding,
File target, Charset tgtEncoding)
throws IOException {
try (InputStream in = new FileInputStream(source);
Reader reader = new InputStreamReader(in, srcEncoding);
OutputStream out = new FileOutputStream(target);
Writer writer = new OutputStreamWriter(out, tgtEncoding)) {
char[] buffer = new char[1024];
int r;
while ((r = reader.read(buffer)) != -1) {
writer.write(buffer, 0, r);
}
}
}
AutoCloseable types will be automatically closed:
public class Foo {
public static void main(String[] args) {
class CloseTest implements AutoCloseable {
public void close() {
System.out.println("Close");
}
}
try (CloseTest closeable = new CloseTest()) {}
}
}
There are issues, but the code you found lying about on the web is really poor.
Closing the buffer streams closes the stream underneath. You really don't want to do that. All you want to do is flush the output stream. Also there's no point in specifying the underlying streams are for files. Performance sucks because you are copying one byte at a time (actually if you use java.io use can use transferTo/transferFrom which is a bit faster still). While we are about it, the variable names suck to. So:
public static void copy(
InputStream in, OutputStream out
) throw IOException {
byte[] buff = new byte[8192];
for (;;) {
int len = in.read(buff);
if (len == -1) {
break;
}
out.write(buff, 0, len);
}
}
If you find yourself using try-finally a lot, then you can factor it out with the "execute around" idiom.
In my opinion: Java should have someway of closing resources at end of scope. I suggest adding private as a unary postfix operator to close at the end of the enclosing block.
Unfortunately, this type of code tends to get a bit bloated in Java.
By the way, if one of the calls to oSBuffer.read or oDBuffer.write throws an exception, then you probably want to let that exception permeate up the call hierarchy.
Having an unguarded call to close() inside a finally-clause will cause the original exception to be replaced by one produced by the close()-call. In other words, a failing close()-method may hide the original exception produced by read() or write(). So, I think you want to ignore exceptions thrown by close() if and only if the other methods did not throw.
I usually solve this by including an explicit close-call, inside the inner try:
try {
while (...) {
read...
write...
}
oSBuffer.close(); // exception NOT ignored here
oDBuffer.close(); // exception NOT ignored here
} finally {
silentClose(oSBuffer); // exception ignored here
silentClose(oDBuffer); // exception ignored here
}
static void silentClose(Closeable c) {
try {
c.close();
} catch (IOException ie) {
// Ignored; caller must have this intention
}
}
Finally, for performance, the code should probably work with buffers (multiple bytes per read/write). Can't back that by numbers, but fewer calls should be more efficient than adding buffered streams on top.
Yes, that's how java works. There is control inversion - the user of the object has to know how to clean up the object instead of the object itself cleaning up after itself. This unfortunately leads to a lot of cleanup code scattered throughout your java code.
C# has the "using" keyword to automatically call Dispose when an object goes out of scope. Java has no such thing.
For common IO tasks such as copying a file, code such as that shown above is reinventing the wheel. Unfortunately, the JDK doesn't provide any higher level utilities, but apache commons-io does.
For example, FileUtils contains various utility methods for working with files and directories (including copying). On the other hand, if you really need to use the IO support in the JDK, IOUtils contains a set of closeQuietly() methods that close Readers, Writers, Streams, etc. without throwing exceptions.

Categories

Resources