Java: How can I read Reader multiple times? - java

If I call IOUtils.toString(reader); it returns correct string value. Second call returns "". Reset does not supported by reader
How can I solve this situation?

You can't make the Reader "re-readable" if it doesn't support mark() and reset(). But you could use the String returned from the call you've shown to create a StringReader any number of times, and read those as needed (or use mark() and reset() on a single instance to re-read it as needed.)

Use a java.io.Reader that does support reset, such as CharArrayReader (see http://download.oracle.com/javase/6/docs/api/java/io/CharArrayReader.html).
A BufferedReader also supports reset() of a limited number of characters if a mark is set.
More generally the markSupported method indicates whether the implementation of Reader you are using supports mark/reset (thanks to comment from Bala R pointing that out).

Related

FilterReader Class's mark method

public void mark(int readAheadLimit)
throws IOException
FilterReader class in Java.io package has a mark method that marks the next element, what is the use of that parameter in that method.
JavaDocs - Limit on the number of characters that may be read while still preserving the mark. After reading this many characters, attempting to reset the stream may fail
What does that mean . Examples and explanations are appreciated about failing reset and what does that parameter do!!
There is a better explanation in InputStream.mark. A reader that supports mark() should mirror this behavior, e.g. by delegating it to the underlying InputStream:
Marks the current position in this input stream. A subsequent call to the reset method repositions this stream at the last marked position so that subsequent reads re-read the same bytes.
The readlimit arguments tells this input stream to allow that many bytes to be read before the mark position gets invalidated.
The general contract of mark is that, if the method markSupported returns true, the stream somehow remembers all the bytes read after the call to mark and stands ready to supply those same bytes again if and whenever the method reset is called. However, the stream is not required to remember any data at all if more than readlimit bytes are read from the stream before reset is called.
Marking a closed stream should not have any effect on the stream.
The mark method of InputStream does nothing.
So, the parameter tells the mark() method how large the buffer for remembering elements needs to be. This allows it to allocate a buffer of appropriate size, if needed.

ObjectInputStream available() method doesn't work as expected (Java)

I've been trying to figure out why a method I've written to read objects from a file didn't work and realized that the available() method of ObjectInputStream gave 0 even though the file wasn't fully read.
The method did work after I've used the FileInputStream available() method instead to determine the EOF and it worked!
Why doesn't the method work for ObjectInputStram while it works for FileInputStream?
Here's the code:
public static void getArrFromFile() throws IOException, ClassNotFoundException {
Product p;
FileInputStream in= new FileInputStream(fName);
ObjectInputStream input= new ObjectInputStream(in);
while(in.available()>0){
p=(Product)input.readObject();
if (p.getPrice()>3000)
System.out.println(p);
}
input.close();
P.S-
I've read that I should use the EOF exception instead of available() for this, but I just wanna know why this doesn't work.
Thanks a lot!!!
Because, as the javadoc tells, available() returns an estimation of the number of bytes that can be read without blocking. The base InputStream implementation always returns 0, because this is a valid estimation. But whatever it returns, the fact that it returns 0 doesn't mean that there is nothing to read anymore. Only that the stream can't guarantee that at least one byte can be read without blocking.
Although this is not documented clearly I have realized from experience that it has to do with dynamic data. If your class only contains statically typed data, then available() is able to estimate the size. If there are dynamic data in your object, like lists etc, then it is not possible to make that estimation.
The available() method just tells how many bytes can be read without blocking. It's not very useful in regular code, but people see the name and erroneously think it does something else.
So in short: don't use available(), it's not the right method to use. Streams indicate ending differently, such as returning -1 or in ObjectInputStream's case, throwing an EOFException.
I found some trick! If you still want to use .available() to read all objects to the end, you can add an integer (ex: out.writeInt(0)) before adding each Object (ex: out.writeObject(obj)) when you write to the file and also read integer before reading each Object. So .available() can read byte left on file and won't be crash! Hope it helps you!
Use available function of InputStream instead of ObjectInputStream. Then if there is any data, get them as an object.
Something like:
if(inputStreamObject.available() > 0){
Object anyName = objectInputStreamObject.readObject();
}
You can get the inputStreamObject directly from the Socket.
I used it and the problem solved.

PrintWriter autoflush puzzling logic

public PrintWriter(OutputStream out, boolean autoFlush):
out - An output stream
autoFlush - A boolean; if true, the println, printf, or format methods
will flush the output buffer
public PrintStream(OutputStream out, boolean autoFlush):
out - The output stream to which values and objects will be printed
autoFlush - A boolean; if true, the output buffer will be flushed
whenever a byte array is written, one of the println methods is invoked,
or a newline character or byte ('\n') is written
What was the reason for changing autoflush logic between these classes?
Because they are always considered as identical except for encoding moments and "autoflush" without flushing on print() hardly corresponds to principle of least astonishment, silly bugs occur:
I created a PrintWriter with autoflush on; why isn't it autoflushing?
I think the answer lies in the history of Java. The trio InputStream, OutputStream and PrintStream in java.io date back to Java 1.0. That is before serious support for file encodings and character sets were built into the language.
To quote the Javadoc:
"A PrintStream adds functionality to
another output stream, namely the
ability to print representations of
various data values conveniently. Two
other features are provided as well.
Unlike other output streams, a
PrintStream never throws an
IOException; instead, exceptional
situations merely set an internal flag
that can be tested via the checkError
method..."
To summarize, it is a convenience for generating textual output, grafted on top of lower level IO.
In Java 1.1, Reader, Writer and PrintWriter were introduced. Those all support character sets. While InputStream and OutputStream still had a real uses (raw data processing), PrintStream became far less relevant, because printing by nature is about text.
The Javadoc for PrintWriter explicitly states:
Unlike the PrintStream class, if
automatic flushing is enabled it will
be done only when one of the println()
methods is invoked, rather than
whenever a newline character happens
to be output. The println() methods
use the platform's own notion of line
separator rather than the newline
character.
Put another way, PrintWriter should only be used through the print*(...) APIs, because writing newline characters etc should not be the caller's responsibility, the same way dealing with file encodings and character sets are not the caller's responsibility.
I would argue that PrintWriter should have been java.io.Printer instead, and not have extended Writer. I don't know whether they extended to mimic PrintStream, or because they were stuck on maintaining the pipe design idiom.
The reason it wasn't the same at start was probably just an accident. Now it's backwards compatibility.

Java's [Input|Output]Streams' one-method-call-for-each-byte: a performance problem?

[Input|Output]Streams exist since JDK1.0, while their character-counterparts Readers|Writers exist since JDK1.1.
Most concepts seem similar, with one exception: the base classes of streams declare an abstract method which processes one single byte at a time, while base readers/writers classes declare an abstract method which processes whole char-arrays.
Thus, given that I understand it correctly, every overridden stream class is limited to process single bytes (thereby performing at least one method call for each byte!), while overridden readers/writers only need a method call per array(-buffer).
Isn't that a huge performance problem?
Can a stream be implemented as subclass of either InputStream or OutputStream, but nevertheless be based on byte-arrays?
Actually, subclasses of InputStream have to override the method that reads a single byte at a time, but also can override others methods that read while byte-arrays. I think that's actually the case for most of the Input/Output streams.
So this isn't much of a performance problem, in my opinion, and yes, you can extend Input/Output stream and be based on byte-arrays.
Single byte reading is almost always a huge performance problem. But if you read the API docs of InputStream, you see that you HAVE TO override read(), but SHOULD also override read(byte[],int,int). Most code that uses any kind of InputStream calls the array style method anyway, but the default implementation of that function is just implemented by calling read() for every byte, so the negative performance impact.
For OutputStream the same holds.
As Daniel said, you must override read(), since client might use it directly, but you also should override read (byte[],int,int).
However, I doubt if you should be concerned with performance since the jvm can and will inline that method for you. Most of all, it doesn't seem like an issue to me.
Also, most readers use some underlying input stream behind the scene, so in any case those char-array based methods end up calling read(byte[],int,int) or even read() directly.
Note that Readers/Writers are for reading characters which can be made of more than one byte such as unicode characters. Streams on the other hand are more suitable when you're dealing with non-string (binary) data.
Apart from that, InputStream and OutputStream also have methods to read/write an entire array of bytes.
performance wise, if you wrap it with BufferedInputStream, JVM should be able to optimize the overhead of single byte read method calls to nothing, i.e. it's as fast as if you do the buffering yourself.

Why use BufferedReader in this case?

What is the difference between using a BufferedReader around the StringReader in the following code vs using the StringReader only? By loading up the DOM in line 2 of both examples, it seems like the BufferedReader is not necessary?
InputSource is = new InputSource(new StringReader(html));
Document dom = XMLResource.load(is).getDocument();
VS
InputSource is = new InputSource(new BufferedReader(new StringReader(html)));
Document dom = XMLResource.load(is).getDocument();
In this particular case, I see no benefit. In general there are two benefits:
The oh-so-handy readLine() method is only defined in BufferedReader rather than Reader (irrelevant here)
BufferedReader reduces IO where individual calls to the underlying reader are potentially expensive (i.e. fewer chunky calls are faster than lots of little ones) - again, irrelevant for StringReader
Cut and paste fail?
EDIT: My original answer below. The below isn't relevant in this case, since the buffered reader is wrapping a StringReader, which wraps a String. So there's no buffering to be performed, and the BufferedReader appears to be redundant. You could make an argument for using best/consistent practises, but it would be pretty tenuous.
Possibly the result of a copy/paste, or perhaps an IDE-driven refactor too far!
BufferedReader will attempt to read in a more optimal fashion.
That is, it will read larger chunks of data in one go (in a configurable amount), and then make available as required. This will reduce the number of reads from disk (etc.) at the expense of some memory usage.
To quote from the Javadoc:
In general, each read request made of
a Reader causes a corresponding read
request to be made of the underlying
character or byte stream. It is
therefore advisable to wrap a
BufferedReader around any Reader whose
read() operations may be costly, such
as FileReaders and InputStreamReaders
The BufferedReader version was copied from some code that used to read from a FileReader?

Categories

Resources