How to buffer OutputStream without any buffer limits in Java? - java

This case is a bit complex, I hope I will simplify it well.
My task starts when I receive PrintStream where I am supposed to output some data. However entire task is calculating + printing, and I can print when I am done with calculation. So this could be 2-pass task, but I hope for 1-pass.
In order to achieve this, I would like to create some output buffer, do calculation and printing (to buffer) and then print out from the buffer to the real output stream.
So far so good, the problem is I am unable to find appropriate class for buffering -- BufferedOutputStream if I understand correctly, starts writing from the buffer when the buffer is full. I have to have much more strict control over it -- not writing to real output until I exactly say so.
Question -- is there any class appropriate for this task?

You could use ByteArrayOutputStream as your buffer. The byte array where this stream writes is enlarged automatically to hold everything you write.
When you are done generating output, just call the writeTo method to write the contents of the buffer to an output stream that writes to some actual device.
For further details see http://docs.oracle.com/javase/6/docs/api/java/io/ByteArrayOutputStream.html

From the javadoc of the flush method :
Flushes this buffered output stream. This forces any buffered output
bytes to be written out to the underlying output stream.

Related

Buffer and File in Java

I'm new to java and I want to ask what's the difference between using FileReader-FileWriter and using BufferedReader-BufferedWriter. Except of speed is there any other reason to use Buffered?
In a code for copying a file and pasting its content into another file is it better to use BufferedReader and BufferedWriter?
The short version is: File writer/reader is fast but inefficient, but a buffered writer/reader saves up writes/reads and does them in chunks (Based on the buffer size) which is far more efficient but can be slower (waiting for the buffer to fill up).
So to answer your question, a buffered writer/reader is generally best, especially if you are not sure on which one to use.
Take a look at the JavaDoc for the BufferedWriter, it does a great job of explaining how it works:
In general, a Writer sends its output immediately to the underlying
character or byte stream. Unless prompt output is required, it is
advisable to wrap a BufferedWriter around any Writer whose write()
operations may be costly, such as FileWriters and OutputStreamWriters.
For example,
PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter("foo.out")));
will buffer the PrintWriter's output to the file. Without buffering,
each invocation of a print() method would cause characters to be
converted into bytes that would then be written immediately to the
file, which can be very inefficient.

FileChannel read behaviour [duplicate]

For example I have a file whose content is:
abcdefg
then i use the following code to read 'defg'.
ByteBuffer bb = ByteBuffer.allocate(4);
int read = channel.read(bb, 3);
assert(read == 4);
Because there's adequate data in the file so can I suppose so? Can I assume that the method returns a number less than limit of the given buffer only when there aren't enough bytes in the file?
Can I assume that the method returns a number less than limit of the given buffer only when there aren't enough bytes in the file?
The Javadoc says:
a read might not fill the buffer
and gives some examples, and
returns the number of bytes read, possibly zero, or -1 if the channel has reached end-of-stream.
This is NOT sufficient to allow you to make that assumption.
In practice, you are likely to always get a full buffer when reading from a file, modulo the end of file scenario. And that makes sense from an OS implementation perspective, given the overheads of making a system call.
But, I can also imagine situations where returning a half empty buffer might make sense. For example, when reading from a locally-mounted remote file system over a slow network link, there is some advantage in returning a partially filled buffer so that the application can start processing the data. Some future OS may implement the read system call to do that in this scenario. If assume that you will always get a full buffer, you may get a surprise when your application is run on the (hypothetical) new platform.
Another issue is that there are some kinds of stream where you will definitely get partially filled buffers. Socket streams, pipes and console streams are obvious examples. If you code your application assuming file stream behavior, you could get a nasty surprise when someone runs it against another kind of stream ... and fails.
No, in general you cannot assume that the number of bytes read will be equal to the number of bytes requested, even if there are bytes left to be read in the file.
If you are reading from a local file, chances are that the number of bytes requested will actually be read, but this is by no means guaranteed (and won't likely be the case if you're reading a file over the network).
See the documentation for the ReadableByteChannel.read(ByteBuffer) method (which applies for FileChannel.read(ByteBuffer) as well). Assuming that the channel is in blocking mode, the only guarantee is that at least one byte will be read.

Is it more effective to buffer an output stream than an input stream in Java?

Being bored earlier today I started thinking a bit about the relative performance of buffered and unbuffered byte streams in Java. As a simple test, I downloaded a reasonably large text file and wrote a short program to determine the effect that buffered streams has when copying the file. Four tests were performed:
Copying the file using unbuffered input and output byte streams.
Copying the file using a buffered input stream and an unbuffered output stream.
Copying the file using an unbuffered input stream and a buffered output stream.
Copying the file using buffered input and output streams.
Unsurprisingly, using buffered input and output streams is orders of magnitude faster than using unbuffered streams. However, the really interesting thing (to me at least) was the difference in speed between cases 2 and 3. Some sample results are as follows:
Unbuffered input, unbuffered output
Time: 36.602513585
Buffered input, unbuffered output
Time: 26.449306847
Unbuffered input, buffered output
Time: 6.673194184
Buffered input, buffered output
Time: 0.069888689
For those interested, the code is available here at Github. Can anyone shed any light on why the times for cases 2 and 3 are so asymmetric?
When you read a file, the filesystem and devices below it do various levels of caching. They almost never read one byte at at time; they read a block. On a subsequent read of the next byte, the block will be in cache and so will be much faster.
It stands to reason then that if your buffer size is the same size as your block size, buffering the input stream doesn't actually gain you all that much (it saves a few system calls, but in terms of actual physical I/O it doesn't save you too much).
When you write a file, the filesystem can't cache for you because you haven't given it a backlog of things to write. It could potentially buffer the output for you, but it has to make an educated guess at how often to flush the buffer. By buffering the output yourself, you let the device do much more work at once because you manually build up that backlog.
To your title question, it is more effective to buffer the output. The reason for this is the way Hard Disk Drives (HDDs) write data to their sectors. Especially considering fragmented disks. Reading is much faster because the disk already knows where the data is versus having to determine where it will fit. Using the buffer the disk will find larger contiguous blank space to save the data than in the unbuffered manner.
Run another test for giggles. Create a new partition on your disk and run your tests reading and writing to the clean slate. To compare apples to apples, format the newly created partition between tests. Please post your numbers after this if you run the tests.
Generally writing is more tedious for the computer cause it cannot cache while reading can. Generally it is much like in real life - reading is faster and easier than writing!

Java: I/O, read() will not fill buffer?

I am learning about I/O, Files and Sockets and i don't understand the meaning of this sentence
read will not always fill a buffer
What does it mean? Anyone has some explanation for me?
"read will not always fill a buffer"
The above sentence means that Buffer has a certain size which is AutoFlushed when filled, But suppose the data to be read into the Buffer is not enough to fill the Buffer... Then you need to manually flush it.
For futher details read the SCJP Programmer guide by Kathy Sierra or Thinking in Java's IO chapter.
The read() method accepts a byte-array that it will fill with from the stream or reader.
If there is not enough data available to fill the buffer, it can either
wait until enough data is available
return immediately but only provide the available data without filling the buffer completely.
The standard implementation does a mixtures of both: It waits until at least one byte is available.
Note: The second case implies that read() may return without any data at all.
It will block until at least one byte is available, and return the number of bytes that can be read at that point without blocking again. See the Javadoc.

Would FileChannel.read read less bytes than specified if there's enough data?

For example I have a file whose content is:
abcdefg
then i use the following code to read 'defg'.
ByteBuffer bb = ByteBuffer.allocate(4);
int read = channel.read(bb, 3);
assert(read == 4);
Because there's adequate data in the file so can I suppose so? Can I assume that the method returns a number less than limit of the given buffer only when there aren't enough bytes in the file?
Can I assume that the method returns a number less than limit of the given buffer only when there aren't enough bytes in the file?
The Javadoc says:
a read might not fill the buffer
and gives some examples, and
returns the number of bytes read, possibly zero, or -1 if the channel has reached end-of-stream.
This is NOT sufficient to allow you to make that assumption.
In practice, you are likely to always get a full buffer when reading from a file, modulo the end of file scenario. And that makes sense from an OS implementation perspective, given the overheads of making a system call.
But, I can also imagine situations where returning a half empty buffer might make sense. For example, when reading from a locally-mounted remote file system over a slow network link, there is some advantage in returning a partially filled buffer so that the application can start processing the data. Some future OS may implement the read system call to do that in this scenario. If assume that you will always get a full buffer, you may get a surprise when your application is run on the (hypothetical) new platform.
Another issue is that there are some kinds of stream where you will definitely get partially filled buffers. Socket streams, pipes and console streams are obvious examples. If you code your application assuming file stream behavior, you could get a nasty surprise when someone runs it against another kind of stream ... and fails.
No, in general you cannot assume that the number of bytes read will be equal to the number of bytes requested, even if there are bytes left to be read in the file.
If you are reading from a local file, chances are that the number of bytes requested will actually be read, but this is by no means guaranteed (and won't likely be the case if you're reading a file over the network).
See the documentation for the ReadableByteChannel.read(ByteBuffer) method (which applies for FileChannel.read(ByteBuffer) as well). Assuming that the channel is in blocking mode, the only guarantee is that at least one byte will be read.

Categories

Resources