Is there any advantage in wrapping a BufferedOutputStream around a ByteArrayOutputStream instead of just using the ByteArrrayOutputStream by itself?
Generally BufferedOutputStream wrapper is mostly used to avoid frequent disk or network writes. It can be much more expensive to separately write a lot of small pieces than make several rather large operations. The ByteArrayOutputStream operates in memory, so I think the wrapping is pointless.
If you want to know the exact answer, try to create a simple performance-measuring application.
Absolutely none. Though BufferedWriter and BufferedReader do offer extra functionality were you to be operating on strings.
ByteArrayOutputStream is not recommended if you want to get high performance, but one interesting feature is to send a message with unknown length. For a better comprehension about how these two methods work, see http://java-performance.info/java-io-bytearrayoutputstream/.
Related
In a Spring Boot app, I am reading csv file data using OpenCSV and it is possible to use FileReader to BufferedReader with it. However, when I compare both of them, I have a dilemma for the following point:
BufferedReader is faster than FileReader, but it uses much more memory.
As I am reading multiple data file (having hundreds of thousands records) in the same method (first I read data from one csv and then use the retrieved id fields to read the second csv), I think I shouldn't use BufferedReader for less memory usage. But I am really not sure what is the most proper way.
So, in this situation, Should I prefer FileReader to BufferedReader?
Generally speaking, depends on your constraints. If performance is an issue, allocate more resources and go for the faster solution. If memory is an issue, do the reverse.
With BufferedReader you can also use the reader and int constructor to set buffer size, which suits your needs.
BufferedReader reader = new BufferedReader(Reader, bufferSize);
Another general rule of thumb, don't do premature optimizations, be it memory or performance. Strive for clean code, if a problem arises, use a profiler to identify the bottlenecks and then deal with them.
As far as i know the difference in size lies simply in the buffer size, which by default is 8k or 16k, so the difference is memory isn't huge; the most important thing is you remember to free the resources when you don' use them anymore, calling close(), also remember to do it incase of Exceptions
We can get a BufferedInputStream by decorating an FileInputStream. And Channel got from FileInputStream.getChannel can also read content into a Buffer.
So, What's the difference between BufferedInputStream and java.nio.Buffer? i.e., when should I use BufferedInputStream and when should I use java.nio.Buffer and java.nio.Channel?
Getting started with new I/O (NIO), an article excerpt:
A stream-oriented I/O system deals with data one byte at a time. An
input stream produces one byte of data, and an output stream consumes
one byte of data. It is very easy to create filters for streamed data.
It is also relatively simply to chain several filters together so that
each one does its part in what amounts to a single, sophisticated
processing mechanism. On the flip side, stream-oriented I/O is often
rather slow.
A block-oriented I/O system deals with data in blocks. Each operation
produces or consumes a block of data in one step. Processing data by
the block can be much faster than processing it by the (streamed)
byte. But block-oriented I/O lacks some of the elegance and simplicity
of stream-oriented I/O.
These classes where written at different times for different packages.
If you are working with classes in the java.io package use BufferedInputStream.
If you are using java.nio use the ByteBuffer.
If are not using either you could use a plain byte[]. ByteBuffer has some useful methods for working with primitives so you might use it for that.
It is unlikely there will be any confusion because in general you will only use one when you have to and in which case only one will compile when you do.
I think we use BufferedInputStream to wrap the InputStream to make it works like block-oriented. But when deal with too much data, it actually consume more time than the real block-oriented I/O (Channel), but still faster than the unwrapperd InputStream.
So if there is a java.io.StringBufferInputStream, you would think that there would be a StringBufferOutputStream.
Any ideas as to why there isn't??
Likewise,there is also a SequenceInputStream but no SequenceOutputStream.
My guess is that someone never got around to making a StringBufferOutputStream in Java 1.0 since the product was somewhat "rushed to market." By the time Java 1.1 rolled around and people actually understood that readers and writers were for characters, and inputstreams and outputstreams were for bytes, the whole concept of using streams for strings was realized to be wrong, so the StringBufferInputStream was rightly deprecated, with no chance ever of a partner coming along.
A SequenceInputStream is a nice way to read from a bunch of streams all concatenated together, but it doesn't make much sense to write a single stream to multiple streams. Well, I suppose you could make sense of this if you wanted to write a large stream into multiple partitions (reminds me of Hadoop here). It's just not common enough to be in a standard library. A complication here would be that you would need to specify the size of each output partition and would really only make sense for files (which can have names with increasing suffixes, perhaps), and so would not generalize into arbitrary output streams in a nice manner.
StringBufferInputStream is deprecated, because bytes and characters are not the same thing. The correct classes to use for this are StringReader and StringWriter.
If you think about it, there is no way to make a SequenceOutputStream work. SequenceInputStream reads from the first stream until it is exhausted, then reads from the next stream. Since an OutputStream is never exhausted (unless, say, it happens to be connected to a socket whose peer closes the connection), how would a SequenceOutputStream class know when to move on to the next stream?
StringBufferInputStream has long been depreciated.
Use StringReader and StringWriter.
I want to constantly write data to disc.
And I want to flush data to disc frequently (for example every chunk of 64MB). What solution can you propose?
I think standard OutputStream might be a better choice than nio.channels because it is more straightforward.
If you are writing a continuous stream of data, for example appending to the end of a file, regular OutputStream with flush() called once in a while is just as good or better than nio. Where nio could give you a big advantage would be writing many small chunks spread over different regions of a file. In that case you could use a memory mapped file and this could be an improvement over old-style writes. However, from the question I understand you are rather dealing with a continuous stream of data. I suggest you implement the regular solution which gives you code you find nicer and only search for alternatives if you find performance lacking. In this case I wouldn't expect nio to make noticeable difference.
I am reading from an InputStream.
and writing what I read into an outputStream.
I also check a few things.
Like if I read an
& (ampersand)
I need to write
"& amp;"
My code works. But now I wonder if I have written the most efficient way (which I doubt).
I read byte by byte. (but this is because I need to do odd modifications)
Can somebody who's done this suggest the fastest way ?
Thanks
If you are using BufferedInputStream and BufferedOutputStream then it is hard to make it faster.
BTW if you are processing the input as characters as opposed to bytes, you should use readers/writers with BufferedReader and BufferedWriter.
The code should be reading/writing characters with Readers and Writers. For example, if its in the middle of a UTF-8 sequence, or it gets the second half of a UCS-2 character and it happens to read the equivalent byte value of an ampersand, then its going to damage the data that its attempting to copy. Code usually lives longer than you would expect it to, and somebody might try to pick it up later and use it in a situation where this could really matter.
As far as being faster or slower, using a BufferedReader will probably help the most. If you're writing to the file system, a BufferedWriter won't make much of a difference, because the operating system will buffer writes for you and it does a good job. If you're writing to a StringWriter, then buffering will make no difference (may even make it slower), but otherwise buffering your writes ought to help.
You could rewrite it to process arrays; and that might make it faster. You can still do that with arrays. You will have to write more complicated code to handle boundary conditions. That also needs to be a factor in the decision.
Measure, don't guess, and be wary of opinions from people who aren't informed of all the details. Ultimately, its up to you ot figure out if its fast enough for this situation. There is no single answer, because all situations are different.
I would prefer to use BufferedReader for reading input and BufferedWriter for output. Using Regular Expressions for matching your input can make your code short and also improve your time complexity.