BufferedInputStream isn't marking

BufferedInputStream isn't marking - java

A BufferedInputStream that I have isn't marking correctly. This is my code:
public static void main(String[] args) throws Exception {
byte[] b = "HelloWorld!".getBytes();
BufferedInputStream bin = new BufferedInputStream(new ByteArrayInputStream(b));
bin.mark(3);
while (true){
byte[] buf = new byte[4096];
int n = bin.read(buf);
if (n == -1) break;
System.out.println(n);
System.out.println(new String(buf, 0, n));
}
}
This is outputting:
11
HelloWorld!
I want it to output
3
Hel
8
loWorld!
I also tried the code with just a pure ByteArrayInputStream as bin, and it didn't work either.

I think you're misunderstanding what mark does.
The purpose of mark is to cause the stream to remember its current position, so you can return to it later using reset(). The argument isn't how many bytes will be read next -- it's how many bytes you'll be able to read afterward before the mark is considered invalid (ie: you won't be able to reset() back to it; you'll either get an exception or end up at the start of the stream instead).
See the docs on InputStream for details. Readers' mark methods work quite similarly.

That's not what mark() does. You need to re-read the documentation. Mark lets you go backward through the stream.

Related

Why BufferedInputStream#reset() did not thrown a RuntimeException?

BufferedInputStream#mark(int) function gets as an argument the limit of bytes that can be read, once read, the mark become invalidated.
In the OCP book mark(int) is described as:
...you can call mark(int) with a read-ahead limit value. You can then
read as many bytes as you want up to the limit value.
So the code below is setting limit value to 1 byte, after reading that byte, mark should be invalidated and calling .reset() function should throw a RuntimeException, yet it is not happening. Is it the JVM which is somehow overriding the argument passed to mark function?
public class Main {
public static void main(String[] args) throws IOException {
Path path = Paths.get("Java8_IOandNIO\\src\\main\\resources\\abcd.txt");
File f = new File(path.toString());
FileInputStream fis = new FileInputStream(f);
BufferedInputStream bis = new BufferedInputStream(fis);
bis.mark(1);
System.out.println((char) bis.read());
System.out.println((char) bis.read());
System.out.println((char) bis.read());
bis.reset();
System.out.println("called reset");
System.out.println((char) bis.read());
System.out.println((char) bis.read());
System.out.println((char) bis.read());
}
}
The code each time is printing data from the sample file:
A
B
C
called reset
A
B
C

Well, the documentation (original contract from InputStream) states:
If the method mark has not been called since the stream was created, or the number of bytes read from the stream since mark was last called is larger than the argument to mark at that last call, then an IOException might be thrown.
(Emphasis mine)
This means that the limit is a recommendation. It is not mandatory that the mark will be invalidated after that number of bytes have been read.

Because:
the OCP you quoted doesn't say anything about throwing a RuntimeException;
it doesn't say you can't necessarily read more;
the OCP isn't a normative reference;
the real normative reference doesn't say so either; and
the stream is buffered, so it can support a mark of up to its internal buffer size, which is currently 8192 bytes.

Why doesn't InputStream fill the array fully?

Dude, I'm using following code to read up a large file(2MB or more) and do some business with data.
I have to read 128Byte for each data read call.
At the first I used this code(no problem,works good).
InputStream is;//= something...
int read=-1;
byte[] buff=new byte[128];
while(true){
for(int idx=0;idx<128;idx++){
read=is.read(); if(read==-1){return;}//end of stream
buff[idx]=(byte)read;
}
process_data(buff);
}
Then I tried this code which the problems got appeared(Error! weird responses sometimes)
InputStream is;//= something...
int read=-1;
byte[] buff=new byte[128];
while(true){
//ERROR! java doesn't read 128 bytes while it's available
if((read=is.read(buff,0,128))==128){process_data(buff);}else{return;}
}
The above code doesn't work all the time, I'm sure that number of data is available, but reads(read) 127 or 125, or 123, sometimes. what is the problem?
I also found a code for this to use DataInputStream#readFully(buff:byte[]):void which works too, but I'm just wondered why the seconds solution doesn't fill the array data while the data is available.
Thanks buddy.

Consulting the javadoc for FileInputStream (I'm assuming since you're reading from file):
Reads up to len bytes of data from this input stream into an array of bytes. If len is not zero, the method blocks until some input is available; otherwise, no bytes are read and 0 is returned.
The key here is that the method only blocks until some data is available. The returned value gives you how many bytes was actually read. The reason you may be reading less than 128 bytes could be due to a slow drive/implementation-defined behavior.
For a proper read sequence, you should check that read() does not equal -1 (End of stream) and write to a buffer until the correct amount of data has been read.
Example of a proper implementation of your code:
InputStream is; // = something...
int read;
int read_total;
byte[] buf = new byte[128];
// Infinite loop
while(true){
read_total = 0;
// Repeatedly perform reads until break or end of stream, offsetting at last read position in array
while((read = is.read(buf, read_total, buf.length - offset)) != -1){
// Gets the amount read and adds it to a read_total variable.
read_total = read_total + read;
// Break if it read_total is buffer length (128)
if(read_total == buf.length){
break;
}
}
if(read_total != buf.length){
// Incomplete read before 128 bytes
}else{
process_data(buf);
}
}
Edit:
Don't try to use available() as an indicator of data availability (sounds weird I know), again the javadoc:
Returns an estimate of the number of remaining bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. Returns 0 when the file position is beyond EOF. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
In some cases, a non-blocking read (or skip) may appear to be blocked when it is merely slow, for example when reading large files over slow networks.
The key there is estimate, don't work with estimates.

Since the accepted answer was provided a new option has become available. Starting with Java 9, the InputStream class has two methods named readNBytes that eliminate the need for the programmer to write a read loop, for example your method could look like
public static void some_method( ) throws IOException {
InputStream is = new FileInputStream(args[1]);
byte[] buff = new byte[128];
while (true) {
int numRead = is.readNBytes(buff, 0, buff.length);
if (numRead == 0) {
break;
}
// The last read before end-of-stream may read fewer than 128 bytes.
process_data(buff, numRead);
}
}
or the slightly simpler
public static void some_method( ) throws IOException {
InputStream is = new FileInputStream(args[1]);
while (true) {
byte[] buff = is.readNBytes(128);
if (buff.length == 0) {
break;
}
// The last read before end-of-stream may read fewer than 128 bytes.
process_data(buff);
}
}

Java: Inputstream mark limit

According to Java documentation, the readlimit parameter of the mark method in the Class InputStream server for set "the maximum limit of bytes that can be read before the mark position becomes invalid.".
I have a file named sample.txt whose content is "hello". And i wrote this code:
import java.io.*;
public class InputStream{
public static void main (String[] args) throws IOException {
InputStream reader = new FileInputStream("sample.txt");
BufferedInputStream bis = new BufferedInputStream(reader);
bis.mark(1);
bis.read();
bis.read();
bis.read();
bis.read();
bis.reset();
System.out.println((char)bis.read());
}
}
The output is "h". But if i after the mark method read more than one bytes, shouldn't i get an error for the invalid reset method call?

I would put this down to documentation error.
The non-parameter doc for BufferedInputStream is "See the general contract of the mark method of InputStream," which to me indicates that BufferedInputStream does not behave differently, parameter doc notwithstanding.
And the general contract, as specified by InputStream, is
The readlimit arguments tells this input stream to allow that many bytes to be read before the mark position gets invalidated [...] the stream is not required to remember any data at all if more than readlimit bytes are read from the stream
In other words, readlimit is a suggestion; the stream is free to under-promise and over-deliver.

If you look at the source, particularly the fill() method, you can see (after a while!) that it only invalidates the mark when it absolutely has to, i.e. it is more tolerant than the documentation might suggest.
...
else if (pos >= buffer.length) /* no room left in buffer */
if (markpos > 0) { /* can throw away early part of the buffer */
int sz = pos - markpos;
System.arraycopy(buffer, markpos, buffer, 0, sz);
pos = sz;
markpos = 0;
} else if (buffer.length >= marklimit) {
markpos = -1; /* buffer got too big, invalidate mark */
pos = 0; /* drop buffer contents */
....
The default buffer size is relatively large (8K), so invalidation won't be triggered in your example.

Looking at the implementation of BufferedInputStream, it describes the significance of the marker position in the JavaDocs (of the protected markpos field):
[markpos is] the value of the pos field at the time the last mark method was called.
This value is always in the range -1 through pos. If there is no marked position in the input stream, this field is -1. If there is a marked position in the input stream, then buf[markpos] is the first byte to be supplied as input after a reset operation. If markpos is not -1, then all bytes from positions buf[markpos] through buf[pos-1] must remain in the buffer array (though they may be moved to another place in the buffer array, with suitable adjustments to the values of count, pos, and markpos); they may not be discarded unless and until the difference between pos and markpos exceeds marklimit.
Hope this helps. Take a peek at the definitions of read, reset and the private method fill in the class to see how it all ties together.
In short, only when the class retrieves more data to fill its buffer will the mark position be taken into account. It will be correctly invalidated if more bytes are read than the call to mark allowed. As a result, calls to read will not necessarily trigger the behaviour advertised in the public JavaDoc comments.

This looks like a subtle bug. If you reduce the buffer sizey you'll get an IOException
public static void main(String[] args) throws IOException {
InputStream reader = new ByteArrayInputStream(new byte[]{1, 2, 3, 4, 5, 6, 7, 8});
BufferedInputStream bis = new BufferedInputStream(reader, 3);
bis.mark(1);
bis.read();
bis.read();
bis.read();
bis.read();
bis.reset();
System.out.println((char)bis.read());
}

PipedInputStream always blocks when calling read() with an empty buffer. Is there any way of stopping this?

I've searched through all the questions I can find relating to PipedInputStreams and PipedOutputStreams and have not found anything that can help me. Hopefully someone here will have come across something similar.
Background:
I have a class that reads data from any java.io.InputStream. The class has a method called hasNext(), which checks the given InputStream for data, returning true if data is found, false otherwise. This hasNext() method works perfectly with other InputStreams but when I try to use a PipedInputStream (fed from a PipedOutputStream in a different Thread, encapsulated in the inputSupplier variable below), it hangs. After looking into how the hasNext() method works, I recreated the problem with the following code:
public static void main(String [] args){
PipedInputStream inputSourceStream = new PipedInputStream(inputSupplier.getOutputStream());
byte[] input = new byte[4096];
int bytes_read = inputSourceStream.read(input, 0, 4096);
}
The inputSupplier is simply an instance of a small class I wrote that runs in its own thread with a local PipedOutputStream to avoid getting deadlocks.
The Problem
So, my problem is that the hasNext() method calls PipedInputStream.read() method on the stream to ascertain whether there is any data to be read. This causes a blocking read operation that never exits, until some data arrives to be read. This means that my function of hasNext() will never return false (or at all) if the stream is empty.
Disclaimer: I know about the available() method but all that tells me is that there are no bytes available, not that we are at the end of the stream (whatever implementation of a Stream that may be), and so read() is required to check this.
[Edit] The whole purpose of me initially using a PipedInputStream was to simulate a "bursty" source of data. That is, I need to have a Stream that I can write to sporadically to see if my hasNext() method will detect that there is new data on the Stream upon reading it. If there is a better way of doing this then I would be thrilled to hear it!

I hate to necro a question this old, but this is near the top of google's results, and I just found a solution for myself: this circular byte buffer exposes in and out streams, and the read method returns -1 immediately when no data is present. A little bit of threading, and your test classes can provide data exactly the way you want.
http://ostermiller.org/utils/src/CircularByteBuffer.java.html
Edit
Turns out I misunderstood the documentation of the above class, and it only returns -1 when a thread calling read() is interrupted. I made a quick mod to the read method that gives me what I want (original code commented out, the only new code is the substitution of an else for the else if:
#Override public int read(byte[] cbuf, int off, int len) throws IOException {
//while (true){
synchronized (CircularByteBuffer.this){
if (inputStreamClosed) throw new IOException("InputStream has been closed; cannot read from a closed InputStream.");
int available = CircularByteBuffer.this.available();
if (available > 0){
int length = Math.min(len, available);
int firstLen = Math.min(length, buffer.length - readPosition);
int secondLen = length - firstLen;
System.arraycopy(buffer, readPosition, cbuf, off, firstLen);
if (secondLen > 0){
System.arraycopy(buffer, 0, cbuf, off+firstLen, secondLen);
readPosition = secondLen;
} else {
readPosition += length;
}
if (readPosition == buffer.length) {
readPosition = 0;
}
ensureMark();
return length;
//} else if (outputStreamClosed){
} else { // << new line of code
return -1;
}
}
//try {
// Thread.sleep(100);
//} catch(Exception x){
// throw new IOException("Blocking read operation interrupted.");
//}
//}
}
```

Java SE 6 and later (correct me if I am wrong) come with the java.nio package, which is designed for asyschronous I/O, which sounds like what you are describing

Why does "piping" a CharBuffer hang?

Why does the following method hang?
public void pipe(Reader in, Writer out) {
CharBuffer buf = CharBuffer.allocate(DEFAULT_BUFFER_SIZE);
while( in.read(buf) >= 0 ) {
out.append(buf.flip());
}
}

Answering my own question: you have to call buf.clear() between reads. Presumably, read is hanging because the buffer is full. The correct code is
public void pipe(Reader in, Writer out) {
CharBuffer buf = CharBuffer.allocate(DEFAULT_BUFFER_SIZE);
while( in.read(buf) >= 0 ) {
out.append(buf.flip());
buf.clear();
}
}

I would assume that it is a deadlock. The in.read(buf) locks the CharBuffer and prevents the out.append(buf) call.
That is assuming that CharBuffer uses locks (of some kind)in the implementation. What does the API say about the class CharBuffer?
Edit: Sorry, some kind of short circuit in my brain... I confused it with something else.

CharBuffers don't work with Readers and Writers as cleanly as you might expect. In particular, there is no Writer.append(CharBuffer buf) method. The method called by the question snippet is Writer.append(CharSequence seq), which just calls seq.toString(). The CharBuffer.toString() method does return the string value of the buffer, but it doesn't drain the buffer. The subsequent call to Reader.read(CharBuffer buf) gets an already full buffer and therefore returns 0, forcing the loop to continue indefinitely.
Though this feels like a hang, it is in fact appending the first read's buffer contents to the writer every pass through the loop. So you'll either start to see a lot of output in your destination or the writer's internal buffer will grow, depending on how the writer is implemented.
As annoying as it is, I'd recommend a char[] implementation if only because the CharBuffer solution winds up building at least two new char[] every pass through the loop.
public void pipe(Reader in, Writer out) throws IOException {
char[] buf = new char[DEFAULT_BUFFER_SIZE];
int count = in.read(buf);
while( count >= 0 ) {
out.write(buf, 0, count);
count = in.read(buf);
}
}
I'd recommend only using this if you need to support converting between two character encodings, otherwise a ByteBuffer/Channel or byte[]/IOStream implementation would be preferable even if you're piping characters.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

BufferedInputStream isn't marking - java

That's not what mark() does. You need to re-read the documentation. Mark lets you go backward through the stream.

Related

Why BufferedInputStream#reset() did not thrown a RuntimeException?

Why doesn't InputStream fill the array fully?

Java: Inputstream mark limit

PipedInputStream always blocks when calling read() with an empty buffer. Is there any way of stopping this?

Why does "piping" a CharBuffer hang?

Categories

Resources