BufferedInputStream#mark(int) function gets as an argument the limit of bytes that can be read, once read, the mark become invalidated.
In the OCP book mark(int) is described as:
...you can call mark(int) with a read-ahead limit value. You can then
read as many bytes as you want up to the limit value.
So the code below is setting limit value to 1 byte, after reading that byte, mark should be invalidated and calling .reset() function should throw a RuntimeException, yet it is not happening. Is it the JVM which is somehow overriding the argument passed to mark function?
public class Main {
public static void main(String[] args) throws IOException {
Path path = Paths.get("Java8_IOandNIO\\src\\main\\resources\\abcd.txt");
File f = new File(path.toString());
FileInputStream fis = new FileInputStream(f);
BufferedInputStream bis = new BufferedInputStream(fis);
bis.mark(1);
System.out.println((char) bis.read());
System.out.println((char) bis.read());
System.out.println((char) bis.read());
bis.reset();
System.out.println("called reset");
System.out.println((char) bis.read());
System.out.println((char) bis.read());
System.out.println((char) bis.read());
}
}
The code each time is printing data from the sample file:
A
B
C
called reset
A
B
C
Well, the documentation (original contract from InputStream) states:
If the method mark has not been called since the stream was created, or the number of bytes read from the stream since mark was last called is larger than the argument to mark at that last call, then an IOException might be thrown.
(Emphasis mine)
This means that the limit is a recommendation. It is not mandatory that the mark will be invalidated after that number of bytes have been read.
Because:
the OCP you quoted doesn't say anything about throwing a RuntimeException;
it doesn't say you can't necessarily read more;
the OCP isn't a normative reference;
the real normative reference doesn't say so either; and
the stream is buffered, so it can support a mark of up to its internal buffer size, which is currently 8192 bytes.
Related
I think that only the value of i was changed, but why it was fileInputStream.read()?
import java.io.*;
public class FileStream_byte1 {
public static void main(String[] args) throws IOException {
FileOutputStream fOutputStream = new FileOutputStream("FileStream_byte1.txt");
fOutputStream.write(-1);
fOutputStream.close();
FileInputStream fileInputStream = new FileInputStream("FileStream_byte1.txt");
int i;
System.out.println(" " + fileInputStream.read());
fileInputStream.close();
}
}
//The result is 255
import java.io.*;
public class FileStream_byte1 {
public static void main(String[] args) throws IOException {
FileOutputStream fOutputStream = new FileOutputStream("FileStream_byte1.txt");
fOutputStream.write(-1);
fOutputStream.close();
FileInputStream fileInputStream = new FileInputStream("FileStream_byte1.txt");
int i ;
while ((i = fileInputStream.read()) != -1)
System.out.println(" " + fileInputStream.read());
fileInputStream.close();
}
}
//The result is -1
The reason why you read 255 (in first case) despite writing -1 can be seen in the documentation of OutputStream.write(int) (emphasis mine):
Writes the specified byte to this output stream. The general contract for write is that one byte is written to the output stream. The byte to be written is the eight low-order bits of the argument b. The 24 high-order bits of b are ignored.
FileOutputStream gives no indication of changing that behavior.
Basically, InputStream.read and OutputStream.write(int) use ints to allow the use of unsigned "bytes". They both expect the int to be in the range of 0-255 (the range of a byte). So while you called write(-1) it will only write the "eight low-order bits" of -1 which results in writing 255 to the file.
The reason you get -1 in the second case is because you are calling read twice but there is only one byte in the file and you print the result of the second read.
From the documentation of InputStream.read (emphasis mine):
Reads the next byte of data from the input stream. The value byte is returned as an int in the range 0 to 255. If no byte is available because the end of the stream has been reached, the value -1 is returned. This method blocks until input data is available, the end of the stream is detected, or an exception is thrown.
You can find that out by reading the documentation, quoting
"
public int read() throws IOException. It reads a byte of data from this input stream. This method blocks if no input is yet available. Specified by:
read in class InputStream. Returns: the next byte of data, or -1 if the
end of the file is reached. Throws: IOException - if an I/O error
occurs.
"
It returns -1 because it reaches the end of the file.
In your while loop, it reads the byte of data you wrote, doesn't print anything and then the next time it tries to read a byte of data, there is none, so it returns -1 and since its matching your condition it prints it proceeding to exit the subroutine
In your code you are calling read twice, the first time it will return 255 then the next time -1 indicating that the stream is ended
try
while ((i = fileInputStream.read()) != -1)
System.out.println(" " + i);
Dude, I'm using following code to read up a large file(2MB or more) and do some business with data.
I have to read 128Byte for each data read call.
At the first I used this code(no problem,works good).
InputStream is;//= something...
int read=-1;
byte[] buff=new byte[128];
while(true){
for(int idx=0;idx<128;idx++){
read=is.read(); if(read==-1){return;}//end of stream
buff[idx]=(byte)read;
}
process_data(buff);
}
Then I tried this code which the problems got appeared(Error! weird responses sometimes)
InputStream is;//= something...
int read=-1;
byte[] buff=new byte[128];
while(true){
//ERROR! java doesn't read 128 bytes while it's available
if((read=is.read(buff,0,128))==128){process_data(buff);}else{return;}
}
The above code doesn't work all the time, I'm sure that number of data is available, but reads(read) 127 or 125, or 123, sometimes. what is the problem?
I also found a code for this to use DataInputStream#readFully(buff:byte[]):void which works too, but I'm just wondered why the seconds solution doesn't fill the array data while the data is available.
Thanks buddy.
Consulting the javadoc for FileInputStream (I'm assuming since you're reading from file):
Reads up to len bytes of data from this input stream into an array of bytes. If len is not zero, the method blocks until some input is available; otherwise, no bytes are read and 0 is returned.
The key here is that the method only blocks until some data is available. The returned value gives you how many bytes was actually read. The reason you may be reading less than 128 bytes could be due to a slow drive/implementation-defined behavior.
For a proper read sequence, you should check that read() does not equal -1 (End of stream) and write to a buffer until the correct amount of data has been read.
Example of a proper implementation of your code:
InputStream is; // = something...
int read;
int read_total;
byte[] buf = new byte[128];
// Infinite loop
while(true){
read_total = 0;
// Repeatedly perform reads until break or end of stream, offsetting at last read position in array
while((read = is.read(buf, read_total, buf.length - offset)) != -1){
// Gets the amount read and adds it to a read_total variable.
read_total = read_total + read;
// Break if it read_total is buffer length (128)
if(read_total == buf.length){
break;
}
}
if(read_total != buf.length){
// Incomplete read before 128 bytes
}else{
process_data(buf);
}
}
Edit:
Don't try to use available() as an indicator of data availability (sounds weird I know), again the javadoc:
Returns an estimate of the number of remaining bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream. Returns 0 when the file position is beyond EOF. The next invocation might be the same thread or another thread. A single read or skip of this many bytes will not block, but may read or skip fewer bytes.
In some cases, a non-blocking read (or skip) may appear to be blocked when it is merely slow, for example when reading large files over slow networks.
The key there is estimate, don't work with estimates.
Since the accepted answer was provided a new option has become available. Starting with Java 9, the InputStream class has two methods named readNBytes that eliminate the need for the programmer to write a read loop, for example your method could look like
public static void some_method( ) throws IOException {
InputStream is = new FileInputStream(args[1]);
byte[] buff = new byte[128];
while (true) {
int numRead = is.readNBytes(buff, 0, buff.length);
if (numRead == 0) {
break;
}
// The last read before end-of-stream may read fewer than 128 bytes.
process_data(buff, numRead);
}
}
or the slightly simpler
public static void some_method( ) throws IOException {
InputStream is = new FileInputStream(args[1]);
while (true) {
byte[] buff = is.readNBytes(128);
if (buff.length == 0) {
break;
}
// The last read before end-of-stream may read fewer than 128 bytes.
process_data(buff);
}
}
According to Java documentation, the readlimit parameter of the mark method in the Class InputStream server for set "the maximum limit of bytes that can be read before the mark position becomes invalid.".
I have a file named sample.txt whose content is "hello". And i wrote this code:
import java.io.*;
public class InputStream{
public static void main (String[] args) throws IOException {
InputStream reader = new FileInputStream("sample.txt");
BufferedInputStream bis = new BufferedInputStream(reader);
bis.mark(1);
bis.read();
bis.read();
bis.read();
bis.read();
bis.reset();
System.out.println((char)bis.read());
}
}
The output is "h". But if i after the mark method read more than one bytes, shouldn't i get an error for the invalid reset method call?
I would put this down to documentation error.
The non-parameter doc for BufferedInputStream is "See the general contract of the mark method of InputStream," which to me indicates that BufferedInputStream does not behave differently, parameter doc notwithstanding.
And the general contract, as specified by InputStream, is
The readlimit arguments tells this input stream to allow that many bytes to be read before the mark position gets invalidated [...] the stream is not required to remember any data at all if more than readlimit bytes are read from the stream
In other words, readlimit is a suggestion; the stream is free to under-promise and over-deliver.
If you look at the source, particularly the fill() method, you can see (after a while!) that it only invalidates the mark when it absolutely has to, i.e. it is more tolerant than the documentation might suggest.
...
else if (pos >= buffer.length) /* no room left in buffer */
if (markpos > 0) { /* can throw away early part of the buffer */
int sz = pos - markpos;
System.arraycopy(buffer, markpos, buffer, 0, sz);
pos = sz;
markpos = 0;
} else if (buffer.length >= marklimit) {
markpos = -1; /* buffer got too big, invalidate mark */
pos = 0; /* drop buffer contents */
....
The default buffer size is relatively large (8K), so invalidation won't be triggered in your example.
Looking at the implementation of BufferedInputStream, it describes the significance of the marker position in the JavaDocs (of the protected markpos field):
[markpos is] the value of the pos field at the time the last mark method was called.
This value is always in the range -1 through pos. If there is no marked position in the input stream, this field is -1. If there is a marked position in the input stream, then buf[markpos] is the first byte to be supplied as input after a reset operation. If markpos is not -1, then all bytes from positions buf[markpos] through buf[pos-1] must remain in the buffer array (though they may be moved to another place in the buffer array, with suitable adjustments to the values of count, pos, and markpos); they may not be discarded unless and until the difference between pos and markpos exceeds marklimit.
Hope this helps. Take a peek at the definitions of read, reset and the private method fill in the class to see how it all ties together.
In short, only when the class retrieves more data to fill its buffer will the mark position be taken into account. It will be correctly invalidated if more bytes are read than the call to mark allowed. As a result, calls to read will not necessarily trigger the behaviour advertised in the public JavaDoc comments.
This looks like a subtle bug. If you reduce the buffer sizey you'll get an IOException
public static void main(String[] args) throws IOException {
InputStream reader = new ByteArrayInputStream(new byte[]{1, 2, 3, 4, 5, 6, 7, 8});
BufferedInputStream bis = new BufferedInputStream(reader, 3);
bis.mark(1);
bis.read();
bis.read();
bis.read();
bis.read();
bis.reset();
System.out.println((char)bis.read());
}
A BufferedInputStream that I have isn't marking correctly. This is my code:
public static void main(String[] args) throws Exception {
byte[] b = "HelloWorld!".getBytes();
BufferedInputStream bin = new BufferedInputStream(new ByteArrayInputStream(b));
bin.mark(3);
while (true){
byte[] buf = new byte[4096];
int n = bin.read(buf);
if (n == -1) break;
System.out.println(n);
System.out.println(new String(buf, 0, n));
}
}
This is outputting:
11
HelloWorld!
I want it to output
3
Hel
8
loWorld!
I also tried the code with just a pure ByteArrayInputStream as bin, and it didn't work either.
I think you're misunderstanding what mark does.
The purpose of mark is to cause the stream to remember its current position, so you can return to it later using reset(). The argument isn't how many bytes will be read next -- it's how many bytes you'll be able to read afterward before the mark is considered invalid (ie: you won't be able to reset() back to it; you'll either get an exception or end up at the start of the stream instead).
See the docs on InputStream for details. Readers' mark methods work quite similarly.
That's not what mark() does. You need to re-read the documentation. Mark lets you go backward through the stream.
Why does the following method hang?
public void pipe(Reader in, Writer out) {
CharBuffer buf = CharBuffer.allocate(DEFAULT_BUFFER_SIZE);
while( in.read(buf) >= 0 ) {
out.append(buf.flip());
}
}
Answering my own question: you have to call buf.clear() between reads. Presumably, read is hanging because the buffer is full. The correct code is
public void pipe(Reader in, Writer out) {
CharBuffer buf = CharBuffer.allocate(DEFAULT_BUFFER_SIZE);
while( in.read(buf) >= 0 ) {
out.append(buf.flip());
buf.clear();
}
}
I would assume that it is a deadlock. The in.read(buf) locks the CharBuffer and prevents the out.append(buf) call.
That is assuming that CharBuffer uses locks (of some kind)in the implementation. What does the API say about the class CharBuffer?
Edit: Sorry, some kind of short circuit in my brain... I confused it with something else.
CharBuffers don't work with Readers and Writers as cleanly as you might expect. In particular, there is no Writer.append(CharBuffer buf) method. The method called by the question snippet is Writer.append(CharSequence seq), which just calls seq.toString(). The CharBuffer.toString() method does return the string value of the buffer, but it doesn't drain the buffer. The subsequent call to Reader.read(CharBuffer buf) gets an already full buffer and therefore returns 0, forcing the loop to continue indefinitely.
Though this feels like a hang, it is in fact appending the first read's buffer contents to the writer every pass through the loop. So you'll either start to see a lot of output in your destination or the writer's internal buffer will grow, depending on how the writer is implemented.
As annoying as it is, I'd recommend a char[] implementation if only because the CharBuffer solution winds up building at least two new char[] every pass through the loop.
public void pipe(Reader in, Writer out) throws IOException {
char[] buf = new char[DEFAULT_BUFFER_SIZE];
int count = in.read(buf);
while( count >= 0 ) {
out.write(buf, 0, count);
count = in.read(buf);
}
}
I'd recommend only using this if you need to support converting between two character encodings, otherwise a ByteBuffer/Channel or byte[]/IOStream implementation would be preferable even if you're piping characters.