I just started with I/O in java and please excuse me if my doubt is silly but i find it dificult to understand how this code works?
variable c in code below is not incremented then the while loop will never terminate even if there is only one character in inputstream and outputstream will be filled with only one character continuously but this code actually works satisfactorily which i dont understand why?
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class CopyBytes {
public static void main(String[] args) throws IOException {
FileInputStream in = null;
FileOutputStream out = null;
try {
in = new FileInputStream("xanadu.txt");
out = new FileOutputStream("outagain.txt");
int c;
while ((c = in.read()) != -1) {
out.write(c);
}
} finally {
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
}
}
}
From the Java API: "A FileInputStream obtains input bytes from a file in a file system. What files are available depends on the host environment."
Your character c is being used to "read" bytes from the stream, when there are no more bytes left to read the read() method returns -1 and that will break the loop.
that's what
((c = in.read()) != -1)
Means, read until the value is -1.
in.read() method read one character at a time. Each iteration one character is read and assigned to c.
Consider the following character's are present in the xanadu.txt file.
"ABCDE-1"
Note: -1 meaning of End of File (-1 is added for illustrate purpose only)
In first iteration
c=A while-> A!=-1 then true. then continue the while loop and save c='A' to outagain.txt file.
In Second iteration
c=B while -> B!=-1 then true. then continue the while loop and save c='B' to outagain.txt file.
In Third iteration
c=C while -> C!=-1 then true. then continue the while loop and save c='C' to outagain.txt file.
In fourth iteration
c=D while -> D!=-1 then true. then continue the while loop and save c='D' to outagain.txt file.
In Fifth iteration
c=E while -> E!=-1 then true. then continue the while loop and save c='E' to outagain.txt file.
In Sixth iteration
c=-1 while -> -1!=-1 then false. then exit from while loop
When you use read () method, you actually move the position which you're reading every time until you get to the end of the file. So that's why it works.
The function read from a FileInputStream object, returns the next byte of data, or -1 if the end of the file is reached. (it actually returns an int)
Source: http://docs.oracle.com/javase/6/docs/api/java/io/FileInputStream.html#read%28%29
In your loop, you are doing: c = in.read() until c equals -1. So, at each iteration, c contains the next byte. read in doing the increment itself.
Related
The inputstream reads several bytes but never throws a -1
Here is write:
private void sendData(byte[] data) throws Exception {
outputStream.write(data);
outputStream.flush();
...
String txtSend = "$00\r";
sendData(txtSend.getBytes());
Here is the read code:
int i;
char c;
while((i = mInputStream.read()) != -1) {
c = (char)i;
}
System.out.println("it never reaches here.");
It will get stuck in the while loop.
Should I be passing a different character?
..
FYI, this is for serial comm and in minicom, I pass the exact same string and it's able to run fine so idk that last character is the culprit.
Maybe someting like this:
Reads the next byte of data from the input stream. The value byte is returned as an int in the range 0 to 255. If no byte is available because the end of the stream has been reached, the value -1 is returned. This method blocks until input data is available, the end of the stream is detected, or an exception is thrown.
A subclass must provide an implementation of this method.
Returns:
the next byte of data, or -1 if the end of the stream is reached.
Throws:
IOException – if an I/O error occurs.
Reads the next byte of data from the input stream. The value byte is returned as an int in the range 0 to 255. If no byte is available because the end of the stream has been reached, the value -1 is returned. This method blocks until input data is available, the end of the stream is detected, or an exception is thrown.
https://developer.android.com/reference/java/io/InputStream#read()
As fanyang said, the read function returns a value between 0 and 255, and returning -1 means the end of the stream.
I think that only the value of i was changed, but why it was fileInputStream.read()?
import java.io.*;
public class FileStream_byte1 {
public static void main(String[] args) throws IOException {
FileOutputStream fOutputStream = new FileOutputStream("FileStream_byte1.txt");
fOutputStream.write(-1);
fOutputStream.close();
FileInputStream fileInputStream = new FileInputStream("FileStream_byte1.txt");
int i;
System.out.println(" " + fileInputStream.read());
fileInputStream.close();
}
}
//The result is 255
import java.io.*;
public class FileStream_byte1 {
public static void main(String[] args) throws IOException {
FileOutputStream fOutputStream = new FileOutputStream("FileStream_byte1.txt");
fOutputStream.write(-1);
fOutputStream.close();
FileInputStream fileInputStream = new FileInputStream("FileStream_byte1.txt");
int i ;
while ((i = fileInputStream.read()) != -1)
System.out.println(" " + fileInputStream.read());
fileInputStream.close();
}
}
//The result is -1
The reason why you read 255 (in first case) despite writing -1 can be seen in the documentation of OutputStream.write(int) (emphasis mine):
Writes the specified byte to this output stream. The general contract for write is that one byte is written to the output stream. The byte to be written is the eight low-order bits of the argument b. The 24 high-order bits of b are ignored.
FileOutputStream gives no indication of changing that behavior.
Basically, InputStream.read and OutputStream.write(int) use ints to allow the use of unsigned "bytes". They both expect the int to be in the range of 0-255 (the range of a byte). So while you called write(-1) it will only write the "eight low-order bits" of -1 which results in writing 255 to the file.
The reason you get -1 in the second case is because you are calling read twice but there is only one byte in the file and you print the result of the second read.
From the documentation of InputStream.read (emphasis mine):
Reads the next byte of data from the input stream. The value byte is returned as an int in the range 0 to 255. If no byte is available because the end of the stream has been reached, the value -1 is returned. This method blocks until input data is available, the end of the stream is detected, or an exception is thrown.
You can find that out by reading the documentation, quoting
"
public int read() throws IOException. It reads a byte of data from this input stream. This method blocks if no input is yet available. Specified by:
read in class InputStream. Returns: the next byte of data, or -1 if the
end of the file is reached. Throws: IOException - if an I/O error
occurs.
"
It returns -1 because it reaches the end of the file.
In your while loop, it reads the byte of data you wrote, doesn't print anything and then the next time it tries to read a byte of data, there is none, so it returns -1 and since its matching your condition it prints it proceeding to exit the subroutine
In your code you are calling read twice, the first time it will return 255 then the next time -1 indicating that the stream is ended
try
while ((i = fileInputStream.read()) != -1)
System.out.println(" " + i);
I was refreshing myself on I/O while I was going over the example code I saw something that confused me:
public class CopyBytes {
public static void main(String[] args) throws IOException {
FileInputStream in = null;
FileOutputStream out = null;
try {
in = new FileInputStream("xanadu.txt");
out = new FileOutputStream("outagain.txt");
int c;
while ((c = in.read()) != -1) {
out.write(c);
}
How can an int value (c), can be assigned to a byte of data from the input stream (in.read())? And what why does the while loop wait for it to not equal -1?
This (c = in.read()) will return -1 when the end of input is reached and hence the while loop will stop.
Read this awesome answer.
From Oracle docs:
public abstract int read()
throws IOException Reads the next byte of data from the input stream. The value byte is returned as an int in the range 0
to 255. If no byte is available because the end of the stream has been
reached, the value -1 is returned. This method blocks until input data
is available, the end of the stream is detected, or an exception is
thrown. A subclass must provide an implementation of this method.
Returns: the next byte of data, or -1 if the end of the stream is
reached. Throws: IOException - if an I/O error occurs.
From the documentation for FileInputStream.read():
public int read()
throws IOException
Thus read() returns ints, not bytes, so it can be assigned to an int variable.
Note that ints can be implicitly converted to ints with no loss. Also from the docs:
Returns:
the next byte of data, or -1 if the end of the file is reached.
The loop check against -1 determines whether the end of file has been reached, and stops looping if so.
Is there a better [pre-existing optional Java 1.6] solution than creating a streaming file reader class that will meet the following criteria?
Given an ASCII file of arbitrary large size where each line is terminated by a \n
For each invocation of some method readLine() read a random line from the file
And for the life of the file handle no call to readLine() should return the same line twice
Update:
All lines must eventually be read
Context: the file's contents are created from Unix shell commands to get a directory listing of all paths contained within a given directory; there are between millions to a billion files (which yields millions to a billion lines in the target file). If there is some way to randomly distribute the paths into a file during creation time that is an acceptable solution as well.
In order to avoid reading in the whole file, which may not be possible in your case, you may want to use a RandomAccessFile instead of a standard java FileInputStream. With RandomAccessFile, you can use the seek(long position) method to skip to an arbitrary place in the file and start reading there. The code would look something like this.
RandomAccessFile raf = new RandomAccessFile("path-to-file","rw");
HashMap<Integer,String> sampledLines = new HashMap<Integer,String>();
for(int i = 0; i < numberOfRandomSamples; i++)
{
//seek to a random point in the file
raf.seek((long)(Math.random()*raf.length()));
//skip from the random location to the beginning of the next line
int nextByte = raf.read();
while(((char)nextByte) != '\n')
{
if(nextByte == -1) raf.seek(0);//wrap around to the beginning of the file if you reach the end
nextByte = raf.read();
}
//read the line into a buffer
StringBuffer lineBuffer = new StringBuffer();
nextByte = raf.read();
while(nextByte != -1 && (((char)nextByte) != '\n'))
lineBuffer.append((char)nextByte);
//ensure uniqueness
String line = lineBuffer.toString();
if(sampledLines.get(line.hashCode()) != null)
i--;
else
sampledLines.put(line.hashCode(),line);
}
Here, sampledLines should hold your randomly selected lines at the end. You may need to check that you haven't randomly skipped to the end of the file as well to avoid an error in that case.
EDIT: I made it wrap to the beginning of the file in case you reach the end. It was a pretty simple check.
EDIT 2: I made it verify uniqueness of lines by using a HashMap.
Pre-process the input file and remember the offset of each new line. Use a BitSet to keep track of used lines. If you want to save some memory, then remember the offset of every 16th line; it is still easy to jump into the file and do a sequential lookup within a block of 16 lines.
Since you can pad the lines, I would do something along those lines, and you should also note that even then, there may exist a limitation with regards to what a List can actually hold.
Using a random number each time you want to read the line and adding it to a Set would also do, however this ensures that the file is completely read:
public class VeryLargeFileReading
implements Iterator<String>, Closeable
{
private static Random RND = new Random();
// List of all indices
final List<Long> indices = new ArrayList<Long>();
final RandomAccessFile fd;
public VeryLargeFileReading(String fileName, long lineSize)
{
fd = new RandomAccessFile(fileName);
long nrLines = fd.length() / lineSize;
for (long i = 0; i < nrLines; i++)
indices.add(i * lineSize);
Collections.shuffle(indices);
}
// Iterator methods
#Override
public boolean hasNext()
{
return !indices.isEmpty();
}
#Override
public void remove()
{
// Nope
throw new IllegalStateException();
}
#Override
public String next()
{
final long offset = indices.remove(0);
fd.seek(offset);
return fd.readLine().trim();
}
#Override
public void close() throws IOException
{
fd.close();
}
}
If the number of files is truly arbitrary it seems like there could be an associated issue with tracking processed files in terms of memory usage (or IO time if tracking in files instead of a list or set). Solutions that keep a growing list of selected lines also run in to timing-related issues.
I'd consider something along the lines of the following:
Create n "bucket" files. n could be determined based on something that takes in to account the number of files and system memory. (If n is large, you could generate a subset of n to keep open file handles down.)
Each file's name is hashed, and goes into an appropriate bucket file, "sharding" the directory based on arbitrary criteria.
Read in the bucket file contents (just filenames) and process as-is (randomness provided by hashing mechanism), or pick rnd(n) and remove as you go, providing a bit more randomosity.
Alternatively, you could pad and use the random access idea, removing indices/offsets from a list as they're picked.
I've searched through all the questions I can find relating to PipedInputStreams and PipedOutputStreams and have not found anything that can help me. Hopefully someone here will have come across something similar.
Background:
I have a class that reads data from any java.io.InputStream. The class has a method called hasNext(), which checks the given InputStream for data, returning true if data is found, false otherwise. This hasNext() method works perfectly with other InputStreams but when I try to use a PipedInputStream (fed from a PipedOutputStream in a different Thread, encapsulated in the inputSupplier variable below), it hangs. After looking into how the hasNext() method works, I recreated the problem with the following code:
public static void main(String [] args){
PipedInputStream inputSourceStream = new PipedInputStream(inputSupplier.getOutputStream());
byte[] input = new byte[4096];
int bytes_read = inputSourceStream.read(input, 0, 4096);
}
The inputSupplier is simply an instance of a small class I wrote that runs in its own thread with a local PipedOutputStream to avoid getting deadlocks.
The Problem
So, my problem is that the hasNext() method calls PipedInputStream.read() method on the stream to ascertain whether there is any data to be read. This causes a blocking read operation that never exits, until some data arrives to be read. This means that my function of hasNext() will never return false (or at all) if the stream is empty.
Disclaimer: I know about the available() method but all that tells me is that there are no bytes available, not that we are at the end of the stream (whatever implementation of a Stream that may be), and so read() is required to check this.
[Edit] The whole purpose of me initially using a PipedInputStream was to simulate a "bursty" source of data. That is, I need to have a Stream that I can write to sporadically to see if my hasNext() method will detect that there is new data on the Stream upon reading it. If there is a better way of doing this then I would be thrilled to hear it!
I hate to necro a question this old, but this is near the top of google's results, and I just found a solution for myself: this circular byte buffer exposes in and out streams, and the read method returns -1 immediately when no data is present. A little bit of threading, and your test classes can provide data exactly the way you want.
http://ostermiller.org/utils/src/CircularByteBuffer.java.html
Edit
Turns out I misunderstood the documentation of the above class, and it only returns -1 when a thread calling read() is interrupted. I made a quick mod to the read method that gives me what I want (original code commented out, the only new code is the substitution of an else for the else if:
#Override public int read(byte[] cbuf, int off, int len) throws IOException {
//while (true){
synchronized (CircularByteBuffer.this){
if (inputStreamClosed) throw new IOException("InputStream has been closed; cannot read from a closed InputStream.");
int available = CircularByteBuffer.this.available();
if (available > 0){
int length = Math.min(len, available);
int firstLen = Math.min(length, buffer.length - readPosition);
int secondLen = length - firstLen;
System.arraycopy(buffer, readPosition, cbuf, off, firstLen);
if (secondLen > 0){
System.arraycopy(buffer, 0, cbuf, off+firstLen, secondLen);
readPosition = secondLen;
} else {
readPosition += length;
}
if (readPosition == buffer.length) {
readPosition = 0;
}
ensureMark();
return length;
//} else if (outputStreamClosed){
} else { // << new line of code
return -1;
}
}
//try {
// Thread.sleep(100);
//} catch(Exception x){
// throw new IOException("Blocking read operation interrupted.");
//}
//}
}
```
Java SE 6 and later (correct me if I am wrong) come with the java.nio package, which is designed for asyschronous I/O, which sounds like what you are describing