That is, if I do:
channel.position(0)
channel.read(buffer); // will read in 1st byte of file and so on
vs
channel.position(1)
channel.read(buffer); // will read in 2nd byte of file and so on
Are my assumptions correct? Reading the documentation doesn't really say anything about that so I wanted to make sure
Is FileChannel position(long newPosition) 0-indexed?
Yes.
Reading the documentation doesn't really say anything about that so I wanted to make sure
It is clear to me. The javadoc for position() says:
"Returns: This channel's file position, a non-negative integer counting the number of bytes from the beginning of the file to the current position".
"[A] non-negative integer" means zero or greater. If they had meant one or greater, they would have written "a positive integer" or "a strictly positive integer".
The method is 0-indexed.
Also when you call the read method, then the file position is updated with the number of bytes actually read. The channel’s position() method returns the current position.
Related
I currently ran into a actually quite weird problem with the put method of java.nio.bytebuffer and I wondered if you might know the answer to it, so let's get to it.
My goal is it to concatenate some data to a bytebuffer. The problem is after I call the put method it always adds a 0 after the byte array.
Here is the method which has those side effects:
public ByteBuffer toByteBuffer() {
ByteBuffer buffer = ByteBuffer.allocateDirect(1024));
buffer.put(type.name().getBytes()); // 0 added here
data.forEach((key, value) -> {
buffer.putChar('|');
buffer.put(key.getBytes()); // 0 added here
buffer.putChar('=');
buffer.put(value.getBytes()); // 0 added here
});
buffer.flip();
return buffer;
}
Expected Output should look something like this:
ClientData|OAT=14.9926405|FQRIGHT=39.689075|.....
Actual Ouptut where _ represents the 0s:
ClientData_|OAT_=14.9926405_|FQRIGHT_=39.689075_|.....
The documentation doesn't say anything about this side effect.
Also the put method only puts a 0 in between the byte arrays but not at the very end of the buffer.
I assume the method might be wrong or at least not properly documented, but I really have no clue why it would behave that way.
I think you may be slightly misinterpreting what is happening here. I note your comment about "but not at the very end of the buffer".
The \0 is actually coming from the putChar(char) calls, not the put(byte[]) calls. As it says in the docs (emphasis added):
Writes two bytes containing the given char value, in the current byte order
The default byte order is big endian; given that the chars you are writing are in the 7-bit ASCII range, this means "the byte you want" is going to be preceded by 0x00.
If you want to write a byte, use put(byte):
buffer.put((byte) '|');
I am new to Random File Access, and I have encountered one issue - as far as I have understood, RandomAccessFile class provides a Random Access file for reading/writing. I can use seek() method to move to preferable position and start reading or wrting, but does not matter in this case. It is completely the random access? But in FileInputStream I have the same ability
read(bute[] byte, int off, int len)
this method provides me reading from some particular place. So, what is the difference? (I guess, InputStream read all file, but just go through all symbols before off position, but it only my guess).
Looking at the documentation of the read method:
https://docs.oracle.com/javase/7/docs/api/java/io/FileInputStream.html#read(byte[],%20int,%20int)
it states that off is "the start offset in the destination array b". So using this call, you can read the next len bytes from the stream and put them is a certain place in your memory buffer. This does not allow you to skip forward like the seek method of a random access file.
The read method you mention does not let you read from any particular place. It always reads from the "next" position in the stream, where it left off, and it puts the read bytes into the byte array at position off. off is the offset in the output, not the input.
The standard idiom when reading from a stream is to check for EOF (-1):
while((bytesRead = inputStream.read(buffer)) != -1)
This seems pretty standard - I checked source for popular libraries like Apache Commons and it seems to be the defacto standard.
Why do we not stop at 0 as well? Wouldn't > -1 be better? Why do whatever work is in the loop when we didn't read anything?
Basically because it would be pointless. Look at the documentation:
If the length of b is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at the end of the file, the value -1 is returned; otherwise, at least one byte is read and stored into b.
So unless you're passing in an empty buffer (which is basically a bug in pretty much all cases; I personally wish the method would throw an exception in that case) the return value will never be 0. It will block for at least one byte to be read (in which case the return value will be 1 or more), or for the end of the stream to be reached (in which case the return value will be -1).
I have the following piece of code:
byte[] payloadArray = getPayload();
int size = (HEADER_SIZE+payloadArray.length);
ByteBuffer cmdBuffer = ByteBuffer.allocate(HEADER_SIZE+payloadArray.length);
//create command
ByteBuffer lengthBuf = ByteBuffer.allocate(2);
lengthBuf.order(ByteOrder.BIG_ENDIAN);
lengthBuf.putChar((char)(size-2));
cmdBuffer.put(lengthBuf);
cmdBuffer.put(getFlag());
After the execution of the last command the first two bytes of cmdBuffer should show the value from getFlag() and lengthBuf. Though, this is not visible inside cmdBuffer.
I am not sure what is the issue here. Could someone please help?
I suspect the reason is that there are no remaining bytes in lengthBuf.
From the JavaDoc on putChar(char):
Writes two bytes containing the given char value, in the current byte order, into this buffer at the current position, and then increments the position by two.
From the JavaDoc on put(ByteBuffe):
...
Otherwise, this method copies n = src.remaining() bytes from the given buffer into this buffer, starting at each buffer's current position. The positions of both buffers are then incremented by n.
So the put operation reads from the current position of lengthBuf which is the end and there are no bytes left.
Try lengthBuf.reset() after putChar(...) which should reset the position only. Please note that depending on the buffer implementation reset() might not be supported, but ByteBuffer.allocate() creates a HeapByteBuffer which supports that operation.
Edit:
Oleg Estekhin noted that you'd better use lengthBuf.flip() instead of lengthBuf.reset().
From the JavaDoc:
After a sequence of channel-read or put operations, invoke this method to prepare for a sequence of channel-write or relative get operations.
I'm using a FileReader wrapped in a LineNumberReader to index a large text file for speedy access later on. Trouble is I can't seem to find a way to read a specific line number directly. BufferedReader supports the skip() function, but I need to convert the line number to a byte offset (or index the byte offset in the first place).
I took a crack at it using RandomAccessFile, and while it worked, it was horribly slow during the initial indexing. BufferedReader's speed is fantastic, but... well, you see the problem.
Some key info:
The file can be any size (currently 35,000 lines)
It's stored on Android's internal filesystem (via getFilesDir() to be exact)
The formatting is not fixed width, unfortunately (hence the need to read by line)
Any ideas?
Describes an extended RandomAccessFile with buffering semantics
Trouble is I can't seem to find a way to read a specific line number directly
Unless you know the length of each line you can't read it directly
There is no shortcut, you will need to read then entire file up front and calculate the offsets manualy.
I would just use a BufferedReader and then get the length of each string and add 1 (or 2?) for the EOL string.
Consider saving an file index along with the large text file. If this file is something you are generating, either on your server, or on the device, it should be trivial to generate an index once and distribute and/or save it along with the file.
I'd recommend an int[] where each value is the absolute offset in bytes for the n*(index+1) th line. So you could have an array of size 35,000 with the start of each line, or an array of size 350, with the start of every 100th line.
Here's an example assuming you have an index file containing an raw sequence of int values:
public String getLineByNumber(RandomAccessFile index,
RandomAccessFile data,
int lineNum) {
index.seek(lineNum*4);
data.seek(index.readInt());
return data.readLine();
}
I took a crack at it using
RandomAccessFile, and while it worked,
it was horribly slow during the
initial indexing
You've started the hard part already. Now for the harder part.
BufferedReader's speed is fantastic,
but...
Is there something in your use of RandomAccessFile that made it slower than it has to be? How many bytes did you read at a time? If you read one byte at a time it will be sloooooow. IF you read in an array of bytes at a time, you can speed things up and use the byte array as a buffer.
Just wrapping up the previous comments :
Either you use RandomAccessFile to first count byte and second parse what you read to find lines by hand OR you use a LineNumberReader to first read lines by lines and count the bytes of each line of char (2 bytes in utf 16 ?) by hand.