Start reading from a specific byte

Start reading from a specific byte - java

How I can start reading from a specific byte? I have following code:
try {
while ( (len = f.read(buffer)) > 0 ) {}
}
For example, I want to start to read at byte 50.

You should use the skip method.
http://developer.android.com/reference/java/io/InputStream.html#skip(long)
You can skip the number of bytes you want.
int nbToSkip = 50
while (nbToSkip > 0) {
int nbSkipped = f.skip( nbToSkip );
nbToSkip -= nbSkipped ;
}

I'm not sure what the type of 'f' is, however I will assume it's some kind of stream.
Also, do you mean you want to read in blocks of 50 bytes, or start from the 50th byte in the stream? I'll assume the latter.
Not sure what language you are using, but here goes anyway:
Maybe there's a seek kind of function with which you can seek to a specific position in the stream, like so:
f.seek(50, SEEK_START);
otherwise you can do it the poor man's way by just reading 50 bytes, or 50 times 1 byte from the stream.
Hard to give you a good answer without any more reference.

Related

ByteBuffer put method adds 0 after ByteArray

I currently ran into a actually quite weird problem with the put method of java.nio.bytebuffer and I wondered if you might know the answer to it, so let's get to it.
My goal is it to concatenate some data to a bytebuffer. The problem is after I call the put method it always adds a 0 after the byte array.
Here is the method which has those side effects:
public ByteBuffer toByteBuffer() {
ByteBuffer buffer = ByteBuffer.allocateDirect(1024));
buffer.put(type.name().getBytes()); // 0 added here
data.forEach((key, value) -> {
buffer.putChar('|');
buffer.put(key.getBytes()); // 0 added here
buffer.putChar('=');
buffer.put(value.getBytes()); // 0 added here
});
buffer.flip();
return buffer;
}
Expected Output should look something like this:
ClientData|OAT=14.9926405|FQRIGHT=39.689075|.....
Actual Ouptut where _ represents the 0s:
ClientData_|OAT_=14.9926405_|FQRIGHT_=39.689075_|.....
The documentation doesn't say anything about this side effect.
Also the put method only puts a 0 in between the byte arrays but not at the very end of the buffer.
I assume the method might be wrong or at least not properly documented, but I really have no clue why it would behave that way.

I think you may be slightly misinterpreting what is happening here. I note your comment about "but not at the very end of the buffer".
The \0 is actually coming from the putChar(char) calls, not the put(byte[]) calls. As it says in the docs (emphasis added):
Writes two bytes containing the given char value, in the current byte order
The default byte order is big endian; given that the chars you are writing are in the 7-bit ASCII range, this means "the byte you want" is going to be preceded by 0x00.
If you want to write a byte, use put(byte):
buffer.put((byte) '|');

When reading stream, why not stop iterating when 0 bytes read

The standard idiom when reading from a stream is to check for EOF (-1):
while((bytesRead = inputStream.read(buffer)) != -1)
This seems pretty standard - I checked source for popular libraries like Apache Commons and it seems to be the defacto standard.
Why do we not stop at 0 as well? Wouldn't > -1 be better? Why do whatever work is in the loop when we didn't read anything?

Basically because it would be pointless. Look at the documentation:
If the length of b is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at the end of the file, the value -1 is returned; otherwise, at least one byte is read and stored into b.
So unless you're passing in an empty buffer (which is basically a bug in pretty much all cases; I personally wish the method would throw an exception in that case) the return value will never be 0. It will block for at least one byte to be read (in which case the return value will be 1 or more), or for the end of the stream to be reached (in which case the return value will be -1).

Nature of InputStream.read() method against URL stream

During my internship I encountered a codepiece (it turns out to be default codepiece for this matter) shown below.
InputStream input = new BufferedInputStream(url.openStream());
OutputStream output = new FileOutputStream(file);
byte data[] = new byte[1024];
int total = 0;
int count;
while ((count = input.read(data)) != -1) {
total += count;
output.write(data, 0, count);
}
So here are my questions.Assume the data is 2050 Byte
What is the reason of using 1024 constant?
As I took Computer networks class,I can relate some of my knowledge
to this
matter.Assuming we have fast connection, will we read 1024 Byte long
data at every iteration? So will count variable be 1024,1024,2 with
every iteration or is it possible 1000,1000,50 ?
If we have really
slow connection , is it possible that read() method will try to fill
1024 Byte buffer , even if it would take minutes long?

What is the reason of using 1024 constant?
None. It's arbitrary. I use 8192. The code you posted will work with any size >= 1.
Assuming we have fast connection, will we read 1024 Byte long data at every iteration?
No, you will either get an exception or end of stream or at least 1 byte on every iteration.
So will count variable be 1024,1024,2 with every iteration or is it possible 1000,1000,50 ?
Anything >= 1 byte per iteration is possible unless an exception or end of stream occurs.
If we have really slow connection, is it possible that read() method will try to fill 1024 Byte buffer, even if it would take minutes long?
No. It will block until it reads at least one byte or an exception or end of stream occurs.
This is all stated in the Javadoc.

I/O operation are expensive , so it is generally recommend to batch it, so in your case it is 1KB you can change that to more/less depending on your requirement.
You have to remember that it is blocking call , so it is too big you might get impression that your program is not moving.
You should not read byte by byte also because it is too much of I/O operation and program will spent all time in I/O only, so size should be depend on rate at which you can process data.

How to increase io performances of this piece of code

How can I make this piece of code extremely quick?
It reads a raw image using RandomAccessFile (in) and write it in a file using DataOutputStream (out)
final int WORD_SIZE = 4;
byte[] singleValue = new byte[WORD_SIZE];
long position;
for (int i=1; i<=100000; i++)
{
out.writeBytes(i + " ");
for(int j=1; j<=17; j++)
{
in.seek(position);
in.read(singleValue);
String str = Integer.toString(ByteBuffer.wrap(singleValue).order(ByteOrder.LITTLE_ENDIAN).getInt());
out.writeBytes(str + " ");
position+=WORD_SIZE;
}
out.writeBytes("\n");
}
The inner for creates a new line in the file every 17 elements
Thanks

I assume that the reason you are asking is because this code is running really slowly. If that is the case, then one reason is that each seek and read call is doing a system call. A RandomAccessFile has no buffering. (I'm guessing that singleValue is a byte[] of length 1.)
So the way to make this go faster is to step back and think about what it is actually doing. If I understand it correctly, it is reading each 4th byte in the file, converting them to decimal numbers and outputting them as text, 17 to a line. You could easily do that using a BufferedInputStream like this:
int b = bis.read(); // read a byte
bis.skip(3); // skip 3 bytes.
(with a bit of error checking ....). If you use a BufferedInputStream like this, most of the read and skip calls will operate on data that has already been buffered, and the number of syscalls will reduce to 1 for every N bytes, where N is the buffer size.
UPDATE - my guess was wrong. You are actually reading alternate words, so ...
bis.read(singleValue);
bis.skip(4);
Every 100000 offsets I have to jump 200000 and then do it again till the end of the file.
Use bis.skip(800000) to do that. It should do a big skip by moving the file position without actually reading any data. One syscall at most. (For a FileInputStream, at least.)
You can also speed up the output side by a roughly equivalent amount by wrapping the DataOutputStream around a BufferedOutputStream.
But System.out is already buffered.

Is this ASCII type?

Is this ASCII ?
And how can I print it in understanding language like in char ?
I get this answer from my PortCom.
Here is how I read :
boolean ok = false;
int read = 0;
System.out.println("In Read :");
while(ok == false) {
int availableBytes = 0;
try {
availableBytes = inputStream.available();
if (availableBytes > 0) {
read = read + availableBytes;
int raw = inputStream.read(readBuffer, read-availableBytes, availableBytes);
System.out.println("Inpustream ="+raw);
traduction = new String(readBuffer, read-availableBytes, availableBytes);
System.out.println("2=>" + traduction);
Response = new String(readBuffer, "UTF-8"); // bytes -> String
} catch (IOException e) {
}
if (availableBytes == 0 && (read == 19 || read == 8)){
ok = true;
}
}

As I read your comments, I am under the impression that you're a little confused as to what a character and ASCII are.
Characters are numbers. Plain dumb numbers. It just so happens that people created standard mappings between numbers and letters. For instance, according to the ASCII character map, 97 is a. The implications of this are that when display software sees 97, it knows that it has to find the glyph for the character a in a given font, and draw it to the screen.
Integer values 0 through 31, when interpreted with the ASCII character map, are so-called control characters and as such have no visual glyph associated with them. They tell software how to behave rather than what to display. For instance, the character #0 is the NUL character and is used to signal the end of a string with the C string library and has little to no practical use in most other languages. Off my head, character #13 is NL, for "new line", and it tells the rendering software to move the drawing cursor to the next line, rather than to render a character.
Most ASCII control characters are outdated and are not meant to be sent to text rendering software. As such, implementations decide how they deal with them if they don't know what to do. Many of them do nothing, some print question marks, and some print completely unrelated characters.
ASCII only maps integers from 0 to 128 to glyphs or control characters, which leaves another 128 possible integers in a byte undefined. Integers above 127 have no associated glyph in the ASCII standard, and only these can be called "not ASCII". So, what you should be asking, really, is "is that text?" rather than "is that ASCII?", because any sequence of integers between 0 and 127 is necessarily ASCII, which however says nothing about whether or not it's human-readable.
And the obvious answer to that question is "no, it's not text". Asking what it is if it's not text is asking us to be psychics, since there's no "universal bug" that maims text. It could be almost anything.
However, since you state that you're reading from a serial link, I'd advise you to check the bauds rate and other link settings, because there's no built-in mechanism to detect mismatches from on end to the other, and it can mangle data the way it does here.

Use the raw value instead of availableBytes:
traduction = new String(readBuffer, read-availableBytes, raw);
The raw indicates how many were actually read as opposed to how many you requested. If you ask 10 bytes and it reads 5, the remaining 5 will be unknown garbage.
UPDATE
The response is obviously wrong too and for the same reason:
Response = new String(readBuffer, "UTF-8");
You are telling it to convert the entire buffer even though you may have only read 1 byte. If you're a bit unlucky you'll get exceptions because not all byte combinations can be converted using UTF-8

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.