InputStream misses byte while reading in a loop

InputStream misses byte while reading in a loop - java

I'm trying to write a java program to read from a COM port. There are 266 bytes to read, and since the 266 bytes are not generated all together, which means that the input stream can be empty at sometime, I used a while loop to read all 266 bytes. The problem is that SOMETIMES one byte may be missed (only one byte), according to my checking the received bytes one by one. Here are the codes:
While(numOfBytes < 266) {
if(!(inputStream.available() > 0)) continue;
inputStream.read(buffer);
data[numOfBytes] = buffer[0];
numOfBytes++;
}

You give input stream an array to store data in (is.read(buffer)), but regardless of how much it reads, storing only 1 byte, and incrementing number of bytes by 1.
Try instead something like:
While(numOfBytes < 266) {
if(!(inputStream.available() > 0)) continue;
int b = inputStream.read();
if(b >= 0){
data[numOfBytes] = (byte) b ;
numOfBytes++;
}
}

normally I would do it like this
byte[] in = new byte[4196];
int bytesRead = 0;
while ((bytesRead = is.read(in)) != -1) {
// add to a StringBuffer maybe
}

Related

Java Reading large files into byte array chunk by chunk

So I've been trying to make a small program that inputs a file into a byte array, then it will turn that byte array into hex, then binary. It will then play with the binary values (I haven't thought of what to do when I get to this stage) and then save it as a custom file.
I studied a lot of internet code and I can turn a file into a byte array and into hex, but the problem is I can't turn huge files into byte arrays (out of memory).
This is the code that is not a complete failure
public void rundis(Path pp) {
byte bb[] = null;
try {
bb = Files.readAllBytes(pp); //Files.toByteArray(pathhold);
System.out.println("byte array made");
} catch (Exception e) {
e.printStackTrace();
}
if (bb.length != 0 || bb != null) {
System.out.println("byte array filled");
//send to method to turn into hex
} else {
System.out.println("byte array NOT filled");
}
}
I know how the process should go, but I don't know how to code that properly.
The process if you are interested:
Input file using File
Read the chunk by chunk of the file into a byte array. Ex. each byte array record hold 600 bytes
Send that chunk to be turned into a Hex value --> Integer.tohexstring
Send that hex value chunk to be made into a binary value --> Integer.toBinarystring
Mess around with the Binary value
Save to custom file line by line
Problem:: I don't know how to turn a huge file into a byte array chunk by chunk to be processed.
Any and all help will be appreciated, thank you for reading :)

To chunk your input use a FileInputStream:
Path pp = FileSystems.getDefault().getPath("logs", "access.log");
final int BUFFER_SIZE = 1024*1024; //this is actually bytes
FileInputStream fis = new FileInputStream(pp.toFile());
byte[] buffer = new byte[BUFFER_SIZE];
int read = 0;
while( ( read = fis.read( buffer ) ) > 0 ){
// call your other methodes here...
}
fis.close();

To stream a file, you need to step away from Files.readAllBytes(). It's a nice utility for small files, but as you noticed not so much for large files.
In pseudocode it would look something like this:
while there are more bytes available
read some bytes
process those bytes
(write the result back to a file, if needed)
In Java, you can use a FileInputStream to read a file byte by byte or chunk by chunk. Lets say we want to write back our processed bytes. First we open the files:
FileInputStream is = new FileInputStream(new File("input.txt"));
FileOutputStream os = new FileOutputStream(new File("output.txt"));
We need the FileOutputStream to write back our results - we don't want to just drop our precious processed data, right? Next we need a buffer which holds a chunk of bytes:
byte[] buf = new byte[4096];
How many bytes is up to you, I kinda like chunks of 4096 bytes. Then we need to actually read some bytes
int read = is.read(buf);
this will read up to buf.length bytes and store them in buf. It will return the total bytes read. Then we process the bytes:
//Assuming the processing function looks like this:
//byte[] process(byte[] data, int bytes);
byte[] ret = process(buf, read);
process() in above example is your processing method. It takes in a byte-array, the number of bytes it should process and returns the result as byte-array.
Last, we write the result back to a file:
os.write(ret);
We have to execute this in a loop until there are no bytes left in the file, so lets write a loop for it:
int read = 0;
while((read = is.read(buf)) > 0) {
byte[] ret = process(buf, read);
os.write(ret);
}
and finally close the streams
is.close();
os.close();
And thats it. We processed the file in 4096-byte chunks and wrote the result back to a file. It's up to you what to do with the result, you could also send it over TCP or even drop it if it's not needed, or even read from TCP instead of a file, the basic logic is the same.
This still needs some proper error-handling to work around missing files or wrong permissions but that's up to you to implement that.
A example implementation for the process method:
//returns the hex-representation of the bytes
public static byte[] process(byte[] bytes, int length) {
final char[] hexchars = "0123456789ABCDEF".toCharArray();
char[] ret = new char[length * 2];
for ( int i = 0; i < length; ++i) {
int b = bytes[i] & 0xFF;
ret[i * 2] = hexchars[b >>> 4];
ret[i * 2 + 1] = hexchars[b & 0x0F];
}
return ret;
}

Java - Read mixed data types from a file (char + int + binary)

I am reading data from a file (Actually a fifo pipe).
The format is as follows
SECTION_NAME
SECTION_SIZE
SECTION_DATA[SECTION_SIZE]
.....
Ex:
pps_frame
1404
<Binary data of 1404 bytes>
sps_frame
1000
<Binary data of 1000 bytes>
...
How do i Read the binary data of 1404 bytes in the above example. Note that the SECTION_SIZE keeps varying.
PS: In my native code, I am writing this data into the pipe.

Unfortunately that is indeed a bit circumstantial: first reading text, and then binary data. Here a solution, staying with binary reading.
FileInputStream fin = ...
BufferedInputStream bin = new BufferedInputStream(fin, 2048);
String name = readLine(bin);
String lineWithSize = readLine(bin);
int size = Integer.parseInt(lineWithSize);
byte[] data = new byte[size];
bin.read(data, 0, size);
bin.close();
With a help function readLine for reading bytes, and converting them to text.
As BufferedReader.readLine it discards the line endings, here following the simplified assumption that a line ending is either LF or CR+LF.
String readLine(BufferedInputStream bin) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
for (;;) {
int ch = bin.read();
if (ch == -1) {
break;
}
if (ch == '\r' || ch == '\n') {
if (ch == '\r') {
bin.read();
}
break;
}
baos.write(ch);
}
return baos.toString("ISO-8859-1"); // Basic Latin-1 encoding.
}
Reading text with a Reader does always a conversion of bytes using some encoding of those bytes. To its internal Unicode (a char is two bytes UTF-16). Hence you cannot 100% perfectly read bytes into a String. Because of size and conversion it is not even desirable.

TCP socket data getting scrambled

I have a Multi threaded TCP socket listener program. I do a blocked read for data of a particular no of bytes(128 bytes and 4xmultiples),so my packet sizes are 128 bytes,256 bytes,384 bytes and 512 bytes.
I am having problem because sometimes data is getting messed in the socket. For eg:
Supposed to read:
<header><data payload(padded with zeros to compensate size)><footer>
ex-- ABCDddddddddd0000000000WXYZ
What i read sometimes:
ex-- ABCDdd00000000000000000dddddd00
and then the next packet looks like
00000WXYZABCDddddd00000000000000000
so i close the socket and we have defined the protocol to send back 2 or 3 old packets to avoid the loss.
my questions are
1. why does the data get scrambled/messed?
2. can it be avoided by any means?
here is my code for read data.
in = new DataInputStream(conn.getInputStream());
outStream = conn.getOutputStream();
while (m_bRunThread) {
// read incoming stream
in.readFully(rec_data_in_byte, 0, 128); //blocks until 128 bytes are read from the socket
{
//converting the read byte array into string
//finding out the size from a particular position,helps determine if any further reads are required or not.
//if the size is a multiple of 128 and the size is a multiple higher than 1 then more reads are required.
if ((Integer.parseInt(SIZE) % 128 == 0) && ((SIZE / 128) > 1)) {
for(int z = 1;z < lenSIZE;z++) {
in.readFully(rec_data_in_byte1, 0, 128);//changed from in.read(rec_data_in_byte1, 0, 128); as per suggestions
}
//extracting the data,validating and processing it
}
}
}
UPDATE:
Implemented Peters fix but the problem still persists. data is getting scrambled.
adding a few lines of extra code where the byte array is converted into a string.
byte[] REC_data=new byte[1024];
System.arraycopy(rec_data_in_byte1, 0, REC_data, 128*z, 128);
rec_data_string=MyClass2.getData(REC_data,0,Integer.parseInt(SIZE)-1,Integer.parseInt(SIZE));
the getdata() method is below:
String msg = "";//the return String
int count = 1;
for (int i = 0; i < datasize; i++) {
if (i >= startindex) {
if (count <= lengthofpacket) {
msg += String.valueOf((char) (bytedata[i]));
count++;
}
}
}
return msg;
can any of this be the reason for the scramble?
P.S-the scramble is happening the same way as it was happening before.

When you do
int lengthActuallyRead = in.read(rec_data_in_byte1, 0, 128);
You need to check the length read. Otherwise it might read 1 byte, or anything up to 128 in this case. Note, any bytes after what was actually read are untouched so they might be 0 or they could be garbage left from a previous message.
If you expect 128 bytes you can use readFully as you did previously
in.readFully(rec_data_in_byte, 0, 128);
Note: If the amount remaining is less than 128 you might want to do this.
int remaining = size - sizeReadSoFar;
int length = in.read(rec_data_in_byte1, 0, remaining);
This prevents you reading part of the next message while you are still reading the old one.

File gets corrupted when transferring it via socket

My Java client sends a file to a C++ server using this code:
FileInputStream fileInputStream = new FileInputStream(path);
byte[] buffer = new byte[64*1024];
int bytesRead = 0;
while ( (bytesRead = fileInputStream.read(buffer)) != -1)
{
if (bytesRead > 0)
{
this.outToServer.write(buffer, 0, bytesRead);
}
}
My C++ server receives the bytes using this code:
vector<char> buf5(file_length);
size_t read_bytes;
do
{
read_bytes = socket.read_some(boost::asio::buffer(buf5,file_length));
file_length -= read_bytes;
}
while(read_bytes != 0);
string file(buf5.begin(), buf5.end());
And then creates the file using this code:
ofstream out_file( (some_path).c_str() );
out_file << file << endl;
out_file.close();
However, somehow the file gets corrupted during this process.
At the end of the process, both files(the one sent and the one created) have the same size.
What am I doing wrong? Any help would be appreciated!
Edit: tried to use different code for receiving the file, same result:
char buf[file_length];
size_t length = 0;
while( length < file_length )
{
length += socket.read_some(boost::asio::buffer(&buf[length], file_length - length), error);
}
string file(buf);

1) is it a text file?
2) if not try opening the file in binary mode before writing, also do not use << operator, instead use write or put methods

In your first example the problem appears to be this line:
read_bytes = socket.read_some(boost::asio::buffer(buf5,file_length));
This results in you overwriting the first N bytes of your string and not appending multiple reads correctly.
In your second example the problem is likely:
string file(buf);
If buf contains any NUL characters then the string will be truncated. Use the same string creation as in your first example with a std::vector<char>.
If you still have problems I would recommend doing a binary diff of the source and copied files (most hex editors can do this). This should give you a better picture of exactly where the difference is and what may be causing it.

difference between input.read and input.read(array, offset, length)

I'm trying to understand how inputstreams work. The following block of code is one of the many ways to read data from a text file:-
File file = new File("./src/test.txt");
InputStream input = new BufferedInputStream (new FileInputStream(file));
int data = 0;
while (data != -1) (-1 means we reached the end of the file)
{
data = input.read(); //if a character was read, it'll be turned to a bite and we get the integer representation of it so a is 97 b is 98
System.out.println(data + (char)data); //this will print the numbers followed by space then the character
}
input.close();
Now to use input.read(byte, offset, length) i have this code. I got it from here
File file = new File("./src/test.txt");
InputStream input = new BufferedInputStream (new FileInputStream(file));
int totalBytesRead = 0, bytesRemaining, bytesRead;
byte[] result = new byte[ ( int ) file.length()];
while ( totalBytesRead < result.length )
{
bytesRemaining = result.length - totalBytesRead;
bytesRead = input.read ( result, totalBytesRead, bytesRemaining );
if ( bytesRead > 0 )
totalBytesRead = totalBytesRead + bytesRead;
//printing integer version of bytes read
for (int i = 0; i < bytesRead; i++)
System.out.print(result[i] + " ");
System.out.println();
//printing character version of bytes read
for (int i = 0; i < bytesRead; i++)
System.out.print((char)result[i]);
}
input.close();
I'm assuming that based on the name BYTESREAD, this read method is returning the number of bytes read. In the documentation, it says that the function will try to read as many as possible. So there might be a reason why it wouldn't.
My first question is: What are these reasons?
I could replace that entire while loop with one line of code: input.read(result, 0, result.length)
I'm sure the creator of the article thought about this. It's not about the output because I get the same output in both cases. So there has to be a reason. At least one. What is it?

The documentation of read(byte[],int,int says that it:
Reads up to len bytes of data.
An attempt is made to read as many as len bytes
A smaller number may be read.
Since we are working with files that are right there in our hard disk, it seems reasonable to expect that the attempt will read the whole file, but input.read(result, 0, result.length) is not guaranteed to read the whole file (it's not said anywhere in the documentation). Relying in undocumented behaviors is a source for bugs when the undocumented behavior change.
For instance, the file stream may be implemented differently in other JVMs, some OS may impose a limit on the number of bytes that you may read at once, the file may be located in the network, or you may later use that piece of code with another implementation of stream, which doesn't behave in that way.
Alternatively, if you are reading the whole file in an array, perhaps you could use DataInputStream.readFully
About the loop with read(), it reads a single byte each time. That reduces performance if you are reading a big chunk of data, since each call to read() will perform several tests (has the stream ended? etc) and may ask the OS for one byte. Since you already know that you want file.length() bytes, there is no reason for not using the other more efficient forms.

Imagine you are reading from a network socket, not from a file. In this case you don't have any information about the total amount of bytes in the stream. You would allocate a buffer of fixed size and read from the stream in a loop. During one iteration of the loop you can't expect there are BUFFERSIZE bytes available in the stream. So you would fill the buffer as much as possible and iterate again, until the buffer is full. This can be useful, if you have data blocks of fixed size, for example serialized object.
ArrayList<MyObject> list = new ArrayList<MyObject>();
try {
InputStream input = socket.getInputStream();
byte[] buffer = new byte[1024];
int bytesRead;
int off = 0;
int len = 1024;
while(true) {
bytesRead = input.read(buffer, off, len);
if(bytesRead == len) {
list.add(createMyObject(buffer));
// reset variables
off = 0;
len = 1024;
continue;
}
if(bytesRead == -1) break;
// buffer is not full, adjust size
off += bytesRead;
len -= bytesRead;
}
} catch(IOException io) {
// stream was closed
}
ps. Code is not tested and should only point out, how this function can be useful.

You specify the amount of bytes to read because you might not want to read the entire file at once or maybe you couldn't or might not want to create a buffer as large as the file.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

InputStream misses byte while reading in a loop - java

normally I would do it like this byte[] in = new byte[4196]; int bytesRead = 0; while ((bytesRead = is.read(in)) != -1) { // add to a StringBuffer maybe }

Related

Java Reading large files into byte array chunk by chunk

Java - Read mixed data types from a file (char + int + binary)

TCP socket data getting scrambled

File gets corrupted when transferring it via socket

difference between input.read and input.read(array, offset, length)

Categories

Resources