First off forgive me if I am mistaken for how blocking works, to my understanding blocking will pause the thread until it is ready, for exsample when reading user input the program will wait until the user hits return.
My problem is that instead of waiting for data to become available it reads bytes with the value 0. Is there a way to block until data becoms available?
The method readBytes is called in a loop.
public byte[] readBytes(){
try{
//read the head int that will be 4 bytes telling the number of bytes that follow containing data
byte[] rawLen = new byte[4];
socketReader.read(rawLen);
ByteBuffer bb = ByteBuffer.wrap(rawLen);
int len = bb.getInt();
byte[] data = new byte[len];
if (len > 0) {
socketReader.readFully(data);
}
return data;
} catch (Exception e){
e.printStackTrace();
logError("Failed to read data: " + socket.toString());
return null;
}
}
If read() returned -1, the peer has disconnected. You aren't handling that case. If you detect end of stream you must close the connection and stop reading. At present you have no way of doing so. You need to reconsider your method signature.
You should use readInt() instead of those four lines of code that read the length. At present you are assuming have read four bytes without actually checking. readInt() will check for you.
This way also you will never get out of sync with the sender, which at present is a serious risk.
public class JavaCopyFileProgram {
public static void main(String[] args)
{
File sourceFile = new File("F:/Study/Java/Java Programs/Factory Methods.txt");
File destFile = new File("D:/DestFile.txt");
FileInputStream inStream = null;
FileOutputStream outStream = null;
try
{
inStream = new FileInputStream(sourceFile);
outStream = new FileOutputStream(destFile);
byte[] buffer = new byte[1024];
int length;
while ((length = inStream.read(buffer)) != -1)
{
outStream.write(buffer, 0, length);
}
}
catch (IOException e)
{
e.printStackTrace();
}
finally
{
try
{
inStream.close();
outStream.close();
}
catch (IOException e)
{
e.printStackTrace();
}
}
System.out.println("Success");
}
}
I am not able to understand how is thw write() method working? When it is first invoked then it is writing the cotents from 0 index to length of the byte array but when it is invoked second time then how is it appending the new text to the end of the previous text? It should override the previous content as again write is called with 0 as starting index. Please help me if i am understood something wrong?
The start offset in the write method does not refer to the offset in the FileOutputStream but to the offset in the array from which you write.
You can read in the documentation of OutputStream (rather than FileOutputStream) how the write method is supposed to be used.
For your specific write method call
outStream.write(buffer, 0, length);
it means: "write out the contents of buffer to stream outStream starting at buffer[0] the next length bytes".
The second and third parameter refer to the bounds of the array. The method will write buffer[0] to buffer[length - 1]. And since you read from inStream again and again (see head of the while loop), the contents of buffer will be filled with consecutive bytes from this input stream. The resulting operation is a file copy here.
The simplest form of the write() method is write( byte[] buffer ); it writes the entire buffer into the output stream.
However, it is often the case that we do not want to write an entire buffer, but only a portion of it. That's why the write( byte[] buffer, int offset, int length ) exists.
The code that you provided uses this variant of the write() method because the buffer is not always full, since read() call does not always read an entire buffer, it reads a length number of bytes.
In the code that you posted, the read() method is invoked each time with an offset of 0, which means that it reads bytes from the input stream and stores them into the buffer starting at offset 0. Therefore the write() method needs to also start fetching bytes to write into the output stream starting at offset zero of the buffer. That's why the offset to write is given as zero.
However, if you had a buffer containing 100 bytes, and you only wanted to write the middle 80, then you would have said write( buffer, 10, 80 ).
So I've been trying to make a small program that inputs a file into a byte array, then it will turn that byte array into hex, then binary. It will then play with the binary values (I haven't thought of what to do when I get to this stage) and then save it as a custom file.
I studied a lot of internet code and I can turn a file into a byte array and into hex, but the problem is I can't turn huge files into byte arrays (out of memory).
This is the code that is not a complete failure
public void rundis(Path pp) {
byte bb[] = null;
try {
bb = Files.readAllBytes(pp); //Files.toByteArray(pathhold);
System.out.println("byte array made");
} catch (Exception e) {
e.printStackTrace();
}
if (bb.length != 0 || bb != null) {
System.out.println("byte array filled");
//send to method to turn into hex
} else {
System.out.println("byte array NOT filled");
}
}
I know how the process should go, but I don't know how to code that properly.
The process if you are interested:
Input file using File
Read the chunk by chunk of the file into a byte array. Ex. each byte array record hold 600 bytes
Send that chunk to be turned into a Hex value --> Integer.tohexstring
Send that hex value chunk to be made into a binary value --> Integer.toBinarystring
Mess around with the Binary value
Save to custom file line by line
Problem:: I don't know how to turn a huge file into a byte array chunk by chunk to be processed.
Any and all help will be appreciated, thank you for reading :)
To chunk your input use a FileInputStream:
Path pp = FileSystems.getDefault().getPath("logs", "access.log");
final int BUFFER_SIZE = 1024*1024; //this is actually bytes
FileInputStream fis = new FileInputStream(pp.toFile());
byte[] buffer = new byte[BUFFER_SIZE];
int read = 0;
while( ( read = fis.read( buffer ) ) > 0 ){
// call your other methodes here...
}
fis.close();
To stream a file, you need to step away from Files.readAllBytes(). It's a nice utility for small files, but as you noticed not so much for large files.
In pseudocode it would look something like this:
while there are more bytes available
read some bytes
process those bytes
(write the result back to a file, if needed)
In Java, you can use a FileInputStream to read a file byte by byte or chunk by chunk. Lets say we want to write back our processed bytes. First we open the files:
FileInputStream is = new FileInputStream(new File("input.txt"));
FileOutputStream os = new FileOutputStream(new File("output.txt"));
We need the FileOutputStream to write back our results - we don't want to just drop our precious processed data, right? Next we need a buffer which holds a chunk of bytes:
byte[] buf = new byte[4096];
How many bytes is up to you, I kinda like chunks of 4096 bytes. Then we need to actually read some bytes
int read = is.read(buf);
this will read up to buf.length bytes and store them in buf. It will return the total bytes read. Then we process the bytes:
//Assuming the processing function looks like this:
//byte[] process(byte[] data, int bytes);
byte[] ret = process(buf, read);
process() in above example is your processing method. It takes in a byte-array, the number of bytes it should process and returns the result as byte-array.
Last, we write the result back to a file:
os.write(ret);
We have to execute this in a loop until there are no bytes left in the file, so lets write a loop for it:
int read = 0;
while((read = is.read(buf)) > 0) {
byte[] ret = process(buf, read);
os.write(ret);
}
and finally close the streams
is.close();
os.close();
And thats it. We processed the file in 4096-byte chunks and wrote the result back to a file. It's up to you what to do with the result, you could also send it over TCP or even drop it if it's not needed, or even read from TCP instead of a file, the basic logic is the same.
This still needs some proper error-handling to work around missing files or wrong permissions but that's up to you to implement that.
A example implementation for the process method:
//returns the hex-representation of the bytes
public static byte[] process(byte[] bytes, int length) {
final char[] hexchars = "0123456789ABCDEF".toCharArray();
char[] ret = new char[length * 2];
for ( int i = 0; i < length; ++i) {
int b = bytes[i] & 0xFF;
ret[i * 2] = hexchars[b >>> 4];
ret[i * 2 + 1] = hexchars[b & 0x0F];
}
return ret;
}
I was trying to read a file into an array by using FileInputStream, and an ~800KB file took about 3 seconds to read into memory. I then tried the same code except with the FileInputStream wrapped into a BufferedInputStream and it took about 76 milliseconds. Why is reading a file byte by byte done so much faster with a BufferedInputStream even though I'm still reading it byte by byte? Here's the code (the rest of the code is entirely irrelevant). Note that this is the "fast" code. You can just remove the BufferedInputStream if you want the "slow" code:
InputStream is = null;
try {
is = new BufferedInputStream(new FileInputStream(file));
int[] fileArr = new int[(int) file.length()];
for (int i = 0, temp = 0; (temp = is.read()) != -1; i++) {
fileArr[i] = temp;
}
BufferedInputStream is over 30 times faster. Far more than that. So, why is this, and is it possible to make this code more efficient (without using any external libraries)?
In FileInputStream, the method read() reads a single byte. From the source code:
/**
* Reads a byte of data from this input stream. This method blocks
* if no input is yet available.
*
* #return the next byte of data, or <code>-1</code> if the end of the
* file is reached.
* #exception IOException if an I/O error occurs.
*/
public native int read() throws IOException;
This is a native call to the OS which uses the disk to read the single byte. This is a heavy operation.
With a BufferedInputStream, the method delegates to an overloaded read() method that reads 8192 amount of bytes and buffers them until they are needed. It still returns only the single byte (but keeps the others in reserve). This way the BufferedInputStream makes less native calls to the OS to read from the file.
For example, your file is 32768 bytes long. To get all the bytes in memory with a FileInputStream, you will require 32768 native calls to the OS. With a BufferedInputStream, you will only require 4, regardless of the number of read() calls you will do (still 32768).
As to how to make it faster, you might want to consider Java 7's NIO FileChannel class, but I have no evidence to support this.
Note: if you used FileInputStream's read(byte[], int, int) method directly instead, with a byte[>8192] you wouldn't need a BufferedInputStream wrapping it.
A BufferedInputStream wrapped around a FileInputStream, will request data from the FileInputStream in big chunks (512 bytes or so by default, I think.) Thus if you read 1000 characters one at a time, the FileInputStream will only have to go to the disk twice. This will be much faster!
It is because of the cost of disk access. Lets assume you will have a file which size is 8kb. 8*1024 times access disk will be needed to read this file without BufferedInputStream.
At this point, BufferedStream comes to the scene and acts as a middle man between FileInputStream and the file to be read.
In one shot, will get chunks of bytes default is 8kb to memory and then FileInputStream will read bytes from this middle man.
This will decrease the time of the operation.
private void exercise1WithBufferedStream() {
long start= System.currentTimeMillis();
try (FileInputStream myFile = new FileInputStream("anyFile.txt")) {
BufferedInputStream bufferedInputStream = new BufferedInputStream(myFile);
boolean eof = false;
while (!eof) {
int inByteValue = bufferedInputStream.read();
if (inByteValue == -1) eof = true;
}
} catch (IOException e) {
System.out.println("Could not read the stream...");
e.printStackTrace();
}
System.out.println("time passed with buffered:" + (System.currentTimeMillis()-start));
}
private void exercise1() {
long start= System.currentTimeMillis();
try (FileInputStream myFile = new FileInputStream("anyFile.txt")) {
boolean eof = false;
while (!eof) {
int inByteValue = myFile.read();
if (inByteValue == -1) eof = true;
}
} catch (IOException e) {
System.out.println("Could not read the stream...");
e.printStackTrace();
}
System.out.println("time passed without buffered:" + (System.currentTimeMillis()-start));
}
Looking to read in some bytes over a socket using an inputStream. The bytes sent by the server may be of variable quantity, and the client doesn't know in advance the length of the byte array. How may this be accomplished?
byte b[];
sock.getInputStream().read(b);
This causes a 'might not be initialized error' from the Net BzEAnSZ. Help.
You need to expand the buffer as needed, by reading in chunks of bytes, 1024 at a time as in this example code I wrote some time ago
byte[] resultBuff = new byte[0];
byte[] buff = new byte[1024];
int k = -1;
while((k = sock.getInputStream().read(buff, 0, buff.length)) > -1) {
byte[] tbuff = new byte[resultBuff.length + k]; // temp buffer size = bytes already read + bytes last read
System.arraycopy(resultBuff, 0, tbuff, 0, resultBuff.length); // copy previous bytes
System.arraycopy(buff, 0, tbuff, resultBuff.length, k); // copy current lot
resultBuff = tbuff; // call the temp buffer as your result buff
}
System.out.println(resultBuff.length + " bytes read.");
return resultBuff;
Assuming the sender closes the stream at the end of the data:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buf = new byte[4096];
while(true) {
int n = is.read(buf);
if( n < 0 ) break;
baos.write(buf,0,n);
}
byte data[] = baos.toByteArray();
Read an int, which is the size of the next segment of data being received. Create a buffer with that size, or use a roomy pre-existing buffer. Read into the buffer, making sure it is limited to the aforeread size. Rinse and repeat :)
If you really don't know the size in advance as you said, read into an expanding ByteArrayOutputStream as the other answers have mentioned. However, the size method really is the most reliable.
Without re-inventing the wheel, using Apache Commons:
IOUtils.toByteArray(inputStream);
For example, complete code with error handling:
public static byte[] readInputStreamToByteArray(InputStream inputStream) {
if (inputStream == null) {
// normally, the caller should check for null after getting the InputStream object from a resource
throw new FileProcessingException("Cannot read from InputStream that is NULL. The resource requested by the caller may not exist or was not looked up correctly.");
}
try {
return IOUtils.toByteArray(inputStream);
} catch (IOException e) {
throw new FileProcessingException("Error reading input stream.", e);
} finally {
closeStream(inputStream);
}
}
private static void closeStream(Closeable closeable) {
try {
if (closeable != null) {
closeable.close();
}
} catch (Exception e) {
throw new FileProcessingException("IO Error closing a stream.", e);
}
}
Where FileProcessingException is your app-specific meaningful RT exception that will travel uninterrupted to your proper handler w/o polluting the code in between.
The simple answer is:
byte b[] = new byte[BIG_ENOUGH];
int nosRead = sock.getInputStream().read(b);
where BIG_ENOUGH is big enough.
But in general there is a big problem with this. A single read call is not guaranteed to return all that the other end has written.
If the nosRead value is BIG_ENOUGH, your application has no way of knowing for sure if there are more bytes to come; the other end may have sent exactly BIG_ENOUGH bytes ... or more than BIG_ENOUGH bytes. In the former case, you application will block (for ever) if you try to read. In the latter case, your application has to do (at least) another read to get the rest of the data.
If the nosRead value is less than BIG_ENOUGH, your application still doesn't know. It might have received everything there is, part of the data may have been delayed (due to network packet fragmentation, network packet loss, network partition, etc), or the other end might have blocked or crashed part way through sending the data.
The best answer is that EITHER your application needs to know beforehand how many bytes to expect, OR the application protocol needs to somehow tell the application how many bytes to expect or when all bytes have been sent.
Possible approaches are:
the application protocol uses fixed message sizes (not applicable to your example)
the application protocol message sizes are specified in message headers
the application protocol uses end-of-message markers
the application protocol is not message based, and the other end closes the connection to say that is the end.
Without one of these strategies, your application is left to guess, and is liable to get it wrong occasionally.
Then you use multiple read calls and (maybe) multiple buffers.
Stream all Input data into Output stream. Here is working example:
InputStream inputStream = null;
byte[] tempStorage = new byte[1024];//try to read 1Kb at time
int bLength;
try{
ByteArrayOutputStream outputByteArrayStream = new ByteArrayOutputStream();
if (fileName.startsWith("http"))
inputStream = new URL(fileName).openStream();
else
inputStream = new FileInputStream(fileName);
while ((bLength = inputStream.read(tempStorage)) != -1) {
outputByteArrayStream.write(tempStorage, 0, bLength);
}
outputByteArrayStream.flush();
//Here is the byte array at the end
byte[] finalByteArray = outputByteArrayStream.toByteArray();
outputByteArrayStream.close();
inputStream.close();
}catch(Exception e){
e.printStackTrace();
if (inputStream != null) inputStream.close();
}
Either:
Have the sender close the socket after transferring the bytes. Then at the receiver just keep reading until EOS.
Have the sender prefix a length word as per Chris's suggestion, then read that many bytes.
Use a self-describing protocol such as XML, Serialization, ...
Use BufferedInputStream, and use the available() method which returns the size of bytes available for reading, and then construct a byte[] with that size. Problem solved. :)
BufferedInputStream buf = new BufferedInputStream(is);
int size = buf.available();
Here is a simpler example using ByteArrayOutputStream...
socketInputStream = socket.getInputStream();
int expectedDataLength = 128; //todo - set accordingly/experiment. Does not have to be precise value.
ByteArrayOutputStream baos = new ByteArrayOutputStream(expectedDataLength);
byte[] chunk = new byte[expectedDataLength];
int numBytesJustRead;
while((numBytesJustRead = socketInputStream.read(chunk)) != -1) {
baos.write(chunk, 0, numBytesJustRead);
}
return baos.toString("UTF-8");
However, if the server does not return a -1, you will need to detect the end of the data some other way - e.g., maybe the returned content always ends with a certain marker (e.g., ""), or you could possibly solve using socket.setSoTimeout(). (Mentioning this as it is seems to be a common problem.)
This is both a late answer and self-advertising, but anyone checking out this question may want to take a look here:
https://github.com/GregoryConrad/SmartSocket
This question is 7 years old, but i had a similiar problem, while making a NIO and OIO compatible system (Client and Server might be whatever they want, OIO or NIO).
This was quit the challenge, because of the blocking InputStreams.
I found a way, which makes it possible and i want to post it, to help people with similiar problems.
Reading a byte array of dynamic sice is done here with the DataInputStream, which kann be simply wrapped around the socketInputStream. Also, i do not want to introduce a specific communication protocoll (like first sending the size of bytes, that will be send), because i want to make this as vanilla as possible. First of, i have a simple utility Buffer class, which looks like this:
import java.util.ArrayList;
import java.util.List;
public class Buffer {
private byte[] core;
private int capacity;
public Buffer(int size){
this.capacity = size;
clear();
}
public List<Byte> list() {
final List<Byte> result = new ArrayList<>();
for(byte b : core) {
result.add(b);
}
return result;
}
public void reallocate(int capacity) {
this.capacity = capacity;
}
public void teardown() {
this.core = null;
}
public void clear() {
core = new byte[capacity];
}
public byte[] array() {
return core;
}
}
This class only exists, because of the dumb way, byte <=> Byte autoboxing in Java works with this List. This is not realy needed at all in this example, but i did not want to leave something out of this explanation.
Next up, the 2 simple, core methods. In those, a StringBuilder is used as a "callback". It will be filled with the result which has been read and the amount of bytes read will be returned. This might be done different of course.
private int readNext(StringBuilder stringBuilder, Buffer buffer) throws IOException {
// Attempt to read up to the buffers size
int read = in.read(buffer.array());
// If EOF is reached (-1 read)
// we disconnect, because the
// other end disconnected.
if(read == -1) {
disconnect();
return -1;
}
// Add the read byte[] as
// a String to the stringBuilder.
stringBuilder.append(new String(buffer.array()).trim());
buffer.clear();
return read;
}
private Optional<String> readBlocking() throws IOException {
final Buffer buffer = new Buffer(256);
final StringBuilder stringBuilder = new StringBuilder();
// This call blocks. Therefor
// if we continue past this point
// we WILL have some sort of
// result. This might be -1, which
// means, EOF (disconnect.)
if(readNext(stringBuilder, buffer) == -1) {
return Optional.empty();
}
while(in.available() > 0) {
buffer.reallocate(in.available());
if(readNext(stringBuilder, buffer) == -1) {
return Optional.empty();
}
}
buffer.teardown();
return Optional.of(stringBuilder.toString());
}
The first method readNext will fill the buffer, with byte[] from the DataInputStream and return the amount bytes read this way.
In the secon method, readBlocking, i utilized the blocking nature, not to worry about consumer-producer-problems. Simply readBlocking will block, untill a new byte-array is received. Before we call this blocking method, we allocate a Buffer-size. Note, i called reallocate after the first read (inside the while loop). This is not needed. You can safely delete this line and the code will still work. I did it, because of the uniqueness of my problem.
The 2 things, i did not explain in more detail are:
1. in (the DataInputStream and the only short varaible here, sorry for that)
2. disconnect (your disconnect routine)
All in all, you can now use it, this way:
// The in has to be an attribute, or an parameter to the readBlocking method
DataInputStream in = new DataInputStream(socket.getInputStream());
final Optional<String> rawDataOptional = readBlocking();
rawDataOptional.ifPresent(string -> threadPool.execute(() -> handle(string)));
This will provide you with a way of reading byte arrays of any shape or form over a socket (or any InputStream realy). Hope this helps!