The ring-buffer don't read the correct data - java

I started to work on a project to generate sounds at different frenquencies. In that aim, I'm using 3 objects:
a generator to create the sinusoidal signal and generate sound packets numbered (with a simple integer ID), sampled at 44100 Hz and using an "Observable" pattern.
a "ring-buffer" which is "Observer" of the generator. It will store the packets in an array (not a list) used as the buffer
a reader to read the sound data into the ring-buffer
In the Main class, I created a break (Thread.sleep) between the creation of the buffer and the creation of the reader. It allow me to fill a bit the ring-buffer before the lecture starts (and I also wanted to see what will happen...)
The problem is about the ring buffer (the code is below). If the break is short enough to not fillfull the ring-buffer when the lecture starts, everything seems more or less ok.
But, if my buffer is filled (and in many other unexplained cases), my sound have many parasites. And if I analyse the result of the ID of the read packets, here is what I get :
Buffer.load - packet ID : 63 - buffer pos. = 62
Buffer.load - packet ID : 64 - buffer pos. = 63
Buffer.load - packet ID : 65 - buffer pos. = 64
Buffer.read - packet ID : 65 - buffer pos. = 2
Buffer.read - packet ID : 65 - buffer pos. = 3
Buffer.read - packet ID : 65 - buffer pos. = 4
Buffer.read - packet ID : 65 - buffer pos. = 5
The read method send always the same packet ID, whatever the position of the reader into the buffer. However, the load method seems to work correctly and load the following ID in following positions.
public class Buffer2 implements Observer {
private Object[] buffer;
/*
* important note about the buffer :
* It is an array and not a list.
* So, it will be impossible to shift on the left or on the right the elements
* (a method like deleteFirst or similar doesn't exsits for arrays)
* in case of lecture (then suppression) or adding of en element.
* So, I'll use three variables : one to know the lecture position in the buffer
* one for the writing position
* a last one to know the volume data into the buffer.
*/
private int inBuffer, bufferSize, first, last;
public Buffer2 (int bufferSize) {
buffer = new Object[bufferSize];
inBuffer=0;
last=0;
this.bufferSize=bufferSize;
}
public synchronized byte[] read () {
while (inBuffer==0) {
try {
wait();
System.out.println("Buffer.read = null");
} catch (InterruptedException e) {
e.printStackTrace();
}
}
// See the method "load()" for the explanation of the next line
first=(first+1)%bufferSize;
inBuffer--;
/*
* This method awake the threads which were sleeping because of the load() method
* when the buffer is full
*/
notifyAll();
Object[] temp = (Object[]) buffer[first];
System.out.println("Buffer.read - packet ID : "+(int)temp[0]+" - buffer pos. = "+first);
return (byte[]) temp[1];
}
public synchronized void load (Object[] data) {
while (inBuffer==bufferSize) {
try {
System.out.println("Buffer.load : full !");
wait();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
/*
* Method to calculate the position to add the new value :
* Ex : buffer size = 10 and we just read the 3rd position
* (3+1)%10 = 4 => the new value will be added in the 4th position of the array
* Ex2 : if the position was 9 :
* (9+1)%10=0 => the value would be added at the position 0 of the array.
*/
last = (last+1)%bufferSize;
inBuffer++;
System.out.println("Buffer.load - packet ID : "+(int)data[0]+" - buffer pos. = "+last);
buffer[last]=data;
/*
* Awake for all waiting threads locked
* because the buffer was empty
*/
notifyAll();
}
#Override
public void update(Observable observable, Object obj) {
if (observable instanceof Generator) {
load((Object[]) obj);
}
}
}
What is wrong ? Thank you for all your answers.
After few tests and checks, I noticed few things :
the update() method receive different values and call the load() method with those values. So, it seems it works as requested.
just before altering the buffer with the instruction buffer[last]=data;, I checked if the data are those from the update method : yes, they are. So, it works.
Just after having altered the buffer, I check the datas in the whole buffer (for the tests, I just used a buffer with 10 values). And here is the problem : all the values are the same, even if they were introduced earlier in the array. To say it in another way, when I load a new value in the array, all the values are modified and become equals to this last value.
Thanks to Thomas for the tips to identify a bit more precisely the problem, but, I still can't solve it.
Who will help me more ?
Thank you to all.

I finally found and fixed the problem, helped with the tips of Thomas. Indeed, I had a side effet when I received my data via the update() method.
The packet received is an Object[2] the 2 objects in it are another Object[] for the header and a byte[] for the datas themselves.
So, I have to read each value of each array of this packet to write them in a "single use" object (Object[] packet = new Object[2]).
So, the code for the update() method fixed is :
#Override
public synchronized void update(Observable observable, Object obj) {
if (observable instanceof Generator) {
// A packet contain a header (header) and datas (data)
Object[] packet = new Object[2];
/*
* The header contain 3 datas :
* [0] ID
* [1] The number of byes in the array of bytes
* [2] the AudioFormat
*/
Object[] header = new Object[3];
header[0]= ((Object[]) ((Object[]) obj)[0])[0];
header[1]= ((Object[]) ((Object[]) obj)[0])[1];
header[2]= ((Object[]) ((Object[]) obj)[0])[2];
// I reconstruct the hehader in the position [0] in the packet
packet[0]=header;
byte[] data = new byte[(int) header[1]*4];
for (int i=0; i< data.length ; i++) {
/*
* Reading each byte in the [i] position of the bytes[]
* This array of bytes is in the position [1] of the array which is the Object[] obj
*/
data[i] = ((byte[]) ((Object[]) obj)[1])[i];
}
// I reconstruct the data in the position [1] of the packet
packet[1]=data;
// I feed the load() method with the packet
load(packet);
}
}
I expect my answer will help someone else. Have a good day.

Related

Problems converting byte array to collection of objects

I am trying to convert an input byte array into a collection of another data structure. The input byte array has a lot of bytes that correspond to a data structure that I call Records which consist of 8 bytes for data (long) and 8 bytes for key (double) for each record. What I am trying to do is define a function that does this automatically because I will do this process very frequently. This is the function as I have it right now:
public Record[] bytesToRecord(byte[] byteArray) {
Record[] arrayRecords = new Record[(byteArray.length/16)];
for (int i = 0; i <= (byteArray.length); i+=16) {
arrayRecords[i] = new Record(Arrays.copyOfRange(byteArray, i, i + 16));
}
return arrayRecords;
}
So as you can see above, the function takes an array of bytes and loops through every 16 bytes to create a new Record object and appends it to the arrayRecords, which is the collection of Records in a Record array. The problem I have is that I think something is going wrong so my function is not taking exactly 16 bytes per record, so when a Record object is created I get a NullPointerException in the Record class because it cannot properly slice the subarray of 16 bytes for a Record to get the long and double values for data and key. As follows there is the Record class constructor:
public class Record {
private byte[] record;
long idData = ByteBuffer.wrap(Arrays.copyOfRange(record, 0,9)).getLong();
double key = ByteBuffer.wrap(Arrays.copyOfRange(record, 9,16)).getDouble();
public Record(byte[] recordArray) {
this.record = recordArray;
}
}
I hope somebody can help me to fix this function or suggest another method to do what I am trying to do.
I identified four issues in your code that lead to failures in your code. I will go over each of them separately.
Problem 1: The problem giving you the NullPointerException has to do with your Record class. When a new class instance is created, the fields are initialized before the constructor is executed. This means in your case that ByteBuffer.wrap(Arrays.copyOfRange(record, 0,9)).getLong(); runs before this.record = recordArray;. The problem then is that the first statement uses an array that was not initialized.
Problem 2: The index you use to specify the start of the double in the array is off by one. The long has 8 bytes 0 to 7 and the double the remaining 8 bytes from 8 to 15. The end index given to the method is defined to be 1 more than the actual end index.
So, to solve these two issues you have to change your Record to something like this:
public class Record {
long idData;
double key;
public Record(byte[] recordArray) {
this.idData = ByteBuffer.wrap(Arrays.copyOfRange(record, 0,8)).getLong();
this.key = ByteBuffer.wrap(Arrays.copyOfRange(record, 8,16)).getDouble();
}
}
Problem 3: In the function bytesToRecord you have a for loop starting at index zero until the index is smaller or equal than the length of the array. This has to be changed to strictly smaller as otherwise, the loop iterates one time too often.
Problem 4: You use the loop index for addressing entities in the Record[] and the byte[], which have completely different lengths. The simplest solution is to iterate over the Record[] and calculate the index for the byte[].
Something like this should solve these two issues:
public static Record[] bytesToRecord(byte[] byteArray) {
Record[] arrayRecords = new Record[(byteArray.length/16)];
for (int i = 0; i < (arrayRecords.length); i++) {
arrayRecords[i] = new Record(Arrays.copyOfRange(byteArray, i*16, i * 16 + 16));
}
return arrayRecords;
}
Edit: An enhancement to your approach in my opinion would be to limit the Record class to only store the values and make all computations inside bytesToRecord. This way you don't have to copy parts of the array and save memory. The code should then look like this:
public class Record {
long idData;
double key;
public Record(long idData, double key) {
this.idData = idData;
this.key = key;
}
}
public static Record[] bytesToRecord(byte[] byteArray) {
Record[] arrayRecords = new Record[(byteArray.length / 16)];
for (int i = 0; i < (arrayRecords.length); i++) {
long id = ByteBuffer.wrap(byteArray, i * 16, 8).getLong();
double key = ByteBuffer.wrap(byteArray, i * 16 + 8, 8).getDouble();
arrayRecords[i] = new Record(id, key);
}
return arrayRecords;
}

Sending data to a database in size-limited chunks

I have a method which takes a parameter which is Partition enum. This method will be called by multiple background threads (15 max) around same time period by passing different value of partition. Here dataHoldersByPartition is a map of Partition and ConcurrentLinkedQueue<DataHolder>.
private final ImmutableMap<Partition, ConcurrentLinkedQueue<DataHolder>> dataHoldersByPartition;
//... some code to populate entry in `dataHoldersByPartition`
private void validateAndSend(final Partition partition) {
ConcurrentLinkedQueue<DataHolder> dataHolders = dataHoldersByPartition.get(partition);
Map<byte[], byte[]> clientKeyBytesAndProcessBytesHolder = new HashMap<>();
int totalSize = 0;
DataHolder dataHolder;
while ((dataHolder = dataHolders.poll()) != null) {
byte[] clientKeyBytes = dataHolder.getClientKey().getBytes(StandardCharsets.UTF_8);
if (clientKeyBytes.length > 255)
continue;
byte[] processBytes = dataHolder.getProcessBytes();
int clientKeyLength = clientKeyBytes.length;
int processBytesLength = processBytes.length;
int additionalLength = clientKeyLength + processBytesLength;
if (totalSize + additionalLength > 50000) {
Message message = new Message(clientKeyBytesAndProcessBytesHolder, partition);
// here size of `message.serialize()` byte array should always be less than 50k at all cost
sendToDatabase(message.getAddress(), message.serialize());
clientKeyBytesAndProcessBytesHolder = new HashMap<>();
totalSize = 0;
}
clientKeyBytesAndProcessBytesHolder.put(clientKeyBytes, processBytes);
totalSize += additionalLength;
}
// calling again with remaining values only if clientKeyBytesAndProcessBytesHolder is not empty
if(!clientKeyBytesAndProcessBytesHolder.isEmpty()) {
Message message = new Message(partition, clientKeyBytesAndProcessBytesHolder);
// here size of `message.serialize()` byte array should always be less than 50k at all cost
sendToDatabase(message.getAddress(), message.serialize());
}
}
And below is my Message class:
public final class Message {
private final byte dataCenter;
private final byte recordVersion;
private final Map<byte[], byte[]> clientKeyBytesAndProcessBytesHolder;
private final long address;
private final long addressFrom;
private final long addressOrigin;
private final byte recordsPartition;
private final byte replicated;
public Message(Map<byte[], byte[]> clientKeyBytesAndProcessBytesHolder, Partition recordPartition) {
this.clientKeyBytesAndProcessBytesHolder = clientKeyBytesAndProcessBytesHolder;
this.recordsPartition = (byte) recordPartition.getPartition();
this.dataCenter = Utils.CURRENT_LOCATION.get().datacenter();
this.recordVersion = 1;
this.replicated = 0;
long packedAddress = new Data().packAddress();
this.address = packedAddress;
this.addressFrom = 0L;
this.addressOrigin = packedAddress;
}
// Output of this method should always be less than 50k always
public byte[] serialize() {
int bufferCapacity = getBufferCapacity(clientKeyBytesAndProcessBytesHolder); // 36 + dataSize + 1 + 1 + keyLength + 8 + 2;
ByteBuffer byteBuffer = ByteBuffer.allocate(bufferCapacity).order(ByteOrder.BIG_ENDIAN);
// header layout
byteBuffer.put(dataCenter).put(recordVersion).putInt(clientKeyBytesAndProcessBytesHolder.size())
.putInt(bufferCapacity).putLong(address).putLong(addressFrom).putLong(addressOrigin)
.put(recordsPartition).put(replicated);
// now the data layout
for (Map.Entry<byte[], byte[]> entry : clientKeyBytesAndProcessBytesHolder.entrySet()) {
byte keyType = 0;
byte[] key = entry.getKey();
byte[] value = entry.getValue();
byte keyLength = (byte) key.length;
short valueLength = (short) value.length;
ByteBuffer dataBuffer = ByteBuffer.wrap(value);
long timestamp = valueLength > 10 ? dataBuffer.getLong(2) : System.currentTimeMillis();
byteBuffer.put(keyType).put(keyLength).put(key).putLong(timestamp).putShort(valueLength)
.put(value);
}
return byteBuffer.array();
}
private int getBufferCapacity(Map<byte[], byte[]> clientKeyBytesAndProcessBytesHolder) {
int size = 36;
for (Entry<byte[], byte[]> entry : clientKeyBytesAndProcessBytesHolder.entrySet()) {
size += 1 + 1 + 8 + 2;
size += entry.getKey().length;
size += entry.getValue().length;
}
return size;
}
// getters and to string method here
}
Basically, what I have to make sure is whenever the sendToDatabase method is called, size of message.serialize() byte array should always be less than 50k at all cost. My sendToDatabase method sends byte array coming out from serialize method. And because of that condition I am doing below validation plus few other stuff. In the method, I will iterate dataHolders CLQ and I will extract clientKeyBytes and processBytes from it. Here is the validation I am doing:
If the clientKeyBytes length is greater than 255 then I will skip it and continue iterating.
I will keep incrementing the totalSize variable which will be the sum of clientKeyLength and processBytesLength, and this totalSize length should always be less than 50000 bytes.
As soon as it reaches the 50000 limit, I will send the clientKeyBytesAndProcessBytesHolder map to the sendToDatabase method and clear out the map, reset totalSize to 0 and start populating again.
If it doesn't reaches that limit and dataHolders got empty, then it will send whatever it has.
I believe there is some bug in my current code because of which maybe some records are not being sent properly or dropped somewhere because of my condition and I am not able to figure this out. Looks like to properly achieve this 50k condition I may have to use getBufferCapacity method to correctly figure out the size before calling sendToDatabase method?
I checked your code, its look good as per your logic. As you said it will always store the information which is less than 50K but it will actually store information till 50K. To make it less than 50K you have to change the if condition to if (totalSize + additionalLength >= 50000).
If your codes still not fulfilling your requirement i.e. storing information when totalSize + additionalLength is greater than 50k I can advise you few thinks.
As more than 50 threads call this method you need to consider two section in your codes to be synchronize.
One is global variable which is a container dataHoldersByPartition object. If multiple concurrent and parallel searches happened in this container object, outcome might not be perfect. Just check whether container type is synchronized or not. If not make this block like below:-
synchronized(this){
ConcurrentLinkedQueue<DataHolder> dataHolders = dataHoldersByPartition.get(partition);
}
Now, I can give only two suggestion to fix this issue. One is instead of if (totalSize + additionalLength > 50000) this you can check the size of the object clientKeyBytesAndProcessBytesHolder if(sizeof(clientKeyBytesAndProcessBytesHolder) >= 50000) (check appropriate method for sizeof in java). And second one is narrow down the area to check whether it is a side effect of multithreading or not. All these suggestion are to find out the area where exactly problem is and fix should be from your end only.
First check whether you method validateAndSend is exactly satisfying your requirement or not. For that synchronize whole validateAndSend method first and check whether everything fine or still have the same result. If still have the same result that means it is not because of multithreading but your coding is not as per requirement. If its work fine that means it is a problem of multithreading. If method synchronization is fixing your issue but degrade the performance you just remove the synchronization from it and concentrate every small block of your code which might cause the issue and make it synchronize block and remove if still not fixing your issue. Like that finally you locate the block of code which is actually creating the issue and leave it as synchronize to fix it finally.
For example first attempt:-
`private synchronize void validateAndSend`
Second attempts: Remove synchronize key words from the method and do the below step:-
synchronize(this){
Message message = new Message(clientKeyBytesAndProcessBytesHolder, partition);
sendToDatabase(message.getAddress(), message.serialize());
}
If you think that I did not correctly understand you please let me know.
In your validateAndSend I would put whole data to the queue, and do whole processing in separate thread. Please consider command model. That way all threads are going to put their load on queue. Consumer thread has all the data, all the information in place, and can process it quite effectively. The only complicated part is sending response / result back to calling thread. Since in your case that is not a problem - the better. There are some more benefits of this pattern - please look at netflix/hystrix.

Queue of byte buffers in Java

I want to add ByteBuffers to a queue in java so I have the following code,
public class foo{
private Queue <ByteBuffer> messageQueue = new LinkedList<ByteBuffer>();
protected boolean queueInit(ByteBuffer bbuf)
{
if(bbuf.capacity() > 10000)
{
int limit = bbuf.limit();
bbuf.position(0);
for(int i = 0;i<limit;i=i+10000)
{
int kb = 1024;
for(int j = 0;j<kb;j++)
{
ByteBuffer temp = ByteBuffer.allocate(kb);
temp.array()[j] = bbuf.get(j);
System.out.println(temp.get(j));
addQueue(temp);
}
}
}
System.out.println(messageQueue.peek().get(1));
return true;
}
private void addQueue(ByteBuffer bbuf)
{
messageQueue.add(bbuf);
}
}
The inner workings of the for loop appear to work correctly as the temp value is set to the correct value and then that should be added to the queue by calling the addQueue method. However only the first letter of the bytebuffer only gets added to the queue and nothing else. Since when I peek at the first value in the head of the queue I get the number 116 as I should, but when I try to get other values in the head they are 0 which is not correct. Why might this be happening where no other values except for the first value of the bytbuffer are getting added to the head of the queue?
ByteBuffer.allocate creates a new ByteBuffer. In each iteration of your inner j loop, you are creating a new buffer, placing a single byte in it, and passing that buffer to addQueue. You are doing this 1024 times (in each iteration of the outer loop), so you are creating 1024 buffers which have a single byte set; in each buffer, all other bytes will be zero.
You are not using the i loop variable of your outer loop at all. I'm not sure why you'd want to skip over 10000 bytes anyway, if your buffers are only 1024 bytes in size.
The slice method can be used to create smaller ByteBuffers from a larger one:
int kb = 1024;
while (bbuf.remaining() >= kb) {
ByteBuffer temp = bbuf.slice();
temp.limit(1024);
addQueue(temp);
bbuf.position(bbuf.position() + kb);
}
if (bbuf.hasRemaining()) {
ByteBuffer temp = bbuf.slice();
addQueue(temp);
}
It's important to remember that the new ByteBuffers will be sharing content with bbuf. Meaning, making changes to any byte in bbuf would also change exactly one of the sliced buffers. That is probably what you want, as it's more efficient than making copies of the buffer. (Potentially much more efficient, if your original buffer is large; would you really want two copies of a one-gigabyte buffer in memory?)
If you truly need to copy all the bytes into independent buffers, regardless of incurred memory usage, you could have your addQueue method copy each buffer space:
private void addQueue(ByteBuffer bbuf)
{
bbuf = ByteBuffer.allocate(bbuf.remaining()).put(bbuf); // copy
bbuf.flip();
messageQueue.add(bbuf);
}

Reading bytes from a java socket: getting ArrayIndexOutOfBounds

am having this very strange problem: i have a small program that reads bytes off a socket;
whenever i am debugging, the program runs fine; but every time i run it (like straight up run it), i get the ArrayIndexOutOfBounds exception. what gives? am i reading it too fast for the socket? am i missing something?
here is the main():
public static void main(String[] args){
TParser p = new TParser();
p.init();
p.readPacket();
p.sendResponse();
p.readPacket();
p.sendResponse();
p.shutdown();
}
The method init is where i create the Sockets for reading and writing;
The next method (readPacket) is where problems start to arise; i read the entire buffer to a private byte array so i can manipulate the data freely; for instance, depending on some bytes on the data i set some properties:
public void readPacket(){
System.out.println("readPacket");
readInternalPacket();
setPacketInfo();
}
private void readInternalPacket(){
System.out.println("readInternalPacket");
try {
int available=dataIN.available();
packet= new byte[available];
dataIN.read(packet,0,available);
dataPacketSize=available;
}
catch (Exception e) {
e.printStackTrace();
}
}
private void setPacketInfo() {
System.out.println("setPacketInfo");
System.out.println("packetLen: " +dataPacketSize);
byte[] pkt= new byte[2];
pkt[0]= packet[0];
pkt[1]= packet[1];
String type= toHex(pkt);
System.out.println("packet type: "+type);
if(type.equalsIgnoreCase("000F")){
recordCount=0;
packetIterator=0;
packetType=Constants.PacketType.ACKPacket;
readIMEI();
validateDevice();
}
}
The line where it breaks is the line
pkt[1]= packet[1]; (setPacketInfo)
meaning it only has 1 byte at that time... but how can that be, if whe i debug it it runs perfectly? is there some sanity check i must do on the socket? (dataIN is of type DataInputStream)
should i put methods on separate threads? ive gone over this over and over, even replaced my memory modules (when i started having weird ideas on this)
...please help me.
I dont know the surrounding code, especially the class of dataIN but I think your code does this:
int available=dataIN.available(); does not wait for data at all, just returns that there are 0 bytes available
so your array is of size 0 and you then do:
pkt[0]= packet[0]; pkt[1]= packet[1]; which is out of bounds.
I would recommend that you at least loop until the available() returns the 2 you expect, but i cannot be sure that that is the correct (* ) or right (** ) way to do it because i dont know dataIN's class-implementation.
Notes: (* ) it is not correct if it is possible for available() to e.g. return the 2 bytes separately. (** ) it is not the right way to do it if dataIN itself provides methods that wait.
Can it be that reading the data from the socket is an asynchronous process and the setPacketInfo() is called before your packet[] is completely filled? If this is the case, it's possible it runs great when debugging, but terrible when it really uses sockets on different machines.
You can add some code to the setPacketInfo() method to check the length of the packet[] variable.
byte[] pkt= new byte[packet.length];
for(int x = 0; x < packet.length; x++)
{
pkt[x]= packet[x];
}
not really sure though why you even copy the packet[] variable into pkt[]?
You are using a packet oriented protocol on a stream oriented layer without transmitting the real packet length. because of fragmentation the size of received data can be smaller than the packet you sent.
Therefore I strongly recommend to send the data packet size before sending the actual packet. On the receiver side you could use a DataInputStream and use blocking read for detecting an incoming packet:
private void readInternalPacket() {
System.out.println("readInternalPacket");
try {
int packetSize = dataIN.readInt();
packet = new byte[packetSize];
dataIN.read(packet, 0, packetSize);
dataPacketSize = packetSize;
} catch (Exception e) {
e.printStackTrace();
}
}
Of course you have to modify the sender side as well, sending the packet size before the packet.
To add to the answer from #eznme. You need to read from your underlying stream until there is no more pending data. This may required one or more reads, but an end of stream will be indicated when the available method returns 0. I would recommend using Apache IOUtils to 'copy' the input stream to a ByteArrayOutputStream, then getting the byte[] array from that.
In your setPacketInfo method you should do a check on your data buffer length before getting your protocol header bytes:
byte[] pkt= new byte[2];
if((packet != null) && (packet.length >= 2)) {
pkt[0]= packet[0];
pkt[1]= packet[1];
// ...
}
That will get rid of the out of bound exceptions you are getting when you read zero-length data buffers from your protocol.
You should never rely on dataIN.available(), and dataIN.read(packet,0,available); returns an integer that says how many bytes you received. That's not always the same value as what available says, and it can also be less than the size of the buffer.
This is how you should read:
byte[] packet = new byte[1024]; //
dataPacketSize = dataIN.read(packet,0,packet.length);
You should also wrap your DataInputStream in a BufferedInputStream, and take care of the case where you get less than 2 bytes, so that you don't try to process bytes that you haven't received.

How to convert OutputStream to InputStream?

I am on the stage of development, where I have two modules and from one I got output as a OutputStream and second one, which accepts only InputStream. Do you know how to convert OutputStream to InputStream (not vice versa, I mean really this way) that I will be able to connect these two parts?
Thanks
There seem to be many links and other such stuff, but no actual code using pipes. The advantage of using java.io.PipedInputStream and java.io.PipedOutputStream is that there is no additional consumption of memory. ByteArrayOutputStream.toByteArray() returns a copy of the original buffer, so that means that whatever you have in memory, you now have two copies of it. Then writing to an InputStream means you now have three copies of the data.
The code using lambdas (hat-tip to #John Manko from the comments):
PipedInputStream in = new PipedInputStream();
final PipedOutputStream out = new PipedOutputStream(in);
// in a background thread, write the given output stream to the
// PipedOutputStream for consumption
new Thread(() -> {originalOutputStream.writeTo(out);}).start();
One thing that #John Manko noted is that in certain cases, when you don't have control of the creation of the OutputStream, you may end up in a situation where the creator may clean up the OutputStream object prematurely. If you are getting the ClosedPipeException, then you should try inverting the constructors:
PipedInputStream in = new PipedInputStream(out);
new Thread(() -> {originalOutputStream.writeTo(out);}).start();
Note you can invert the constructors for the examples below too.
Thanks also to #AlexK for correcting me with starting a Thread instead of just kicking off a Runnable.
The code using try-with-resources:
// take the copy of the stream and re-write it to an InputStream
PipedInputStream in = new PipedInputStream();
new Thread(new Runnable() {
public void run () {
// try-with-resources here
// putting the try block outside the Thread will cause the
// PipedOutputStream resource to close before the Runnable finishes
try (final PipedOutputStream out = new PipedOutputStream(in)) {
// write the original OutputStream to the PipedOutputStream
// note that in order for the below method to work, you need
// to ensure that the data has finished writing to the
// ByteArrayOutputStream
originalByteArrayOutputStream.writeTo(out);
}
catch (IOException e) {
// logging and exception handling should go here
}
}
}).start();
The original code I wrote:
// take the copy of the stream and re-write it to an InputStream
PipedInputStream in = new PipedInputStream();
final PipedOutputStream out = new PipedOutputStream(in);
new Thread(new Runnable() {
public void run () {
try {
// write the original OutputStream to the PipedOutputStream
// note that in order for the below method to work, you need
// to ensure that the data has finished writing to the
// ByteArrayOutputStream
originalByteArrayOutputStream.writeTo(out);
}
catch (IOException e) {
// logging and exception handling should go here
}
finally {
// close the PipedOutputStream here because we're done writing data
// once this thread has completed its run
if (out != null) {
// close the PipedOutputStream cleanly
out.close();
}
}
}
}).start();
This code assumes that the originalByteArrayOutputStream is a ByteArrayOutputStream as it is usually the only usable output stream, unless you're writing to a file. The great thing about this is that since it's in a separate thread, it also is working in parallel, so whatever is consuming your input stream will be streaming out of your old output stream too. That is beneficial because the buffer can remain smaller and you'll have less latency and less memory usage.
If you don't have a ByteArrayOutputStream, then instead of using writeTo(), you will have to use one of the write() methods in the java.io.OutputStream class or one of the other methods available in a subclass.
An OutputStream is one where you write data to. If some module exposes an OutputStream, the expectation is that there is something reading at the other end.
Something that exposes an InputStream, on the other hand, is indicating that you will need to listen to this stream, and there will be data that you can read.
So it is possible to connect an InputStream to an OutputStream
InputStream----read---> intermediateBytes[n] ----write----> OutputStream
As someone metioned, this is what the copy() method from IOUtils lets you do. It does not make sense to go the other way... hopefully this makes some sense
UPDATE:
Of course the more I think of this, the more I can see how this actually would be a requirement. I know some of the comments mentioned Piped input/ouput streams, but there is another possibility.
If the output stream that is exposed is a ByteArrayOutputStream, then you can always get the full contents by calling the toByteArray() method. Then you can create an input stream wrapper by using the ByteArrayInputStream sub-class. These two are pseudo-streams, they both basically just wrap an array of bytes. Using the streams this way, therefore, is technically possible, but to me it is still very strange...
As input and output streams are just start and end point, the solution is to temporary store data in byte array. So you must create intermediate ByteArrayOutputStream, from which you create byte[] that is used as input for new ByteArrayInputStream.
public void doTwoThingsWithStream(InputStream inStream, OutputStream outStream){
//create temporary bayte array output stream
ByteArrayOutputStream baos = new ByteArrayOutputStream();
doFirstThing(inStream, baos);
//create input stream from baos
InputStream isFromFirstData = new ByteArrayInputStream(baos.toByteArray());
doSecondThing(isFromFirstData, outStream);
}
Hope it helps.
ByteArrayOutputStream buffer = (ByteArrayOutputStream) aOutputStream;
byte[] bytes = buffer.toByteArray();
InputStream inputStream = new ByteArrayInputStream(bytes);
You will need an intermediate class which will buffer between. Each time InputStream.read(byte[]...) is called, the buffering class will fill the passed in byte array with the next chunk passed in from OutputStream.write(byte[]...). Since the sizes of the chunks may not be the same, the adapter class will need to store a certain amount until it has enough to fill the read buffer and/or be able to store up any buffer overflow.
This article has a nice breakdown of a few different approaches to this problem:
http://blog.ostermiller.org/convert-java-outputstream-inputstream
The easystream open source library has direct support to convert an OutputStream to an InputStream: http://io-tools.sourceforge.net/easystream/tutorial/tutorial.html
// create conversion
final OutputStreamToInputStream<Void> out = new OutputStreamToInputStream<Void>() {
#Override
protected Void doRead(final InputStream in) throws Exception {
LibraryClass2.processDataFromInputStream(in);
return null;
}
};
try {
LibraryClass1.writeDataToTheOutputStream(out);
} finally {
// don't miss the close (or a thread would not terminate correctly).
out.close();
}
They also list other options: http://io-tools.sourceforge.net/easystream/outputstream_to_inputstream/implementations.html
Write the data the data into a memory buffer (ByteArrayOutputStream) get the byteArray and read it again with a ByteArrayInputStream. This is the best approach if you're sure your data fits into memory.
Copy your data to a temporary file and read it back.
Use pipes: this is the best approach both for memory usage and speed (you can take full advantage of the multi-core processors) and also the standard solution offered by Sun.
Use InputStreamFromOutputStream and OutputStreamToInputStream from the easystream library.
I encountered the same problem with converting a ByteArrayOutputStream to a ByteArrayInputStream and solved it by using a derived class from ByteArrayOutputStream which is able to return a ByteArrayInputStream that is initialized with the internal buffer of the ByteArrayOutputStream. This way no additional memory is used and the 'conversion' is very fast:
package info.whitebyte.utils;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
/**
* This class extends the ByteArrayOutputStream by
* providing a method that returns a new ByteArrayInputStream
* which uses the internal byte array buffer. This buffer
* is not copied, so no additional memory is used. After
* creating the ByteArrayInputStream the instance of the
* ByteArrayInOutStream can not be used anymore.
* <p>
* The ByteArrayInputStream can be retrieved using <code>getInputStream()</code>.
* #author Nick Russler
*/
public class ByteArrayInOutStream extends ByteArrayOutputStream {
/**
* Creates a new ByteArrayInOutStream. The buffer capacity is
* initially 32 bytes, though its size increases if necessary.
*/
public ByteArrayInOutStream() {
super();
}
/**
* Creates a new ByteArrayInOutStream, with a buffer capacity of
* the specified size, in bytes.
*
* #param size the initial size.
* #exception IllegalArgumentException if size is negative.
*/
public ByteArrayInOutStream(int size) {
super(size);
}
/**
* Creates a new ByteArrayInputStream that uses the internal byte array buffer
* of this ByteArrayInOutStream instance as its buffer array. The initial value
* of pos is set to zero and the initial value of count is the number of bytes
* that can be read from the byte array. The buffer array is not copied. This
* instance of ByteArrayInOutStream can not be used anymore after calling this
* method.
* #return the ByteArrayInputStream instance
*/
public ByteArrayInputStream getInputStream() {
// create new ByteArrayInputStream that respects the current count
ByteArrayInputStream in = new ByteArrayInputStream(this.buf, 0, this.count);
// set the buffer of the ByteArrayOutputStream
// to null so it can't be altered anymore
this.buf = null;
return in;
}
}
I put the stuff on github: https://github.com/nickrussler/ByteArrayInOutStream
The library io-extras may be useful. For example if you want to gzip an InputStream using GZIPOutputStream and you want it to happen synchronously (using the default buffer size of 8192):
InputStream is = ...
InputStream gz = IOUtil.pipe(is, o -> new GZIPOutputStream(o));
Note that the library has 100% unit test coverage (for what that's worth of course!) and is on Maven Central. The Maven dependency is:
<dependency>
<groupId>com.github.davidmoten</groupId>
<artifactId>io-extras</artifactId>
<version>0.1</version>
</dependency>
Be sure to check for a later version.
From my point of view, java.io.PipedInputStream/java.io.PipedOutputStream is the best option to considere. In some situations you may want to use ByteArrayInputStream/ByteArrayOutputStream. The problem is that you need to duplicate the buffer to convert a ByteArrayOutputStream to a ByteArrayInputStream. Also ByteArrayOutpuStream/ByteArrayInputStream are limited to 2GB. Here is an OutpuStream/InputStream implementation I wrote to bypass ByteArrayOutputStream/ByteArrayInputStream limitations (Scala code, but easily understandable for java developpers):
import java.io.{IOException, InputStream, OutputStream}
import scala.annotation.tailrec
/** Acts as a replacement for ByteArrayOutputStream
*
*/
class HugeMemoryOutputStream(capacity: Long) extends OutputStream {
private val PAGE_SIZE: Int = 1024000
private val ALLOC_STEP: Int = 1024
/** Pages array
*
*/
private var streamBuffers: Array[Array[Byte]] = Array.empty[Array[Byte]]
/** Allocated pages count
*
*/
private var pageCount: Int = 0
/** Allocated bytes count
*
*/
private var allocatedBytes: Long = 0
/** Current position in stream
*
*/
private var position: Long = 0
/** Stream length
*
*/
private var length: Long = 0
allocSpaceIfNeeded(capacity)
/** Gets page count based on given length
*
* #param length Buffer length
* #return Page count to hold the specified amount of data
*/
private def getPageCount(length: Long) = {
var pageCount = (length / PAGE_SIZE).toInt + 1
if ((length % PAGE_SIZE) == 0) {
pageCount -= 1
}
pageCount
}
/** Extends pages array
*
*/
private def extendPages(): Unit = {
if (streamBuffers.isEmpty) {
streamBuffers = new Array[Array[Byte]](ALLOC_STEP)
}
else {
val newStreamBuffers = new Array[Array[Byte]](streamBuffers.length + ALLOC_STEP)
Array.copy(streamBuffers, 0, newStreamBuffers, 0, streamBuffers.length)
streamBuffers = newStreamBuffers
}
pageCount = streamBuffers.length
}
/** Ensures buffers are bug enough to hold specified amount of data
*
* #param value Amount of data
*/
private def allocSpaceIfNeeded(value: Long): Unit = {
#tailrec
def allocSpaceIfNeededIter(value: Long): Unit = {
val currentPageCount = getPageCount(allocatedBytes)
val neededPageCount = getPageCount(value)
if (currentPageCount < neededPageCount) {
if (currentPageCount == pageCount) extendPages()
streamBuffers(currentPageCount) = new Array[Byte](PAGE_SIZE)
allocatedBytes = (currentPageCount + 1).toLong * PAGE_SIZE
allocSpaceIfNeededIter(value)
}
}
if (value < 0) throw new Error("AllocSpaceIfNeeded < 0")
if (value > 0) {
allocSpaceIfNeededIter(value)
length = Math.max(value, length)
if (position > length) position = length
}
}
/**
* Writes the specified byte to this output stream. The general
* contract for <code>write</code> is that one byte is written
* to the output stream. The byte to be written is the eight
* low-order bits of the argument <code>b</code>. The 24
* high-order bits of <code>b</code> are ignored.
* <p>
* Subclasses of <code>OutputStream</code> must provide an
* implementation for this method.
*
* #param b the <code>byte</code>.
*/
#throws[IOException]
override def write(b: Int): Unit = {
val buffer: Array[Byte] = new Array[Byte](1)
buffer(0) = b.toByte
write(buffer)
}
/**
* Writes <code>len</code> bytes from the specified byte array
* starting at offset <code>off</code> to this output stream.
* The general contract for <code>write(b, off, len)</code> is that
* some of the bytes in the array <code>b</code> are written to the
* output stream in order; element <code>b[off]</code> is the first
* byte written and <code>b[off+len-1]</code> is the last byte written
* by this operation.
* <p>
* The <code>write</code> method of <code>OutputStream</code> calls
* the write method of one argument on each of the bytes to be
* written out. Subclasses are encouraged to override this method and
* provide a more efficient implementation.
* <p>
* If <code>b</code> is <code>null</code>, a
* <code>NullPointerException</code> is thrown.
* <p>
* If <code>off</code> is negative, or <code>len</code> is negative, or
* <code>off+len</code> is greater than the length of the array
* <code>b</code>, then an <tt>IndexOutOfBoundsException</tt> is thrown.
*
* #param b the data.
* #param off the start offset in the data.
* #param len the number of bytes to write.
*/
#throws[IOException]
override def write(b: Array[Byte], off: Int, len: Int): Unit = {
#tailrec
def writeIter(b: Array[Byte], off: Int, len: Int): Unit = {
val currentPage: Int = (position / PAGE_SIZE).toInt
val currentOffset: Int = (position % PAGE_SIZE).toInt
if (len != 0) {
val currentLength: Int = Math.min(PAGE_SIZE - currentOffset, len)
Array.copy(b, off, streamBuffers(currentPage), currentOffset, currentLength)
position += currentLength
writeIter(b, off + currentLength, len - currentLength)
}
}
allocSpaceIfNeeded(position + len)
writeIter(b, off, len)
}
/** Gets an InputStream that points to HugeMemoryOutputStream buffer
*
* #return InputStream
*/
def asInputStream(): InputStream = {
new HugeMemoryInputStream(streamBuffers, length)
}
private class HugeMemoryInputStream(streamBuffers: Array[Array[Byte]], val length: Long) extends InputStream {
/** Current position in stream
*
*/
private var position: Long = 0
/**
* Reads the next byte of data from the input stream. The value byte is
* returned as an <code>int</code> in the range <code>0</code> to
* <code>255</code>. If no byte is available because the end of the stream
* has been reached, the value <code>-1</code> is returned. This method
* blocks until input data is available, the end of the stream is detected,
* or an exception is thrown.
*
* <p> A subclass must provide an implementation of this method.
*
* #return the next byte of data, or <code>-1</code> if the end of the
* stream is reached.
*/
#throws[IOException]
def read: Int = {
val buffer: Array[Byte] = new Array[Byte](1)
if (read(buffer) == 0) throw new Error("End of stream")
else buffer(0)
}
/**
* Reads up to <code>len</code> bytes of data from the input stream into
* an array of bytes. An attempt is made to read as many as
* <code>len</code> bytes, but a smaller number may be read.
* The number of bytes actually read is returned as an integer.
*
* <p> This method blocks until input data is available, end of file is
* detected, or an exception is thrown.
*
* <p> If <code>len</code> is zero, then no bytes are read and
* <code>0</code> is returned; otherwise, there is an attempt to read at
* least one byte. If no byte is available because the stream is at end of
* file, the value <code>-1</code> is returned; otherwise, at least one
* byte is read and stored into <code>b</code>.
*
* <p> The first byte read is stored into element <code>b[off]</code>, the
* next one into <code>b[off+1]</code>, and so on. The number of bytes read
* is, at most, equal to <code>len</code>. Let <i>k</i> be the number of
* bytes actually read; these bytes will be stored in elements
* <code>b[off]</code> through <code>b[off+</code><i>k</i><code>-1]</code>,
* leaving elements <code>b[off+</code><i>k</i><code>]</code> through
* <code>b[off+len-1]</code> unaffected.
*
* <p> In every case, elements <code>b[0]</code> through
* <code>b[off]</code> and elements <code>b[off+len]</code> through
* <code>b[b.length-1]</code> are unaffected.
*
* <p> The <code>read(b,</code> <code>off,</code> <code>len)</code> method
* for class <code>InputStream</code> simply calls the method
* <code>read()</code> repeatedly. If the first such call results in an
* <code>IOException</code>, that exception is returned from the call to
* the <code>read(b,</code> <code>off,</code> <code>len)</code> method. If
* any subsequent call to <code>read()</code> results in a
* <code>IOException</code>, the exception is caught and treated as if it
* were end of file; the bytes read up to that point are stored into
* <code>b</code> and the number of bytes read before the exception
* occurred is returned. The default implementation of this method blocks
* until the requested amount of input data <code>len</code> has been read,
* end of file is detected, or an exception is thrown. Subclasses are encouraged
* to provide a more efficient implementation of this method.
*
* #param b the buffer into which the data is read.
* #param off the start offset in array <code>b</code>
* at which the data is written.
* #param len the maximum number of bytes to read.
* #return the total number of bytes read into the buffer, or
* <code>-1</code> if there is no more data because the end of
* the stream has been reached.
* #see java.io.InputStream#read()
*/
#throws[IOException]
override def read(b: Array[Byte], off: Int, len: Int): Int = {
#tailrec
def readIter(acc: Int, b: Array[Byte], off: Int, len: Int): Int = {
val currentPage: Int = (position / PAGE_SIZE).toInt
val currentOffset: Int = (position % PAGE_SIZE).toInt
val count: Int = Math.min(len, length - position).toInt
if (count == 0 || position >= length) acc
else {
val currentLength = Math.min(PAGE_SIZE - currentOffset, count)
Array.copy(streamBuffers(currentPage), currentOffset, b, off, currentLength)
position += currentLength
readIter(acc + currentLength, b, off + currentLength, len - currentLength)
}
}
readIter(0, b, off, len)
}
/**
* Skips over and discards <code>n</code> bytes of data from this input
* stream. The <code>skip</code> method may, for a variety of reasons, end
* up skipping over some smaller number of bytes, possibly <code>0</code>.
* This may result from any of a number of conditions; reaching end of file
* before <code>n</code> bytes have been skipped is only one possibility.
* The actual number of bytes skipped is returned. If <code>n</code> is
* negative, the <code>skip</code> method for class <code>InputStream</code> always
* returns 0, and no bytes are skipped. Subclasses may handle the negative
* value differently.
*
* The <code>skip</code> method of this class creates a
* byte array and then repeatedly reads into it until <code>n</code> bytes
* have been read or the end of the stream has been reached. Subclasses are
* encouraged to provide a more efficient implementation of this method.
* For instance, the implementation may depend on the ability to seek.
*
* #param n the number of bytes to be skipped.
* #return the actual number of bytes skipped.
*/
#throws[IOException]
override def skip(n: Long): Long = {
if (n < 0) 0
else {
position = Math.min(position + n, length)
length - position
}
}
}
}
Easy to use, no buffer duplication, no 2GB memory limit
val out: HugeMemoryOutputStream = new HugeMemoryOutputStream(initialCapacity /*may be 0*/)
out.write(...)
...
val in1: InputStream = out.asInputStream()
in1.read(...)
...
val in2: InputStream = out.asInputStream()
in2.read(...)
...
As some here have answered already, there is no efficient way to just ‘convert’ an OutputStream to an InputStream. The trick to solve a problem like yours is to execute all code that requires the OutputStream into its own thread. By using piped streams, we can then transfer the data out of the created thread over into an InputStream.
Example usage:
public static InputStream downloadFileAsStream(final String uriString) throws IOException {
final InputStream inputStream = runInOwnThreadWithPipedStreams((outputStream) -> {
try {
downloadUriToStream(uriString, outputStream);
} catch (final Exception e) {
LOGGER.error("Download of uri '{}' has failed", uriString, e);
}
});
return inputStream;
}
Helper function:
public static InputStream runInOwnThreadWithPipedStreams(
final Consumer<OutputStream> outputStreamConsumer) throws IOException {
final PipedInputStream inputStream = new PipedInputStream();
final PipedOutputStream outputStream = new PipedOutputStream(inputStream);
new Thread(new Runnable() {
public void run() {
try {
outputStreamConsumer.accept(outputStream);
} finally {
try {
outputStream.close();
} catch (final IOException e) {
LOGGER.error("Closing outputStream has failed. ", e);
}
}
}
}).start();
return inputStream;
}
Unit Test:
#Test
void testRunInOwnThreadWithPipedStreams() throws IOException {
final InputStream inputStream = LoadFileUtil.runInOwnThreadWithPipedStreams((OutputStream outputStream) -> {
try {
IOUtils.copy(IOUtils.toInputStream("Hello World", StandardCharsets.UTF_8), outputStream);
} catch (final IOException e) {
LoggerFactory.getLogger(LoadFileUtilTest.class).error(e.getMessage(), e);
}
});
final String actualResult = IOUtils.toString(inputStream, StandardCharsets.UTF_8);
Assertions.assertEquals("Hello World", actualResult);
}
If you want to make an OutputStream from an InputStream there is one basic problem. A method writing to an OutputStream blocks until it is done. So the result is available when the writing method is finished. This has 2 consequences:
If you use only one thread, you need to wait until everything is written (so you need to store the stream's data in memory or disk).
If you want to access the data before it is finished, you need a second thread.
Variant 1 can be implemented using byte arrays or filed.
Variant 1 can be implemented using pipies (either directly or with extra abstraction - e.g. RingBuffer or the google lib from the other comment).
Indeed with standard java there is no other way to solve the problem. Each solution is an implementataion of one of these.
There is one concept called "continuation" (see wikipedia for details). In this case basically this means:
there is a special output stream that expects a certain amount of data
if the ammount is reached, the stream gives control to it's counterpart which is a special input stream
the input stream makes the amount of data available until it is read, after that, it passes back the control to the output stream
While some languages have this concept built in, for java you need some "magic". For example "commons-javaflow" from apache implements such for java. The disadvantage is that this requires some special bytecode modifications at build time. So it would make sense to put all the stuff in an extra library whith custom build scripts.
Though you cannot convert an OutputStream to an InputStream, java provides a way using PipedOutputStream and PipedInputStream that you can have data written to a PipedOutputStream to become available through an associated PipedInputStream. Sometime back I faced a similar situation when dealing with third party libraries that required an InputStream instance to be passed to them instead of an OutputStream instance. The way I fixed this issue is to use the PipedInputStream and PipedOutputStream. By the way they are tricky to use and you must use multithreading to achieve what you want. I recently published an implementation on github which you can use. Here is the link . You can go through the wiki to understand how to use it.
Old post but might help others, Use this way:
OutputStream out = new ByteArrayOutputStream();
...
out.write();
...
ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(out.toString().getBytes()));

Categories

Resources