"available" of DataInputStream from Socket - java

I have this code on the client side :
DataInputStream dis = new DataInputStream(socketChannel.socket().getInputStream());
while(dis.available()){
SomeOtherClass.method(dis);
}
But available() keeps returning 0, although there is readable data in the stream. So after the actual data to be read is finished, empty data is passed to the other class to be read and this causes corruption.
After a little search; I found that available() is not reliable when using with sockets, and that I should be reading first few bytes from stream to actually see if data is available to parse.
But in my case; I have to pass the DataInputStream reference I get from the socket to some other class that I cannot change.
Is it possible to read a few bytes from DataInputStream without corrupting it, or any other suggestions ?

Putting a PushbackInputStream in between allows you to read some bytes without corrupting the data.
EDIT: Untested code example below. This is from memory.
static class MyWrapper extends PushbackInputStream {
MyWrapper(InputStream in) {
super(in);
}
#Override
public int available() throws IOException {
int b = super.read();
// do something specific?
super.unread(b);
return super.available();
}
}
public static void main(String... args) {
InputStream originalSocketStream = null;
DataInputStream dis = new DataInputStream(new MyWrapper(originalSocketStream));
}

This should work:
PushbackInputStream pbi = new PushbackInputStream(socketChannel.socket().getInputStream(), 1);
int singleByte;
DataInputStream dis = new DataInputStream(pbi);
while((singleByte = pbi.read()) != -1) {
pbi.unread(singleByte);
SomeOtherClass.method(dis);
}
But please note that this code will behave different from the example with available (if availabe would work) because available does not block but read may block.

But available() keeps returning 0, although there is readable data in the stream
If available() returns zero, either:
The input stream you are using doesn't support available() and so it just returns zero. That isn't the case here, as you are using a DataInputStream wrapped directly around the socket's input stream, and that configuration does support available(), OR ...
There is no readable data in the stream. That appears to be the case here. In fact the only possible way you can know there is readable data in the stream without actually reading it is to call available() and get a positive result. There is no other way of telling.
There are few correct uses of availabe(), and this isn't one of them. Why should you fall out of that loop just because there isn't any data in the socket receive buffer? The only way you should get out of that loop is by getting an end of stream condition.
I should be reading first few bytes from stream to actually see if data is available to parse.
That doesn't even make sense. If you can read anything from the stream, there is data available, and if you can't, there isn't.
Just read, block, and react correctly to EOS, in its various manifestations.

Related

converting OutputStream into an InputStream

Is there any way to convert an OutputStream into an InputStream?
So the following would work
InputStream convertOStoIS(OutputStream os) {
}
I do not want to use any libraries, I read that there are some who are able to accomplish this with bytecode manipulation.
Edit
I want to be able to intersect a sink, to analyze the data or redirect the output. I want to place another OutputStream under the on given by some function and redirect the data into another input stream.
The related topics had a ByteArrayOutputStream or a PipedStream which is not the case in my question.
Related:
How to convert OutputStream to InputStream?
Most efficient way to create InputStream from OutputStream
Use a java.io.FilterOutputStream to wrap the existing OutputStream. By overriding the write() method you can intercept output and do whatever you want with it, either send it somewhere else, modify it, or discard it completely.
As to your second question, you cannot change the sink of an OutputStream after the fact, i.e. cause previously written data to "move" somewhere else, but using a FilterOutputStream you can intercept and redirect any data written after you wrap the original `OutputStream.
To answer my own question, yes you can build a redirect like this:
class OutInInputRedirect {
public final transient InputStream is;
public final transient OutputStream os;
public OutInInputRedirect() throws IOException {
this(1024);
}
public OutInInputRedirect(int size) throws IOException {
PipedInputStream is = new PipedInputStream(size);
PipedOutputStream os = new PipedOutputStream(is);
this.is = is;
this.os = os;
}
}
Just use the OutputStream as an replacement and the InputStream in those places you need, be awere that the closing of the OutputStream also closes the InputStream!
It is quite easy and works as expected. Either way you cannot change an already connected stream (without reflection).

Java - Resetting InputStream

I'm dealing with some Java code in which there's an InputStream that I read one time and then I need to read it once again in the same method.
The problem is that I need to reset it's position to the start in order to read it twice.
I've found a hack-ish solution to the problem:
is.mark(Integer.MAX_VALUE);
//Read the InputStream is fully
// { ... }
try
{
is.reset();
}
catch (IOException e)
{
e.printStackTrace();
}
Does this solution lead to some unespected behaviours? Or it will work in it's dumbness?
As written, you have no guarantees, because mark() is not required to report whether it was successful. To get a guarantee, you must first call markSupported(), and it must return true.
Also as written, the specified read limit is very dangerous. If you happen to be using a stream that buffers in-memory, it will potentially allocate a 2GB buffer. On the other hand, if you happen to be using a FileInputStream, you're fine.
A better approach is to use a BufferedInputStream with an explicit buffer.
It depends on the InputStream implementation. You can also think whether it will be better if you use byte[]. The easiest way is to use Apache commons-io:
byte[] bytes = IOUtils.toByteArray(inputSream);
You can't do this reliably; some InputStreams (such as ones connected to terminals or sockets) don't support mark and reset (see markSupported). If you really have to traverse the data twice, you need to read it into your own buffer.
Instead of trying to reset the InputStream load it into a buffer like a StringBuilder or if it's a binary data stream a ByteArrayOutputStream. You can then process the buffer within the method as many times as you want.
ByteArrayOutputStream bos = new ByteArrayOutputStream();
int read = 0;
byte[] buff = new byte[1024];
while ((read = inStream.read(buff)) != -1) {
bos.write(buff, 0, read);
}
byte[] streamData = bos.toByteArray();
For me, the easiest solution was to pass the object from which the InputStream could be obtained, and just obtain it again. In my case, it was from a ContentResolver.

Convert Jetty Buffer to InputStream

I have to parse xml from the content of a Jetty buffer using SAX.
From my ContentExchange I can call getRequestContent, and then I get a Buffer
I need an InputStream, or an InputSoruce or a String or a File in order to parse it with SAX. How can I convert the buffer to one of those, and which way is the most efficient?
It looks like something obvious, but I can not find any information in the documentation.
Apologies for answering an old question, but someone (such as myself) may stumble upon this in the future.
Jetty's Buffer class implements a writeTo(OutputStream) method. A simple solution would be to do the following:
PipedInputStream is = new PipedInputStream();
PipedOutputStream os = new PipedOutputStream(is);
Then for each Buffer received, do:
void processBuffer(Buffer buf) {
buf.writeTo(os);
}
This way you can stream responses without need for caching them.
EDIT:
Of course, make sure that processBuffer() and readers of the PipedInputStream are running in separate threads to avoid potential deadlock.
Perhaps you could wrap the buffer in your own custom (anonymous?) InputStream since you only need to implement the read() method. For example:
public InputStream forBuffer(final Buffer buf) {
return new InputStream() {
#Override
public int read() /* throws IOException */ {
return buf.get();
}
};
}
From the Jetty docs it's hard to tell what happens when the Buffer#get() method hits the end but some simple testing should reveal it (and if it happens to return -1 then this example is complete!).

Using an InputStream for Logging and then XML parsing

What I want to do is log the output from an inputstream that I go using
org.apache.http.HttpEntity entity = response.getEntity();
org.apache.http.HttpResponse content =entity.getContent();
//Print the result to the screen for debugging
//puroposes
if(Logging.DEBUG) {
InputStream content =entity.getContent();
int i;
StringBuilder b = new StringBuilder();
while( (i=content.read()) != -1 ) {
b.append((char)i);
}
Log.d(TAG, b.toString());
}
Now after I have finished logging, I want to use the exact same stream through an XML parser. The problem is that it tells me that the steam has already been used.
I tried to the use mark() and reset() calls before and after debugging but it didn't work.
It depends whether the inputstream that is returned supports it. The default implementation in the InputStream class does nothing, as described in the API. So you can't be sure whether the returned Stream actually supports it. To be sure of this, you should wrap it in a BufferedInputStream, which does supports these methods.
In general mark() and reset() won't work on an arbitrary InputStream. They only work on subclasses like FileInputStream where the underlying data source supports these operations.
For something like a SocketInputStream or a console InputStream, your only option will be to read and buffer the entire stream contents somewhere; e.g. in memory or by writing it to a temporary file.

What could lead to the creation of false EOF in a GZip compressed data stream

We are streaming data between a server (written in .Net running on Windows) to a client (written in Java running on Ubuntu) in batches. The data is in XML format. Occasionally the Java client throws an unexpected EOF while trying decompress the stream. The message content always varies and is user driven. The response from the client is also compressed using GZip. This never fails and seems to be rock solid. The response from the client is controlled by the system.
Is there a chance that some arrangement of characters or some special characters are creating false EOF markers? Could it be white-space related? Is GZip suitable for compressing XML?
I am assuming that the code to read and write from the input/output streams works because we only occasionally gets this exception and when we inspect the user data at the time there seems to be special characters (which is why I asked the question) such as the '#' sign.
Any ideas?
UPDATE:
The actual code as requested. I thought it wasn't this due to the fact that I had been to a couple of sites to get help on this issue and they all more or less had the same code. Some sites mentioned appended GZip. Something to do with GZip creating multiple segments?
public String receive() throws IOException {
byte[] buffer = new byte[8192];
ByteArrayOutputStream baos = new ByteArrayOutputStream(8192);
do {
int nrBytes = in.read(buffer);
if (nrBytes > 0) {
baos.write(buffer, 0, nrBytes);
}
} while (in.available() > 0);
return compressor.decompress(baos.toByteArray());
}
public String decompress(byte[] data) throws IOException {
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
ByteArrayInputStream in = new ByteArrayInputStream(data);
try {
GZIPInputStream inflater = new GZIPInputStream(in);
byte[] byteBuffer = new byte[8192];
int r;
while((r = inflater.read(byteBuffer)) > 0 ) {
buffer.write(byteBuffer, 0, r);
}
} catch (IOException e) {
log.error("Could not decompress stream", e);
throw e;
}
return new String(buffer.toByteArray());
}
At first I thought there must be something wrong with the way that I am reading in the stream and I thought perhaps I am not looping properly. I then generated a ton of data to be streamed and checked that it was looping. Also the fact they it happens so seldom and so far has not been reproducable lead me to believe that it was the content rather than the scenario. But at this point I am totally baffled and for all I know it is the code.
Thanks again everyone.
Update 2:
As requested the .Net code:
Dim DataToCompress = Encoding.UTF8.GetBytes(Data)
Dim CompressedData = Compress(DataToCompress)
To get the raw data into bytes. And then it gets compressed
Private Function Compress(ByVal Data As Byte()) As Byte()
Try
Using MS = New MemoryStream()
Using Compression = New GZipStream(MS, CompressionMode.Compress)
Compression.Write(Data, 0, Data.Length)
Compression.Flush()
Compression.Close()
Return MS.ToArray()
End Using
End Using
Catch ex As Exception
Log.Error("Error trying to compress data", ex)
Throw
End Try
End Function
Update 3: Also added more java code. the in variable is the InputStream return from socket.getInputStream()
It certainly shouldn't be due to the data involved - the streams deal with binary data, so that shouldn't make any odds at all.
However, without seeing your code, it's hard to say for sure. My first port of call would be to check anywhere that you're using InputStream.read() - check that you're using the return value correctly, rather than assuming a single call to read() will fill the buffer.
If you could provide some code, that would help a lot...
I would suspect that for some reason the data is altered underway, by treating it as text, not as binary, so it may either be \n conversions or a codepage alteration.
How is the gzipped stream transferred between the two systems?
It is not pssible. EOF in TCP is delivered as an out of band FIN segment, not via the data.

Categories

Resources