Why are there streams in the HttpURLConnection API? - java

From what I understand about HTTP, it works like this: The client assembles a message, consisting of some header fields and (possibly) a body and sends it to the server. The server processes it, assembles its own response message and sends it back to the client.
And so I come to the question:
Why are there all of a sudden streams in HttpURLConnection?
This makes no sense to me. It makes it look like there is a continuous open channel. At what point does the message actually get sent to the server? On connect? On getInputStream? When trying to read from the stream? What if I have payload, does it get sent at a different time then? Can I do write-read-write-read with just a single connection?
I'm sure I just haven't understood it right yet, but right now it just seems like a bad API to me.
I would have more expected to see something like this:
HttpURLConnection http = url.openConnection();
HttpMessage req = new HttpMessage;
req.addHeader(...);
req.setBody(...);
http.post(req);
// Block until response is available (Future pattern)
HttpMessage res = http.getResponse();

IMHO HttpURLConnection has indeed a bad API. But handling the input and output message as streams is a way to deal efficiently with large amounts of data. I think all other answers (at this moment 5!) are correct. There are some questions open:
At what point does the message actually get sent to the server? On connect? On getInputStream? When trying to read from the stream?
There are some triggers when all collected data (e.g. headers, options for timeout, ...) is actually transmitted to the server. In most cases you don't have to call connect, this is done implicitly e.g. when calling getResponseCode() or getInputStream(). (BTW I recommend to call getResponseCode() before getInputStream() because if you get an error code (e.g. 404), getInputStream will throw an Exception and you should better call getErrorStream().)
What if I have payload, does it get sent at a different time then?
You have to call getOutputStream() and then send the payload. This should be done (obviously) after you added the headers. After closing the stream you can expect a response from the server.
Can I do write-read-write-read with just a single connection?
No. Technically this would be possible when using keep-alive. But HttpURLConnection handles this under the cover and you can only do one request-response roundtrip with an instance of this class.
Making life easier
If you don't want to fight with the horrible API of HttpURLConnection, you could have a look at some abstraction APIs listed on DavidWebb. When using DavidWebb, a typical request looks like this:
Webb webb = Webb.create();
String result = webb.post("http://my-server/path/resource")
.header("auth-token", myAuthToken)
.body(myBody)
.ensureSuccess()
.asString()
.getBody();

while the underlying transport does take place using individual packets, there's no guarantee that what you think about as a single http request/response will "fit" in a single http "packet". in turn, there's also no guarantee that a single http "packet" will fit in a single tcp packet, and so on.
imagine downloading a 20MB image using http. its a single http "response" but i guarantee there will be multiple packets going back and forth between the browser and the website serving it up.
every block is made up of possibly multiple smaller blocks, at each level, and since you might start processing the response before all the different bits of it have arrived, and you really dont want to concern yourself with how many of them there are, a stream is the common abstraction over this.

Here the Http protocol works on Connection-Oriented TCP connection. So internally, it creates a TCP connection. then send http request on that and receive the response back. then drop the TCP Connection. that is why two different streams are there.

Because streams are the generic way to push data between two places in Java, and that's what the HTTP connection does. HTTP works over TCP, which is a streamed connection so this API mimics that.
As for why it isn't abstracted further - consider that there is no size limits in HTTP requests. For example a file upload can be many MB or even GB in size.
Using a streamed API you can read data from a file or other source and stream it out over the connection at the same time without needing to load all that data into memory at once.

TCP is a byte stream. The body of an HTTP request or response is an arbitrary byte stream. Not sure what kind of API you were expecting, but when you have byte stream data you get a byte stream API.

A streaming response can be consumed on the fly not allocating up all the data in local memory, so it would be better from a memory point of view, for instance if you are to parse a huge json file doing this from stream and discards the raw data after it has been consumed. And in theory the parsing can begin as soon as the first byte has arrived.
And it is getInputStream that does the send/receive part as well as initiating the creation of the underlying socket

Related

Java: read from webservice and buffer

I am looking for an Object. I am quite sure it exists but I don't know its name.
My application receives a JMS message. In the message there are file ids for another system. This system is connected by consuming its webservice. As soon as I receive the message, I will start reading the data from the webservice and write it into some kind of object. This will be like 20000 requests that I will stream into some sort of data strucutre.
In the meanwhile or some time after I finished, a consumer will call a webservice of my application. This will open a stream to the already loaded data and send it as a StreamingResponseBody to the caller. My other thread is still writing data to the streamed object.
The object I am looking for is not a fixed size buffer as I want to read all data even though nobody is interested in the data (yet) and nobody calls the service to download the data.
What object / data structure do you suggest?
I would be able to use a File with an InputStream and an OutputStream but I am not sure if this is the best solution. The performance however won't be the problem as the network speed will limit the read performance anyway.
As my application will be a cloud application, everything is in memory, so persistence is nothing to care about.

HTTP requests without waiting for response

Is it possible to send an HTTP request without waiting for a response?
I'm working on an IoT project that requires logging of data from sensors. In every setup, there are many sensors, and one central coordinator (Will mostly be implemented with Raspberry Pi) which gathers data from the sensors and sends the data to the server via the internet.
This logging happens every second. Therefore, the sending of data should happen quickly so that the queue does not become too large. If the request doesn't wait for a response (like UDP), it would be much faster.
It is okay if few packets are dropped every now and then.
Also, please do tell me the best way to implement this. Preferably in Java.
The server side is implemented using PHP.
Thanks in advance!
EDIT:
The sensors are wireless, but the tech they use has very little (or no) latency in sending to the coordinator. This coordinator has to send the data over the internet. But, just assume the internet connection is bad. As this is going to be implemented in a remote part of India.
You are looking for an asynchronous HTTP library such as OkHttp. It allows to specify a Callback that is executed asynchronous (by a second thread).
Therefore your main thread continues execution.
You can set the TCP timeout for a GET request to less than a second, and keep retriggering the access in a thread. Use more threads for more devices.
Something like:
HttpURLConnection con = (HttpURLConnection) new URL(url).openConnection();
con.setRequestMethod("GET");
con.setConnectTimeout(1000); //set timeout to 1 second
if (con.getResponseCode() == HttpURLConnection.HTTP_OK) {
...
}
Sleep the thread for the remainder of 1 second if the access is less than a second. You can consume the results on another thread if you add the results to thread-safe queues. Make sure to handle exceptions.
You can't use UDP with HTTP, HTTP is TCP only.

Cannot read from TCP socket

I have a C++ client using QT and a JAVA server and I have successfully written from the client to the server but I cannot write from the server to the client. My code:
QString
Client::readTCP ( )
{
socketTCP->waitForReadyRead();
QTextStream in (socketTCP);
return in.readAll() ;
}
// Later on
qDebug() << Client::readTCP();
But no matter what method I choose I can't get a response from the server. The server code is as follows:
DataOutputStream output = new DataOutputStream (SOCKET.getOutputStream());
output.writeBytes ( "myString" );
ANSWER:
It works either because I changed in.readAll() to in.readLine() or it is because I waited a couple seconds after the server started before sending a message.
The QTextStream::readAll function attempts to read the entire contents of the stream. Either this message is or isn't the entire contents of the stream.
If this message isn't the entire contents of the stream, then it should not return. It would be a serious error if readAll returned only a part of the contents of the stream despite the fact that it's specified to return all the contents.
If this is the entire contents of the stream, then the server is broken. If it doesn't close the socket, how can the client know it's received the entire contents? Unless there's some other way to indicate end of message, it has to be indicated by closing the stream, and you don't show the server closing the stream.
I'll repeat the advice I always give when I see problems like this -- do not ever implement a network protocol until you specify that protocol in a protocol specification. Otherwise, it's not possible to fix problems where the server and client disagree because there's no way to know which end is right. Here, the server and client disagree over how the end of a message is to be marked, and without a protocol specification to refer to, there's no way to know which end to fix.
If you had a protocol specification, you could just look at the section that explains how the ends of messages are marked and detected. Then you could fix whichever end doesn't follow the specification. (Or, if the specification doesn't say how, then fix the specification! Clearly, this has to happen somehow and it's the specification's job to explain how.)
In Java, after sending data to buffer, flush it. Output streams have flush() method which forces any data left in stream to be written/sent. Try that if using readAll() on client side.
Also, readLine() is advised if you know how much data will be sent. You can loop trough in.readLine() until it gets null. Also readLine() will remove any \n or \r\n.
I have no experience with Qt so I cannot say if you are reading correctly from the socket, but with Java I use PrintStream's method println when sending text, and a VS C++ Client receives it just fine using recv.
Also, you may want to check if the packet is actually sent over the socket using Wireshark.

Java socket get unsent bytes

I have a server-client application and I've hit a weird speed bump with "login failure codes" of sorts.
What I want to do is send the code that describes the login's validity and close the OutputStream if necessary.
The problem with this is that the socket is closed before the client can read the response, which is leading to seemingly random and cryptic failures.
Is there a way, aside from using a setSoLinger() (etc.), to check that the last byte (or more) that was written has been read by the client?
Thanks.
In Java, the only way for your server application to know that the client has received something is to include this property in the protocol: the client must send some kind of message stating that it received the data (maybe with some kind of checksum if you want), and the server must then wait for this message. Only then it can be sure of anything.
There's no other reliable way to be sure that the other endpoint received some data: no .close(), no .flush(), nothing will guarantee reception by the other endpoint.

Java: ignoring an input stream - will buffers overflow and bad things happen?

I have a client connecting to my server. The client sends some messages to the server which I do not care about and do not want to waste time parsing its messages if I'm not going to be using them. All the i/o I'm using is simple java i/o, not nio.
If I create the input stream and just never read from it, can that buffer fill up and cause problems? If so, is there something I can do or a property I can set to have it just throw away data that it sees?
Now what if the server doesn't create the input stream at all? Will that cause any problems on the client/sending side?
Please let me know.
Thanks,
jbu
When you accept a connection from a client, you get an InputStream. If you don't read from that stream, the client's data will buffer up. Eventually, the buffer will fill up and the client will block when it tries to write more data. If the client writes all of its data before reading a response from the server, you will end up with a pretty classic deadlock situation. If you really don't care about the data from the client, just read (or call skip) until EOF and drop the data. Alternatively, if it's not a standard request/response (like HTTP) protocol, fire up a new thread that continually reads the stream to keep it from getting backed up.
If you get no useful data from the client, what's the point of allowing it to connect?
I'm not sure of the implications of never reading from a buffer in Java -- I'd guess that eventually the OS would stop accepting data on that socket, but I'm not sure there.
Why don't you just call the skip method of your InputStream occasionally with a large number, to ensure that you discard the data?
InputStream in = ....
byte[] buffer = new byte[4096] // or whatever
while(true)
in.read(buffer);
if you accept the connection, you should read the data. to tell you the truth i have never seen (or could forsee) a situation where this (a server that ignores all data) could be useful.
I think you get the InputStream once you accept the request, so if you don't acknowledge that request the underlying framework (i.e. tomcat) will drop that request (after some lapsed time).
Regards.

Categories

Resources