I'm having trouble using HttpUrlConnection. I'm working on multiple servers. Some servers send response in gzip encoding and some don't. For gzip encoding, I'm using
inputStream = new GZIPInputStream(connection.getInputStream());
inputStreamReader = new InputStreamReader(inputStream);
And for normal encoding, I'm using
inputStreamReader = new InputStreamReader(connection.getInputStream());
Is it possible to know the encoding of getInputStream so that I know beforehand whether or not to use GZIPInputStream. Or is there a generic input stream reader for both compressed & uncompressed. Thanks.
Get the content encoding from the HttpURLConnection using getContentEncoding().
It it's gzip-encoded, the result of that call should be gzip, and then you know what type of input stream you need to create.
Related
Beginner in java, I try to decompress an HTTP response in Gzip format. Roughly, I have a bufferReader which allows me to read lines of http response from a socket. Thanks to that, I parse the http header and if it specifies that the body is in gzip format then I have to decompress it. Here is the code which I use:
DataInputStream response = new DataInputStream(clientSideSocket.getInputStream());
BufferedReader buffer = new BufferedReader(new InputStreamReader(response))
header = parseHTTPHeader(buffer); // return a map<String,String> with header options
StringBuilder SBresponseBody = new StringBuilder();
String responseBody = new String();
String line;
while((line = buffer.readLine())!= null) // extract the body as if was a string...
SBresponseBody.append(line);
responseBody = SBresponseBody.toString();
if (header.get("Content-Encoding").contains("gzip"))
responseBody = unzip(responseBody); // function I try to construct
My attempt for the unzip function is as follows:
private String unzip(String body) throws IOException {
String responseBody = "";
byte[] readBuffer = new byte[5000];
GZIPInputStream gzip = new GZIPInputStream (new ByteArrayInputStream(body.getBytes());
int read = gzip.read(readBuffer,0,readBuffer.length);
gzip.close();
byte[] result = Arrays.copyOf(readBuffer, read);
responseBody = new String(result, "UTF-8");
return responseBody;
}
I get an error in the GZIPInputStream: not GZIP format (because gzip header is not found in body).
Here are my thoughts:
• Is body.toByte() wrong since it has been read by a bufferReader as a character string and therefore converting it back to byte[] makes no sense since it has already been interpreted in the wrong way? Or do I reconvert Sting body to byte[] in the wrong way?
• Do I have to build a GZIP header myself using the information provided in the HTTP header and adding it to the String body ?
• Do I need to create another InputStream from my socket.getInputStream() to read the information byte by byte, or is it tricky since there is already a buffer "connected" to this socket?
Roughly, I have a bufferReader which allows me to read lines of http response from a socket.
You've handrolled a HTTP client.
This is not a good thing; HTTP is considerably more complicated than you think it is. gzip is just one of about 10,000 things you need to think about. There's HTTP/2.0, Spdy, http3, chunked transfer encoding, TLS, redirects, mime packing, and so much more to think about.
So, if you want to write an actual HTTP client, you need about 100x this code and a ton of domain knowledge, because the actual specs of the HTTP protocol, while handy, don't really tell the story. The de-facto protocol you're implementing is 'whatever servers connected to the internet tend to send' and what they tend to send is tightly wound up with 'whatever commonly used browsers tend to get right', which is almost, but not quite, what that spec document says. This is one of those cases where pragmatics and implementations are the 'real spec', and the actual spec is merely attempting to document reality.
That's a long way around to say: Your mistake is trying to handroll a HTTP client. Don't do that. Use OkHttp or the http client introduced in jdk11 in the core libraries.
But, I know what I want!
Your code is loaded up with bugs, though.
DataInputStream response = new DataInputStream(clientSideSocket.getInputStream());
DataInputStream is useless here. Remove that wrapper.
BufferedReader buffer = new BufferedReader(new InputStreamReader(response))
Missing semi-colon. Also, this is broken - this will convert the bytes flowing over the wire to characters using 'platform default encoding' which is wrong, you need to look at the Content-Type header.
responseBody = unzip(responseBody)
You cannot do this. Your major misunderstanding is that you appear to think that there is no difference between a bunch of bytes, and a sequence of characters.
That's wrong. Once you stored bytes into chars, you cannot unzip it anymore.
The fix is to check for the gzip header FIRST, then wrap your inputstream through GZipStream.
I am using HttpsURLConnection to call a server and return the response returned from the HttpsURLConnection from my servlet. I am copying the response from HttpssURLConnection to HttpServletresponse using streams, copying bytes from the httpconnection response input stream to the response's output stream, checking the end by seeing if read returns < 0.
Following is the code for copying the response. The variable response is of type HttpServletResponse and the variable httpCon is of type HttpsURLConnection.
InputStream responseStream = httpCon.getInputStream();
if (responseStream != null)
{
OutputStream os = response.getOutputStream();
byte[] buffer = new byte[1024];
int len;
while ((len = responseStream.read(buffer)) >= 0)
{
os.write(buffer, 0, len);
}
os.flush();
os.close();
}
On the client side, I am using python requests library to read the response.
What I am seeing that if I use the curl to test my servlet, I am getting the proper response json, response = u'{"key":"value"}'.
If i read it from the requests python, it is putting some extra characters in the response , the response looks like the following
response = u'b0\r\n{"key":"value"}\r\n0\r\n\r\n'
Both the strings are unicode. But the second one has extra characters.
Same resonse if I try from curl/Postman restclient, I am able to get it properly. But from python requests, it is not working. I tried another livetest library in python, with that also, it is not working and the response has same characters. I also tried to change the accept-encoding header but it did not have any effect.
Because of this, I am not able to parse the json.
I don't want to change the client to parse this kind of string.
Can I change something on the server so that it will work correctly?
Did the response contain the below header "Transfer-Encoding: chunked"?
The response should be in Chunked transfer encoding
https://en.wikipedia.org/wiki/Chunked_transfer_encoding.
In this case, you get \r\n0\r\n\r\n at the end of the response is as expected since it is terminating symbol of this encoding. I guest curl/Postman just help us to handle Chunked transfer encoding, so you can't find these chunked symbols.
I have been going over the following tutorial and came across this code which I do not understand the purpose of:
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write( data );
wr.flush();
I don't know what is the purpose of the above code and where is it writing this data to...
From what I could gather, the documentation states that it converts character to bytes... but then it writes it to some where... not sure why this is.
It is basically used for turning a character stream into a byte stream.
The byte streams and character streams are incompatible for linking as the first one operates on 8-bit ASCII characters and the other on 16-bit Unicode characters. To link them explicitly, two classes exist in java.io package, InputStreamReader and OutputStreamWriter.
InputStreamReader links a byte stream, with the character stream BufferedReader (on reading-side) .
Whereas with the OutputStreamWriter, the characters of 2-bytes are encoded (converted) into bytes of 1-byte (InputStreamReader does it other way – bytes to characters).
For a Java program to interact with a server-side process it simply must be able to write to a URL, thus providing data to the server. It can do this by following these steps:
1.Create a URL.
2.Retrieve the URLConnection object.
3.Set output capability on the URLConnection.
4.Open a connection to the resource.
5.Get an output stream from the connection.
6.Write to the output stream.
7.Close the output stream.
Now in the snippet you provided ,
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
creates an output stream on the connection and opens an
OutputStreamWriter on it--Step 5 If the URL does support output, then
this method returns an output stream that is connected to the input
stream of the URL on the server side — the client's output is the
server's input.If the URL does not support output, getOutputStream
method throws an UnknownServiceException.
And
wr.write( data );
wr.close();
wr.flush();
It writes the required information to the output stream and closes the
stream. The data written to the output stream on the client side is
the input on the server side.
It's writing it to the output stream of the URLConnection - which is basically used for the body of an HTTP request (assuming it's an HTTP URL, of course).
So heres my problem. I'm reading a json from web using httpurlconnection. That json contains german special chars (äöü). Inside NetBeans, everything is fine. When I build the jar an run it, "Silberanhänger" changes to "Silberanhänger". Heres the code, nothing special inside
URL url = new URL("jsonUrl);
HttpURLConnection con = (HttpURLConnection) url.openConnection();
con.setUseCaches(false);
con.setRequestProperty("Accept-Language","de-de,de;q=0.8,en-us;q=0.5,en;q=0.3");
con.setRequestProperty("Cookie","s="+session);
try (BufferedReader bf = new BufferedReader(new InputStreamReader(
con.getInputStream()))) {
jsonRepresentation = bf.readLine(); //only 1 line
}
con.disconnect();
System.out.println(jsonRepresentation) // "ä" in IDE, "ä" in Live
Setting -Dfile.encoding=UTF8 is a hack that will have side-effects on all code run on that JVM. A better hack would be to specify the charset in the InputStreamReader's constructor
new InputStreamReader(con.getInputStream(), "UTF-8")
However this might still fail if the HTTP server on the other end changes its encoding. You would be better off using a HTTP library such as Apache HTTPComponents to parse the HTTP response into a String. It will read the encoding from the HTTP header and do the right thing in all circumstances.
Set jvm encoding with -Dfile.encoding=UTF8
This is my code
URL url = new URL("http://172.16.32.160:8080/epramaan/loginotp");
URLConnection connection1 = url.openConnection();
connection1.setDoOutput(true);
ObjectOutputStream out=new ObjectOutputStream(connection1.getOutputStream());
out.writeObject(send);
out.flush();
out.close();
ObjectInputStream in = new ObjectInputStream(connection1.getInputStream());
String output=(String)in.readObject();
in.close();
//Rest of the code
Once the OutputStream writes data to the stream, will the object InputStream stop execution till the response is received?
I assume that by stop execution you mean block.
Just noticed that you are using readObject and not read. Please elaborate what kind of data you are reading/writing and why are u using object streams ?
As you mentioned you are using String, I would suggest to use method readFully(byte[] buf). This method blocks till all the bytes are read. Once you have the byte array, a String can be created from this byte array.
You can use InputStream.read(byte[]) for reading the entire byte array to memory (you can get the array length from the HTTP Content-Length header) and use URLConnection.setReadTimeout() for timing out if you are blocking for too long.
From the byte array you can construct your object, constructing your ObjectInputStream over a ByteArrayInputStream
Once the OutputStream writes data to the stream, will the object InputStream stop execution till the response is received?
Not precisely. Opening the InputStream doesn't block anything, and doesn't even cause the request headers to be sent. However, reading from the InputStream will do both of those things.
I suspect that the real cause of your problems is that you are getting an error response from the server that is something other than a serialized object; e.g. it could be a generic HTML error page from the server. Naturally, attempting to deserialize this fails.
The correct procedure is:
Create the URLConnection object.
Set any request headers you need to.
Connect it (or skip this ... it will happen implicitly).
Open and write to the OutputStream.
Close the OutputStream.
Use getResponseCode() to see if the request succeeded or failed.
If it succeeded, call getInputStream() and read and process the response.
If it failed, call getErrorStream() and process the error output.