InputStream of a socket not readable

InputStream of a socket not readable - java

I have an HTML form that makes a POST request to a Socket I made with Java. I read each line with
/**
* Read a line from an {#link InputStream}
* #param inFromClient The {#link InputStream} to read from
* #return The {#link String} read
* #throws IOException When something went wrong while reading
*/
private String readLine(InputStream inFromClient) throws IOException {
StringBuilder lineb = new StringBuilder();
char c = (char) inFromClient.read();
while (c != '\n'){
lineb.append(Character.toString(c));
c = (char) (inFromClient.read());
}
String line = lineb.toString();
return line.substring(0,line.lastIndexOf('\r')<0?0:line.lastIndexOf('\r'));
}
That way, I'm able to parse the request till the boundary and then save the file sent. Everything works perfectly.
However, I'm also trying to make a POST request with Java to the same socket. First, I create a second socket connected to my server socket. Then I do:
PrintWriter pw = new PrintWriter(socket.getOutputStream(), true);
pw.println("POST / HTTP/1.1");
pw.println("Host: ...");
...
The problem
The problem is, my method cannot read any line and it all ends up with a "OutOfMemory' exception at line 5. Why am I not able to read lines sent from a Java socket while I can read those sent from my browser (html form) ? Thank you.

Your server code must read() into an int and check whether that's -1 before casting to a char. You're ignoring end-of-file from the stream and appending -1 to your string builder for ever.
However:
I'd recommend using an existing HTTP server framework in your server to read and parse requests, rather than writing your own. (Or at least use an off-the-shelf HTTP request parser / response serialiser if you want to use your own socket code.)
Both your client and server code ignore character encoding. You need to convert bytes to/from chars using a Charset instance.
Use HttpURLConnection in your client, rather than a simple TCP socket.
Better, use something like https://hc.apache.org/ for your HTTP functionality.

Related

Unzip http response

Beginner in java, I try to decompress an HTTP response in Gzip format. Roughly, I have a bufferReader which allows me to read lines of http response from a socket. Thanks to that, I parse the http header and if it specifies that the body is in gzip format then I have to decompress it. Here is the code which I use:
DataInputStream response = new DataInputStream(clientSideSocket.getInputStream());
BufferedReader buffer = new BufferedReader(new InputStreamReader(response))
header = parseHTTPHeader(buffer); // return a map<String,String> with header options
StringBuilder SBresponseBody = new StringBuilder();
String responseBody = new String();
String line;
while((line = buffer.readLine())!= null) // extract the body as if was a string...
SBresponseBody.append(line);
responseBody = SBresponseBody.toString();
if (header.get("Content-Encoding").contains("gzip"))
responseBody = unzip(responseBody); // function I try to construct
My attempt for the unzip function is as follows:
private String unzip(String body) throws IOException {
String responseBody = "";
byte[] readBuffer = new byte[5000];
GZIPInputStream gzip = new GZIPInputStream (new ByteArrayInputStream(body.getBytes());
int read = gzip.read(readBuffer,0,readBuffer.length);
gzip.close();
byte[] result = Arrays.copyOf(readBuffer, read);
responseBody = new String(result, "UTF-8");
return responseBody;
}
I get an error in the GZIPInputStream: not GZIP format (because gzip header is not found in body).
Here are my thoughts:
• Is body.toByte() wrong since it has been read by a bufferReader as a character string and therefore converting it back to byte[] makes no sense since it has already been interpreted in the wrong way? Or do I reconvert Sting body to byte[] in the wrong way?
• Do I have to build a GZIP header myself using the information provided in the HTTP header and adding it to the String body ?
• Do I need to create another InputStream from my socket.getInputStream() to read the information byte by byte, or is it tricky since there is already a buffer "connected" to this socket?

Roughly, I have a bufferReader which allows me to read lines of http response from a socket.
You've handrolled a HTTP client.
This is not a good thing; HTTP is considerably more complicated than you think it is. gzip is just one of about 10,000 things you need to think about. There's HTTP/2.0, Spdy, http3, chunked transfer encoding, TLS, redirects, mime packing, and so much more to think about.
So, if you want to write an actual HTTP client, you need about 100x this code and a ton of domain knowledge, because the actual specs of the HTTP protocol, while handy, don't really tell the story. The de-facto protocol you're implementing is 'whatever servers connected to the internet tend to send' and what they tend to send is tightly wound up with 'whatever commonly used browsers tend to get right', which is almost, but not quite, what that spec document says. This is one of those cases where pragmatics and implementations are the 'real spec', and the actual spec is merely attempting to document reality.
That's a long way around to say: Your mistake is trying to handroll a HTTP client. Don't do that. Use OkHttp or the http client introduced in jdk11 in the core libraries.
But, I know what I want!
Your code is loaded up with bugs, though.
DataInputStream response = new DataInputStream(clientSideSocket.getInputStream());
DataInputStream is useless here. Remove that wrapper.
BufferedReader buffer = new BufferedReader(new InputStreamReader(response))
Missing semi-colon. Also, this is broken - this will convert the bytes flowing over the wire to characters using 'platform default encoding' which is wrong, you need to look at the Content-Type header.
responseBody = unzip(responseBody)
You cannot do this. Your major misunderstanding is that you appear to think that there is no difference between a bunch of bytes, and a sequence of characters.
That's wrong. Once you stored bytes into chars, you cannot unzip it anymore.
The fix is to check for the gzip header FIRST, then wrap your inputstream through GZipStream.

Is the official Oracle SSLSocketClient.java demo code insecure?

From this link, a demo for SSLSocketClient.java is given:
import java.net.*;
import java.io.*;
import javax.net.ssl.*;
/*
* This example demostrates how to use a SSLSocket as client to
* send a HTTP request and get response from an HTTPS server.
* It assumes that the client is not behind a firewall
*/
public class SSLSocketClient {
public static void main(String[] args) throws Exception {
try {
SSLSocketFactory factory =
(SSLSocketFactory)SSLSocketFactory.getDefault();
SSLSocket socket =
(SSLSocket)factory.createSocket("www.verisign.com", 443);
/*
* send http request
*
* Before any application data is sent or received, the
* SSL socket will do SSL handshaking first to set up
* the security attributes.
*
* SSL handshaking can be initiated by either flushing data
* down the pipe, or by starting the handshaking by hand.
*
* Handshaking is started manually in this example because
* PrintWriter catches all IOExceptions (including
* SSLExceptions), sets an internal error flag, and then
* returns without rethrowing the exception.
*
* Unfortunately, this means any error messages are lost,
* which caused lots of confusion for others using this
* code. The only way to tell there was an error is to call
* PrintWriter.checkError().
*/
socket.startHandshake();
PrintWriter out = new PrintWriter(
new BufferedWriter(
new OutputStreamWriter(
socket.getOutputStream())));
out.println("GET / HTTP/1.0");
out.println();
out.flush();
/*
* Make sure there were no surprises
*/
if (out.checkError())
System.out.println(
"SSLSocketClient: java.io.PrintWriter error");
/* read response */
BufferedReader in = new BufferedReader(
new InputStreamReader(
socket.getInputStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
System.out.println(inputLine);
in.close();
out.close();
socket.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
I have two questions:
According to this official document, if we are using a raw SSLSocketFactory rather than the HttpsURLConnection, there is no hostname verification enforced in the handshake process. Therefore, hostname verification should be done manually.
When using raw SSLSocket and SSLEngine classes, you should always check the peer's credentials before sending any data. The SSLSocket and SSLEngine classes do not automatically verify that the host name in a URL matches the host name in the peer's credentials. An application could be exploited with URL spoofing if the host name is not verified. Since JDK 7, endpoint identification/verification procedures can be handled during SSL/TLS handshaking. See the SSLParameters.getEndpointIdentificationAlgorithm method.
Does it mean the demo is insecure?
I saw a solution to add hostname verification in Java 7 as:
SSLParameters sslParams = new SSLParameters();
sslParams.setEndpointIdentificationAlgorithm("HTTPS");
sslSocket.setSSLParameters(sslParams);
When the algorithm is specified as "HTTPS", the handshake will verify the hostname. Otherwise (the algorithm is empty only using raw SSLSockeFactory), the hostname verification has not been invoked at all.
I curious about could I fix it as follows:
SSLSocketFactory factory =
(SSLSocketFactory)SSLSocketFactory.getDefault();
SSLSocket socket =
(SSLSocket)factory.createSocket("www.verisign.com", 443);
HostnameVerifier hv = HttpsURLConnection.getDefaultHostnameVerifier();
if(!hv.verify(socket.getSession().getPeerHost(),socket.getSession())){
threw CertificateException("Hostname does not match!")
}
I saw the HttpsURLConnection.getDefaultHostnameVerifier() can return a default HostnameVerifier, can I use it to do verification? I saw many people talking about use a custom HostnameVerifier. I don't understand if there is a default one why we need to customize it?

Borderline as an answer but got much too long for comments.
(1) yes, for HTTPS (as noted in the paragraph after the one you quoted) this is a security flaw; probably this example was written before Java 7 and not updated since. You could file a bug report for them to update it. (Of course there are some using SSL/TLS applications that don't validate hostname, like SNMPS and LDAPS, and don't even have URLs, but can still be implemented using Java JSSE.)
(2) the HTTP is wrong or poor also:
PrintWriter uses the JVM's lineSeparator which varies by platform, but HTTP standards (RFCs 2068, 2616, 7230) require CRLF for request header(s) on all platforms, though some servers (probably including google) will accept just-LF following the traditional Postel maxim 'be conservative in what you send and liberal in what you receive';
the read side assumes all data is line-oriented and won't be damaged by canonicalizing EOLs, which is true for HTTP header and some bodies like the text/html you will get from most webservers when request has no Accept (or Accept-encoding), but is not guaranteed;
the read side also assumes all data can be decoded from and re-encoded to the JVM default 'charset' safely; this is true for HTTP header (which is effectively 7-bit ASCII) but not many/most bodies: in particular handling 8859 or similar as UTF8 will destroy much of it, and handling UTF8 as 8859 or CP1252 will mojibake it.
(3) HTTP/1.0 is officially obsolete, although it is still widely supported and makes a significantly simpler demo, so I'd let that one slide.

HttpsUrlConnection response returned from servlet contains extra 'b''0''\r\n' characters when read through python library

I am using HttpsURLConnection to call a server and return the response returned from the HttpsURLConnection from my servlet. I am copying the response from HttpssURLConnection to HttpServletresponse using streams, copying bytes from the httpconnection response input stream to the response's output stream, checking the end by seeing if read returns < 0.
Following is the code for copying the response. The variable response is of type HttpServletResponse and the variable httpCon is of type HttpsURLConnection.
InputStream responseStream = httpCon.getInputStream();
if (responseStream != null)
{
OutputStream os = response.getOutputStream();
byte[] buffer = new byte[1024];
int len;
while ((len = responseStream.read(buffer)) >= 0)
{
os.write(buffer, 0, len);
}
os.flush();
os.close();
}
On the client side, I am using python requests library to read the response.
What I am seeing that if I use the curl to test my servlet, I am getting the proper response json, response = u'{"key":"value"}'.
If i read it from the requests python, it is putting some extra characters in the response , the response looks like the following
response = u'b0\r\n{"key":"value"}\r\n0\r\n\r\n'
Both the strings are unicode. But the second one has extra characters.
Same resonse if I try from curl/Postman restclient, I am able to get it properly. But from python requests, it is not working. I tried another livetest library in python, with that also, it is not working and the response has same characters. I also tried to change the accept-encoding header but it did not have any effect.
Because of this, I am not able to parse the json.
I don't want to change the client to parse this kind of string.
Can I change something on the server so that it will work correctly?

Did the response contain the below header "Transfer-Encoding: chunked"?
The response should be in Chunked transfer encoding
https://en.wikipedia.org/wiki/Chunked_transfer_encoding.
In this case, you get \r\n0\r\n\r\n at the end of the response is as expected since it is terminating symbol of this encoding. I guest curl/Postman just help us to handle Chunked transfer encoding, so you can't find these chunked symbols.

Passing String through Socket duplicates value

I am creating a simple Client-Server application and facing some weird behaviour when passing messages through a Socket: When the Client writes to the server, the message is passed correctly, however when the server sends a response, whichever value is sent through the socket seems to get duplicated...
Here is a sample code of what the server does:
.
.
.
public void respond(Socket socket)
{
try
{
InputStreamReader inStream = new InputStreamReader( socket.getInputStream() );
PrintWriter outStream = new PrintWriter(
new OutputStreamWriter( socket.getOutputStream(), "UTF-16" ) );
outStream.write("Message received\n");
outStream.flush();
.
.
.
}
catch (Exception e) { /* Do something */ }
}
.
.
.
Server and Client are currently running on the same machine.
Furthermore, encoding seems to be no issue when writing from client to server, but it is when writing from server to client: If I specify any other (or no) encoding than UTF-16 for the OutputStreamWriter, the Client won't be able to parse the message correctly.
Does any of you guys have an idea why that might be?

The character encoding on each end of the conversation needs to be the same: the Charset used for encoding by InputStreamReader at the client must match that used by the OutputStreamWriter at the server (and vice-versa).
If you don't specify one, it is going to use the JVM's default.
When you didn't provided your client's code, the fact that the server is using the default Charset to read and UTF-16 to write makes me think there is a potential mismatch.

reading bytes from web site

I am trying to create a proxy server.
I want to read the websites byte by byte so that I can display images and all other stuff. I tried readLine but I can't display images. Do you have any suggestions how I can change my code and send all data with DataOutputStream object to browser ?
try{
Socket s = new Socket(InetAddress.getByName(req.hostname), 80);
String file = parcala(req.url);
DataOutputStream out = new DataOutputStream(clientSocket.getOutputStream());
BufferedReader dis = new BufferedReader(new InputStreamReader(s.getInputStream()));
PrintWriter socketOut = new PrintWriter(s.getOutputStream());
socketOut.print("GET "+ req.url + "\n\n");
//socketOut.print("Host: "+req.hostname);
socketOut.flush();
String line;
while ((line = dis.readLine()) != null){
System.out.println(line);
}
}
catch (Exception e){}
}
Edited Part
This is what I should have to do. I can block banned web sites but can't allow other web sites in my program.
In the filter program, you will open a TCP socket at the specified port and wait for connections. If a
request comes (i.e. the client types a URL to access a web site), the application will process it to
decide whether access is allowed or not and then, using the same socket, it will send the reply back
to the client. After the client opened her connection to WebPolice (and her request has been checked
and is allowed), the real web page needs to be shown to the client. Therefore, since the user already gave her request, now it is WebPolice’s turn to forward the request so that the user can get the web page. Thus, WebPolice acts as a client and requests the web page. This means you need to open a connection to the web server (without closing the connection to the user), forward the request over this connection, get the reply and forward it back to the client. You will use threads to handle multiple connections (at the same time and/or at different times).

I don't know what exactly you're trying to do, but crafting an HTTP request and reading its response incorporates somewhat more than you have done here. Readline won't work on binary data anyway.
You can take a look at the URLConnection class (stolen here):
URL oracle = new URL("http://www.oracle.com/");
URLConnection yc = oracle.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(yc.getInputStream()));
Then you can read textual or binary data from the in object.

Read line will treat the line read as a String, so unless you want to mess around with conversions over to bytes, I wouldn't recommend that.
I would just read bytes until you can't read anymore, then write them out to a file, this should allow you to grab the images, keeping file headers intact which can be important when dealing with files other than text.
Hope this helps.

Instead of using BufferedReader you can try to use InputStream.
It has several methods for reading bytes.
http://docs.oracle.com/javase/6/docs/api/java/io/InputStream.html

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.