Socket HTTP request returning invalid GZIP - java

I am teaching myself more about HTTP requests and such, so I wrote a simple POST request using Java's HttpURLConnection class and it returns compressed data which is easily decompress. I decided to go a lower level and send the HTTP request with sockets (for practice). I figured it out after a series of google searches, but there is one issue. When the server respondes with compressed data it isn't valid. Here is an image of a bit of debugging.
http://i.imgur.com/KfAcero.png
The portion below the "=" separator line is the response when using a HttpURLConnection instance, but the portion above it is the response when using sockets. I'm not too sure what is going on here. The bottom part is valid, while the top is not.
The HttpParameter and header classes simply store a key and value.
public String sendPost(String host, String path, List<HttpParameter> parameters, List<HttpHeader> headers) throws UnknownHostException, IOException {
String data = this.encodeParameters(parameters);
Socket socket = new Socket(host, 80);
PrintWriter writer = new PrintWriter(socket.getOutputStream());
BufferedReader reader = new BufferedReader(new InputStreamReader(socket.getInputStream()));
writer.println("POST " + path + " HTTP/1.1");
for(HttpHeader header : headers) {
writer.println(header.getField() + ": " + header.getValue());
}
writer.println();
writer.println(data);
writer.flush();
StringBuilder contentBuilder = new StringBuilder();
for(String line; (line = reader.readLine()) != null;) {
contentBuilder.append(line + "\n");
}
reader.close();
writer.close();
return contentBuilder.toString();
}

Your problem is that you are using Readers and Writers for something that is not text.
InputStream and OutputStream work with bytes; Reader and Writer work with encoded text. If you try to use Reader and Writer with something that is not encoded text, you will mangle it.
Sending the request with a Writer is fine.
You want to do something like this instead:
InputStream in = socket.getInputStream();
// ...
ByteArrayOutputStream contentBuilder = new ByteArrayOutputStream();
byte[] buffer = new byte[32768]; // the size of this doesn't matter too much
int num_read;
while(true) {
num_read = in.read(buffer);
if(num_read < 0)
break;
contentBuilder.write(buffer, 0, num_read);
}
in.close();
writer.close();
return contentBuilder.toByteArray();
and make sendPost return a byte array.

Related

URLConnection doesn't read whole page

In my app I need to download some web page. I do it in a way like this
URL url = new URL(myUrl);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setReadTimeout(5000000);//5 seconds to download
conn.setConnectTimeout(5000000);//5 seconds to connect
conn.setRequestMethod("GET");
conn.setDoInput(true);
conn.connect();
int response = conn.getResponseCode();
is = conn.getInputStream();
String s = readIt(is, len);
System.out.println("got: " + s);
My readIt function is:
public String readIt(InputStream stream) throws IOException {
int len = 10000;
Reader reader;
reader = new InputStreamReader(stream, "UTF-8");
char[] buffer = new char[len];
reader.read(buffer);
return new String(buffer);
}
The problem is that It doesn't dowload the whole page. For example, if myUrl is "https://wikipedia.org", then the output is
How can I download the whole page?
Update
Second answer from here Read/convert an InputStream to a String solved my problem. The problem is in readIt function. You should read response from InputStream like this:
static String convertStreamToString(java.io.InputStream is) {
java.util.Scanner s = new java.util.Scanner(is).useDelimiter("\\A");
return s.hasNext() ? s.next() : "";
}
There are a number of mistakes your code:
You are reading into a character buffer with a fixed size.
You are ignoring the result of the read(char[]) method. It returns the number of characters actually read ... and you need to use that.
You are assuming that read(char[]) will read all of the data. In fact, it is only guaranteed to return at least one character ... or zero to indicate that you have reached the end of stream. When you reach from a network connection, you are liable to only get the data that has already been sent by the other end and buffered locally.
When you create the String from the char[] you are assuming that every position in the character array contains a character from your stream.
There are multiple ways to do it correctly, and this is one way:
public String readIt(InputStream stream) throws IOException {
Reader reader = new InputStreamReader(stream, "UTF-8");
char[] buffer = new char[4096];
StringBuilder builder = new StringBuilder();
int len;
while ((len = reader.read(buffer) > 0) {
builder.append(buffer, 0, len);
}
return builder.toString();
}
Another way to do it is to look for an existing 3rd-party library method with a readFully(Reader) method.
You need to read in a loop till there are no more bytes left in the InputStream.
while (-1 != (len = in.read(buffer))) { //do stuff here}
You are reading only 10000 bytes from the input stream.
Use a BufferedReader to make your life easier.
public String readIt(InputStream stream) throws IOException {
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
StringBuilder out = new StringBuilder();
String newLine = System.getProperty("line.separator");
String line;
while ((line = reader.readLine()) != null) {
out.append(line);
out.append(newLine);
}
return out.toString();
}

Writing a simple HTTP server to accept GET requests

I'm trying to create a simple server that accepts a request, and then writes the content of a file to the browser that sent the request. The server connects and writes to the socket. However my browser says
no data received
and doesn't display anything.
public class Main {
/**
* #param args
*/
public static void main(String[] args) throws IOException{
while(true){
ServerSocket serverSock = new ServerSocket(6789);
Socket sock = serverSock.accept();
System.out.println("connected");
InputStream sis = sock.getInputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(sis));
String request = br.readLine(); // Now you get GET index.html HTTP/1.1`
String[] requestParam = request.split(" ");
String path = requestParam[1];
System.out.println(path);
PrintWriter out = new PrintWriter(sock.getOutputStream(), true);
File file = new File(path);
BufferedReader bfr = null;
String s = "Hi";
if (!file.exists() || !file.isFile()) {
System.out.println("writing not found...");
out.write("HTTP/1.0 200 OK\r\n");
out.write(new Date() + "\r\n");
out.write("Content-Type: text/html");
out.write("Content length: " + s.length() + "\r\n");
out.write(s);
}else{
FileReader fr = new FileReader(file);
bfr = new BufferedReader(fr);
String line;
while ((line = bfr.readLine()) != null) {
out.write(line);
}
}
if(bfr != null){
bfr.close();
}
br.close();
out.close();
serverSock.close();
}
}
}
Your code works for me (data shows up in the browser), if I use
http://localhost:6789/etc/hosts
and there is a file /etc/hosts (Linux filesystem notation).
If the file does not exist, this snippet
out.write("HTTP/1.0 200 OK\r\n");
out.write(new Date() + "\r\n");
out.write("Content-Type: text/html\r\n");
out.write("\r\n");
out.write("File " + file + " not found\r\n");
out.flush();
will return data that shows up in the browser: Note that I have explicitly added a call to flush() here. Make sure that out is flushed in the other case as well.
The other possibility is to reorder your close statements.
A quote from EJP's answer on How to close a socket:
You should close the outermost output stream you have created from the socket. That will flush it.
This is especially the case if the outermost output stream is (another quote from the same source):
a buffered output stream, or a stream wrapped around one. If you don't close that, it won't be flushed.
So out.close() should be called before br.close().

Splitting strings by newline trouble

I am reading in a file that is being sent though a socket and then trying to split it via newlines (\n), when I read in the file I am using a byte[] and I convert the byte array to a string so that I can split it.
public String getUserFileData()
{
try
{
byte[] mybytearray = new byte[1024];
InputStream is = clientSocket.getInputStream();
int bytesRead = is.read(mybytearray, 0, mybytearray.length);
is.close();
return new String(mybytearray);
}
catch(IOException e)
{
}
return "";
}
Here is the code used to attempting to split the String
public void readUserFile(String userData, Log logger)
{
String[] data;
String companyName;
data = userData.split("\n");
username = data[0];
password = data[1].toCharArray();
companyName = data[2];
quota = Float.parseFloat(data[3]);
company = new Company();
company.readCompanyFile("C:\\Users\\Chris\\Documents\\NetBeansProjects\\ArFile\\ArFile Clients\\" + companyName + "\\"
+ companyName + ".cmp");
cloudFiles = new CloudFiles();
cloudFiles.readCloudFiles(this, logger);
}
It causes this error
Exception in thread "AWT-EventQueue-1" java.lang.ArrayIndexOutOfBoundsException
You can use the readLine method in BufferedReader class.
Wrap the InputStream under InputStreamReader, and wrap it under BufferedReader:
InputStream is = clientSocket.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
Please also check the encoding of the stream - you might need to specify the encoding in the constructor of InputStreamReader.
As stated in comments, using a BufferedReader would be best - you should be using an InputStreamReader anyway in order to convert from binary to text.
// Or use a different encoding - whatever's appropriate
BufferedReader reader = new BufferedReader(
new InputStreamReader(clientSocket.getInputStream(), "UTF-8");
try {
String line;
// I'm assuming you want to read every incoming line
while ((line = reader.readLine()) != null) {
processLine(line);
}
} finally {
reader.close();
}
Note that it's important to state which encoding you want to use - otherwise it'll use the platform's default encoding, which will vary from machine to machine, whereas presumably the data is in one specific encoding. If you don't know which encoding that is yet, you need to find out. Until then, you simply can't reliably understand the data.
(I hope your real code doesn't have an empty catch block, by the way.)

Streaming byte array to a Google App Engine servlet

I'm trying to stream an image in the form of a byte[] to a Google App Engine servlet. I've done this before with servlets running on Tomcat, but for some reason doing so with GAE seems to be more problematic.
The byte array is being streamed fine from the client side and has the correct size, but it is always empty when being read on the server side.
Here's the important snippet of code doing the streaming from the client:
URL myURL = new URL("http://myapp.appspot.com/SetAvatar?memberId=1");
URLConnection servletConnection = myURL.openConnection();
servletConnection.setRequestProperty("Content-Type", "application/octet-stream");
servletConnection.setDoOutput(true);
servletConnection.setDoInput(true);
OutputStream os = servletConnection.getOutputStream();
InputStream is = servletConnection.getInputStream();
IOUtils.write(imageBytes, os);
os.flush();
os.close();
BufferedReader in = new BufferedReader(new InputStreamReader(is));
String inputLine;
while ((inputLine = in.readLine()) != null) {
System.out.println(inputLine);
}
in.close();
Here's the code from the GAE servlet:
public void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {
BufferedReader reader = request.getReader();
byte[] imageBytes = IOUtils.toByteArray(reader);
PrintWriter outputWriter = response.getWriter();
int len = request.getContentLength();
outputWriter.println("Content type is: " + request.getContentType());
outputWriter.println("Content length is: " + request.getContentLength());
outputWriter.println("Bytes read: " + imageBytes.length);
outputWriter.close();
}
The output from the server is:
Content type is: application/octet-stream
Content length is: 0
Bytes read: 0
I've tried just about everything like different readers and streams, but always with the same result: An empty byte array on the server side. I'm using the IOUtils class from the Apache Commons IO package, but I've tried without as well.
Any ideas why this is happening? Thanks in advance for any clues!

Upload image from J2ME client to a Servlet

I want to send an image from a J2ME client to a Servlet.
I am able to get a byte array of the image and send it using HTTP POST.
conn = (HttpConnection) Connector.open(url, Connector.READ_WRITE, true);
conn.setRequestMethod(HttpConnection.POST);
conn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
os.write(bytes, 0, bytes.length); // bytes = byte array of image
This is the Servlet code:
String line;
BufferedReader r1 = new BufferedReader(new InputStreamReader(in));
while ((line = r1.readLine()) != null) {
System.out.println("line=" + line);
buf.append(line);
}
String s = buf.toString();
byte[] img_byte = s.getBytes();
But the problem I found is, when I send bytes from the J2ME client, some bytes are lost. Their values are 0A and 0D hex. Exactly, the Carriage Return and Line Feed.
Thus, either POST method or readLine() are not able to accept 0A and 0D values.
Any one have any idea how to do this, or how to use any another method?
That's because you're using a BufferedReader to read the binary stream line by line. The readLine() basically splits the content on CRLF. Those individual lines doesn't contain the CRLF anymore.
Don't use the BufferedReader for binary streams, it doesn't make sense. Just write the obtained InputStream to an OutputStream of any flavor, e.g. FileOutputStream, the usual Java IO way.
InputStream input = null;
OutputStream output = null;
try {
input = request.getInputStream();
output = new FileOutputStream("/path/to/file.ext");
byte[] buffer = new byte[10240];
for (int length = 0; (length = input.read(buffer()) > 0;) {
output.write(buffer, 0, length);
}
} finally {
if (output != null) output.close();
if (input != null) input.close();
}
That said, the Content-Type you're using is technically wrong. You aren't sending a WWW-form URL-encoded value in the request body. You are sending a binary stream. It should be application/octet-stream or maybe image. This is not the cause of this problem, but it is just plain wrong.

Categories

Resources