Urlencoding data for post request body. Am I using wrong charset? - java

I want to replicate a working POST request in Java. For testing purpose, lets take message like: 'äöõüäöõüäöõüäöõü'
Working POST request (with encoded message of 'äöõüäöõüäöõüäöõü'):
Header
POST http://www.mysite.com/newreply.php?do=postreply&t=477352 HTTP/1.1
Host: www.warriorforum.com
Connection: keep-alive
Content-Length: 403
Origin: http://www.mysite.com
X-Requested-With: XMLHttpRequest
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko)Chrome/14.0.835.202 Safari/535.1
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Accept: */*
Referer: http://www.mysite.com/test-forum/477352-test.html
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: bblastvisit=1319205053; bblastactivity=0; bbuserid=265374; bbpassword=1125e9ec1ab41f532ab8ec6f77ddaf94; bbsessionhash=91444317c100996990a04d6c5bbd8375;
Body
securitytoken=1319806096-618e5f9012901e2d818bf2c74c2121baa064be57&ajax=1&ajax_lastpost=1319806096&**message=%u00E4%u00F6%u00F5%u00FC%u00E4%u00F6%u00F5%u00FC%u00E4%u00F6%u00F5%u00FC%u00E4%u00F6%u00F5%u00FC**&wysiwyg=0&styleid=1&signature=1&fromquickreply=1&s=&do=postreply&t=477352&p=who%20cares&specifiedpost=0&parseurl=1&loggedinuser=265374
As we can see in the request body 'äöõüäöõüäöõüäöõü is encoded as: %u00E4%u00F6%u00F5%u00FC%u00E4%u00F6%u00F5%u00FC%u00E4%u00F6%u00F5%u00FC%u00E4%u00F6%u00F5%u00FC
Now i want to replicate it.
Lets Url encode the text with charset utf-8 in Java:
String userText = "äöõüäöõüäöõüäöõü";
String encoded = URLEncoder.encode(userText, "utf-8");
Result: %C3%A4%C3%B6%C3%B5%C3%BC%C3%A4%C3%B6%C3%B5%C3%BC%C3%A4%C3%B6%C3%B5%C3%BC%C3%A4%C3%B6%C3%B5%C3%BC%0A%0A%0A%5BSIZE%3D%221%22%5D%5BI%5D << NOT THE SAME
Lets try ISO-8859-1:
String userText = "äöõüäöõüäöõüäöõü";
String encoded = URLEncoder.encode(userText, "ISO-8859-1");
Result: %E4%F6%F5%FC%E4%F6%F5%FC%E4%F6%F5%FC%E4%F6%F5%FC%0A%0A%0A%5BSIZE%3D%221%22%5D%5BI%5D << NOT THE SAME
Neither of them produce the same encoded string as in the working example, but all of them have the same input. What am I missing here?

%u00E4%u00F6%u00F5%u00FC%u00E4%u00F6%u00F5%u00FC%u00E4%u00F6%u00F5%u00FC%u00E4%u00F6%u00F5%u00FC
I don't know what the above data is encoded as, but it isn't application/x-www-form-urlencoded; charset=UTF-8 as the request claims. This is not legal data for this MIME type.
It looks like some UTF-16BE-encoded form.
URLEncoder.encode(userText, "utf-8"); would be the correct way to encode the application/x-www-form-urlencoded; charset=UTF-8 values if this was actually what the server was expecting. (ref)

Related

Handling blank line in BufferedReader

I am receiving this POST request from a client:
HTTP method: POST
Host: 127.0.0.1:52400
Connection: keep-alive
Content-Length: 18
Pragma: no-cache
Cache-Control: no-cache
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Origin: null
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.122 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip,deflate
Accept-Language: da-DK,da;q=0.8,en-US;q=0.6,en;q=0.4,es;q=0.2
fname=foof&pw=bar
I have a small and very simple Java Webserver running, getting this request from InputStream.
From the BufferedReader I set data to a String, containing the request, like this:
for (String line; (line = in.readLine()) != null; ) {
if (line.isEmpty()) break;
header += line + "\n";
}
When I print header to the console, I get this:
POST / HTTP/1.1
Host: 127.0.0.1:52400
Connection: keep-alive
Content-Length: 18
Pragma: no-cache
Cache-Control: no-cache
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Origin: null
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.122 Safari/537.36
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip,deflate
Accept-Language: da-DK,da;q=0.8,en-US;q=0.6,en;q=0.4,es;q=0.2
The POST parameters are left out
I guess the problem occurs due to the blank line in the post-request.
How can I make sure the BufferedReader does read the request to the end, and not stopping at the blankline, all though stopping when the BufferedReader hits the end of the request.
Please ignore the lack of security in this example - I simply need to get the POST request into plain string representation for now.
Any help on this i appreciated, thanks!
Jesper.
The problem is because you have a break in your for loop. When you reach the blank line, it hits the break and exits the loop, not adding any of the lines after that. Instead, you should use this:
for (String line; (line = in.readLine()) != null; ) {
if (line.isEmpty()) continue;
header += line + "\n";
}
By using continue instead of break, the loop will simply proceed to the next iteration, and the rest of the lines can be added.
More information can be found here

jquery ajax and java server, lost data

i have this ajax function that looks like so
$.ajax({
type: "POST",
url: "http://localhost:55556",
data: "lots and lots of pie",
cache: false,
success: function(result)
{
alert("sent");
},
failure: function()
{
alert('An Error has occured, please try again.');
}
});
and a server that looks like so
clientSocket = AcceptConnection();
inp = new BufferedReader(new InputStreamReader (clientSocket.getInputStream()));
String requestString = inp.readLine();
BufferedReader ed = new BufferedReader(new InputStreamReader(clientSocket.getInputStream()));
while(true){
String tmp = inp.readLine();
System.out.println(tmp);
}
now the odd thing is when i send my ajax my server gets by using system.out
Host: localhost:55556
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Content-Length: 20
Origin: null
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
the question is where is the data that i sent through, where is lots of pie?
The data should come after a blank line after the header lines, but I think the problem is that the data does not end with a newline character, and therefore, you cannot read it with the .readLine() method.
While looping through the header lines, you could look for the "Content-Length" line and get the length of the data. When you have reached the blank line, stop using .readLine(). Instead switch to reading one character at a time, reading the number of characters specified by the "Content-Length" header. I think you can find example code for this in this answer.
If you can, I suggest you use a library to help with this. I think the Apache HTTP Core library can help with this. See this answer.

Send WebForm as POST - wrong format

i have a webpage with a webform. I's basically an input field and a selector, but there are a couple hidden fields. I need to send a POST request to this page (this is a search page basically). I've constructed the request as the following:
HttpClient httpClient = new DefaultHttpClient();
HttpConnectionParams.setConnectionTimeout(httpClient.getParams(), 10000);
HttpConnectionParams.setSoTimeout(httpClient.getParams(), 10000);
HttpPost httpPost = new HttpPost("http://www3.u-szeged.hu/kereses_terem.ivy");
httpPost.setHeader("Content-Type", "multipart/form-data");
httpPost.setHeader("name", "search-terem");
List<NameValuePair> nameValuePairs = new ArrayList<NameValuePair>();
nameValuePairs.add(new BasicNameValuePair("qsearch", "601"));
nameValuePairs.add(new BasicNameValuePair("medium-subtype", "meta-KOD"));
//and a lot of other parameters, please read on
httpPost.setEntity(new UrlEncodedFormEntity(nameValuePairs));
HttpResponse response = httpClient.execute(httpPost);
I've captured the sent request with Fiddler2, here is the raw request:
POST **************** HTTP/1.1
Content-Type: multipart/form-data
name: search-terem
Content-Length: 760
Host: ****************
Connection: Keep-Alive
User-Agent: Apache-HttpClient/UNAVAILABLE (java 1.4)
qsearch=601&medium-subtype=meta-KOD&meta-KOD=&meta-CIM=&num=20&root-id=epulet&relation-types=child+vchild&medium-type=meta&object-types=etr_epulet&result-order=caption&result-order-direction=ASC&pageloader.preexecute=%09%2Fivy%2Fiem-shared%2Fsystem%2Fgems%2Fform-cye%2Fform-engine.xslt&request.formXML=%25webroot%25%2Fxml%2Fforms%2Fusz-search-terem.xml&request.formAPage=%09start&request.form-save-id=epulet&request.form-save-type=col&request.form-save-enabled=true&request.form-instance=2F58DD5D-BFCC-43A4-9B0C-0BC7BF887242&request.form-save-to-instance=false&request.form-redirect-to=kereses_terem.ivy&request.form-language=hu-HU&request.form-save-language=hu-HU&request.instance-id=2F58DD5D-BFCC-43A4-9B0C-0BC7BF887242&request.formPage=--finish--&cmd=submit
If the send the form in the browser, this is the sent request:
POST http://www3.u-szeged.hu/object.epulet.ivy HTTP/1.1
Host: www3.u-szeged.hu
Connection: keep-alive
Content-Length: 3033
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Origin: http://www3.u-szeged.hu
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryuiCLY3ISz0tLmrvg
Referer: http://www3.u-szeged.hu/object.epulet.ivy
Accept-Encoding: gzip,deflate,sdch
Accept-Language: hu-HU,hu;q=0.8,en-US;q=0.6,en;q=0.4
Cookie: ivy:state-id:persistent=c9581db3-7a49-4e39-9130-3635b491385b; ivy:state-id:session=271fb66f-9579-4f13-a5b6-56c81dea4206
------WebKitFormBoundaryuiCLY3ISz0tLmrvg
Content-Disposition: form-data; name="meta-KOD"
------WebKitFormBoundaryuiCLY3ISz0tLmrvg
Content-Disposition: form-data; name="meta-CIM"
------WebKitFormBoundaryuiCLY3ISz0tLmrvg
Content-Disposition: form-data; name="num"
20
//and so on
This is obviously not the same, therefore the server responds with 0 results. How can i send a POST request like this? Thanks!

html src hidden

Trying to read a webpage using HttpClient. But some of the html is hidden by some js magic, try hitting view source on this page http://uc.worldoftanks.eu/uc/accounts/#wot&at_search=a
Any idea how to get HttpClient to return the "full" html page?
HttpClient does not process javascript, which means there is no content that can be hidden when reading the http content from the server.
It's probably the other way round, the javascript that runs on the page likely creates new html elements and appends them to the DOM... which is not something you can handle using HttpClient, HttpClient is a communication client designed purely to read data accross a HTTP connection.
When that page loads, a request is being sent to
http://uc.worldoftanks.eu/uc/accounts/?type=table&offset=0&limit=25&order_by=name&search=a&echo=1&id=accounts_index
Try hitting that address up with your HttpClient to see the table data. Play with the offset, limit and order_by values to change pagination and sorting.
Manually browsing to said URL yields a redirect, though, so there appears to be some of the Request headers that you need to include in your HttpClient. The full headers of the request my browser issues, that does yield a JSON response with the table data, is as follows:
GET /uc/accounts/?type=table&offset=0&limit=25&order_by=name&search=&echo=1&id=accounts_index HTTP/1.1
Host: uc.worldoftanks.eu
Connection: keep-alive
Referer: http://uc.worldoftanks.eu/uc/accounts/?type=table&offset=0&limit=25&order_by=name&search=a&echo=1&id=accounts_index
X-Requested-With: XMLHttpRequest
X-CSRFToken: 5e33bf57602f76de9285e9b14bcfe7fe
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.107 Safari/535.1
Accept: application/json, text/javascript, */*; q=0.01
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-GB,en;q=0.8,en-US;q=0.6,ar;q=0.4
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: csw_popup=true; __utma=21812543.1316104722.1312873581.1312873581.1312873581.1; __utmb=21812543.2.10.1312873581; __utmc=21812543; __utmz=21812543.1312873581.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); csrftoken=5e33bf57602f76de9285e9b14bcfe7fe
They might be looking for X-Requested-With or Accept or Referrer, for instance.

HTTP POST request with authorization on android

When I set "Authorization" header with setHeader from HttpPost then hostname disappears from request and there is always error 400 (bad request) returned. Same code is working fine on pure java (without android) and when I remove setting "Authorization" header also on android it works fine, but I need authorization.
This is a code (domain changed):
HttpClient client = new DefaultHttpClient();
HttpPost post = new HttpPost("http://myhost.com/test.php");
post.setHeader("Accept", "application/json");
post.setHeader("User-Agent", "Apache-HttpClient/4.1 (java 1.5)");
post.setHeader("Host", "myhost.com");
post.setHeader("Authorization",getB64Auth());
List <NameValuePair> nvps = new ArrayList <NameValuePair>();
nvps.add(new BasicNameValuePair("data[body]", "test"));
AbstractHttpEntity ent=new UrlEncodedFormEntity(nvps, HTTP.UTF_8);
ent.setContentType("application/x-www-form-urlencoded; charset=UTF-8");
ent.setContentEncoding("UTF-8");
post.setEntity(ent);
post.setURI(new URI("http://myhost.com/test.php"));
HttpResponse response =client.execute(post);
Method getB64Auth() returns "login:password" encoded using Base64 like: "YnxpcYRlc3RwMTulHGhlSGs=" but it's not important.
This is a piece of lighttpd's error.log when above code is invoked on pure java:
2011-02-23 15:37:36: (request.c.304) fd: 8 request-len: 308
POST /test.php HTTP/1.1
Accept: application/json
User-Agent: Apache-HttpClient/4.1 (java 1.5)
Host: myhost.com
Authorization: Basic YnxpcYRlc3RwMTulHGhlSGs=
Content-Length: 21
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Content-Encoding: UTF-8
Connection: Keep-Alive
HTTP/1.1 200 OK
Content-type: text/html
Transfer-Encoding: chunked
and record from access.log (IP changed):
1.1.1.1 myhost.com - [23/Feb/2011:15:37:36 +0100] "POST /test.php HTTP/1.1" 200 32 "-" "Apache-HttpClient/4.1 (java 1.5)"
When the same code is invoked on android, I get this in logs:
POST /test.php HTTP/1.1
Accept: application/json
User-Agent: Apache-HttpClient/4.1 (java 1.5)
Host: myhost.com
Authorization: Basic YnxpcYRlc3RwMTulHGhlSGs=
Content-Length: 21
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Content-Encoding: UTF-8
Connection: Keep-Alive
Expect: 100-Continue
2011-02-23 15:45:10: (response.c.128) Response-Header:
HTTP/1.1 400 Bad Request
Content-Type: text/html
Content-Length: 349
Connection: close
access.log:
1.1.1.1 - - [23/Feb/2011:15:45:10 +0100] "POST /test.php HTTP/1.1" 400 349 "-" "Apache-HttpClient/4.1 (java 1.5)"
How to get Authorization with POST working on android?
When I use HttpURLConnection instead of HttpClient it is no difference.
Thanks to Samuh for a hint :)
There was an extra newline character inserted which has no means in GET requests, but matters in POST ones.
This is proper way to generate Authorization header in android (in getB64Auth in this case):
private String getB64Auth (String login, String pass) {
String source=login+":"+pass;
String ret="Basic "+Base64.encodeToString(source.getBytes(),Base64.URL_SAFE|Base64.NO_WRAP);
return ret;
}
The Base64.NO_WRAP flag was lacking.
use simply this :
String authorizationString = "Basic " + Base64.encodeToString(
("your_login" + ":" + "your_password").getBytes(),
Base64.NO_WRAP); //Base64.NO_WRAP flag
post.setHeader("Authorization", authorizationString);

Categories

Resources