URL encoding of "&" - java

I am currently facing an issue with sending some encoded characters. My main aim is to send the text using a POST https request. The catch is that the back-end is not quite the best one, so special letters (such as æ) I have to send in a special (custom) way.
Giving a simple example, I have the text hjælp. The letter æ should become &aelig in order the back-end to understand that it's this specific letter.
My url looks like this:
https://example.com/back-end/sendText?user=admin&text=hj&aeliglp
Obviously, this wouldn't work, because the back-end would see 3 parameter keys: user, text and aeliglp.
Of course, in code, my url is an actual URL object. However, if I use URLEncoder.encode(value, "utf-8"); it would turn my & into %26 and the %26 itself to %2526.
On wikipedia I read about this:
Because the percent ("%") character serves as the indicator for
percent-encoded octets, it must be percent-encoded as "%25" for that
octet to be used as data within a URI.
Nevertheless, I must send it with a %26, but without encoding it to %2526. That is because I cannot change or ask for a change on the back-end.
In order to send the POST I use the most basic way:
HttpsURLConnection conn = (HttpsURLConnection) url.openConnection(); //url is my URL object
conn.setRequestMethod("POST");
int result = conn.getResponseCode();
Is there any way I can create an URL object without encoding it automatically?

Related

How to send special character via HTTP post request made in Java

I need to send data to another system in a Java aplication via HTTP POST method. Using the Apache HttpClient library is not an option.
I create a URL, httpconection without problems. But when sending special character like Spanish Ñ, the system complains it is receiving
Ñ instead of Ñ.
I've read many post, but I don't understand some things:
When doing a POST connection, and writing to the connection object, is it mandatory to do the URLEncode.encode(data,encoding) to the data being sent?
When sending the data, in some examples I have seen they use the
conn.writeBytes(strData), and in other I have seen conn.write(strData.getBytes(encoding)). Which one is it better? Is it related of using the encode?
Update:
The current code:
URL url = new URL(URLstr);
conn1 = (HttpsURLConnection) url.openConnection();
conn1.setRequestMethod("POST");
conn1.setDoOutput(true);
DataOutputStream wr = new DataOutputStream(conn1.getOutputStream());
wr.writeBytes(strToSend);//data sent
wr.flush();
wr.close();
(later I get the response)
strToSend has been previously URLENCODE.encode(,"UTF-8")
I still don't know if I must use urlencode in my code and/or setRequestProperty("Contentype","application/x-www-formurlencode");
Or if I must use .write(strToSend.getByte(??)
Any ideas are welcome. I am testing also the real server (I dont know very much about it)

Sending UTF-8 values in HTTP headers results in Mojibake

i want to send arabic data from servlet using HTTPServletResponse to client
i am trying this
response.setCharacterEncoding("UTF-8");
response.setHeader("Info", arabicWord);
and i receive the word like this
String arabicWord = response.getHeader("Info");
in client(receiving) also tried this
byte[]d = response.getHeader("Info").getBytes("UTF-8");
arabicWord = new String(d);
but seems like there is no unicode because i receive strange english words,so please how can i send and receive arabic utf8 words?
HTTP headers doesn't support UTF-8. They officially support ISO-8859-1 only. See also RFC 2616 section 2:
Words of *TEXT MAY contain characters from character sets other than ISO- 8859-1 [22] only when encoded according to the rules of RFC 2047 [14].
Your best bet is to URL-encode and decode them.
response.setHeader("Info", URLEncoder.encode(arabicWord, "UTF-8"));
and
String arabicWord = URLDecoder.decode(response.getHeader("Info"), "UTF-8");
URL-encoding will transform them into %nn format which is perfectly valid ISO-8859-1. Note that the data sent in the headers may have size limitations. Rather send it in the response body instead, in plain text, JSON, CSV or XML format. Using custom HTTP headers this way is namely a design smell.
I don't know where word variable is comming from, but try this:
arabicWord = new String(d, "UTF-8");
UPDATE: Looks like the problem is with UTF-8 encoded data in HTTP headers, see: HTTP headers encoding/decoding in Java for detailed discussion.

HttpURLConnection and encoded characters in the URL

I have a code like this:
URL url = new URL("http://foo.com/?param=paj%E9");
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
...
However, it seems like the openConnection is supressing the "%E9" part of the url, and the server ends up receiving a request http://foo.com?param=paj
Am I forgetting to apply any different setting for this to work properly?
Thanks!
EDIT: The url "http://foo.com/?param=paj%E9" is already encoded (from http://foo.com/?param=pajé), and this should be the request the server should receive. If I try to access http://foo.com/?param=paj%E9 straight from the browser, it works as expected. If I URLEncode "paj%E9", I'll be double-encoding the parameter, and the server would see "paj%E9" instead "pajé" upon decoding the value.
I'm actually trying to build a proxy, and therefore I receive the urls already encoded. The problem is that whenever I pass such an encoded parameter to be requested using HttpURLConnection, it simply ignores the encoded part (like %E9).
You need to use java.net.URI class to encode your URL instead of handle it on your own. Chek this:
HTTP URL Address Encoding in Java
You can use the following code
URLEncoder.encode("中文", "utf-8")

java.io.IOException: Server returns HTTP response code 505

I have HTML based queries in my code and one specific kind seems to give rise to IOExceptions upon receiving 505 response from the server. I have looked up the 505 response along with other people who seemed to have similar problems. Apparently 505 stands for HTTP version mismatch, but when I copy the same query URL to any browser (tried firefox, seamonkey and Opera) there seems to be no problem. One of the posts I read suggested that the browsers might automatically handle the version mismatch problem..
I have tried to dig in deeper by using the nice developer tool that comes with Opera, and it looks like there is no mismatch in versions (I believe Java uses HTTP 1.1) and a nice 200 OK response is received. Why do I experience problems when the same query goes through my Java code?
private InputStream openURL(String urlName) throws IOException{
URL url = new URL(urlName);
URLConnection urlConnection = url.openConnection();
return urlConnection.getInputStream();
}
sample link: http://www.uniprot.org/uniprot/?query=mnemonic%3aNUGM_HUMAN&format=tab&columns=id,entry%20name,reviewed,organism,length
There has been some issues in Tomcat with URLs containing space in it. To fix the problem, you need to encode your url with URLEncoder.
Example (notice the space):
String url="http://example.org/test test2/index.html";
String encodedURL=java.net.URLEncoder.encode(url,"UTF-8");
System.out.println(encodedURL); //outputs http%3A%2F%2Fexample.org%2Ftest+test2%2Findex.html
AS a developer at www.uniprot.org I have the advantage of being able to look in the request logs. In the last year according to the logs we have not send a 505 response code. In any case our servers do understand http 1 requests as well as the default http1.1 (though you might not get the results that you expect).
That makes me suspect there was either some kind of data corruption on the way. Or you where affected by a hardware failure (lately we have had some trouble with a switch and a whole datacentre ;). In any case if you ever have questions or problems with uniprot.org please contact help#uniprot.org then we can see if we can help/fix the problem.
Your code snippet seems normal and should work.
Regards,
Jerven Bolleman
Are you behind a proxy? This code works for me and prints out the same text I see through a browser.
final URL url = new URL("http://www.uniprot.org/uniprot/?query=mnemonic%3aNUGM_HUMAN&format=tab&columns=id,entry%20name,reviewed,organism,length");
final URLConnection conn = url.openConnection();
final InputStream is = conn.getInputStream();
System.out.println(IOUtils.toString(is));
conn is an instance of HttpURLConnection
from the API documentation for the URL class:
The URL class does not itself encode or decode any URL components
[...]. It is the responsibility of the caller to encode any fields,
which need to be escaped prior to calling URL, and also to decode any
escaped fields, that are returned from URL.
so if you have any spaces in your url-str encode it before calling new URL(url-str)
#posdef I was having same HTTP error code 505 problem. When I pasted URL that I was using in Java code in Firefox, Chrome it worked. But through code was giving IOException. But at last I came to know that in url string there were brackets '(' and ')', by removing them it worked so it seems I needed URLEncodeing same like browsers.

Safe Data serialization for Plain HTTP GET & POST communication

I'm using the client's browser to submit HTTP request.
For report generation the securityToken is submitted as POST, for report download the same token needs to be submitted by the user browser, this time using GET.
What encoding would you recommend for the securityToken which actually represents encrypted data.
I've tried BASE64 but this fails because the standard can include the "+" character which gets translated in HTTP GET to ' ' (blank space).
Then I tried URL Encoding, but this fails because for HTTP POST stuff such as %3d are transmitted without translation but when browser does HTTP GET with the data %3d is converted to '='.
What encoding would you recommend, to allow safe transmission over HTTP POST & GET without data being misinterpreted.
The environment is Java, Tomcat.
Thank you,
Maxim.
Hex string.
Apache commons-codec has a Hex class that provides this functionality.
It will look like this:
http://youraddress.com/context/servlet?param=ac7432be432b21
Well, you can keep the Base64 and use this solution:
Code for decoding/encoding a modified base64 URL

Categories

Resources