I am trying to send several locations in one request to Google Elevation API, you are supposed to be able to send up to 512 locations per request. Their documentation says to use:
An array of coordinates separated using the pipe ('|') character: locations=40.714728,-73.998672|-34.397,150.644
but I am getting back the error:
Caused by: java.net.URISyntaxException: Illegal character in query at index 99: https://maps.googleapis.com/maps/api/elevation/json?locations=51.606013718523265,-8.432384161819547|51.606031961540985,-8.432374430210215|51.60607166348032,-8.432334651837888|51.60610446039263,-8.4322494395575&key=myAPIkey
It works if I just send a single point. I am told to use the pipe ('|') character yet it won't accept it. My code is
position = ellipsePositions.get(index1);
longitude = position.getLongitude();
latitude = position.getLatitude();
APIstring = latitude + "," + longitude;
for (int index = 1; index < indexList.size(); index++)
{
if (ellipsePositions.get(index) != null)
{
position = ellipsePositions.get(index);
longitude = position.getLongitude();
latitude = position.getLatitude();
APIstring = APIstring + "|" + latitude + "," + longitude;
}
}
and
WebResource webResource = client.resource("https://maps.googleapis.com/maps/api/elevation/json?locations="+ APIstring + "&key=myAPIkey");
ClientResponse response = webResource.accept("application/json").get(ClientResponse.class);
String data = response.getEntity(String.class);
Can anyone help?
You need to URL encode any "unsafe" characters in the query string. Per the documentation on web services
Building a Valid URL
You may think that a "valid" URL is self-evident, but that's not quite the case. A URL entered within an address bar in a browser, for example, may contain special characters (e.g. "上海+中國"); the browser needs to internally translate those characters into a different encoding before transmission. By the same token, any code that generates or accepts UTF-8 input might treat URLs with UTF-8 characters as "valid", but would also need to translate those characters before sending them out to a web server. This process is called URL-encoding.
We need to translate special characters because all URLs need to conform to the syntax specified by the W3 Uniform Resource Identifier specification. In effect, this means that URLs must contain only a special subset of ASCII characters: the familiar alphanumeric symbols, and some reserved characters for use as control characters within URLs.
Some common characters that must be encoded are:
Unsafe character Encoded value
| %7C
Related
My Flutter app calls a REST API method /user/search/<search string> and I am forming the URL endpoint using encodeQueryComponent like this:
String endpoint = "/user/search/"+Uri.encodeQueryComponent(searchString);
The back-end implemented in Java tries to retrieve the search string like this:
String value = URLDecoder.decode(value, StandardCharsets.UTF_8.toString());
However, when the search string contains the + sign, the raw encode string in the back-end contains %2B and the decoded String contains space. As a temporary hack, I am currently doing value = value.replace("%2B", "+"); instead of decode. But this is obviously not the right approach because the search string may contain characters from any language or special characters.
Can someone tell me what is the right way to get the original string sent by the user in Java?
I have setup a URI as below:
router.attach("/pmap/campaign/{campaign}/staffCat/{staffCat}/isp/{isp}", PMAPResource.class);
In the PMAPResource.class I have the following code:
public Representation represent()
{
String campaignID = (String) this.getRequestAttributes().get("campaign");
String staffCat = Reference.decode((String) this.getRequestAttributes().get("staffCat"));
String ispID = (String) this.getRequestAttributes().get("isp");
}
The staffCat field is manual input from user, it can be anything. Some examples are:
BAA2(A)
BAB1#(A)
BA B1
It works for most cases until it hits the # sign where it returns 404 Not Found error. Console dump shows the following /fwd-PMAP/pmap/campaign/1/staffCat/BAB1.
What should I do in order to read the "#" so I can get BAB1#(A) as it is?
# has special meaning in a URL. It must be escaped.
There are actually many special characters in URLs, so you should always escape arbitrary text when building a URL in the client.
In this particular case, the correct URL sent by the client would be something like:
/fwd-PMAP/pmap/campaign/1/staffCat/BAB1%23(A)/isp/XXX
Again, this is a problem that needs to be fixed on the client. The server will decode the %23 for you.
Pages with spaces in the URL don't get correctly translated:
i.e.
http://www.streetinsider.com/Press Releases/National Trends Reflected in Plano Housing Market/9778767.html
or
http://www.streetinsider.com/Press%20Releases/National+Trends+Reflected+in+Plano+Housing+Market/9778767.html
Gives 404. Please note "Press Releases" is encoded as "Press%20Releases".
However following two versions work fine where "Press Releases" is encoded as "Press+Releases".
http://www.streetinsider.com/Press+Releases/National+Trends+Reflected+in+Plano+Housing+Market/9778767.html
The article parses fine with plus signs or HEX spaces %20.
http://www.streetinsider.com/Press+Releases/National%20Trends%20Reflected%20in%20Plano%20Housing%20Market/9778767.html
Both + and %20 represent spaces. Then why this behavior.
And also, in java what could I use to get the correct encoded URL
Both + and %20 represent spaces
Only in query strings. Elsewhere in a URL a plus is a plus, not a space. In this case the web server gives you the same content for the two different URLs
http://www.streetinsider.com/Press+Releases/National+Trends+Reflected+in+Plano+Housing+Market/9778767.html
and
http://www.streetinsider.com/Press+Releases/National%20Trends%20Reflected%20in%20Plano%20Housing%20Market/9778767.html
but the two URLs are distinct, they're not alternative representations of the same URL.
Officially + might only be used in the query string (after ?).
This is what URLEncoder is for:
"?x=" + URLEncoder.encode("Hello World", "UTF-8");
"?x=" + URLEncoder.encode("ŝi estas ĉarma", "UTF-8");
?x=Hello+World
?x=%C5%9Di+estas+%C4%89arma
The more universal class URI, obeys the specification for spaces to be replaced, using %.
URI uri = new URI("http", "www.streetinsider.com",
"/Press Releases/National Trends Reflected in Plano Housing Market/9778767.html",
"?x=ŝi estas ĉarma");
String u = uri.toString();
http://www.streetinsider.com/Press%20Releases/National%20Trends%20
Reflected%20in%20Plano%20Housing%20Market/9778767.html#?x=ŝi%20estas%20ĉarma
One sometime encounters URI as generalisation for File and others, and then has to be careful not introducing %20 in file names.
So probably there is a partial remapping on streetinsider of + or even %20 as it seems; in order to reach the same code.
Your statement
Both + and %20 represent spaces.
is not exactly true in all cases.
Space characters may only be encoded as "+" in one context: application/x-www-form-urlencoded key-value pairs.
The RFC-1866 (HTML 2.0 specification), paragraph 8.2.1. subparagraph 1. says: "The form field names and values are escaped: space characters are replaced by `+', and then reserved characters are escaped").
Here is an example of such a string in URL where RFC-1866 allows encoding spaces as pluses: "http://example.com/over/there?name=foo+bar". So, only after "?", spaces can be replaced by pluses (in other cases, spaces should be encoded to %20). This way of encoding form data is also given in later HTML specifications, for example, look for relevant paragraphs about application/x-www-form-urlencoded in HTML 4.01 Specification, and so on.
The URL that you have provided is not a form data containing key/value pairs, it's just a path to a 9778767.html file:
http://www.streetinsider.com/Press%20Releases/National+Trends+Reflected+in+Plano+Housing+Market/9778767.html
So, it is illegal to use pluses here. The correct URL in this case should have been the following:
http://www.streetinsider.com/Press%20Releases/National%20Trends%20Reflected%20in%20Plano%20Housing%20Market/9778767.html
I'm trying to get an url parameter in jee.
So I have this kind of url :
http://MySite/MySite.jsp?page=recherche&msg=toto
First i tried with : request.getParameter("msg").toString();
it works well but if I try to search "c++" , the method "getParameter()" returns "c" and not "c++" and i understand.
So I tried another thing. I get the current URL and parse it to get the value of the message :
String msg[]= request.getQueryString().split("msg=");
message=msg[1].toString();
It works now for the research "c++" but now I can't search accent. What can I do ?
EDIT 1
I encode the message in the url
String urlString=Utils.encodeUrl(request.getParameter("msg"));
so for the URL : http://MySite/MySite.jsp?page=recherche&msg=c++
i have this encoded URL : http://MySite/MySite.jsp?page=recherche&msg=c%2B%2B
And when i need it, i decode the message of the URL
String decodedUrl = URLDecoder.decode(url, "ISO-8859-1");
Thanks everybody
Anything you send via "get" method goes as part of the url, which needs to be urlencoded to be valid in case it contains at least one of the reserved characters. So, any character will need to be encoded before sending.
In order to send c++, you would have to send c%2B%2B. That would be interpreted properly at the server side.
Here some reference you can check:
http://www.blooberry.com/indexdot/html/topics/urlencoding.htm
Now the question is, how and where do you generate your URL? According to the language, you will need to use the proper method to encode your strings.
if I try to search "c++" , the method "getParameter()" returns "c" and not "c++"
Query parameters are treated as application/x-www-form-urlencoded, so a + character in the URL means a space character in the parameter value. If you want to send a + character then it needs to be encoded in the URL as %2B:
http://MySite/MySite.jsp?page=recherche&msg=c%2B%2B
The same applies to accented characters, they need to be escaped as the bytes of their UTF-8 representation, so été would need to be:
msg=%C3%A9t%C3%A9
(é being Unicode character U+00E9, which is C3 A9 in UTF-8).
In short, it's not the fault of this code, it's the fault of whatever component is responsible for constructing the URL on the client side.
Call your URL with
msg=c%2B%2B
+ in a URL mean 'space'. It needs to be escaped.
You need to escape special characters when passing them as URL parameters. Since + means space and & means and another parameter, these cannot be used as parameter values.
See this other S.O. question.
You may want to use the Apache HTTP client library to help you with the URL encoding/decoding. The URIUtil class has what you need.
Something like this should work:
String rawParam = request.getParameter("msg");
String msgParam = URIUtil.decode(rawParam);
Your example indicates that the data is not being properly encoded on the client side. See this JavaScript question.
I'm testing PHP urlencode() vs. Java java.net.URLEncoder.encode().
Java
String all = "";
for (int i = 32; i < 256; ++i) {
all += (char) i;
}
System.out.println("All characters: -||" + all + "||-");
try {
System.out.println("Encoded characters: -||" + URLEncoder.encode(all, "utf8") + "||-");
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
PHP
$all = "";
for($i = 32; $i < 256; ++$i)
{
$all = $all.chr($i);
}
echo($all.PHP_EOL);
echo(urlencode(utf8_encode($all)).PHP_EOL);
All characters seem to be encoded in the same way with both functions, except for the 'asterisk' character that is not encoded by Java, and translated to %2A by PHP. Which behaviour is supposed to be the 'right' one, if any?
Note: I tried with rawurlencode(), too - no luck.
It is okay to have a * in a URL, (but it is also okay to have it in its encoded form).
RFC1738: Uniform Resource Locators (URL) states the following:
Reserved:
[...]
Usually a URL has the same interpretation when an octet is
represented by a character and when it encoded. However, this is not
true for reserved characters: encoding a character reserved for a
particular scheme may change the semantics of a URL.
Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.
On the other hand, characters that are not required to be encoded
(including alphanumerics) may be encoded within the scheme-specific
part of a URL, as long as they are not being used for a reserved
purpose.
Wikipedia suggests that * is a reserved character when it comes to URIs, and that it must be encoded if not used for the reserved purpose. According to RFC3986, pages 12-13:
URIs include components and subcomponents that are delimited by
characters in the "reserved" set. These characters are called
"reserved" because they may (or may not) be defined as delimiters by
the generic syntax, by each scheme-specific syntax, or by the
implementation-specific syntax of a URI's dereferencing algorithm.
If data for a URI component would conflict with a reserved
character's purpose as a delimiter, then the conflicting data must be
percent-encoded before the URI is formed.
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "#"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
(The fact that the URL RFC still allows the * character to go unencoded is that is doesn't have a reserved purpose i URLs, and as such doesn't have to be encoded. So wether you have to encode it or not depends on what sort of URI you're creating.)
Javadoc of URLEncoder refers to the HTML specification:
This class contains static methods for converting a String to the application/x-www-form-urlencoded MIME format. For more information about HTML form encoding, consult the HTML specification.
HTML4 is quite unclear regarding this question and refers to RFC1738, which is quoted by aioobe:
Control names and values are escaped. Space characters are replaced by '+', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by '%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., '%0D%0A').
However, HTML5 directly states that * should not be encoded:
If the character isn't in the range U+0020, U+002A, U+002D, U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005F, U+0061 to U+007A
Replace the character with a string formed as follows:
...
Otherwise
Leave the character as is.