how to store special character '{' and '}' in StringBuilder.AppendFormat - java

I am creating one Project using RESTFUL WEBSERVICES but in the url when I am giving
url: "http://localhost:8080/RestfulWebservicesNewVersion2/REST/webservices/GetFriend"
I am getting this output:
"\u0027EmployeeList\u0027:{{\u0027emp_id\u0027:\u00272\u0027,\u0027emp_ename\u0027:
\u0027rkjha\u0027,\u0027emp_phoneno\u0027:\u00273232323232\u0027,\u0027emp_email\u0027
Can you tell me how could I will remove the "U0027 " part.

You can use java.text.normalizer to remove Unicode characters that are not in the "normal" ASCII character set.

Related

Why is java returning encoded values different

I am not quite sure why does java return %27+ for special characters in the name.
For example, the value I am trying to encode was "Mc' Donald". Its encoding to "Mc%27+Donald" when it should be "Mc%27%20Donald". reason why I replaced in the first place is db has ' instead of ' so replacing and encoding again.
lastName = URLEncoder.encode(lastName.replace("'", "'"), "UTF-8");
In HTML encoding, + is a valid replacement for SPACE (%20) as well.

Apache Nutch 2.3.1 Fetcher giving Invalid uri exception

I have configured Apache Nutch 2.3.1 with Hadoop ecosystem. I have to fetch some person-arabic script websites. Nutch is giving exception for few URLs at fetch time. Following is an example exception
java.lang.IllegalArgumentException: Invalid uri 'http://agahi.safirak.com/ads/850/پیچ-بند-بادی-هفتیری-1800-دور-بادی-جیسون.html': escaped absolute path not valid
at org.apache.commons.httpclient.HttpMethodBase.<init>(HttpMethodBase.java:222)
at org.apache.commons.httpclient.methods.GetMethod.<init>(GetMethod.java:89)
at org.apache.nutch.protocol.httpclient.HttpResponse.<init>(HttpResponse.java:77)
at org.apache.nutch.protocol.httpclient.Http.getResponse(Http.java:173)
at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:245)
at org.apache.nutch.fetcher.FetcherReducer$FetcherThread.run(FetcherReducer.java:564)
I've been able to reproduce this issue even on the 1.x branch. The problem is that the Java URI class that the Apache HTTP client library uses internally doesn't support non escaped UTF-8 characters:
From the JavaDoc documentation for java.net.URI:
Character categories
RFC 2396 specifies precisely which characters are permitted in the various components of a URI reference. The following categories, most of which are taken from that specification, are used below to describe these constraints:
alpha The US-ASCII alphabetic characters, 'A' through 'Z' and 'a' through 'z'
digit The US-ASCII decimal digit characters, '0' through '9'
alphanum All alpha and digit characters
unreserved All alphanum characters together with those in the string "_-!.~'()*"
punct The characters in the string ",;:$&+="
reserved All punct characters together with those in the string "?/[]#"
escaped Escaped octets, that is, triplets consisting of the percent character ('%') followed by two hexadecimal digits ('0'-'9', 'A'-'F', and 'a'-'f')
other The Unicode characters that are not in the US-ASCII character set, are not control characters (according to the Character.isISOControl method), and are not space characters (according to the Character.isSpaceChar method) (Deviation from RFC 2396, which is limited to US-ASCII)
The set of all legal URI characters consists of the unreserved, reserved, escaped, and other characters.
Properly escaped the URL would look more like:
http://agahi.safirak.com/ads/850/%D9%BE%DB%8C%DA%86-%D8%A8%D9%86%D8%AF-%D8%A8%D8%A7%D8%AF%DB%8C-%D9%87%D9%81%D8%AA%DB%8C%D8%B1%DB%8C-1800-%D8%AF%D9%88%D8%B1-%D8%A8%D8%A7%D8%AF%DB%8C-%D8%AC%DB%8C%D8%B3%D9%88%D9%86.html
Actually if you open the example URL on Chrome and then copy the URL from the address bar, you'll get the escaped representation. Feel free to open an issue for this (otherwise I'll do it). In the mean time you could try to use the protocol-http plugin which does not uses the Apache HTTP client. I've tested locally and the parsechecker works fine:
➜ local (master) ✗ bin/nutch parsechecker "http://agahi.safirak.com/ads/850/پیچ-بند-بادی-هفتیری-1800-دور-بادی-جیسون.html"
fetching: http://agahi.safirak.com/ads/850/پیچ-بند-بادی-هفتیری-1800-دور-بادی-جیسون.html
robots.txt whitelist not configured.
parsing: http://agahi.safirak.com/ads/850/پیچ-بند-بادی-هفتیری-1800-دور-بادی-جیسون.html
contentType: text/html
signature: 048b390ab07464f5d61ae09646253529
---------
Url
---------------
http://agahi.safirak.com/ads/850/پیچ-بند-بادی-هفتیری-1800-دور-بادی-جیسون.html
---------
ParseData
---------
Version: 5
Status: success(1,0)
Title: پیچ بند بادی هفتیری 1800 دور بادی جیسون-نیازمندی سفیرک
Outlinks: 76
outlink: toUrl: http://agahi.safirak.com/ads/850/پیچ-بند-بادی-هفتیری-1800-دور-بادی-جیسون.html anchor:
outlink: toUrl: http://agahi.safirak.com/assets/fonts/font-awesome/css/font-awesome.min.css anchor:
outlink: toUrl: http://agahi.safirak.com/assets/css/bootstrap.css anchor:
...

Unexpected output from URLEncoder in Java

I am trying to encode a URL parameter.
For example when I am encoding
qOddENxeLxL+13drGKYUgA==\n
using URL Encoder tool
It gives the following output which works when I request API
qOddENxeLxL%2B13drGKYUgA%3D%3D%5Cn
But when I am encoding URL from my Java code (Android) using URLEncoder.encode("qOddENxeLxL+13drGKYUgA==\n", "UTF-8");
It gives me the following result
qOddENxeLxL%252B13drGKYUgA%253D%253D%250A
I tried using other Encoding schemes too but could not produce the same result.
The issue is because the \n is being interpreted as a new line character. Java will treat \ inside a string as starting an escape sequence.
You have to escape it in order to get the same thing as in the URL you provided.
System.out.println(URLEncoder.encode("qOddENxeLxL+13drGKYUgA==\\n", "UTF-8"));
This will provide the same result:
qOddENxeLxL%2B13drGKYUgA%3D%3D%5Cn
The issue is that you are feeding \n to the URLEncoder tool, which doesn't understand it as an escape sequence and so gives you %5Cn, and to the Java compiler inside a string literal, which does understand it and so gives you 0x0A.
Figured out the issue, here string was getting encoded two times.
While passing parameter to Retrofit call it is getting encoded automatically by retrofit and I was passing encoded parameter to retrofit so it got encoded again.
BTW thanks for the explanations. :)

Resty - IllegalArgumentException with | character in URL

I use Resty client for handling Facebook REST API. The problem is that I want to use "|" character in Facebook token as it is in the doc:
https://graph.facebook.com/800309809778160/permissions?access_token=861093975893683|t5r-lFvnrsEQ_xTtUsdMuiEdFdsdE
When I paste this URL to the browser - works fine. But when I do it using Resty, (new Resty().text(url)) it throws exception:
Exception in thread "main" java.lang.IllegalArgumentException: Illegal character in query at index 83: https://graph.facebook.com/800309809778160/permissions?access_token=861093975893683|t5r-lFvnrsEQ_xTtUsdMuiEdFdsdE
at java.net.URI.create(URI.java:852)
at us.monoid.web.Resty.text(Resty.java:271)
I wonder if I should use another REST client (like HTTPUrlConnection or Rapa), or the reason is elsewhere.
You need to encode special characters in url like "&" , "?" with their encoding value
instead of "|" pass "%7C" in url
check complete list of encoding value of character at http://www.w3schools.com/tags/ref_urlencode.asp
You need to escape the | character in the URL with %7C
https://graph.facebook.com/800309809778160/permissions?access_token=861093975893683%7Ct5r-lFvnrsEQ_xTtUsdMuiEdFdsdE
You can checkout more escape character here: http://www.blooberry.com/indexdot/html/topics/urlencoding.htm
You could use this method
java.net.URLEncoder.encode()

Tomcat: possible to parse URL parameters containing '&'?

I have a servlet running on tomcat 6 which should be called as follows:
http://<server>/Address/Details?summary="Acme & co"
However: when I iterate through the parameters in the servlet code:
//...
while (paramNames.hasMoreElements()) {
paramName = (String) paramNames.nextElement();
if (paramName.equals("summary")) {
summary = request.getParameter(paramName).toString();
}
}
//...
the value of summary is "Acme ".
I assume tomcat ignores the quotes - so it sees "& co" as a second parameter (albeit improperly formed: there's no =...).
So: is there any way to avoid this? I want the value of summary to be "Acme & co". I tried replacing '&' in the URL with & but that doesn't work (presumably because it's decoded back to a straight '&' before the params are parsed out).
Thanks.
Use http://<server>/Address/Details?summary="Acme %26 co". Because in URL special http symbol(e.g. &,/, //) does not work as parameters.
Are you encoding and decoding the URL with URLEncode ? If so, can you check what the input and output of those are ? Seems like one of the special characters is not being properly encoded/decoded
Try %26 for the &
Try your parameter like
summary="Acme & co"
& is part reserved characters. Refer RFC2396 section
2.2. Reserved Characters.
how to encode URL to avoid special characters in java
Characters allowed in GET parameter
HTTP URL - allowed characters in parameter names
http://illegalargumentexception.blogspot.in/2009/12/java-safe-character-handling-and-url.html

Categories

Resources