How to convert url encoded string to plain text in JAVA - java

I am looking for html references in a HTML page I am retrieving from the server. The problem is for all the hyperlinks I am retrieving, the text that I am getting is URL encoded. Lets say, the URL is "http://abc.def.com/gh?ij=x&kl=y&mn=z", my program parses it as "http://abc.def.com/gh?ij=3Dx&kl=3Dy&mn=3Dz" . (look at the difference around "=" and "&" in the two URL's) . Some searching on the Web tells me that the second URL is a URL encoded form of the first URL.
What should I do to retrieve the actual URL as it is, and not its URL encoded version? Right now, I am replacing =3D with 3D and & with &, but that is a very bad hack.

Try to use java.net.URLDecoder

Related

URL that contain the ? is changed automaticaly to %3F with thymeleaf

I read many URLs from database through Java Spring Boot and thymleaf expression. And if the URL has "?" it changed automaticaly to "%3F" for instance the URL: www.myserver.de?id=3 will be www.myserver.de%3Fid=3 I think I can't use this solution link because the URLs are readen from DB with Params
< a th:href="#{//{link}(link=${link})}" target="_blank"> For more information click here < /a >
Your URLs are url encoded in the database so ? is represented as %3F. If you want to get the original URL from encoded URL you need to url decode it e.g. by doing:
String result = java.net.URLDecoder.decode(url, StandardCharsets.UTF_8);
You can decode it back to what was your original url.
If you are using java use java.net.URLDecoder.
More here
If you are just reading complete URLs straight from the database, you shouldn't be using Thymeleaf standard url syntax (as it's designed to escape URL parameters). You should just do something like:
<a th:href="|//${link}|" target="_blank">For more information click here</a>

Displaying a JSF requestLink parameter containing a + sign

Strange one this. I have a tag in my JSF page which contains a parameter that contains a + sign. When the resulting hyperlink is clicked the URL converts the + sign to a space, as this is how spaces in URLs are represented.
Is there any way of encoding this parameter to display "%2b" (which is the urlencoded string) instead of +? I am using JSF 1.2
<hx:requestLink styleClass="requestLink" id="link31"
action="#{sellingMarginBean.changeView}">
<h:outputText styleClass="outputText" id="text81"
value="#{varsummaryDataList.tier.description}"></h:outputText>
<f:param
value="#{varsummaryDataList.tier.tierCode}"
name="tierCode" id="param51"></f:param>
</hx:requestLink>
If I change the value of tierCode to replace any '+' with '%2b' before putting out to the screen this works, but it's a hack at best as it means creating a custom method on my Tier domain object or cycling through summaryDataList and performing the replace.
Thanks in advance
Steve
According to this post by BalusC:
JSP/Servlet request: during request processing an average application server will by default use the ISO 8859-1 character encoding to URL-decode the request parameters. You need to force the character encoding to UTF-8 yourself. First this: "URL encoding" must not to be confused with "character encoding". URL encoding is merely a conversion of characters to their numeral representations in the %xx format, so that special characters can be passed through URL without any problems. The client will URL-encode the characters before sending them to the server. The server should URL-decode the characters using the same character encoding.
So probably your client and server are not using the same URL(URI)-encoding. Your best bet is to force the server itself to use UTF-8 encoding. That depends on what server you're using.
You could also use JSTL's fn:replace for your parameter, as an alternative but more "hacky" solution. Remember to define the JSTL taglib in your namespace set (xmlns:fn="http://java.sun.com/jsp/jstl/functions").
<f:param
value="#{fn:replace(varsummaryDataList.tier.tierCode, '+', '%2b'}"
name="tierCode" id="param51" />
See also:
POST parameters using wrong encoding in JSF 1.2
JSF 2.0 request.getParameter return a string with wrong encoding
How can I manipulate a String in a JSF tag?

I am being asked to encode a URL. How do I do this?

I am being asked to encode a header image parameter into a base url but I am not sure how to do this. Here is a screenshot of the document they have supplied me: http://postimage.org/image/6hkscqdld/
Basically they have a page that I am suppose to use, but I can add a custom header image to that page by adding a parameter to the URL, however the dynamic header image tag must be encoded. How would I go about encoding it?
in php you can just use the function
urlencode($string);
http://php.net/manual/en/function.urlencode.php
$url = 'http://www.example.com/your/url/goes/here';
urlencode($url);
The above will take a string and encode it properly for use in a URL.

Why aren't UTF-8 characters being rendered correctly in this web page (generated with JSoup)?

I'm having trouble dealing with Charsets while parsing and rendering a page using the JSoup library. here is an example of the page it renders:
http://dl.dropbox.com/u/13093/charset-problem.html
As you can see, where there should be ' characters, ? is being rendered instead (even when you view the source).
This page is being generated by downloading a web page, parsing with JSoup, and then re-rendering it again having made some structural changes.
I'm downloading the page as follows:
final Document inputDoc = Jsoup.connect(sourceURL.toString()).get();
When I create the output document I do so as follows:
outputDoc.outputSettings().charset(Charset.forName("UTF-8"));
outputDoc.head().appendElement("meta").attr("charset", "UTF-8");
outputDoc.head().appendElement("meta").attr("http-equiv", "Content-Type")
.attr("content", "text/html; charset=UTF-8");
Can anyone offer suggestions as to what I'm doing wrong?
edit: Note that the source page is http://blog.locut.us/ and as you'll see, it appears to render correctly
The question marks are typical whenever you write characters to the outputstream of the response which are not covered by the response's character encoding. You seem to be relying on the platform default character encoding when serving the response. The response Content-Type header of your site also confirms this by a missing charset attribute.
Assuming that you're using a servlet to serve the modified HTML, then you should be using HttpServletResponse#setCharacterEncoding() to set the character encoding before writing the modified HTML out.
response.setCharacterEncoding("UTF-8");
response.getWriter().write(html);
The problem is most likely in reading the input page, you need to have the correct encoding for the source too.

Show Images with name containing special characters

I am trying to display some images containing special characters like ☻ ☺ ♥ or Chinese or Arabic characters in their names using jsp...but the images are not getting displayed !!
<img src = "pipo².jpg" />
<img src = "pip☺☻♥o².jpg" />
What am I doing wrong !!
Try encoding the filename using URLEncoder.encode() method before the HTML is sent to the page, e.g.
String encodedString = URLEncoder.encode(filename, "UTF-8").
This will convert the characters to entities which can be passed in HTML.
you can percent encode the urls using encodeURIComponent in javascript to give you
<img src="pip%C3%A2%C2%98%C2%BA%C3%A2%C2%98%C2%BB%C3%A2%C2%99%C2%A5o%C3%82%C2%B2.jpg">
I'd recommend renaming your files.
Using special characters in src paths is not strictly allowed, you'd have to find the URL style escape codes for those characters.

Categories

Resources