Jsoup changes output from single quote to double quote on HTML attributes - java

We are using Jsoup to parse, manipulate and extend a html template. So far everything works fine until it comes to single quotes used in combination with HTML attributes
<span data-attr='JSON'></span>
That HTML snippet is converted to
<span data-attr="JSON"></span>
which will conflict with the inner json data which is specified as valid with double quotes only
{"param" : "value"} //valid
{'param' : 'value'} //invalid
so we need to force Jsoup to NOT change those single quotes to double quotes, but how? Currently that is our code to parse and produce html content.
pageTemplate = Jsoup.parse(new File(mainTemplateFilePath), "UTF-8");
pageTemplate.outputSettings().escapeMode(Entities.EscapeMode.xhtml);
pageTemplate.outputSettings().charset("UTF-8");
... adding some html
pageTemplate.html(); // will output the double quoted attributes :(

You need to HTML encode the JSON value before putting it into the data-attr attribute. When you do so, you should end up with this:
<span data-attr="{"param":"value"}"></span>
Although that looks fairly daunting, it is actually valid HTML. When your corresponding JavaScript executes someSpan.getAttribute("data-attr"), the " values will be transformed into " values automatically, giving you access to the original valid JSON string.

Related

Java - Escape quotes in style attribute

In a Java application I have HTML, as a String, that looks like this:
<DIV STYLE="font-family:"Times New Roman"">
And I wish to decode the encoded quotes so that it is correctly displayed on the page. The problem is that conventional StringEscapeUtils escape methods will decode each quote as a double quote, resulting in HTML like this:
<DIV STYLE="font-family:"Times New Roman"">
Which will not correctly render on the page. The desired result is for the HTML to look like this:
<DIV STYLE='font-family:"Times New Roman"'>
I can algorithmically examine the string to replace the encoded quotes to what I want but is there a dedicated method to correctly decode quotes for such a String?
If it is defined in your java code
you may try to add \ before "
I assume you are expecting something like this right?
String randomHtmlCode = " <DIV STYLE='font-family:\"Times New Roman\"'> ";

Output string as html in freemarker

So we are storing html in out data model. I need to output this into a freemarker template:
example:
[#assign value = model.value!]
${value}
value = '<p>This is <a href='somelink'>Some link</a></p>'
I have tried [#noescape] but it throws an error saying there is no escape block. see FREEMARKER: avoid escaping HTML chars. This solution did not work for me.
[#noescape] or <#noescape> is only valid when used inside an [#escape] tag. Your data is probably stored with the HTML encoded. You need to get the backend to un-encode the html.
Otherwise you'll need to do something like...
${value?replace(">", ">")?replace("<", "<")}
But that isn't a good approach because it won't catch all the encoded values and shouldn't be done in the view layer.

request.getParameter - unable reading &

I have a problem reading a value that contains '&' from the url using java spring.
My summary.jsp file contains the following code:
`
<h3>
<fmt:message key="contact.title"/>
<c:if test="${!editMode}">
<label class="eightyfivefont">
<a class="termslink eightyfivefont"
href="?submitForm=true&
institutionName=<c:out value="${institution.institutionName}" />&
repositoryName=<c:out value="${institution.repositoryName}" />&
editMode=true">
(edit)</a>
</label>
</c:if>
</h3>
`
which produce the following url:
...summary.html?submitForm=true&institutionName=r&r&repositoryName=r&r
The problem is when I am trying to fetch the institutionName that hold the value "r&r".
when fetching the value using the following command:
String name = request.getParameter("institutionName");
it fetches only the string "r" and not "r&r".
The string is stored in XML file as "<institutionName>r&r</institutionName>" which is parsed and added using:
Document doc = DocumentHelper.createDocument();
Element root = doc.addElement("institution");
and for reading from the xml:
Document doc = DocumentHelper.parseText(xml);
Element root = doc.getRootElement();
(I assume the issue isn't with the XML).
Is there a solution for this problem?
The ampersand & is used to divide key-value pairs. If you want to have it as a value, or key, for that matter, you have to urlencode it. For example, like this:
encodeURIComponent('&') = "%26"
You should be able to use JavaScript for that. Or, of course, some Java method, if the URL is being created in the jsp itself.
You should use URLEncoder to encode the institutionName
java.net.URLEncoder.encode("r&r", "UTF-8")
this outputs the URL-encoded, which is fine as a GET parameter:
r%26r
check this answer. Create a custom EL function or otherwise you can use scriptlet though it is not recommended.

Process Thymeleaf variable as HTML code and not text

I'm using Thymeleaf to process html templates, I understood how to append inline strings from my controller, but now I want to append a fragment of HTML code into the page.
For example, lets stay that I have this in my Java application:
String n="<span><i class=\"icon-leaf\"></i>"+str+"</span> \n";
final WebContext ctx = new WebContext(request, response,
servletContext, request.getLocale());
ctx.setVariable("n", n);
What do I need to write in the HTML page so that it would be replaced by the value of the n variable and be processed as HTML code instead of it being encoded as text?
You can use th:utext attribute that stands for unescaped text (see documentation). Use this with caution and avoid user input in th:utext as it can cause security problems.
<div th:remove="tag" th:utext="${n}"></div>
If you want short-hand syntax you can use following:
[(${variable})]
Escaped short-hand syntax is
[[${variable}]]
but if you change inner square brackets [ with regular ( ones HTML is not escaped.
Example within tags:
<div>
[(${variable})]
</div>
Staring with Thymeleaf 3.0 the html friendly tag would be:
<div class="mailbox-read-message" data-th-utext="*{body}">

Thymeleaf string substitution and escaping

I have a string which contains raw data, which I want escaped. The string also contains markers which I want to replace with span tags.
For example my string is
"blah {0}something to span{1} < random chars <"
I would like the above to be rendered within a div, and replace {0} with and {1} with
I have tried a number of things, including doing the substitution in my controller, and trying to use the th:utext attribute, however I then get SAX exceptions.
Any ideas?
You can do this using i18n ?
something like:
resource.properties:
string.pattern=my name is {0} {1}
thymeleaf view:
<label th:text="#{__${#string.pattern('john', 'doe')}__}"></label>
The result should be:
my name is john doe
Im not sure this is a good way. But I hope it could help you
It looks using message parameters is the right approach to output formatted strings. See http://www.thymeleaf.org/doc/usingthymeleaf.html#messages
I suspect you need to pass character entity reference in order to avoid SAX exceptions
<span th:utext = "#{string.pattern(${'<span>john</span>'}, ${'<span>doe</span>'})}"/>
Alternatively place the markup in your .properties file:
string.pattern=my name is <span>{0}</span> <span>{1}</span>

Categories

Resources