It is very good practice in HTML template engines to HTML-escape by default placeholder text to help prevent XSS (cross-site scripting) attacks. Is it possible to achieve this behavior in StringTemplate?
I have tried to register custom AttributeRenderer which escapes HTML unless format is "raw":
stg.registerRenderer(String.class, new AttributeRenderer() {
#Override
public String toString(Object o, String format, Locale locale) {
String s = (String)o;
return Objects.equals(format, "raw") ? s : StringEscapeUtils.escapeHtml4(s);
}
});
But it fails because in this case StringTemlate escapes not only placeholder text but also template text itself. For example this template:
example(title, content) ::= <<
<html>
<head>
<title>$title$</title>
</head>
<body>
$content; format = "raw"$
</body>
</html>
>>
Is rendered as:
<html>
<head>
<title>Example Title</title>
</head>
<body>
<p>Not escaped because of <code>format = "raw"</code>.</p>
</body>
</html>
Can anybody help?
There is no good solution to encode by default. The template is passed through the AttributeRenderer for the string data type, and there is no context information to detect if it is processing the template or a variable. So all strings, including the template, are encoded by default since you cannot specify "raw" for the template.
An alternative solution is to use format="xml-encode" in the variables that need to be encoded. The built-in StringRenderer has support for several formats:
upper
lower
cap
url-encode
xml-encode
So your example would be:
example(title, content) ::= <<
<html>
<head>
<title>$title; format="xml-encode"$</title>
</head>
<body>
$content$
</body>
</html>
>>
In order to encode by default, you have limited options. The alternatives are:
Use a custom data type (not String) for your variables, so you can register your HtmlEscapeStringRenderer for the custom data type. This is difficult if you use complex objects as variables that are already using standard strings.
Add the raw and the escaped variables to the model manually, e.g. add title (escaped) and title_raw (raw). You do not need a custom AttributeRenderer in this case. StringTemplate has a strict view/model separation and you need to have the model populated before it is rendered with both the raw and escaped values.
Neither option is particularly desirable, but I do not see any other alternatives with StringTemplate4.
The answer is to revert to StringTemplate v3.
Related
In the API of a document converter, which generates HTML (or XHTML), I want to expose these methods:
// Convert the input file to a file using the specified charset
void convert(File in, File out, Charset charset);
// Convert the input document to a string using the specified charset
String convert(String in, Charset charset);
There is no way for client code to produce faulty documents with the file-based method, it safely writes a result document with the specified charset.
The String based method obviuously will lead to problems, if the client code does not respect the chosen charset - for example if the charset parameter is ISO-8859-1 but the result String is served as UTF-8 content in a web application:
String html = convert(getInputDocument(), ISO_8859_1);
...
response.setContentType("text/html;charset=UTF-8");
response.setCharacterEncoding("UTF-8");
try (PrintWriter out = response.getWriter()) {
out.print(html);
}
Question: which options should I consider to design the API so that users are guided to correct usage of the result string?
deprecate the method and provide a method which returns a byte array
use method names which contain the encoding (convertToUTF_8, convertToISO_8859_1 ...)
The result string could for example be
<!DOCTYPE html>
<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Untitled document</title>
</head>
<body>
<p>Motörhead</p>
</body>
</html>
I don't know your exact use-case, but one possibility is to protect document with a proper object context (instead of it just being a String):
public interface Document {
void writeTo(ServletResponse response);
}
This way you can retain all control of how that "string" can be written to different targets.
I'm not sure whether you need a convert at all, since the document could automatically convert its content if it sees that the response already has a different encoding. But even if you need a convert you could do it this way:
public interface Document {
void writeTo(ServletResponse response);
Document convert(Charset targetCharset);
}
This would return a new document which is of a different charset.
Say I am displaying escaped value in HTML with below code under text area:
<c:out value="${person.name}" />
My question do I need to decode this value at server side manually or browser will do it automatically ?
No, you need not to decode this value manually .. All you need is:
Specify your HTTP response content type encoding as UTF-8. To be precise use HttpServletResponse.setContentType ("text/html;charset=utf-8");.
Your JSP should have content type encoding set as UTF-8 in your JSP .. To be precise add this meta tag in your JSP and you should be good to go <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
When you have this tag in your JSP then browser will understand that content of this page should be render as per UTF-8 encoding rules.
If don't specify page encoding explicitly using these kind of meta tags or some other mechanism then browser use default encoding associated with it while page rendering and you may not see expected result especially for characters from Unicode's advanced blocks of BMP and Supplementary Multilingual Plane. Check this on how to see the default encoding of browser.
Concept
Server should specify desired encoding scheme in "response stream" and same encoding scheme should be used in JSP/ASP/HTML page.
Server side encoding options
PHP
header('Content-type: text/html; charset=utf-8');
Perl
print "Content-Type: text/html; charset=utf-8\n\n";
Python
Use the same solution as for Perl (except that you don't need a semicolon at the end).
Java Servlets
resource.setContentType ("text/html;charset=utf-8");
JSP
<%# page contentType="text/html; charset=UTF-8" %>
ASP and ASP.Net
<%Response.charset="utf-8"%>
Client side encoding options
Use following meta tag in your HTML page <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
Further reading:
HTTP-charset
This answer
when I get the request.parameter for the escaped input (done thru) <c:out value="${person.name}" />, I get the escaped value and store it in db as it is. For example :- <script>test</script> is stored as <script>test</script> Now when value is fetched from DB and displayed on browser, it renders it correctly i.e <script>test</script> is displayed as <script>test</script>
I don't know how to explain the problem but I will try my best.
I have a html page code as below stored in sql database table
<html>
<head>
</head>
<body>
Dear ${RESOURCES.NAME},<br>
</body>
</html>
I have to pull this code into a java string in my java app and then replace the content at places like ${RESOURCES.NAME} with its actual value. The value is stored inside a java variable in my java app.
I am thinking of replaceing the ${variables} with %s and then use java string formatter to replace its content with my java variable but since there is so many such variable in my html page, so its tedious and i have to provide the arguments in string formatter in sequence also.
Can anybody suggest a better idea to do this?
Try using the String.replace() method, for example :
String html = db.get(); // Your method to get the html field from the database
String name = "John Doe";
String output = html.replace("${RESOURCE.NAME}", name);
Docs here
I've been using Play 2.0 framework for a couple of days now for a proof of concept application at my job. One of the first things I wanted to check out was the custom tag functionality, since it reminded me of HtmlHelpers in ASP.Net MVC. The thing is that I can't seem to make them work and was wondering if I'm misusing the feature or misunderstanding something.
Here's a simple example of what I want to do: I want to be able to use #script("scriptname.js") anywhere in the templates and have that subsitute the entire tags.
Here's what I got so far:
main.scala.html
#(title: String, scripts: Html = Html(""))(content: Html)
#import tags._
<!DOCTYPE html>
<html>
<head>
<!-- this is how I would like to use the helper/tag -->
#script("jquery.js")
#script("jquery-ui.js")
<!-- let views add their own scripts. this part is working OK -->
#scripts
</head>
<body>
#content
</body>
</html>
I created a subdirectory called "tags" under the app/views directory. There I created my script.scala.html tag/helper file:
#(name: String)
<script src="#routes.Assets.at("javascripts/#name")" type="text/javascript"></script>
The problem I'm having is that whenever I use #script() the output includes the #name parameter in it. For example #script("x.js") actually outputs
<script src="assets/javascripts/#name" type="text/javascript"></script>
What am I doing wrong?
For the record, I did read the documentation and search here, but neither of these links have helped me figure this out:
http://www.playframework.org/documentation/2.0.3/JavaTemplateUseCases
How to define a tag with Play 2.0?
#routes.Assets.at(...) evaluates the Scala expression routes.Assets.at(...) and substitutes the result into your output. There is no recursive evaluation that would allow you to have evaluate an expression textually to get that expression, which seems to be what you're expecting.
What you intend to do is achieved using
#routes.Assets.at("javascripts/" + name)
I have some HTML code that I store in a Java.lang.String variable. I write that variable to a file and set the encoding to UTF-8 when writing the contents of the string variable to the file on the filesystem. I open up that file and everything looks great e.g. → shows up as a right arrow.
However, if the same String (containing the same content) is used by a jsp page to render content in a browser, characters such as → show up as a question mark (?)
When storing content in the String variable, I make sure that I use:
String myStr = new String(bytes[], charset)
instead of just:
String myStr = "<html><head/><body>→</body></html>";
Can someone please tell me why the String content gets written to the filesystem perfectly but does not render in the jsp/browser?
Thanks.
but does not render in the jsp/browser?
You need to set the response encoding as well. In a JSP you can do this using
<%# page pageEncoding="UTF-8" %>
This has actually the same effect as setting the following meta tag in HTML <head>:
<meta http-equiv="content-type" content="text/html; charset=utf-8">
Possibilities:
The browser does not support UTF-8
You don't have Content-Type: text/html; charset=utf-8 in your HTTP Headers.
The lazy developer (=me) uses Apache Common Lang StringEscapeUtils.escapeHtml http://commons.apache.org/lang/api-release/org/apache/commons/lang/StringEscapeUtils.html#escapeHtml(java.lang.String) which will help you handle all 'odd' characters. Let the browser do the final translation of the html entities.