XSLT parameter not replaced - java

Could someone advise me what's wrong with the XSLT transformation below? I have stripped it down to a minimum.
Basically, I would like to have a parameter "title" replaced, but I cannot get it to run. The transformation simply ignores the parameter. I have highlighted the relevant bits with some exclamation marks.
Any advise is greatly appreciated.
public class Test {
private static String xslt =
"<?xml version=\"1.0\"?>\n" +
"<xsl:stylesheet\n" +
" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\"\n" +
" version=\"1.0\">\n" +
" <xsl:param name=\"title\" />\n" +
" <xsl:template match=\"/Foo\">\n" +
" <html><head><title>{$title}</title></head></html>\n" + // !!!!!!!!!!!
" </xsl:template>\n" +
"</xsl:stylesheet>\n";
public static void main(String[] args) {
try {
final DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware( true );
final DocumentBuilder db = dbf.newDocumentBuilder();
final Document document = db.newDocument();
document.appendChild( document.createElement( "Foo" ) );
final StringWriter resultWriter = new StringWriter();
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer( new StreamSource( new StringReader( xslt ) ) );
// !!!!!!!!!!!!!!!!!!
transformer.setParameter( "title", "This is a title" );
// !!!!!!!!!!!!!!!!!!
transformer.transform( new DOMSource( document ), new StreamResult( resultWriter ) );
System.out.println( resultWriter.toString() );
} catch( Exception ex ) {
ex.printStackTrace();
}
}
}
I'm using Java 6 without any factory-specific system properties set.
Thank you in advance!

<html><head><title>{$title}</title></head></html>
The problem is in the above line.
In XSLT the {someXPathExpression} syntax can be used only in (some) attributes, and never in text nodes.
Solution:
Replace the above with:
<html><head><title><xsl:value-of select="$title"/></title></head></html>

Related

XML pretty print add unnecessary whitespace element content containing CDATA

I have a piece of Java code which pretty prints xml. When using a LSSerializer to pretty print the output, it is formatted nicely and indented but elements which contain CDATA behave strangely. The XML
<root><outer><inner><text><![CDATA[Content of the CDATA block]]></text></inner></outer></root>
gets transformed into the following xml
<?xml version="1.0" encoding="UTF-8"?><root>
<outer>
<inner>
<text>
<![CDATA[Content of the CDATA block]]>
</text>
</inner>
</outer>
</root>
and has the CDATA element in a separate line. This causes issues when extracting the content later on with xpath expressions.
The code
#Test
public void testOutputXML() throws Exception {
final Document document = loadXMLFromString( "<root><outer><inner><text><![CDATA[Content of the CDATA block]]></text></inner></outer></root>" );
final String formattedXml = toXmlPrettyLS( document );
final Document formattedDocument = loadXMLFromString( formattedXml );
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//text/text()");
final String evaluate = expr.evaluate( formattedDocument );
assertThat( evaluate ).isEqualTo( "Content of the CDATA block" );
}
private String toXmlPrettyLS( final Document document ) throws Exception {
final ByteArrayOutputStream bos = new ByteArrayOutputStream();
final DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
final DOMImplementationLS loadSave = ( DOMImplementationLS ) registry.getDOMImplementation( "LS" );
final LSOutput output = loadSave.createLSOutput();
output.setByteStream( bos );
final LSSerializer serializer = loadSave.createLSSerializer();
final DOMConfiguration config = serializer.getDomConfig();
config.setParameter( "format-pretty-print", true );
serializer.write( document, output );
return String.valueOf( bos );
}
private Document loadXMLFromString( final String xml ) throws Exception {
final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware( true );
final DocumentBuilder builder = factory.newDocumentBuilder();
return builder.parse( new ByteArrayInputStream( xml.getBytes() ) );
}
is used to transform the xml and extract the content, the environment is Java 11.
How can I adjust the formatting to get
<text>![CDATA[Content of the CDATA block]]></text>
instead?

Include delimiter when performing substring operation

How do I include the delimiter when performing a substring operation?
i.e. given the string message which looks like this:
<nutrition>
<daily-values>
<total-fat units="g">65</total-fat>
<saturated-fat units="g">20</saturated-fat>
<cholesterol units="mg">300</cholesterol>
<sodium units="mg">2400</sodium>
<carb units="g">300</carb>
<fiber units="g">25</fiber>
<protein units="g">50</protein>
</daily-values>
</nutrition>
<food>
<name>Avocado Dip</name>
<mfr>Sunnydale</mfr>
<serving units="g">29</serving>
<calories total="110" fat="100"/>
<total-fat>11</total-fat>
<saturated-fat>3</saturated-fat>
<cholesterol>5</cholesterol>
<sodium>210</sodium>
<carb>2</carb>
<fiber>0</fiber>
<protein>1</protein>
<vitamins>
<a>0</a>
<c>0</c>
</vitamins>
<minerals>
<ca>0</ca>
<fe>0</fe>
</minerals>
</food>
and then
message = message.substring(message.indexOf("<food>"), message.indexOf("</food>"));
returns
<food>
<name>Avocado Dip</name>
<mfr>Sunnydale</mfr>
<serving units="g">29</serving>
<calories total="110" fat="100"/>
<total-fat>11</total-fat>
<saturated-fat>3</saturated-fat>
<cholesterol>5</cholesterol>
<sodium>210</sodium>
<carb>2</carb>
<fiber>0</fiber>
<protein>1</protein>
<vitamins>
<a>0</a>
<c>0</c>
</vitamins>
<minerals>
<ca>0</ca>
<fe>0</fe>
</minerals>
How do I get it to keep the last </food> tag given I don't know the surrounding content of the XML file?
Here's a solution using javax.xml. It aims to solve the case when multiple <food> elements are present in the document. In order to handle this case correctly, you need to
deserialize your XML into org.w3c.dom.Document
extract the list of <food> nodes as org.w3c.dom.NodeList
serialize back to String at the end
Here's a simplified example:
private static final String XML =
"<?xml version = \"1.0\" encoding = \"UTF-8\"?>\n"
+ "<message>\n"
+ " <food>\n"
+ " <name>A</name>\n"
+ " </food>\n"
+ " <food>\n"
+ " <name>B</name>\n"
+ " </food>\n"
+ "</message>\n";
#Test
public void xpath() throws Exception {
// Deserialize
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
Document document;
try (InputStream in = new ByteArrayInputStream(XML.getBytes(StandardCharsets.UTF_8))) {
document = factory.newDocumentBuilder().parse(in);
}
XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xPath.compile("//food");
NodeList nodeList = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
System.out.println(node.getNodeName() + ": " + node.getTextContent().trim());
}
// Serialize
Document exportDoc = factory.newDocumentBuilder().newDocument();
Node exportNode = exportDoc.importNode(nodeList.item(0), true);
exportDoc.appendChild(exportNode);
String content = serialize(exportDoc);
System.out.println(content);
}
private static String serialize(Document doc) throws TransformerException {
DOMSource domSource = new DOMSource(doc);
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
// set indent
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(domSource, result);
return writer.toString();
}
The 1st output shows all <food> elements are deserialized correctly:
food: A
food: B
The 2nd output shows the 1st element are serialized back to string:
<food>
<name>A</name>
</food>

Unable to parse XML using java

I have an XML string got as a response. But I am unable to reach at Response Code and remarks. Can anybody help me to get the response code.
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Body>
<GetIMEIInfoResponse xmlns="http://tempuri.org/">
<GetIMEIInfoResult>
<![CDATA[
<SerialsDetail>
<Item>
<ResponseCode>2</ResponseCode>
<Remark>Invalid Input</Remark>
</Item>
</SerialsDetail>
]]>
</GetIMEIInfoResult>
</GetIMEIInfoResponse>
</s:Body>
</s:Envelope>
Thats how I am trying to do
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new StringReader(response)));
NodeList list = doc.getElementsByTagName("Remark");
System.out.println(list.getLength());
Node n = list.item(0);
System.out.println(n.getTextContent());
} catch (Exception e) {
e.printStackTrace();
}
You are asking for an element with name "Remark", but you document does not contain such an element. Instead, it contains only an "GetIMEIInfoResult" element with a bunch of text in it. This text happens to be xml. But in order to access the contents of the inner piece of XML, you have to parse the contents of the "GetIMEIInfoResult" in the same way that you've parsed the entire document.
Here is how you can do it:
import java.io.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.InputSource;
public class NestedCDATA {
private static String response =
"<s:Envelope xmlns:s=\"http://schemas.xmlsoap.org/soap/envelope/\">" +
" <s:Body>" +
" <GetIMEIInfoResponse xmlns=\"http://tempuri.org/\">" +
" <GetIMEIInfoResult>" +
" <![CDATA[" +
" <SerialsDetail>" +
" <Item>" +
" <ResponseCode>2</ResponseCode>" +
" <Remark>Aawwwwwwww yeaaaah!</Remark>" +
" </Item>" +
" </SerialsDetail>" +
" ]]>" +
" </GetIMEIInfoResult>" +
" </GetIMEIInfoResponse>" +
" </s:Body>" +
"</s:Envelope>";
public static String getCdata(Node parent) {
NodeList cs = parent.getChildNodes();
for(int i = 0; i < cs.getLength(); i++){
Node c = cs.item(i);
if(c instanceof CharacterData) {
CharacterData cdata = (CharacterData)c;
String content = cdata.getData().trim();
if (content.length() > 0) {
return content;
}
}
}
return "";
}
public static void main(String[] args) {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new StringReader(response)));
Node cdataParent = doc.getElementsByTagName("GetIMEIInfoResult").item(0);
DocumentBuilder cdataBuilder = factory.newDocumentBuilder();
Document cdataDoc = cdataBuilder.parse(new InputSource(new StringReader(
getCdata(cdataParent)
)));
Node remark = cdataDoc.getElementsByTagName("Remark").item(0);
System.out.println("Content of Remark in CDATA: " + getCdata(remark));
} catch (Exception e) {
e.printStackTrace();
}
}
}
Result: "Content of Remark in CDATA: Aawwwwwwww yeaaaah!".
Here is another interesting question for you: why does your service output XML with XML in it? XML all by itself is already nested enough. Is it really necessary to wrap parts of it in CDATA?
The problem of the XML is that the data in the tag GetIMEIInfoResult is CDATA. This causes the builder not to recognize it as XML. To access the data in the tag GetIMEIInfoResult you can use the following:
Element infoResult = (Element) list.item(0);
String elementData = getCharacterDataOfNode(infoResult.getFirstChild());
public static String getCharacterDataOfNode(Node node) {
String data = "";
if (node instanceof CharacterData) {
data = ((CharacterData) node).getData();
}
return data;
}
Then you have to parse that data again with a DocumentBuilder where you can access the tag Remark. To get the content you have again work with the getCharacterDataOfNode() method.

Pretty print XML in java 8

I have an XML file stored as a DOM Document and I would like to pretty print it to the console, preferably without using an external library. I am aware that this question has been asked multiple times on this site, however none of the previous answers have worked for me. I am using java 8, so perhaps this is where my code differs from previous questions? I have also tried to set the transformer manually using code found from the web, however this just caused a not found error.
Here is my code which currently just outputs each xml element on a new line to the left of the console.
import java.io.*;
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
public class Test {
public Test(){
try {
//java.lang.System.setProperty("javax.xml.transform.TransformerFactory", "org.apache.xalan.xsltc.trax.TransformerFactoryImpl");
DocumentBuilderFactory dbFactory;
DocumentBuilder dBuilder;
Document original = null;
try {
dbFactory = DocumentBuilderFactory.newInstance();
dBuilder = dbFactory.newDocumentBuilder();
original = dBuilder.parse(new InputSource(new InputStreamReader(new FileInputStream("xml Store - Copy.xml"))));
} catch (SAXException | IOException | ParserConfigurationException e) {
e.printStackTrace();
}
StringWriter stringWriter = new StringWriter();
StreamResult xmlOutput = new StreamResult(stringWriter);
TransformerFactory tf = TransformerFactory.newInstance();
//tf.setAttribute("indent-number", 2);
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.transform(new DOMSource(original), xmlOutput);
java.lang.System.out.println(xmlOutput.getWriter().toString());
} catch (Exception ex) {
throw new RuntimeException("Error converting to String", ex);
}
}
public static void main(String[] args){
new Test();
}
}
In reply to Espinosa's comment, here is a solution when "the original xml is not already (partially) indented or contain new lines".
Background
Excerpt from the article (see References below) inspiring this solution:
Based on the DOM specification, whitespaces outside the tags are perfectly valid and they are properly preserved. To remove them, we can use XPath’s normalize-space to locate all the whitespace nodes and remove them first.
Java Code
public static String toPrettyString(String xml, int indent) {
try {
// Turn xml string into a document
Document document = DocumentBuilderFactory.newInstance()
.newDocumentBuilder()
.parse(new InputSource(new ByteArrayInputStream(xml.getBytes("utf-8"))));
// Remove whitespaces outside tags
document.normalize();
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']",
document,
XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); ++i) {
Node node = nodeList.item(i);
node.getParentNode().removeChild(node);
}
// Setup pretty print options
TransformerFactory transformerFactory = TransformerFactory.newInstance();
transformerFactory.setAttribute("indent-number", indent);
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
// Return pretty print xml string
StringWriter stringWriter = new StringWriter();
transformer.transform(new DOMSource(document), new StreamResult(stringWriter));
return stringWriter.toString();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
Sample usage
String xml = "<root>" + //
"\n " + //
"\n<name>Coco Puff</name>" + //
"\n <total>10</total> </root>";
System.out.println(toPrettyString(xml, 4));
Output
<root>
<name>Coco Puff</name>
<total>10</total>
</root>
References
Java: Properly Indenting XML String published on MyShittyCode
Save new XML node to file
I guess that the problem is related to blank text nodes (i.e. text nodes with only whitespaces) in the original file. You should try to programmatically remove them just after the parsing, using the following code. If you don't remove them, the Transformer is going to preserve them.
original.getDocumentElement().normalize();
XPathExpression xpath = XPathFactory.newInstance().newXPath().compile("//text()[normalize-space(.) = '']");
NodeList blankTextNodes = (NodeList) xpath.evaluate(original, XPathConstants.NODESET);
for (int i = 0; i < blankTextNodes.getLength(); i++) {
blankTextNodes.item(i).getParentNode().removeChild(blankTextNodes.item(i));
}
This works on Java 8:
public static void main (String[] args) throws Exception {
String xmlString = "<hello><from>ME</from></hello>";
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(new InputSource(new StringReader(xmlString)));
pretty(document, System.out, 2);
}
private static void pretty(Document document, OutputStream outputStream, int indent) throws Exception {
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
if (indent > 0) {
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", Integer.toString(indent));
}
Result result = new StreamResult(outputStream);
Source source = new DOMSource(document);
transformer.transform(source, result);
}
I've written a simple class for for removing whitespace in documents - supports command-line and does not use DOM / XPath.
Edit: Come to think of it, the project also contains a pretty-printer which handles existing whitespace:
PrettyPrinter prettyPrinter = PrettyPrinterBuilder.newPrettyPrinter().ignoreWhitespace().build();
Underscore-java has static method U.formatXml(string). I am the maintainer of the project. Live example
import com.github.underscore.U;
public class MyClass {
public static void main(String args[]) {
String xml = "<root>" + //
"\n " + //
"\n<name>Coco Puff</name>" + //
"\n <total>10</total> </root>";
System.out.println(U.formatXml(xml));
}
}
Output:
<root>
<name>Coco Puff</name>
<total>10</total>
</root>
I didn't like any of the common XML formatting solutions because they all remove more than 1 consecutive new line character (for some reason, removing spaces/tabs and removing new line characters are inseparable...). Here's my solution, which was actually made for XHTML but should do the job with XML as well:
public String GenerateTabs(int tabLevel) {
char[] tabs = new char[tabLevel * 2];
Arrays.fill(tabs, ' ');
//Or:
//char[] tabs = new char[tabLevel];
//Arrays.fill(tabs, '\t');
return new String(tabs);
}
public String FormatXHTMLCode(String code) {
// Split on new lines.
String[] splitLines = code.split("\\n", 0);
int tabLevel = 0;
// Go through each line.
for (int lineNum = 0; lineNum < splitLines.length; ++lineNum) {
String currentLine = splitLines[lineNum];
if (currentLine.trim().isEmpty()) {
splitLines[lineNum] = "";
} else if (currentLine.matches(".*<[^/!][^<>]+?(?<!/)>?")) {
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
++tabLevel;
} else if (currentLine.matches(".*</[^<>]+?>")) {
--tabLevel;
if (tabLevel < 0) {
tabLevel = 0;
}
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
} else if (currentLine.matches("[^<>]*?/>")) {
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
--tabLevel;
if (tabLevel < 0) {
tabLevel = 0;
}
} else {
splitLines[lineNum] = GenerateTabs(tabLevel) + splitLines[lineNum];
}
}
return String.join("\n", splitLines);
}
It makes one assumption: that there are no <> characters except for those that comprise the XML/XHTML tags.
Create xml file :
new FileInputStream("xml Store - Copy.xml") ;// result xml file format incorrect !
so that, when parse the content of the given input source as an XML document
and return a new DOM object.
Document original = null;
...
original.parse("data.xml");//input source as an XML document

Document.toString() is "[#document: null]" even though XML was parsed

Consider this example
#Test
public void testXML() {
final String s = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><results>\n" +
" <status>OK</status>\n" +
" <usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>\n" +
" <url/>\n" +
" <language>english</language>\n" +
" <docSentiment>\n" +
" <type>neutral</type>\n" +
" </docSentiment>\n" +
"</results> ";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try
{
builder = factory.newDocumentBuilder();
Document doc = builder.parse( new InputSource( new StringReader( s ) ) );
System.out.println(doc.toString());
} catch (Exception e) {
e.printStackTrace();
}
}
When I run this example
System.out.println(doc.toString()); turns out to be [#document: null].
I also validated this XML online and no errors were found. What am I missing?
What I need?
I need to find out value of <docSentiment> in this XML
Thanks
As per MadProgrammer's advice, I managed to get the value.
Note: Even though [#document: null] was shown, the document was not null, in reality.
#Test
public void testXML() {
final String s = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><results>\n" +
" <status>OK</status>\n" +
" <usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>\n" +
" <url/>\n" +
" <language>english</language>\n" +
" <docSentiment>\n" +
" <type>neutral</type>\n" +
" </docSentiment>\n" +
"</results>";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try
{
builder = factory.newDocumentBuilder();
Document doc = builder.parse( new InputSource( new StringReader( s ) ) );
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//docSentiment/type");
NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
System.out.println("Sentiment:" + ((DTMNodeList) nl).getDTMIterator().toString());
} catch (Exception e) {
e.printStackTrace();
}
}
and I go the output as
Sentiment:neutral

Categories

Resources