Convert varying xml to another xml - java

I have an input XML file which looks like this:
<Root>
<Monday>Monday<Monday>
<Indicator>true<Indicator>
<Value>1<Value>
<Tuesday>Tuesday<Tuesday>
<Indicator>true<Indicator>
<Value>2<Value>
<Wednesday>Wednesday<Wednesday>
<Indicator>true<Indicator>
<Value>3<Value>
</Root>
It must be converted to the output XML file which is:
<Root>
<Monday>Monday<Monday>
<Value>1<Value>
<Tuesday>Tuesday<Tuesday>
<Value>2<Value>
<Wednesday>Wednesday<Wednesday>
<Value>3<Value>
</Root>
The problem is that the input XML can vary. Sometimes it might be
<Root>
<Monday>Monday<Monday>
<Indicator>true<Indicator>
<Value>1<Value>
<Thursday>Thursday<Thursday>
<Indicator>true<Indicator>
<Value>4<Value>
</Root>
Now the output must be
<Root>
<Monday>Monday<Monday>
<Value>1<Value>
<Thursday>Thursday<Thursday>
<Value>4<Value>
</Root>
I also have the list of valid tags like Monday, Tuesday etc, which can come in the input XML in an ArrayList in Java. Any ideas on how to accomplish this?

From this answer: How to remove elements from xml using xslt with stylesheet and xsltproc? :
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Indicator"/>
</xsl:stylesheet>

Related

java: replace namespace uri string in xml file with xslt [duplicate]

I am trying to replace namespace string using xslt.
I have the source namespace string and target namespace string in another xml file in the format of "namespace source="xxx" target="xxx"". All source namespace strings in my to-be-transformed xml should be changed to the corresponding target value. Only elements need to be considered as attributes don't have namespace.
I'm using the JDK default xalan xslt processor, which supports xslt 1.0. I'd like to stick to xslt 1.0 if possible, but I can change processor and use xslt 2.0 if really needed.
For example, to-be-transformed input xml:
<root xmlns="http://ns1" xmlns:ns2="http://ns2-old" xmlns:ns3="http://ns3">
<ElementA xmlns="http://ns4-old">
<ElementB/>
<ns2:elementD/>
</ElementA>
</root>
The output xml should be ("http://ns2-old" is changed to "http://ns2-new", and "http://ns4-old" is changed to "http://ns4-new"):
<root xmlns="http://ns1" xmlns:ns2="http://ns2-new" xmlns:ns3="http://ns3">
<ElementA xmlns="http://ns4-new">
<ElementB/>
<ns2:elementD/>
</ElementA>
</root>
Here is my xsl that is not working:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:variable name="nsmapdoc" select="document('my-map-file')"/>
<xsl:key name="nsmap" match="//namespace/#target" use="#source"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<!-- process each element -->
<xsl:template match="element()">
<xsl:for-each select="namespace::*">
<!-- for each namespace node of the element, save the string value-->
<xsl:variable name="sourceuri"><xsl:value-of select="."/>
</xsl:variable>
<!-- get the target value for this namespace-->
<xsl:variable name="targeturi">
<xsl:for-each select="$nsmapdoc">
<xsl:value-of select="key('nsmap', $sourceuri)"/>
</xsl:for-each>
</xsl:variable>
<!-- if target value exists, replace the current namespace node string value with the target value-->
<xsl:if test="$targeturi">
<xsl:value-of select="$targeturi"/>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
I have a few questions:
For the identity template that do the copy, why doing "match="node()|#*" instead of just "match="node()"? Is attribute also a node?
I am not sure if I did correctly to get the namespace string value (like "http://ns1", "http://ns2-old") for the element.
I think I got the key correctly. However, I am not sure if I used the type correctly---looks like targeturi variable is not a string. If key does not have the entry, what will lookup the entry return? In my case, I should replace the namespace value only for the namespace that has an entry in the map.
How to write a new string value for the namespace node?
I need to process each namespace nodes for the element. Is it the right way to define a variable inside ?
please help to see what is wrong with my xsl. Any suggestion is appreciated.
I think you have two, possibly three, separate questions here.
The first question is: how to move elements from one namespace to another, using a "map" of source-to-target namespaces. Let me answer this question first. Given:
XML
<root xmlns="http://ns1" xmlns:ns2="http://ns2-old" xmlns:ns3="http://ns3">
<ElementA xmlns="http://ns4-old">
<ElementB/>
<ns2:ElementD/>
</ElementA>
</root>
map.xml
<root>
<namespace source="http://ns2-old" target="http://ns2-new"/>
<namespace source="http://ns4-old" target="http://ns4-new"/>
</root>
The following stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:param name="nsmapdoc" select="document('map.xml')"/>
<xsl:template match="*">
<xsl:variable name="old-ns" select="namespace-uri()"/>
<xsl:variable name="map-entry" select="$nsmapdoc/root/namespace[#source=$old-ns]"/>
<xsl:variable name="new-ns">
<xsl:choose>
<xsl:when test="$map-entry">
<xsl:value-of select="$map-entry/#target"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$old-ns"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:element name="{local-name()}" namespace="{$new-ns}">
<xsl:copy-of select="#*"/>
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
will return:
<?xml version="1.0" encoding="utf-8"?>
<root xmlns="http://ns1">
<ElementA xmlns="http://ns4-new">
<ElementB/>
<ElementD xmlns="http://ns2-new"/>
</ElementA>
</root>
Note:
ElementA and its child ElementB have been moved from namespace URI "http://ns4-old" to URI "http://ns4-new";
ElementD has been moved from namespace URI "http://ns2-old" to URI "http://ns2-new";
The prefix of ElementD has been stripped off; this is a meaningless, cosmetic change, and it should not present any problems for the receiving application;
The root element has remained in its original namespace;
Unused namespace declarations have been discarded.
Note also that we are not using a key to lookup the new namespace: using a key across documents is quite awkward in XSLT 1.0 and I have chosen to do without.
The second question is how to copy the unused namespace declarations. There are several possible answers to choose from:
It's not necessary to copy them, since that are not used for anything;
It's not possible to copy them;
It is possible to copy them by copying some element from the source document; however copying an element also copies its namespace - so this can be done only if there is a known element that is supposed to stay in its original namespace. For example, if you know beforehand that the root element is not supposed to be moved to another namespace, you can add another template to the stylesheet:
to get this result:
<?xml version="1.0" encoding="utf-8"?>
<root xmlns="http://ns1" xmlns:ns2="http://ns2-old" xmlns:ns3="http://ns3">
<ElementA xmlns="http://ns4-new">
<ElementB/>
<ElementD xmlns="http://ns2-new"/>
</ElementA>
</root>

Copy of element using java XSLT looses namespace with output html (not for first element)

I have the following XML
<root>
<a>test</a>
<b xmlns="bns">test</b>
<a>test</a>
<b xmlns="bns">test</b>
<a>test</a>
</root>
and this XSL
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html" indent="no"/>
<xsl:template match="*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
If i apply the XSL to the XML using Java the namespace of the second b-tag is removed:
<root>
<a>test</a>
<b xmlns="bns">test</b>
<a>test</a>
<b>test</b>
<a>test</a>
</root>
I noticed that this only happens for output method html. Can anybody tell me why and what i can do about it?

xsl to search for all occurences of part of the text in xml, comparing from the passed list of texts and replacing it by a node

The input file is as below:
<root>
<node1>
<child_node1>apple mango<sub_node1>water grapes</sub_node1> banana</child_node1>
<child_node2>Cherry mango<sub_node2>Date grapes</sub_node2> Coconut</child_node2>
</node1>
<node2>
<child_node3>banana grapes apple</child_node3>
</node2>
......
</root>
An XSL is required which works for the below requirement.
Requirement:
Need to pass the list of strings and each string from that list has to be checked in this input file for all occurrences of matching text. If found enclose that matched text with a tag say <fruit>.
Example:
For the above input file if I pass list of Strings including: grapes, apple
The Expected output:
<root>
<node1>
<child_node1><fruit>apple</fruit> mango<sub_node1>water <fruit>grapes</fruit></sub_node1> banana</child_node1>
<child_node2>Cherry mango<sub_node2>Date <fruit>grapes</fruit></sub_node2> Coconut</child_node2>
</node1>
<node2>
<child_node3>banana <fruit>grapes</fruit> <fruit>apple</fruit></child_node3>
</node2>
......
</root>
Only the exact matching text needs to be tagged and "<child_node1>apple mango<sub_node1>water grapes</sub_node1> banana</child_node1>" is still valid (The Nodes and its text).
The list of input Strings may be any thing and a generic string matching approach is required which checks with each string in that list.
Any help in this regard is greatly appreciated. Thanks a lot in advance!
i tried something that goes like this:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes" method="xml" />
<xsl:strip-space elements="*" />
<xsl:variable name="list">apple mango</xsl:variable>
<xsl:template match="node() | #*">
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:for-each select="tokenize(., '\s+')">
<xsl:choose>
<xsl:when test=". = tokenize($list, '\s+')">
<fruit><xsl:value-of select="."/></fruit>
</xsl:when>
<xsl:otherwise><xsl:copy-of select="."/></xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

How to add namespace name and its value in XSLT?

I want to convert XML to XML using XSLT in JAVA. How to add namespace name and it's value in XSLT file? I have tried many ways to get the namespace value but didn't get the output what i expect. So Please do the needful.
This is my XML,
<?xml version="1.0" encoding="ISO-8859-1"?>
<root xmlns="namespacename">
<child>A</child>
<child>B</child>
</root>
XSLT file,
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:element name="root" namespace="namespacename">
<xsl:element name="child-one">
<xsl:value-of select="root/child"/>
</xsl:element>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
I need the output XML file like this.
<?xml version="1.0" encoding="ISO-8859-1"?>
<root xmlns="namespacename">
<child-one>A</child-one>
</root>
If you know the namespace, then simply add it as the default namespace and write the result as literal elements.
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="namespacename"
xmlns:i="namespacename"
exclude-result-prefixes="i">
<xsl:template match="/">
<root>
<child-one>
<xsl:value-of select="i:root/i:child"/>
</child-one>
</root>
</xsl:template>
</xsl:stylesheet>
Note that the XPath expression root/child normally doesn't respect the default namespace, so you have to declare an additional namespace with a prefix (e.g. i) so the path becomes i:root/i:child. However, this also requires excluding the namespace for the result using exclude-result-prefixes="i".

Bypassing namespaces while copying an XML with XSLT

Starting from an XML with a default namespace:
<Root>
<A>foo</A>
<B></B>
<C>bar</C>
</Root>
I apply an XSLT to remove the 'C' element:
<?xml version="1.0" ?>
<xsl:stylesheet version="2.0" xmlns="http://www.w3.org/1999/XSL/Transform" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" indent="no" encoding="utf-8" />
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="#*" />
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
<xsl:template match="C" />
</xsl:stylesheet>
and I end up with the following XML (it's OK to have 'B' not collapsed because I'm using HTML as output method):
<Root>
<A>foo</A>
<B></B>
</Root>
But then if I ever get another XML, this time with a namespace:
<Root xmlns="http://company.com">
<A>foo</A>
<B></B>
<C>bar</C>
</Root>
the 'C' element is not removed after XSLT process.
What can I do to bypass this namespace, is there a way?
Not so recommendable, but works:
<xsl:template match="*[local-name()='C']" />
Better:
<xsl:stylesheet
version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:foo="http://company.com"
exclude-result-prefixes="foo"
>
<!-- ... -->
<xsl:template match="C | foo:C" />
<!-- ... -->
</xsl:stylesheet>

Categories

Resources