XSLT select node without loosing attributes - java

I am trying to do xslt transformation on xml tags and the output is coming as expected but all attribute value are getting trimmed off . I want to have all attributes need to be applied but specific attributes need to be modified like attribute names.
Please find below example.
<section>
<line number='1' style='none' bold='true' size='10pt'>Line 1</line>
<line number='2' bold='true' >Line 2</line>
<line number='3' style='none' bold='true' size='10pt' color='red'>Line 3</line>
</section>
This i want to transform to
<div>
<p num='1' style='none' bold='true' size='10pt'>Line 1</p>
<p num='2' b='true' >Line 2</p>
<p num='3' style='none' bold='true' size='10pt' color='red'>Line 3</p>
</div>
This is the example i have write so far. But is but complex because i cant assume which attribute applied so i don't want to give the names of attributes explicitly.
<xsl:template match="section"><div use-attribute-sets="default"><xsl:apply-templates select="node()"/></div></xsl:template>
<xsl:template match="p"><p use-attribute-sets="default"><xsl:apply-templates select="node()"/></p></xsl:template>
<xsl:attribute-set name="default">
<xsl:attribute name="number"><xsl:value-of select="#num"/></xsl:attribute>
<xsl:attribute name="style"><xsl:value-of select="#style"/></xsl:attribute>
<xsl:attribute name="b"><xsl:value-of select="#bold"/></xsl:attribute>
<xsl:attribute name="size"><xsl:value-of select="#size"/></xsl:attribute>
<xsl:attribute name="color"><xsl:value-of select="#color"/></xsl:attribute>
</xsl:attribute-set>

If you want to copy all existing attributes by default with some exceptions you can achieve that like the following code:
<xsl:template match="*"> <!-- This matches all nodes. Note that specific templates have higher priority and will hit earlier (You can probably use your match="p" or "line") -->
<xsl:apply-templates select="#*"/> <!-- Default value if no select is given is '*' so no attributes would be hit -->
</xsl:template>
<xsl:template match="#*"> <!-- Match all the attributes so you can copy them -->
<xsl:copy/>
</xsl:template>
<xsl:template match="#bold"> <!-- Note that this will hit instead of #* because its more specific as described above -->
<xsl:attribute name="b" select="."/>
</xsl:template>
<!-- Here you can specify more of such attribute matches if needed -->

Related

Fetch an XML node-name as regex in XSL Stylesheet transformation

I have an XML (I know it is incorrect as per XML standards but I am restricted to changes as I am processing response from external party) as follows,
XML Snippet : <root>
<3party>some_value</3party>
</root>
I would like to fetch <3party> from above snippet in XSL stylesheet transformation. The fact is the <3party> element is invalid so I can no refer it as follows as it fails the xsl compilation. I would need a way to refer it as a partial element may be using some regx way? Following is incorrect.
<xsl:variable select="$response/root/3party" />
Any answers would help me out.
Edit : Possible solution to above usecase would be.
<xsl:for-each select="$response/root">
<!-- Node name -->
<xsl:variable name="name" select="local-name()" />
<!-- check if it contains party -->
<xsl:if test="contains($name, 'party')">
<!-- Node value -->
<xsl:variable name="value" select="node()" />
</xsl:if>
</xsl:for-each>
With regard to the code added to your question:
<xsl:for-each select="$response/root">
<!-- Node name -->
<xsl:variable name="name" select="local-name()" />
<!-- check if it contains party -->
<xsl:if test="contains($name, 'party')">
<!-- Node value -->
<xsl:variable name="value" select="node()" />
</xsl:if>
</xsl:for-each>
The test contains($name, 'party') will never return true. The context node's name is root and "root" does not contain "party".
On a more general note: you seem to think that the problem is how to get your XSLT stylesheet to compile. That is not so. The real problem here is that your input is not well-formed XML. The input must be parsed before it can be transformed, and if it's not well-formed XML, the process will fail well before even considering your stylesheet.

Coloring occasional lines in full XML Structure and show in HTML

My First question is marked as duplicated, but it isn't a duplicate
show XML in HTML with inline stylesheet
I hope this question is not immediatley marked as duplicate, only because one of the moderators has read the first two sentences and've ignored the rest.
The Problem is not to show an XML Structure in HTML, rather to show a full dynamically XML structure, with all tags and occasional colored lines.
The structure and interior fields are full dynamically and every field can be correct or wrong, depending on the xml file like to compare.
So a field is at the first comparison correct, but on another comparison it’s wrong. The fields and structure of XML can vary greatly from one comparison to another.
I’m looking since yesterday for a corresponding and professional solution for this problem.
Background process: comparison different xml files, via soa microservices in java. The comparison is made by org.custommonkey.xmlunit. The Result have to be an html popup, what shows me differences marked by colored lines.
Example Output XUnit Diff Result XPath
/ROOT[1]/MATDETAIL[1]/OUTPUT[1]/GENERAL[1]/CHANGED_BY[1]/text()[1]
Transform the source xml via xslt and the xunit diff result informations.
Example Input XML
<ROOT>
<MATDETAIL>
<OUTPUT>
<GENERAL>
<CREATED_ON/>
<CREATED_BY>ORIGINAL USER</CREATED_BY>
<LAST_CHNGE/>
<CHANGED_BY>NEW USER</CHANGED_BY>
</GENERAL>
<RETURN>
<TYPE>S</TYPE>
<MESSAGE/>
<LOG_NO/>
<LOG_MSG_NO>000000</LOG_MSG_NO>
</RETURN>
</OUTPUT>
</MATDETAIL>
</ROOT>
Example XSL Transformation
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node() | #*">
<xsl:copy>
<xsl:apply-templates select="node() | #*" mode="unescape"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/ROOT[1]/MATDETAIL[1]/OUTPUT[1]/GENERAL[1]/CHANGED_BY[1]">
<xsl:element name = "span">
<xsl:attribute name="style">font-weight:bold; color:red </xsl:attribute>
<xsl:copy>
<xsl:value-of select = "current()" />
</xsl:copy>
<xsl:text><== Expected: dasda</xsl:text>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Example Result of XSL Transformation
<ROOT>
<MATDETAIL>
<OUTPUT>
<GENERAL>
<CREATED_ON/>
<CREATED_BY>ORIGINAL USER</CREATED_BY>
<LAST_CHNGE/>
<span style="font-weight:bold; color:red "><CHANGED_BY>NEW USER</CHANGED_BY><== Expected: ORIGINAL USER</span>
</GENERAL>
<RETURN>
<TYPE>S</TYPE>
<MESSAGE/>
<LOG_NO/>
<LOG_MSG_NO>000000</LOG_MSG_NO>
</RETURN>
</OUTPUT>
</MATDETAIL>
</ROOT>
I’m not able to show this xml structure in html, with (all) tags, correctly AND colored.
Either I’ve get no tags, so there are only the raw data in XML to see, without tags, but the lines are colored.
Or I get the xml structure with all data, but not colored.
I tried to replace the lt and gt chars inside xslt, but failed, or after transformation in java, this result shows very ugly. My colleague has meant that we can not use it in any way.
Because the XML Structur can be every time different and so on fully dynamically, I can not style the xml with css and tag definition.
Unfortunately, alternative implementations are not an option. I have to do this somehow with the means available to me. (Java, XML & XSL, JS, HTML, CSS).
I hope to get good ideas to solute this.
I would like to thank you in advance.
I hope i can solve your issue with following try.
I. Input:
<ROOT baum="baum">
<MATDETAIL>
<OUTPUT>
<GENERAL>
<CREATED_ON/>
<CREATED_BY>ORIGINAL USER</CREATED_BY>
<LAST_CHNGE/>
<CHANGED_BY>NEW USER</CHANGED_BY>
</GENERAL>
<RETURN>
<TYPE>S</TYPE>
<MESSAGE/>
<LOG_NO/>
<LOG_MSG_NO>000000</LOG_MSG_NO>
</RETURN>
</OUTPUT>
</MATDETAIL>
</ROOT>
II. Stylesheet (XSLT 1.0):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="CHANGED_BY">
<span style="color:red;">
<xsl:apply-templates select="." mode="serialize"/>
</span>
</xsl:template>
<xsl:template match="*">
<xsl:apply-templates select="." mode="serialize"/>
</xsl:template>
<xsl:template match="#*">
<xsl:apply-templates select="." mode="serialize"/>
</xsl:template>
<xsl:template match="*" mode="serialize">
<xsl:value-of select="concat('<', name())"/>
<xsl:apply-templates select="#*" />
<xsl:choose>
<xsl:when test="node()">
<xsl:text>></xsl:text>
<xsl:apply-templates />
<xsl:value-of select="concat('<', name(), '>')"/>
</xsl:when>
<xsl:otherwise>
<xsl:text> /></xsl:text>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="#*" mode="serialize">
<xsl:value-of select="concat(' ', name(), '="', ., '"')"/>
</xsl:template>
<xsl:template match="text()" mode="serialize">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
III: Output:
<ROOT baum="baum">
<MATDETAIL>
<OUTPUT>
<GENERAL>
<CREATED_ON />
<CREATED_BY>ORIGINAL USER</CREATED_BY>
<LAST_CHNGE />
<span style="color:red;"><CHANGED_BY>NEW USER</CHANGED_BY></span>
</GENERAL>
<RETURN>
<TYPE>S</TYPE>
<MESSAGE />
<LOG_NO />
<LOG_MSG_NO>000000</LOG_MSG_NO>
</RETURN>
</OUTPUT>
</MATDETAIL>
</ROOT>
IV. Explanation:
Whenever the mode="serialize" is applied, the context is escaped. See example for CHANGED_BY to format with HTML-Tags. The xml structure is fully escaped so the browser shows it like a string instead of tags.
I really hope it solves your problem

Dynamically set XML URI at runtime - XSLT option

I have a number of Java classes that I am using in conjunction with JAXB in order to generate XML. The java classes have minimal changes from year to year but the output XML needs to have very specific yearly changes to it and it's proving a little elusive. I've tried updating the attributes using DOM but nodes further along the tree are maintaining the previous state. I've tried using reflection to update the annotations directly before marshalling but it doesn't seem to be having the desired effect. I've also tried replacing the XMLRootElement object (and XMLType, XMLElement) with local classes but nothing seems to be working properly as some information always seems to be retained somewhere even when it seems that I have changed the member/attribute/etc.
I am not going to duplicate all the java objects on a yearly basis just so that I can change the namespaces to match the requirements.
Right now I'm at the point where I think XSLT might be the last option but I have little to no knowledge of it. Is there a simple way to update 5-8 namespace URI's that are located on the root element? I don't want to change the prefixes (they are already set using a prefix mapper), I just want to change the namespace from "com.help.me.2014" to "com.help.me.2015".
Thanks
Andy
Resolution:
First off I greatly appreciate the effort and responses. I didn't actually try any of them as I came up with a different solution prior to getting back to see them.
Anyone coming along in the future can try the items listed below (as an XSLT solution) or you can try what I describe below.
I am generating XML in two different styles/formats, one with and one without SOAP wrappers. Due to my difficulty accessing the actual namespaces within the DOM/SOAP objects and my inability to alter the annotations at runtime I ended up capturing the output stream and manipulating the resulting string.
SOAP:
ByteArrayOutputStream stream = new ByteArrayOutputStream();
soapMessage.writeTo(stream);
String file = new String(stream.toByteArray);
... manipulate file (now a string), replace values, etc. -> actually passed to dependency injected converters, then send on to client via response.write
JAXB Marshalling is very similar to the SOAP, both send the resulting String onto converters which manipulate it as a StringBuilder then send it on.
Thanks again for the suggestions. Hopefully it helps someone in the future although the requirement is a little out there.
Andy
Changing namespaces every year is almost certainly the wrong thing to do, but the following XSLT stylesheet will change namespaces
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:old="oldspace"
version="1.0">
<xsl:template match="old:*">
<xsl:element name="{local-name(.)}" namespace="newspace">
<xsl:apply-templates select="#*|node()"/>
</xsl:element>
</xsl:template>
<xsl:template match="#old:*">
<xsl:attribute name="{local-name()}" namespace="newspace">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>
<xsl:template match="*">
<xsl:copy select=".">
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="#*">
<xsl:copy-of select="."/>
</xsl:template>
<xsl:template match="processing-instruction()|comment()">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
This style sheet creates a copy of every element changing the namespace from oldspace to newspace when appropriate. Other than the namespace change, the original document is preserved. Similar templates can be made for every namespace that needs to be changed (note there are two templates that are namespace specific).
Note that prefixes WILL be altered. These are not really content as such, so it is nearly impossible to preserve them in a case like this. The only way I can think of to preserve those would involve writing a separate template for each element in the original, directly creating the new elements instead of using the xsl:element element.
For example, the given xml
<os:myroot xmlns:os="oldspace">
<?keep-this?>
<os:testing abc='3' def='9'>
<!-- This is a child -->
<os:item>1234</os:item>
</os:testing>
<!-- this element is in the default namespace -->
<testing2>
<abc>112233</abc>
</testing2>
</os:myroot>
is transformed to
<myroot xmlns="newspace">
<?keep-this?>
<testing>
<!-- This is a child -->
<item>1234</item>
</testing>
<!-- this element is in the default namespace -->
<testing2 xmlns="">
<abc>112233</abc>
</testing2>
</myroot>
where all elements that were in the oldspace namespace are now in the newspace namespace.
Here's an option that allows you to pass the old and new namespace URIs in as xsl:params.
XML Input (Borrowed from Matthew's answer; thanks!)
<os:myroot xmlns:os="com.help.me.2014">
<?keep-this?>
<os:testing abc='3' def='9'>
<!-- This is a child -->
<os:item>1234</os:item>
</os:testing>
<!-- this element is in the default namespace -->
<testing2>
<abc>112233</abc>
</testing2>
</os:myroot>
XSLT 1.0
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="oldns" select="'com.help.me.2014'"/>
<xsl:param name="newns" select="'com.help.me.2015'"/>
<xsl:template match="#*|node()" name="ident">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*" priority="1">
<xsl:choose>
<xsl:when test="namespace-uri()=$oldns">
<xsl:element name="{name()}" namespace="{$newns}">
<xsl:apply-templates select="#*|node()"/>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:element name="{name()}" namespace="{namespace-uri()}">
<xsl:apply-templates select="#*|node()"/>
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
XML Output
<os:myroot xmlns:os="com.help.me.2015"><?keep-this?>
<os:testing abc="3" def="9"><!-- This is a child -->
<os:item>1234</os:item>
</os:testing><!-- this element is in the default namespace -->
<testing2>
<abc>112233</abc>
</testing2>
</os:myroot>
Here's an XSLT 2.0 option that produces the same output...
XSLT 2.0
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="oldns" select="'com.help.me.2014'"/>
<xsl:param name="newns" select="'com.help.me.2015'"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*" priority="1">
<xsl:element name="{name()}" namespace="{
if (namespace-uri()=$oldns) then $newns else namespace-uri()}">
<xsl:apply-templates select="#*|node()"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Here's another 2.0 example that handles multiple namespace uris. The old and new uris are passed in as a string with commas as delimiters.
The order of the uris are important. The first old uri corresponds to the first new uri. The second old uri corresponds to the second new uri. Etc.
XML Input (updated to have more than one namespace uri)
<os:myroot xmlns:os="com.help.me.2014">
<?keep-this?>
<os:testing abc='3' def='9'>
<!-- This is a child -->
<os:item>1234</os:item>
</os:testing>
<!-- this element is in the default namespace -->
<testing2>
<abc>112233</abc>
</testing2>
<os2:testing xmlns:os2="com.help.me.again.2014">
<os2:item>ABCD</os2:item>
</os2:testing>
</os:myroot>
XSLT 2.0
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="oldns" select="'com.help.me.2014,com.help.me.again.2014'"/>
<xsl:param name="newns" select="'com.help.me.2015,com.help.me.again.2015'"/>
<xsl:variable name="oldns-seq" select="tokenize($oldns,',')"/>
<xsl:variable name="newns-seq" select="tokenize($newns,',')"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*" priority="1">
<xsl:variable name="nsIdx" select="index-of($oldns-seq,namespace-uri())"/>
<xsl:element name="{name()}" namespace="{
if (namespace-uri()=$oldns-seq) then $newns-seq[$nsIdx] else namespace-uri()}">
<xsl:apply-templates select="#*|node()"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
XML Output
<os:myroot xmlns:os="com.help.me.2015"><?keep-this?>
<os:testing abc="3" def="9"><!-- This is a child -->
<os:item>1234</os:item>
</os:testing>
<!-- this element is in the default namespace -->
<testing2>
<abc>112233</abc>
</testing2>
<os2:testing xmlns:os2="com.help.me.again.2015">
<os2:item>ABCD</os2:item>
</os2:testing>
</os:myroot>

getting attributes of all the parent in xslt

I have following xml structure
<comp name = "a">
<subcomp1 name = "a1">
<subcomp2 name = "a2"/>
</subcomp1>
<subcomp3 name="a3/>
</comp>
If I try the following syntax,
<xsl:value-of select="#name" />
gives attribute value of name when I am in perticular tag. ie when I am at
<comp> #name = a, at<subcomp2> it is a2
But I want to get all the attribute value including the parents. ie when I am at
<subcomp2> I want value a->a1->a2
<subcomp1> a->a1
<subcomp3> a->a3
<xsl:value-of select="..\#name" />
gives only one parent above. So please let me know the solution for the same
The expression you want is ancestor-or-self::*/#name
If you are using XSLT 2.0, then xsl:value-of returns all matching attributes, so you can just do this to list them, for example
<xsl:value-of select="ancestor-or-self::*/#name" separator="-" />
However, in XSLT 1.0, xsl:value-of will only return the value of the first one. So, you can use xsl:for-each instead
<xsl:for-each select="ancestor-or-self::*/#name">
<xsl:if test="position() > 1">-</xsl:if>
<xsl:value-of select="." />
</xsl:for-each>
Or maybe this...
<xsl:for-each select="ancestor-or-self::*">
<xsl:if test="position() > 1">-</xsl:if>
<xsl:value-of select="#name" />
</xsl:for-each>

How can I get array of elements, including missing elements, using XPath in XSLT?

Given the following XML-compliant HTML:
<div>
<a>a1</a>
<b>b1</b>
</div>
<div>
<b>b2</b>
</div>
<div>
<a>a3</a>
<b>b3</b>
<c>c3</c>
</div>
doing //a will return:
[a1,a3]
The problem with above is that the third column data is now in second place, when A is not found it is completely skipped.
how can you express an xpath to get all A elements which will return:
[a1, null, a3]
same case for //c, I wonder if it's possible to get
[null, null, c3]
UPDATE: consider another scenario where are no common parents <div>.
<h1>heading1</h1>
<a>a1</a>
<b>b1</b>
<h1>heading2</h1>
<b>b2</b>
<h1>heading3</h1>
<a>a3</a>
<b>b3</b>
<c>c3</c>
UPDATE: I am now able to use XSLT as well.
There is no null value in XPath. There's a semi-related question here which also explains this: http://www.velocityreviews.com/forums/t686805-xpath-query-to-return-null-values.html
Realistically, you've got three options:
Don't use XPath at all.
Use this: //a | //div[not(a)], which would return the div element if there was no a within it, and have your Java code handle any div's returned as 'no a element present'. Depending on the context, this may even allow you to output something more useful if required, as you'll have access to the entire contents of the div, for example an error 'no a element found in div (some identifier)'.
Preprocess your XML with an XSLT that inserts a elements in any div element that does not already have one with a suitable default.
Your second case is a little tricky, and to be honest, I'd actually recommend not using XPath for it at all, but it can be done:
//a | //h1[not(following-sibling::a) or generate-id(.) != generate-id(following-sibling::a[1]/preceding-sibling::h1[1])]
This will match any a elements, or any h1 elements where no following a element exists before the next h1 element, or the end of the document. As Dimitre pointed out though, this only works if you're using it from within XSLT, as generate-id is an XSLT function.
If you're not using it from within XLST, you can use this rather contrived formula:
//a | //h1[not(following-sibling::a) or count(. | preceding-sibling::h1) != count(following-sibling::a[1]/preceding-sibling::h1)]
It works by matching h1 elements where the count of itself and all preceding h1 elements is not the same as the count of all h1 elements preceding the next a. There may be a more efficient way of doing it in XPath, but if it's going to get any more contrived than that, I'd definitely recommend not using XPath at all.
Solution for the first problem:
This XPath expression:
/*/div/a
|
/*/div[not(a)]
When evaluated against the following XML document:
<t>
<div>
<a>a1</a>
<b>b1</b>
</div>
<div>
<b>b2</b>
</div>
<div>
<a>a3</a>
<b>b3</b>
<c>c3</c>
</div>
</t>
selects the following three nodes (a, div, a):
<a>a1</a>
<div>
<b>b2</b>
</div>
<a>a3</a>
In your java array any selected non-a element should be treated as (or replaced by) null.
Here is one solution to the second problem:
Use these XPath expressions for selecting the a elements from each group:
For the first group:
/*/h1[1]
/following-sibling::a
[not(/*/h1[2])
or
count(.|/*/h1[2]/preceding-sibling::a)
=
count(/*/h1[2]/preceding-sibling::a)
]
For the second group:
/*/h1[2]
/following-sibling::a
[not(/*/h1[3])
or
count(.|/*/h1[3]/preceding-sibling::a)
=
count(/*/h1[3]/preceding-sibling::a)
]
And for the 3rd group:
/*/h1[3]
/following-sibling::a
[not(/*/h1[4])
or
count(.|/*/h1[4]/preceding-sibling::a)
=
count(/*/h1[4]/preceding-sibling::a)
]
In case that:
count(/*/h1)
is $cnt,
generate $cnt such expressions (for i = 1 to $cnt) and evaluate all of them. The selected nodes by each of them either contains an a element, or not. If the $k-th group (nodes selected from evaluating the $k-th expression) contains an a, use its string value to generate the $k-th item of the wanted array -- otherwise generate null for the $k-th item of the wanted array.
Here is an XSLT - based verification of the above XPath expressions:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="vGroup1" select=
"/*/h1[1]
/following-sibling::a
[not(/*/h1[2])
or
count(.|/*/h1[2]/preceding-sibling::a)
=
count(/*/h1[2]/preceding-sibling::a)
]
"/>
<xsl:variable name="vGroup2" select=
"/*/h1[2]
/following-sibling::a
[not(/*/h1[3])
or
count(.|/*/h1[3]/preceding-sibling::a)
=
count(/*/h1[3]/preceding-sibling::a)
]
"/>
<xsl:variable name="vGroup3" select=
"/*/h1[3]
/following-sibling::a
[not(/*/h1[4])
or
count(.|/*/h1[4]/preceding-sibling::a)
=
count(/*/h1[4]/preceding-sibling::a)
]
"/>
Group1: "<xsl:copy-of select="$vGroup1"/>"
Group2: "<xsl:copy-of select="$vGroup2"/>"
Group3: "<xsl:copy-of select="$vGroup3"/>"
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the following XML document (no complete and well-formed XML document has been provided by the OP !!!):
<t>
<h1>heading1</h1>
<a>a1</a>
<b>b1</b>
<h1>heading2</h1>
<b>b2</b>
<h1>heading3</h1>
<a>a3</a>
<b>b3</b>
<c>c3</c>
</t>
the three XPath expressions are evaluated and the selected nodes by each of them are output:
Group1: "<a>a1</a>"
Group2: ""
Group3: "<a>a3</a>"
Explanation:
We use the well-known Kayessian formula for the intersection of two nodesets:
$ns1[count(. | $ns2) = count($ns2)]
The result of evaluating this expression contains exactly the nodes that belong both to the nodeset $ns1 and the nodeset $ns2.
What remains is to substitute $ns1 and $ns2 with expressions that are relevant to the problem.
We substitute $ns1 by:
/*/h1[1]
/following-sibling::a
and we substitute $ns2 by:
/*/h1[2]
/preceding-sibling::a
In other words, the a elements that are between the first and second /*/h1 are the intersection of the a elements that are following siblings of /*/h1[1] and the a elements that are preceding siblings of /*/h1[2].
This expression is only problematic for the a elements that follow the last of the /*/h1 elements. this is why we add an additional predicate, that checks for non-existence of a next /*/h1 element and or this with the following boolean expressions.
Finally, as a guiding example for a Java implementation here is a complete XSLT transformation, which does something similar -- produces a serialized array, and can be mechanically translated to a corresponding Java solution:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="my:my">
<xsl:output method="text"/>
<my:null>null</my:null>
<my:Q>"</my:Q>
<xsl:variable name="vNull" select="document('')/*/my:null"/>
<xsl:variable name="vQ" select="document('')/*/my:Q"/>
<xsl:template match="/">
<xsl:variable name="vGroup1" select=
"/*/h1[1]
/following-sibling::a
[not(/*/h1[2])
or
count(.|/*/h1[2]/preceding-sibling::a)
=
count(/*/h1[2]/preceding-sibling::a)
]
"/>
<xsl:variable name="vGroup2" select=
"/*/h1[2]
/following-sibling::a
[not(/*/h1[3])
or
count(.|/*/h1[3]/preceding-sibling::a)
=
count(/*/h1[3]/preceding-sibling::a)
]
"/>
<xsl:variable name="vGroup3" select=
"/*/h1[3]
/following-sibling::a
[not(/*/h1[4])
or
count(.|/*/h1[4]/preceding-sibling::a)
=
count(/*/h1[4]/preceding-sibling::a)
]
"/>
[<xsl:value-of select=
"concat($vQ[$vGroup1/self::a[1]],
$vGroup1/self::a[1],
$vQ[$vGroup1/self::a[1]],
$vNull[not($vGroup1/self::a[1])])"/>
<xsl:text>,</xsl:text>
<xsl:value-of select=
"concat($vQ[$vGroup2/self::a[1]],
$vGroup2/self::a[1],
$vQ[$vGroup2/self::a[1]],
$vNull[not($vGroup2/self::a[1])])"/>
<xsl:text>,</xsl:text>
<xsl:value-of select=
"concat($vQ[$vGroup3/self::a[1]],
$vGroup3/self::a[1],
$vQ[$vGroup3/self::a[1]],
$vNull[not($vGroup3/self::a[1])])"/>]
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the same XML document (above), the wanted, correct result is produced:
["a1",null,"a3"]
Update2:
Now the OP has added that he can use an XSLT solution. Here is one:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="my:my" exclude-result-prefixes="xsl">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:key name="kFollowing" match="a"
use="generate-id(preceding-sibling::h1[1])"/>
<my:null/>
<xsl:variable name="vNull" select="document('')/*/my:null"/>
<xsl:template match="/*">
<xsl:copy-of select=
"h1/following-sibling::a[1]
|
h1[not(key('kFollowing', generate-id()))]"/>
=============================================
<xsl:apply-templates select="h1"/>
</xsl:template>
<xsl:template match="h1">
<xsl:variable name="vAsInGroup" select=
"key('kFollowing', generate-id())"/>
<xsl:copy-of select="$vAsInGroup[1] | $vNull[not($vAsInGroup)]"/>
</xsl:template>
</xsl:stylesheet>
This transformation implements two different solutions. The difference is in what element is used to represent "null". In the first case it is the h1 element. This isn't recommended, because any h1 already has its own meaning which is different from "representing null". The second solution uses a special my:null element to represent null.
When this transformation is applied on the same XML document as above:
<t>
<h1>heading1</h1>
<a>a1</a>
<b>b1</b>
<h1>heading2</h1>
<b>b2</b>
<h1>heading3</h1>
<a>a3</a>
<b>b3</b>
<c>c3</c>
</t>
each of the two XPath expressions (containing XSLT key() references) are evaluated and the selected nodes are output (above and below "========", respectively):
<a>a1</a>
<h1>heading2</h1>
<a>a3</a>
=============================================
<a>a1</a>
<my:null xmlns:my="my:my" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"/>
<a>a3</a>
Note on performance:
Because keys are used, this solution will be significantly more efficient when more than one search is made -- for example, when the corresponding arrays for a, b, and c need to be produced.
I suggest you use the following, which might be rewritten to an xsl:function where the parent node name (here: div) is parametrized.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<root>
<aList><xsl:copy-of select="$divIncludingNulls//a"/></aList>
<bList><xsl:copy-of select="$divIncludingNulls//b"/></bList>
<cList><xsl:copy-of select="$divIncludingNulls//c"/></cList>
</root>
</xsl:template>
<xsl:variable name="divChild" select="distinct-values(//div/*/name())"/>
<xsl:variable name="divIncludingNulls">
<xsl:for-each select="//div">
<xsl:variable name="divElt" select="."/>
<div>
<xsl:for-each select="$divChild">
<xsl:variable name="divEltvalue" select="$divElt/*[name()=current()]"/>
<xsl:element name="{.}">
<xsl:choose>
<xsl:when test="$divEltvalue"><xsl:value-of select="$divEltvalue"/></xsl:when>
<xsl:otherwise>null</xsl:otherwise>
</xsl:choose>
</xsl:element>
</xsl:for-each>
</div>
</xsl:for-each>
</xsl:variable>
</xsl:stylesheet>
Applied to
<?xml version="1.0" encoding="UTF-8"?>
<root>
<div>
<a>a1</a>
<b>b1</b>
</div>
<div>
<b>b2</b>
</div>
<div>
<a>a3</a>
<b>b3</b>
<c>c3</c>
</div>
</root>
the output is
<?xml version="1.0" encoding="UTF-8"?>
<root>
<aList>
<a>a1</a>
<a>null</a>
<a>a3</a>
</aList>
<bList>
<b>b1</b>
<b>b2</b>
<b>b3</b>
</bList>
<cList>
<c>null</c>
<c>null</c>
<c>c3</c>
</cList>
</root>

Categories

Resources