How to edit XML with XSL? - java

I'm writing a dummy "MyAgenda" application in Java which has to allow maintenance of the XML file that stores the data.
Say I have a XML file like:
<myagenda>
<contact>
<name>Matthew Blake</name>
<phone>12345678</phone>
</contact>
</myagenda>
How can I add a new <contact> by using XSLT ?
Thanks.

Start with the identity transform, which transforms any XML document into itself.
The identity transform is a simple machine: given a tree, it copies every node it finds recursively. You're going to override its behavior for one specific node - the myagenda element - which it's going to copy in a different way.
To do this, add a template that matches the element that you want to update and duplicates it. In your case:
<xsl:template match="myagenda">
<xsl:copy-of select=".">
<xsl:apply-templates select="node() | #*"/>
</xsl:copy-of>
</xsl:template>
You might think, "wait isn't that the identity transform?" It is, but it's not going to stay that way.
Now decide on how you're going to get the new contact information into the transform. There are basically two ways: read it from a separate XML document using the document function, or pass the values into the transform using parameters. Let's assume that you're using parameters; in this case, you'd add the following to the top of your XSLT (right after the xsl:output element):
<xsl:param name="contactName"/>
<xsl:param name="contactPhone"/>
Now, instead of transforming myagenda into a copy of itself, you want to transform it into a copy of itself that has a new contact in it. So modify the template to do this:
<xsl:template match="myagenda">
<xsl:copy-of select=".">
<xsl:apply-templates select="node() | #*"/>
<contact>
<name><xsl:value-of select="$contactName"/></name>
<phone><xsl:value-of select="$contactPhone"/></phone>
</contact>
</xsl:copy-of>
</xsl:template>
If you wanted to get the name and phone out of a separate XML document in the file system, you'd start the XSLT with something like this:
<xsl:variable name="contact" value="document('contact.xml')"/>
<xsl:variable name="contactName" value="$contact/*/name[1]'/>
<xsl:variable name="contactPhone" value=$contact/*/phone[1]'>
That reads in contact.xml and finds the first name and phone element under the top-level element (using * in the pattern means that you don't care what the top-level element's name is).

use the xsl:param as a global parameter in the header of your xsl stylesheet.
<xsl:param name="newname"/>
<xsl:param name="newphone"/>
fill the new params with your xslt engine and then add the new item via a template:
(...)
<xsl:template match="myagenda">
<xsl:apply-templates select="contact"/>
<xsl:if test="string-length($newname)>0">
<xsl:element name="contact">
<xsl:element name="name">
<xsl:value-of select="$newname"/>
</xsl:element>
<xsl:element name="phone">
<xsl:value-of select="$newphone"/>
</xsl:element>
</xsl:element>
</xsl:if>
</xsl:template>
(...)

XSLT converts 1 xml file to another xml or text file.

Related

XSTL/ XSL file: need to remove duplicates generically from the parent tag given that all the child key values are same for XML

I have been working on a problem for a long time. I need to remove duplicates from xml file based on the key value of the child tag. The parent tag "A" will always be known and will stay the same. The nested tags can have different names i.e., there could be "Name", "Location", "Name". If the data under 2 "Name" tags are duplicates of each other, one of the name tag along with its child nodes must get removed. This should only happen if all the child tag key values are same and not if only one or 2 or more tags are same but there exists some tags with different key value or same key and different value under the parent tag.
Example:
`<A>
<Name>
<c>1<c>
<d>g</d>
<e>h</e>
</Name>
<Location>
<c>2<c>
<d>g</d>
<e>h</e>
</Location>
<Name>
<c>1<c>
<d>g</d>
<e>h</e>
</Name>
<A>`
Expected output:
`<A>
<Name>
<c>1<c>
<d>g</d>
<e>h</e>
</Name>
<Locaiton>
<c>2<c>
<d>g</d>
<e>h</e>
</Locaiton>
<A>`
I tried : this:
`<xsl:template match="#*|node()">
<xsl:if test="not(node()) or not(preceding-sibling::node()[.=string(current())])">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:if>
</xsl:template>`
but what ended up happening was that the child tags with the same key values got removed as well and I was getting something like this:
`<A>
<Name>
<c>1<c>
<d>g</d>
<e>h</e>
</Name>
<Location>
<c>2<c>
</Location>
<A>`
I'm looking for a generic way as I don't want to specify the tag values or keys in the file.
Thanks in advance :)!
In XSLT 3 using a composite key with for-each-group group-by might suffice:
<xsl:template match="A">
<xsl:copy>
<xsl:for-each-group select="*" composite="yes" group-by="*">
<xsl:apply-templates select="."/>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
If the grandchildren also not to be sorted then you might need
<xsl:for-each-group select="*" composite="yes" group-by="sort(*, (), function($c) { name($c) })">
instead of the simple group-by given above.
Both ways, as the base transformation, you need to set up the identity transformation by declaring <xsl:mode on-no-match="shallow-copy"/> as a child of xsl:stylesheet (or xsl:transform) in the XSLT.
But the problem is rather underspecified, it is not clear whether the names and order of child elements are simply unknown are always the same or whether there can be variations and how to handle them.
As an alternative, if you can have different elements as children of A but need to eliminate duplicates only for a specific one like B but for possible other elements then a key explicitly declared for B elements can help
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
expand-text="yes">
<xsl:mode on-no-match="shallow-copy"/>
<xsl:key name="B-group" match="A/B" use="sort(*, (), function($c) { name($c) })"/>
<xsl:template match="B[not(. is key('B-group', sort(*, (), function($c) { name($c) }))[1])]"/>
</xsl:stylesheet>

How to determine if two XML files have the same structure even if the tags have different values?

I wish to compare two XML files and determine if they have the same structure i.e. The same type and number of tags with preferably the same attributes. The value of the tags and attributes may be different.
This code detects ALL the differences. Even if the structure is the same but values are different. I want to refine this to detect only the structural differences.
public static List compareXML(Reader source, Reader target) throws
SAXException, IOException{
//creating Diff instance to compare two XML files
Diff xmlDiff = new Diff(source, target);
//for getting detailed differences between two xml files
DetailedDiff detailXmlDiff = new DetailedDiff(xmlDiff);
return detailXmlDiff.getAllDifferences();
}
Try this XSLT 3.0:
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="text()"/>
<xsl:template match="#*">
<xsl:attribute name="name()"/>
</xsl:template>
<xsl:variable name="doc1">
<xsl:apply-templates select="doc('one.xml')"/>
</xsl:variable>
<xsl:variable name="doc2">
<xsl:apply-templates select="doc('two.xml')"/>
</xsl:variable>
<xsl:template name="xsl:initial-template">
<xsl:value-of select="deep-equal($doc1, $doc2)"/>
</xsl:template>

How to transform an xml file by searching for some nodes and replacing the values

This is the input xml -
<payload id="001">
<termsheet>
<format>PDF</format>
<city>New York</city>
</termsheet>
</payload>
We are using Xalan for most of our xml transformations and we are on XSLT 1.0
I want to write a XSLT template which would convert the input to the below output -
<payload id="001">
<termsheet>
<format>pdf</format>
<city>Mr. ABC</city>
</termsheet>
</payload>
I tried lot of answers on SO, but can't get around this problem.
Apologies for not being clear, toLower was an over simplification. I want to use the city name and invoke a java method which will return a business contact from that city. I have updated the original question
I think that the simplest way is to use java extension with Xalan, you can write a simple java class that implements the business logic you need, and then call it from your xslt. The stylesheet is quite simple
<xsl:stylesheet version="1.0"
xmlns:java="http://xml.apache.org/xalan/java"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
exclude-result-prefixes="java">
<xsl:template match='node() | #*'>
<xsl:copy>
<xsl:apply-templates select ='node()|#*'></xsl:apply-templates>
</xsl:copy>
</xsl:template>
<xsl:template match="termsheet/city">
<xsl:copy>
<xsl:value-of select='java:org.example.Card.getName(.)'/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
you also neeed to write the java class invoked
package org.example
public class Card {
public static String getName(String id) {
// put here your code to get what you need
return "Mr. ABC"
}
}
there are other ways to do that and you should really give an eye to the documentation about xalan extensions

Transform xml using xslt to csv file

I'm having an issue with xsl:templates and xsl:call-template tags. Perhaps it's a lack of understand , but here's what I'm trying to do...
If I have a template that's matching on "/*", and I need to call other templates from within the enclosing template that require other document contexts, what is the most efficient method of doing this?
<xsl:template match="/*">
<xsl:call-template name="header">
<xsl:with-param name="headerContext" select="./[1]"/>
</xsl:call-template>
<xsl:call-template name="body">
<xsl:with-param name="bodyContext" select="*/*/[1]"/>
</xsl:call-template>
<xsl:template>
I'm using xsl:with-param when calling the header and body templates so that I can override the match="/*" from the enclosing template, but when I do this the output gets messed up. If I comment out the call to the "header" template, the body template works properly, and vicee versa, but calling both from the main template, as you see in the above example, makes them behave strangely. The header and body templates require a selection to different parts of the document, that's why I chose to use w0th-param, but I don't think it's even working.
Should I be using apply-templates instead?
XSL was designed to be event-based. So, typically, you'll want to use template matching more than explicitly specifying which descendants to process.
<!-- Identity Template will copy every node to the output. -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- You listed ./[1] as your xpath, but you might want to add more information
to make it more specific. i.e. element names, not just * and position. -->
<xsl:template match="/*/header">
<someOutputHeader><xsl:apply-templates /></someOutputHeader>
</xsl:template>
<xsl:template match="/something/*/body">
<newBody><xsl:apply-templates /></newBody>
</xsl:template>
Also, it's good practice to specify a nodeTest before a predicate. So, for example, instead of writing "./[1]" you could specify * after the slash. "./*[1]" You also don't need to use "./" either. It's implied by xpath. So really, it's "*[1]"

How can we convert XML file to CSV?

I am having an XML file
<?xml version="1.0" encoding="ISO-8859-1"?>
<Results>
<Row>
<COL1></COL1>
<COL2>25.00</COL2>
<COL3>2009-07-06 15:49:34.984</COL3>
<COL4>00001720</COL4>
</Row>
<Row>
<COL1>RJ</COL1>
<COL2>26.00</COL2>
<COL3>2009-07-06 16:04:16.156</COL3>
<COL4>00001729</COL4>
</Row>
<Row>
<COL1>SD</COL1>
<COL2>28.00</COL2>
<COL3>2009-07-06 16:05:04.375</COL3>
<COL4>00001721</COL4>
</Row>
</Results>
I have to convert this XML into CSV file. I have heard we can do such thing using XSLT. How can i do this in Java ( with/without XSLT )?
Using XSLT is often a bad idea. Use Apache Commons Digester. It's fairly easy to use - here's a rough idea::
Digester digester = new Digester();
digester.addObjectCreate("Results/Row", MyRowHolder.class);
digester.addCallMethod("Results/Row/COL1","addCol", 0);
// Similarly for COL2, etc.
digester.parse("mydata.xml");
This will create a MyRowHolder instance (where this is a class you provide). This class would have a addCol() method which would be called for each <COLn> with the contents of that tag.
In pseudo code:
loop through the rows:
loop through all children of `Row`:
write out the text
append a comma
new line
That quick little loop will write a comma at the end of each line, but I'm sure you can figure out how to remove that.
For actually parsing the XML, I suggest using JDOM. It has a pretty intuitive API.
In XSLT 1.0:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="ISO-8859-1" />
<xsl:template match="/Results">
<xsl:apply-templates select="Row" />
</xsl:template>
<xsl:template match="Row">
<xsl:apply-templates select="*" />
<xsl:if test="not(last())">
<xsl:value-of select="'
'" />
</xsl:if>
</xsl:template>
<xsl:template match="Row/*">
<xsl:value-of select="." />
<xsl:if test="not(last())">
<xsl:value-of select="','" />
</xsl:if>
</xsl:template>
</xsl:stylesheet>
If your COL* values can contain commas, you could wrap the values in double quotes:
<xsl:template match="Row/*">
<xsl:value-of select="concat('"', ., '"')" />
<!-- ... --->
If they can contain commas and double quotes, things could get a bit more complex due to the required escaping. You know your data, you'll be able to decide how to best format the output. Using a different separator (e.g. TAB or a pipe symbol) is also an option.
Read the XML file in.
Loop throught each record and add it to a csv file.
With XSLT you can use the JAXP interface to the XSLT processor and then use <xsl:text> in your stylesheet to convert to text output.
<xsl:text>
</xsl:text>
generates a newline. for example.
Use the straightforward SAX API via the standard Java JAXP package. This will allow you to write a class that receives events for each XML element your reader encounters.
Briefly:
read your XML in using SAX
record text values via the SAX DefaultHandler characters() method
when you get an end event for a COL, record this string value
when you get the ROW end event, simply write out a comma separated line of previously recorded values

Categories

Resources