I am attempting to begin writing a program which uses DOM4j with which I wish to parse a XML file, save it to some tables and finally allow the user to manipulate the data.
Unfortunately I am stuck on the most basic step, the parsing.
Here is the portion of my XML I am attempting to include:
<?xml version="1.0"?>
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:camt.054.001.04">
<BkToCstmrDbtCdtNtfctn>
<GrpHdr>
<MsgId>000022222</MsgId>
When I attempt to find the root of my XML it does return the root correctly as "Document". When I attempt to get the child node from Document it also correctly gives me "BkToCstmrDbtCdtNtfctn". The problem is that when I try to go any further and get the child nodes from "Bk" I can't. I get this in the console:
org.dom4j.tree.DefaultElement#2b05039f [Element: <BkToCstmrDbtCdtNtfctn uri: urn:iso:std:iso:20022:tech:xsd:camt.054.001.04 attributes: []/>]
Here is my code, I would appreciate any feedback. Ultimately I want to get the "MsgId" attribute back but in general I just want to figure how to parse deeper into the XML because in reality it probably has about 25 layers.
public static Document getDocument(final String xmlFileName){
Document document = null;
SAXReader reader = new SAXReader();
try{
document = reader.read(xmlFileName);
}
catch (DocumentException e)
{
e.printStackTrace();
}
return document;
}
public static void main(String args[]){
String xmlFileName = "C:\\Users\\jhamric\\Desktop\\Camt54.xml";
String xPath = "//Document";
Document document = getDocument(xmlFileName);
Element root = document.getRootElement();
List<Node> nodes = document.selectNodes(xPath);
for(Iterator i = root.elementIterator(); i.hasNext();){
Element element = (Element) i.next();
System.out.println(element);
}
for(Iterator i = root.elementIterator("BkToCstmrDbtCdtNtfctn");i.hasNext();){
Element bk = (Element) i.next();
System.out.println(bk);
}
}
}
The best approach is probably to use XPath, but since the XML document uses namespaces, you cannot use the "simple" selectNodes methods in the API. I would create a helper method to easily evaluate any XPath expression on either the Document or the Element level:
public static void main(String[] args) throws Exception {
Document doc = getDocument(...);
Map<String, String> namespaceContext = new HashMap<>();
namespaceContext.put("ns", "urn:iso:std:iso:20022:tech:xsd:camt.054.001.04");
// Select the first GrpHdr element in document order
Element element = (Element) select("//ns:GrpHdr[1]", doc, namespaceContext);
System.out.println(element.asXML());
// Select the text content of the MsgId element
Text msgId = (Text) select("./ns:MsgId/text()", element, namespaceContext);
System.out.println(msgId.getText());
}
static Object select(String expression, Branch contextNode, Map<String, String> namespaceContext) {
XPath xp = contextNode.createXPath(expression);
xp.setNamespaceURIs(namespaceContext);
return xp.evaluate(contextNode);
}
Note that the XPath expression must use namespace prefixes that is mapped to the namespace URIs used in the input document, but that the actual value of the prefix doesn't matter.
Related
Although I am able to set a text value inside a node with the code below
private static void setPhoneNumber(Document xmlDoc, String phoneNumber) {
Element root = xmlDoc.getDocumentElement();
Element phoneParent = (Element) root.getElementsByTagName("gl-bus:entityPhoneNumber").item(0);
Element phoneElement = (Element) phoneParent.getElementsByTagName("gl-bus:phoneNumber").item(0);
phoneElement.setTextContent(phoneNumber);
}
I cannot do the same with XPath because I get null for the node object
private static void setPhoneNumber(Document xmlDoc, String phoneNumber) {
try {
NodeList nodes = (NodeList) xPath.evaluate("/gl-cor:entityInformation/gl-bus:entityPhoneNumber/gl-bus:phoneNumber", xmlDoc, XPathConstants.NODESET);
Node node = nodes.item(0);
node.setTextContent(phoneNumber);
} catch (XPathExpressionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
The fact that you're using the non-namespace-aware method getElementsByTagName(), passing it an element name containing a colon, suggests that you're not handling namespaces properly when you parse the XML. If your XML were parsed in a namespace-aware manner then this shouldn't have worked, but something like
String namespace = // the namespace URI bound to the gl-bus prefix in your doc
Element phoneParent = (Element) root.getElementsByTagNameNS(namespace, "entityPhoneNumber").item(0);
would work correctly. Note that the standard Java DocumentBuilderFactory is not namespace aware by default, you must call setNamespaceAware(true) on the factory before you ask it for a newDocumentBuilder.
XPath requires namespace-aware parsing, and if you want to access elements that are in a namespace via XPath then you must supply a NamespaceContext to the XPath object to tell it what prefix bindings to use - it does not inherit the prefix bindings from the original XML. Annoyingly there's no default implementation of NamespaceContext provided in the core Java library so you either have to write your own or use a third-party implementation such as Spring's SimpleNamespaceContext. With that:
SimpleNamespaceContext ctx = new SimpleNamespaceContext();
ctx.bindNamespaceUri("g", namespace); // the same URI as before
ctx.bindNamespaceUri("c", ...); // the namespace bound to gl-cor:
xPath.setNamespaceContext(ctx);
NodeList nodes = (NodeList) xPath.evaluate("/c:entityInformation/g:entityPhoneNumber/g:phoneNumber", xmlDoc, XPathConstants.NODESET);
I've come across and problem that I've looked up on stack overflow but none of the solutions seems to solve the problem for me.
I'm retrieving XML data from Yahoo and it comes back as below (truncated for brevity's sake).
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<fantasy_content xmlns="http://fantasysports.yahooapis.com/fantasy/v2/base.rng" xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" copyright="Data provided by Yahoo! and STATS, LLC" refresh_rate="31" time="55.814027786255ms" xml:lang="en-US" yahoo:uri="http://fantasysports.yahooapis.com/fantasy/v2/league/328.l.108462/settings">
<league>
<league_key>328.l.108462</league_key>
<league_id>108462</league_id>
<draft_status>postdraft</draft_status>
</league>
</fantasy_content>
I've been having a problem getting XPath to retrieve any elements so I've written a unit test to try to resolve it and it looks like:
final File file = new File("league-settings.xml");
javax.xml.parsers.DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true);
javax.xml.parsers.DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
org.w3c.dom.Document doc = dBuilder.parse(file);
javax.xml.xpath.XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new YahooNamespaceContext());
final String expression = "yfs:league";
final XPathExpression expr = xPath.compile(expression);
Object nodes = expr.evaluate(doc, XPathConstants.NODESET);
assert(nodes instanceof NodeList);
NodeList leagueNodes = (NodeList)nodes;
int leaguesLength = leagueNodes.getLength();
assertEquals(leaguesLength, 1);
The YahooNamespaceContext class I created to map the namespaces looks as follows:
public class YahooNamespaceContext implements NamespaceContext {
public static final String YAHOO_NS = "http://www.yahooapis.com/v1/base.rng";
public static final String DEFAULT_NS = "http://fantasysports.yahooapis.com/fantasy/v2/base.rng";
public static final String YAHOO_PREFIX = "yahoo";
public static final String DEFAULT_PREFIX = "yfs";
private final Map<String, String> namespaceMap = new HashMap<String, String>();
public YahooNamespaceContext() {
namespaceMap.put(DEFAULT_PREFIX, DEFAULT_NS);
namespaceMap.put(YAHOO_PREFIX, YAHOO_NS);
}
public String getNamespaceURI(String prefix) {
return namespaceMap.get(prefix);
}
public String getPrefix(String uri) {
throw new UnsupportedOperationException();
}
public Iterator<String> getPrefixes(String uri) {
throw new UnsupportedOperationException();
}
}
Any help with people with more experience with XML namespaces or debugging tips into Xpath compilation/evaluation would be appreciated.
If the problem is that you're getting zero as the length of the result nodelist, have you tried changing
final String expression = "yfs:league";
to
final String expression = "//yfs:league";
?
It appears that the context for evaluating your XPath expressions, doc, is the root node of the document. dBuilder.parse(file) returns the document root node, not the outermost element (a.k.a. document element). Remember, in XPath, a root node is not an element. So doc
is not the yfs:fantasy_content element node but is its (invisible) parent.
In that context, the XPath expression "yfs:league" will only select an element that is a direct child of that root node, of which there is no yfs:league -- only yfs:fantasy_content.
The XPath expression yfs:league is equivalent to child::yfs:league. It means: find direct children nodes (not descendants) of doc with the specified local name (league) and namespace URI (http://fantasysports.yahooapis.com/fantasy/v2/base.rng).
You must take into account the outermost element (fantasy_content) or search for descendant instead of child nodes.
Replacing
final String expression = "yfs:league";
with
final String expression = "yfs:fantasy_content/yfs:league";
or with
final String expression = "//yfs:league";
will solve the problem.
I just started to try Jaxp13XPathTemplate but I'm a bit confused on parsing the XML.
Here is the sample XML
<fxDataSets>
<fxDataSet name="NAME_A">
<link rel="self" href="http://localhost:8080/linkA"/>
<baseCurrency>EUR</baseCurrency>
<description>TEST DESCRIPTION A</description>
</fxDataSet>
<fxDataSet name="NAME_B">
<link rel="self" href="http://localhost:8080/linkB"/>
<baseCurrency>EUR</baseCurrency>
<description>TEST DESCRIPTION B</description>
</fxDataSet>
<fxDataSets>
I'm already able to get NAME_A and NAME_B however I'm not able to get the description for both Node.
Here is what I have come up with.
XPathOperations xpathTemplate = new Jaxp13XPathTemplate();
String fxRateURL = "http://localhost:8080/rate/datasets";
RestTemplate restTemplate = new RestTemplate();
Source fxRate = restTemplate.getForObject(fxRateURL,Source.class);
List<Map<String, Object>> currencyList = xpathTemplate.evaluate("//fxDataSet", fxRate , new NodeMapper() {
public Object mapNode(Node node, int i) throws DOMException
{
Map<String, Object> singleFXMap = new HashMap<String, Object>();
Element fxDataSet = (Element) node;
String id = fxDataSet.getAttribute("name");
/* This part is not working
if(fxDataSet.hasChildNodes())
{
NodeList nodeList = fxDataSet.getChildNodes();
int length = nodeList.getLength();
for(int index=0;i<length;i++)
{
Node childNode = nodeList.item(index);
System.out.println("childNode name"+childNode.getLocalName()+":"+childNode.getNodeValue());
}
}*/
return new Object();
}
});
try to use dom4j library and it's saxReader.
InputStream is = FileUtils.class.getResourceAsStream("file.xml");
SAXReader reader = new SAXReader();
org.dom4j.Document doc = reader.read(is);
is.close();
Element content = doc.getRootElement(); //this will return the root element in your xml file
List<Element> methodEls = content.elements("element"); // this will retun List of all Elements with name "element"
Take a look public <T> List<T> evaluate(String expression, Source context, NodeMapper<T> nodeMapper)
evaluate takes NodeMapper<T> as one of its parameter
it returns object of type List<T>
But for your given code snippet:
its passing new NodeMapper() as parameter
but trying to return List<Map<String, Object>> which is surely violation of the contract of the api.
Probable solution:
I am assuming you wanna return a object of type FxDataSet which wraps <fxDataSet>...</fxDataSet> element. If this is the case,
pass parameter as new NodeMapper<FxDataSet>() as parameter
use List<FxDataSet> currencyList = ... as left hand side expression;
change method return type as public FxDataSet mapNode(Node node, int i) throws DOMException.
Take a look at the documentation also for NodeMapper.
Surely, I have not used Jaxp13XPathTemplate, but this should be common Java concept which helped me to find out what was wrong actually. I wish this solution will work.
If you want to get at the child nodes of the fxDataSet element you should be able to do:
Node descriptionNode= fxDataSet.getElementsByTagName("description").item(0);
I have an XML document, and an XPath expression for that doc. I have to update the doc by using XPath at runtime.
How can I do this using Java?
The below is my xml:
<?xml version="1.0" encoding="ISO-8859-1"?>
<PersonList>
<Person>
<Name>Sonu Kapoor</Name>
<Age>24</Age>
<Gender>M</Gender>
<PostalCode>54879</PostalCode>
</Person>
<Person>
<Name>Jasmin</Name>
<Age>28</Age>
<Gender>F</Gender>
<PostalCode>78745</PostalCode>
</Person>
<Person>
<Name>Josef</Name>
<Age>232</Age>
<Gender>F</Gender>
<PostalCode>53454</PostalCode>
</Person>
</PersonList>
I have to change the values of name and age under //PersonList/Person[2]/Name.
Use setNodeValue. First, get a NodeList, for example:
myNodeList = (NodeList) xpath.compile("//MyXPath/text()")
.evaluate(myXmlDoc, XPathConstants.NODESET);
Then set the value of e.g. the first node:
myNodeList.item(0).setNodeValue("Hi mom!");
More examples e.g. here.
As mentioned in two other answers here, as well as in your previous question: technically, XPath is not a way to "update" an XML document, but only to locate nodes within an XML document. But I presume the above is what you want.
EDIT: Responding to your comment... Are you asking how to write your DOM to an XML file after you've finished editing the DOM? If so, here are two examples of how to do it:
http://www.java2s.com/Code/Java/XML/WriteDOMout.htm
http://download.oracle.com/javaee/1.4/tutorial/doc/JAXPXSLT4.html
You can delete the file and create a new one.
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(
new InputSource("data.xml"));
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate("//employee/name[text()='old']", doc,
XPathConstants.NODESET);
for (int idx = 0; idx < nodes.getLength(); idx++) {
nodes.item(idx).setTextContent("new value");
}
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(doc), new StreamResult(new File("data_new.xml")));
XPath is used to select parts of an XML document.It has no provision for updating. But since it returns DOM objects (Elements, if memory serves, or maybe Nodes) you can then use DOM methods for altering the document.
XPath can be used to select nodes in a document, not for modification
You apply the xpath expression to your document and get an element (in your case). Once you have this Element, you can use the Element methods to change values (name and age in your case)
Starting from a NodeList it should work like that:
NodeList nodes = getNodeListFromXPathExpression(); // you know how
if (nodes.length == 0)
return; // empty nodelist, xpath didn't select anything
Node first = node.getItem(0); // take the first from the list, your element
// this is a shortcut for your example:
// first is the actual selected element (a node)
// .getFirst() returns the first child node, the "text node" (="Jasmine", ="28")
// .setNodeValue() replace the actual value of that text node with a new string
first.getFirstChild().setNodeValue("New Name or new age");
Consider using XQuery Update instead of XPath. This allows you to write
replace value of node //PersonList/Person[2]/Name with "Anonymous"
This is much easier than using the Java DOM API.
I've created a small project for using XPATH to create/update XML:
https://github.com/shenghai/xmodifier
the code to change your xml is like:
Document document = readDocument("personList.xml");
XModifier modifier = new XModifier(document);
modifier.addModify("//PersonList/Person[2]/Name", "newName");
modifier.modify();
This is a super cool function where you can able to modify any tag value for any XML document using its xpath. You need to pass three arguments xml,xpathExpression and newValue and it returns the XML file as String with modified value.
If you want to pass XML as file, you need to change the function accordingly. But the logic will be same.
public String updateXML(String xml, String xpathExpression, String newValue)
{
try {
//Creating document builder
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document document = builder.parse(new org.xml.sax.InputSource(new StringReader(xml)));
//Evaluating xpath expression using Element
XPath xpath = XPathFactory.newInstance().newXPath();
Element element = (Element)xpath.evaluate(xpathExpression, document, XPathConstants.NODE);
//Setting value in the text
element.setTextContent(value);
//Transformation of document to xml
StringWriter stringWriter = new StringWriter();
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(document), new StreamResult(stringWriter));
xml = stringWriter.toString();
}
catch (Exception e)
{
e.printStackTrace();
}
return xml;
}
Here is the code to change the content with vtd-xml... vtd-xml is unique in that it is the only API that offers incremental update capability.
import com.ximpleware.*;
import java.io.*;
public class changeName {
public static void main(String s[]) throws VTDException,java.io.UnsupportedEncodingException,java.io.IOException{
VTDGen vg = new VTDGen();
if (!vg.parseFile("input.xml", false))
return;
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
XMLModifier xm = new XMLModifier(vn);
ap.selectXPath("//PersonList/Person[2]");
int i=0;
while((i=ap.evalXPath())!=-1){
if (vn.toElement(VTDNav.FIRST_CHILD,"Name")){
int k=vn.getText();
if (i!=-1)
xm.updateToken(k, "Jonathan");
vn.toElement(VTDNav.PARENT);
}
if (vn.toElement(VTDNav.FIRST_CHILD,"Age")){
int k=vn.getText();
if (i!=-1)
xm.updateToken(k, "42");
vn.toElement(VTDNav.PARENT);
}
}
xm.output("new.xml");
}
}
i have an xml file that contains lots of different nodes. some in particularly are nested like this:
<emailAddresses>
<emailAddress>
<value>sambj1981#gmail.com</value>
<typeSource>WORK</typeSource>
<typeUser></typeUser>
<primary>false</primary>
</emailAddress>
<emailAddress>
<value>sambj#hotmail.co.uk</value>
<typeSource>HOME</typeSource>
<typeUser></typeUser>
<primary>true</primary>
</emailAddress>
</emailAddresses>
From the above node, what i want to do is go through each and get the values inside it(value, typeSource, typeUser etc) and put them in a POJO.
i tried to see if i can use this xpath expression "//emailAddress" but it doesnt return me the tags inside inside it. maybe i am doing it wrong. i am pretty new to using xpath.
i could do something like this:
//emailAddress/value | //emailAddress/typeSource | .. but doing that will list all elements values together if im not mistaken leaving me to work out when i have finished reading from a specific emailAddress tag and going to the next emailAddress tag.
well to sum up my needs i basically want this to be returned similar to how you would return results from a bog standard sql query that returns results in a row. i.e. if your sql query produces 10 emailAddress, it will return each emailAddress in a row and i can simply iterate over "each emailAddress" and get the appropriate value based on the colunm name or index.
No,
//emailAddress
doesn't return the tags inside, that is correct. What it does return is a NodeList/NodeSet. To actually get the values you can do something like this:
String emailpath = "//emailAddress";
String emailvalue = ".//value";
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xpath = xPathFactory.newXPath();
Document document;
public XpathStuff(String file) throws ParserConfigurationException, IOException, SAXException {
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = docFactory.newDocumentBuilder();
BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file));
document = builder.parse(bis);
NodeList nodeList = getNodeList(document, emailpath);
for(int i = 0; i < nodeList.getLength(); i++){
System.out.println(getValue(nodeList.item(i), emailvalue));
}
bis.close();
}
public NodeList getNodeList(Document doc, String expr) {
try {
XPathExpression pathExpr = xpath.compile(expr);
return (NodeList) pathExpr.evaluate(doc, XPathConstants.NODESET);
} catch (XPathExpressionException e) {
e.printStackTrace();
}
return null;
}
//extracts the String value for the given expression
private String getValue(Node n, String expr) {
try {
XPathExpression pathExpr = xpath.compile(expr);
return (String) pathExpr.evaluate(n,
XPathConstants.STRING);
} catch (XPathExpressionException e) {
e.printStackTrace();
}
return null;
}
Maybe I should point out that when iterating over the Nodelist, in .//values the first dot means the current context. Without the dot you would get the first node all the time.
//emailAddress/*
will get these nodes in the document order.
It depends on how you want to iterate through the nodes. We do all our XML using XOM (http://www.xom.nu/) which is an easy reliable Java package. It's possible to write your own strategy using XOM calls.
If you use XStream you can set it up quite easily. Like so:
#XStreamAlias( "EmailAddress" )
public class EmailAddress {
#XStreamAlias()
private String value;
#XStreamAlias()
private String typeSource;
#XStreamAlias()
private String typeUser;
#XStreamAlias()
private boolean primary;
// ... the rest omitted for brevity
}
You then marshal & unmarshal quite simply like so:
XStream xstream = new XStream();
xstream.processAnnotations( EmailAddress.class );
xstream.toXML( /* Object value here */ emailAddress );
xstream.fromXML( /* String xml value here */ "" );
IDK if you have to use XPath or not, but if not I'd consider an out of the box solution like this.
I am totally aware this is not what you were asking for, but may consider using jibx. This is a tool for human-readable XML to POJO mapping.
So I believe you could generate mapping for your email structure in a quick way and let the jibx do the work for you.