How to navigate dom tree from XML in java - java

Below is a sample of the xml I am using, I have striped out some of the fields as they are unnecessary to demonstrate my point.
I am trying to parse the orders from this xml. However, I encounter a problem when I try to parse the product sets for each order. When the first order is processing, instead of adding the 2 sets detailed below, it will add all the sets it can find in the xml into the first order. I am not sure how to get around this as this is all quite new to me. Below is my java...
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
// Create a list of orders and sub elements
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
nList = doc.getElementsByTagName("order");
setList = doc.getElementsByTagName("set");
orders = new Order[nList.getLength()];
Node nNode = nList.item(i);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
temp = new Order();
// Populate order with details from XML
parseClientDetails(eElement);
// Add sets
parseSets();
temp.setSets(setArray);
orders[i] = temp;
}
...
private void parseSets() {
Node nNode;
Element element;
for (int c = 0; c < setList.getLength(); c++) {
nNode = setList.item(c);
element = (Element) nNode;
tempSet = new Set();
tempSet.setBandwidth(getValue("bandwidth", element));
tempSet.setCategory(getValue("category", element));
tempSet.setSet_package(getValue("package", element));
setArray.add(tempSet);
}
}
XML:
<orderSet>
<order>
<customer name="SelectCustomerDetails">
<clientId>UK12345</clientId>
<etc>...</etc>
</customer>
<product>
<set>
<category>Silver</category>
<package>3000IP</package>
<bandwidth>160</bandwidth>
</set>
<set>
<category>Silver</category>
<package>3000IP</package>
<bandwidth>320</bandwidth>
</set>
</product>
</order>
<order>
...
</order>
</orderSet>

The problem is that you are calling doc.getElementsByTagName("set") which gives you a list of all sets in the entire document. Instead, you need to call it on each order, like this:
nList = doc.getElementsByTagName("order");
orders = new Order[nList.getLength()];
Node nNode = nList.item(i);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
//get the sets for the current order only
NodeList setList = eElement.getElementsByTagName("set");
//now process the sets
}

You can use the 'javax.xml.xpath' APIs to get the content you need from the XML document. These APIs were introduced in Java SE 5 and provide much more control than 'getElementsByTagName'.
Example
What is best way to change one value in XML files in Java?

Related

Best way to reach the tag I want in an XML file when it's repeated?

first post in here. I have an XML file that includes the tag "usine" multiple times and I'm doing it in a way that does not seem right and I want to see if there's a more optimal way to do it. This is my first time working with XML and Node/NodeList so I'm still getting familiar with it.
Here is the XML file
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<metadonnees>
<usine type="usine-matiere">
<icones>
<icone type="vide" path="src/ressources/UMP0%.png"/>
<icone type="un-tiers" path="src/ressources/UMP33%.png"/>
<icone type="deux-tiers" path="src/ressources/UMP66%.png"/>
<icone type="plein" path="src/ressources/UMP100%.png"/>
</icones>
<sortie type = "metal"/>
<interval-production>100</interval-production>
</usine>
<usine type="usine-aile">
<icones>
<icone type="vide" path="src/ressources/UT0%.png"/>
<icone type="un-tiers" path="src/ressources/UT33%.png"/>
<icone type="deux-tiers" path="src/ressources/UT66%.png"/>
<icone type="plein" path="src/ressources/UT100%.png"/>
</icones>
<entree type="metal" quantite="2"/>
<sortie type="aile"/>
<interval-production>50</interval-production>
</usine>
</metadonnees>
<simulation>
<usine type="usine-matiere" id="11" x="32" y="32"/>
<usine type="usine-aile" id="21" x="320" y="32"/>
<chemins>
<chemin de="11" vers="21" />
<chemin de="21" vers="41" />
</chemins>
</simulation>
For example, if I want to retrieve the x value of 'usine type="usine-aile"' in the simulation tag, here is the code I use :
NodeList nList = doc.getElementsByTagName("simulation");
Node positionNode = nList.item(0);
Element elementPosition = (Element) positionNode;
NodeList cooList = elementPosition.getElementsByTagName("usine");
Node cooNode = cooList.item(0);
Element cooElem = (Element) cooNode;
System.out.println(cooElem.getAttribute("x"));
Basically I have to make two NodeLists because the I want is in the tag and not the one in the tag, so the first NodeList is to locate me in the tag, then I go deeper making a new NodeList to find the I want. Is there a better way to do this? I'm probably doing it a wrong way so I wish to know your answers. Thanks
First, Get elements by tag name (e.g get elements of usine), it returns a Nodelist of that tag. e.g in simulation you have two usine tag(a NodeList with lenght of 2).
Second, you can iterate this Nodelist and do whatever you want to each node (element), for example you can get the attribute of each usine tag(
x , y, id)
In summary
1- Get element by tag name (NodeList)
2- Iterate Nodelist
3- Process the nodes (e.g Get attribute of each node in the iteration process (x,y,id)
I coded your scenario as follows
public static void main(String argv[]) throws ParserConfigurationException, IOException, SAXException {
//Read xml file
File fXmlFile = new File("/test.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
//Get usine nodes
NodeList nodeList = doc.getElementsByTagName("usine");
//Iterate nodeList
for (int temp = 0; temp < nodeList.getLength(); temp++) {
//Get each node and process it
Node node = nodeList.item(temp);
if (node.getNodeType() == Node.ELEMENT_NODE) {
//Print attributes of the node
Element element = (Element) node;
System.out.println("X = " + element.getAttribute("x"));
System.out.println("Y = " + element.getAttribute("y"));
System.out.println("ID = " + element.getAttribute("id"));
}
}
}
In this post, we're using DOM Parser for parsing XML, you can use this link to become more familiar with other XML processing libraries
like: SAX Parser, StAX Parser and JAXB, these are much better than DOM Parser in terms of speed and performance.

Tag names are missing in the XML parsed except the root element

I am reading a dynamic XML file (without any known structure) and putting the relevant tag name and value to a hashmap (ex: metadata<tagName, Value> ).
My issue here is, I can not get the tagName but it only adds the root tagName and all the values of entire xml.
my XML is:
<?xml version="1.0" encoding="UTF-8"?>
<form kwf="VARA">
<sec1>
<docID>2d2c5bf209b79d8b1a1f840ce4ce4030e66a76d6</docID>
<qrCode>xx.jpg</qrCode>
<title>NOOO FORM NAME</title>
<ELO_VARAFNAME>NO</ELO_VARAFNAME>
<ELO_VARALNAME>NAME</ELO_VARALNAME>
<ELO_VARAEMAIL>noname#gmail.com</ELO_VARAEMAIL>
<ELO_VARAORBEONDOCID>2d2c5bf209b79d8b1a1f840ce4ce4030e66a76d6</ELO_VARAORBEONDOCID>
</sec1>
</form>
My Code is:
public static Map<String,String> getMetaDataFromOrbeonXML(File fXmlFile) throws SAXException, ParserConfigurationException, IOException
{
Map metaData = new HashMap();
String formName="";
String docID = "";
try {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("form");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("\nCurrent Element :" + nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
docID = eElement.getElementsByTagName("docID").item(0).getTextContent();
metaData.put("docID", docID);
metaData.put("appName", APP_NAME);
metaData.put(eElement.getTagName(), eElement.getTextContent());
System.out.println("META DATA MAP: "+ metaData.toString());
}
}
} catch (Exception e) {
e.printStackTrace();
}
return metaData;
}
And the out put is:
{form= 2d2c5bf209b79d8b1a1f840ce4ce4030e66a76d6
xx.jpg
NOOO FORM NAME
NO
NAME
noname#gmail.com
2d2c5bf209b79d8b1a1f840ce4ce4030e66a76d6
, docID=2d2c5bf209b79d8b1a1f840ce4ce4030e66a76d6, appName=VIRGINAUSI, formName=AITSLForm}
Tag names are missing in the map except the root element. Please help !
The code above correctly adds 2 entries in the map. The first entry, maps element Form to it's text content (which is the collection of the text content of all it's descendant nodes).
If you want to access the descendant nodes you'll need to use eElement.getChildNodes() and iterate over the NodeList returned.
This might be useful:
Java: Most efficient method to iterate over all elements in a org.w3c.dom.Document?

Couldn't able to read the attribute using DOM parser

i am having issues when reading the attribute of a link,
this is the structure of my xml,
<entry>
<updated>
<title>
<link href="">
</entry>
i managed to read the date and title correctly but the href attribute of the link is not working.
Here is my code,
NodeList nList = doc.getElementsByTagName("entry");
System.out.println("============================");
for (int temp = 0; temp < nList.getLength(); temp++)
{
Node node = nList.item(temp);
System.out.println(""); //Just a separator
if (node.getNodeType() == Node.ELEMENT_NODE)
{
Element eElement = (Element) node;
System.out.println("Date : " + eElement.getElementsByTagName("updated").item(0).getTextContent());
System.out.println("Title : " + eElement.getElementsByTagName("title").item(0).getTextContent());
// The below code is for reading href attribute of link,
NodeList node1 = eElement.getElementsByTagName("link");
Element eElement1 = (Element) node1;
System.out.println(eElement1.getAttribute("href"));
}
}
I am creating a new nodelist for the attributes of link but the code is not working.
error:
java.lang.ClassCastException: com.sun.org.apache.xerces.internal.dom.DeepNodeListImpl cannot be cast to org.w3c.dom.Element
at Demo.main(Demo.java:45)
A NodeList is not an Element and cannot be cast to one (successfully), so this code isn't going to work:
NodeList node1 = eElement.getElementsByTagName("link");
Element eElement1 = (Element) node1;
A NodeList is, as the name suggests, a list of nodes (and in your case, the nodes will be Elements). So this code would work for the first link:
NodeList list = eElement.getElementsByTagName("link");
Element eElement1 = (Element) list.item(0);
...whereupon your getAttribute should work fine, as Element has getAttribute.
Side note: If your library has support for newer query functions, you could also do this:
String href = ((Element)eElement.querySelector("entry")).getAttribute("href");
...because querySelector returns just the first match (not a list) (or null if no matches; if that's a possibility, add a guard to the above). But I don't know how well querySelector is supported outside of browsers yet.
// The below code is for reading href attribute of link,
NodeList node1 = eElement.getElementsByTagName("link");
Element eElement1 = (Element) node1;
NodeList will give you Node object not Element, you can get href value as follows,
String hrefValue = nodeList.item(0).
getAttributes().getNamedItem("href").getNodeValue();

xml node deletion not working properly in java using dom parser?

Here i have xml node where i'm displaying and selecting particular node to delete.For my below xml file and code ,only first node is deleting though i select second node.
<root>
<book> <!--node 1 -->
<id>1111</id>
<name>abacd</name>
<author>abcd</author>
<price>700</price>
<category>abcd</category>
</book>
<book> <!--node 2 -->
<id>2222</id>
<name>abacd</name>
<author>abcd</author>
<price>700</price>
<category>abcd</category>
</book>
<book> <!--node 3 -->
<id>3333</id>
<name>abacd</name>
<author>abcd</author>
<price>700</price>
<category>abcd</category>
</book>
</root>
and my java code to delete node as
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
int nodeValue = Integer.parseInt(nodeNumber);
//nodeValue is node number eg: 2;
NodeList bookList = doc.getElementsByTagName("book");
for (int i = 1; i <= bookList.getLength(); i++) {
if (i == nodeValue) {
Element rootElement = (Element) doc.getElementsByTagName("book").item(0);
Element idElement = (Element) doc.getElementsByTagName("id").item(0);
idElement.getParentNode().removeChild(idElement);
Element nameElement = (Element) doc.getElementsByTagName("name").item(0);
nameElement.getParentNode().removeChild(nameElement);
Element authorElement = (Element) doc.getElementsByTagName("author").item(0);
authorElement.getParentNode().removeChild(authorElement);
Element priceElement = (Element) doc.getElementsByTagName("price").item(0);
priceElement.getParentNode().removeChild(priceElement);
Element categoryElement = (Element) doc.getElementsByTagName("category").item(0);
categoryElement.getParentNode().removeChild(categoryElement);
rootElement.getParentNode().removeChild(rootElement);
doc.normalize();
}
}
could anybody guide me where to change my code.
You always call the first node by this
doc.getElementsByTagName("book").item(0);
Instead try to use
doc.getElementsByTagName("book").item(nodeValue);
Or use bookList.item(nodeValue) to access the node directly
if we want to delete node according to node number then below code helps.I got my answer by this
int nodeValue = Integer.parseInt(nodeNumber);
NodeList bookList = doc.getElementsByTagName("book");
Node nNode = bookList.item(nodeValue);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
eElement.getParentNode().removeChild(nNode);
}
It will delete selected node (eg:2).

Java XML with namespace issue

I have this code:
org.w3c.dom.Document doc = docBuilder.parse(representation.getStream());
Element element = doc.getDocumentElement();
NodeList nodeList = element.getElementsByTagName("xnat:MRSession.scan.file");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
// do something with the current element
my problem is with getElementsByTagName("xnat:MRSession.scan.file")
my xml looks like this:
<?xml version="1.0" encoding="UTF-8"?><xnat:MRSession "REMOVED DATA IGNORE">
<xnat:sharing>
<xnat:share label="23_MR1" project="BOGUS_GSU">
<!--hidden_fields[xnat_experimentData_share_id="1",sharing_share_xnat_experimentDa_id="xnat_E00001"]-->
</xnat:share>
</xnat:sharing>
<xnat:fields>
<xnat:field name="studyComments">
<!--hidden_fields[xnat_experimentData_field_id="1",fields_field_xnat_experimentDat_id="xnat_E00001"]-->S</xnat:field>
</xnat:fields>
<xnat:subject_ID>xnat_S00002</xnat:subject_ID>
<xnat:scanner manufacturer="GE MEDICAL SYSTEMS" model="GENESIS_SIGNA"/>
<xnat:prearchivePath>/home/ryan/xnat_data/prearchive/BOGUS_OUA/20120717_131900137/23_MR1</xnat:prearchivePath>
<xnat:scans>
<xnat:scan ID="1" UID="1.2.840.113654.2.45.2.108830" type="SAG LOCALIZER" xsi:type="xnat:mrScanData">
<!--hidden_fields[xnat_imageScanData_id="1"]-->
<xnat:image_session_ID>xnat_E00001</xnat:image_session_ID>
<xnat:quality>usable</xnat:quality>
<xnat:series_description>SAG LOCALIZER</xnat:series_description>
<xnat:scanner manufacturer="GE MEDICAL SYSTEMS" model="GENESIS_SIGNA"/>
<xnat:frames>29</xnat:frames>
<xnat:file URI="/home/ryan/xnat_data/archive/BOGUS_OUA/arc001/23_MR1/SCANS/1/DICOM/scan_1_catalog.xml" content="RAW" file_count="29" file_size="3968052" format="DICOM" label="DICOM" xsi:type="xnat:resourceCatalog">
So Basically I need to be able to iterate through all the xnat:MRSession/xnat:scan/xnat:file
elements and make some changes. Problem is
getElementsByTagName("xnat:MRSession.scan.file")
Is always null. Please help. Thanks
You could try the following using XPath:
Document document = // the parsed document
XPathFactory xPathFactory = XPathFactory.newInstance();
NodeList allFileNodes = xPathFactory.newXPath().evaluate("\\XNAT_NAMESPACE:file", document.getDocumentElement(), XPathConstants.NODESET);
Instead XNAT_NAMESPACE you would need to specify the exact namespace that is meant with the prefix "xnat" in your example.

Categories

Resources