Retrieving XML node names

Retrieving XML node names - java

I am trying to retrieve the names of all the nodes from XML file using "node.getNodeName()". While doing so, every node name is preceeded and followed by "#text". Because of that, i am not getting the exact count of nodes as well. I want "#text" to be eliminated while retrieving the names. How do i do that??

With that :
package com.hum;
import java.io.InputStreamReader;
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
/**
*
* #author herve
*/
public class PrintNameXML
{
public static void main(String[] args) throws Exception
{
String xml = "<a><o><u>ok</u></o></a>";
Document doc =
DocumentBuilderFactory
.newInstance()
.newDocumentBuilder()
.parse(new InputSource(new StringReader(xml)));
NodeList nl = doc.getElementsByTagName("*");
for (int i = 0; i < nl.getLength(); i++)
{
System.out.println("name is : "+nl.item(i).getNodeName());
}
}
}
I get :
name is : a
name is : o
name is : u
Is that you search ?

Related

Java Xpath multiple elements with same name of a parent node

I have an xml like below.
<name>
<value>123</value>
<value>456</value>
<value>789</value>
</name>
Now using java's Xpath query I tried below method
NodeList list3 = (NodeList) xpath.evaluate("name/value", element,XPathConstants.NODESET);
But it gives me only first value, how can I print all <value> tags ?

Your XPath expression is correct, there is most likely another problem in your code. You really should provide a complete example which demonstrates your problem.
The following code demonstrates how this would look like:
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
public class XmlTest {
public static void main(String[] args) throws Exception {
String xml = "<name>\n" +
"<value>123</value>\n" +
"<value>456</value>\n" +
"<value>789</value>\n" +
"</name>";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new StringReader(xml)));
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
NodeList list = (NodeList) xpath.evaluate("name/value", doc, XPathConstants.NODESET);
for (int i = 0; i < list.getLength(); ++i) {
Node node = list.item(i);
System.out.println(node.getNodeName());
}
}
}
Running this results in the following output:
value
value
value

How I need to get tag name of sub child in xml by comparing the attribute value and given string value using Java?

I have a xml file. I need to get the sub child tag of the parent tag (Body) in xml file using Java. First I need to use DOM for reading an element
and get xml file from my local machine drive. I have one String varaible (Sring getSubChildValue = "181_paragraph_13") and I need to compare the value
with each and every attribute Value in the Xml file. If the given Value may be in sub child tag,I cont able to get a Value.
what I need to do for compare the String variable and with Xml File
What I need to do for print the Tag name if the String value is equal to any attrinbute Value.
Example: (P) Tag is the sub child of Tag (Body) which contain the given String Value. So I need to get tag name P.
How to avoid the Hard coding the sub-child Name to get the solution?
Example XML file:
<parent>
<Body class="student" id="181_student_method_3">
<Book class="Book_In_School_11" id="181_student_method_11"/>
<subject class="subject_information " id="181_student_subject_12"/>
<div class="div_passage " id="181_div_method_3">
<p class=" paragraph_book_name" id="181_paragraph_13">
<LiberaryBook class="Liberary" id="181_Liberary_9" >
<Liberary class="choice "
id="Liberary_replace_1" Uninversity="University_Liberary_1">
Dubliners</Liberary>
<Liberary class="choice "
id="Liberary_replace_2" Uninversity="University_Liberary_2">
Adventure if sherlock Holmes</Liberary>
<Liberary class="choice "
id="Liberary_replace_3" Uninversity="University_Liberary_3">
Charlotte’s Web</Liberary>
<Liberary class="choice "
id="Liberary_replace_4" Uninversity="University_Liberary_4">
The Outsiders</Liberary>
</LiberaryBook>
</p>
</div>
</Body>
</parent>
Example Java code:
import java.io.File;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class PerfectTagChange {
public static void main(String[] args) {
String filePath = "/xmlfile/Xml/check/sample.xml";
File xmlFile = new File(filePath);
DocumentBuilderFactory
dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
try {
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
Element root = doc.getDocumentElement();
changeValue(root,doc);
doc.getDocumentElement().normalize();
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File("/xmlfile/Xml/check/Demo.xml"));
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(source, result);
System.out.println("XML file updated successfully");
} catch (SAXException | ParserConfigurationException | IOException | TransformerException e1) {
e1.printStackTrace();
}
}
//This Method is used to check which attribute contain given string Value : Hard code parent tag, But no other tag.
private static void changeValue(Node someNode,Document doc) {
Sring getSubChildValue = "181_paragraph_13"
NodeList childs = someNode.getChildNodes();
for (int in = 0; in < childs.getLength();) {
Node child = childs.item(in);
if (child.getNodeType() == Document.ELEMENT_NODE) {
if (child.getNodeName().equalsIgnoreCase("Body") ) {
//If I hard code the ID here on getNamedItem("id"),
If the attribute Name got Changed from ID to Name
it will be in problem.
//3.What is the solution for solving the problem.
if(child.getAtrribute.getNamedItem("id").getNodeValue().equals(getSubChildValue)){
system.out.println(child.getAtrribute.getNamedItem("id").getNodeValue());
}
}
}
}
}

If you change your code to this:
private static void changeValue(Node someNode, Document doc, String searchString) throws Exception {
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xPath.evaluate("//*[#*=\"" + searchString + "\"]",
doc.getDocumentElement(),
XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("Tagname: " + nodes.item(i).getNodeName());
}
}
you don't have the name of the attribute to be hardcoded.
EDIT:
Added searchString as parameter.

Escaping entire XML for a webservice response

I have a Java Restful webservice that returns out an userdetails xml as below:
<userdetails>
<firstName>first</firstName>
<firstName>last</firstName>
<email>123#gmail.com</email>
</userdetails>
Any of the fields in the XML can contain special charecters which will cause issues for the client when they use Jaxb to convert xml into java object.
I can use "StringEscapeUtils.escapeXml" to escape special charecters in a field like say for firstName and it is escaping it correctly.
StringEscapeUtils.escapeXml(firstName);
But I have to do this for every field in my XML. Is there any way where I can escape the entire XML at once instead of doing it for every field.

Another option is to traverse all text elements, escaping them in the process (below code taken from here and slightly modified):
import java.io.File;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.apache.commons.lang.StringEscapeUtils;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
public class EscapeXml {
static String inputFile = "data.xml";
static String outputFile = "data_new.xml";
public static void main(String[] args) throws Exception {
Document doc = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().parse(new InputSource(inputFile));
// locate the node(s)
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList)doc.getElementsByTagName("*");
// escape text nodes
for (int idx = 0; idx < nodes.getLength(); idx++) {
Node node = nodes.item(idx);
if (node.getNodeType() == Node.ELEMENT_NODE) {
NodeList childNodes = node.getChildNodes();
for (int cIdx = 0; cIdx < childNodes.getLength(); cIdx++) {
Node childNode = childNodes.item(cIdx);
if (childNode.getNodeType() == Node.TEXT_NODE) {
String newTextContent =
StringEscapeUtils.escapeXml(childNode.getTextContent());
childNode.setTextContent(newTextContent);
}
}
}
}
// save the result
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(doc), new StreamResult(new File(outputFile)));
}
}

Issues with xpath that contained namespaces in Java (Dom parser)

I'm trying to parse an rdfs xml file in order to find all the Classes in an rdfs file.
The xpath: "/rdf:RDF/rdfs:Class"
is working in my XML editor.
When i insert the xpath in my Java program (i have implemented a dom parser), i get 0 Classes.
The following example runs but it outputs 0 classes!
I do:
import java.io.IOException;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPathExpressionException;
import org.xml.sax.SAXException;
public class Main {
public static void main(String args[]) throws XPathExpressionException, ParserConfigurationException, SAXException, IOException{
FindClasses FSB = new FindClasses();
FSB.FindAllClasses("C:\\Workspace\\file.xml"); //rdfs file
}
}
The class FindClasses is as follows:
import java.io.IOException;
import java.util.Collection;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class FindClasses {
public void FindAllClasses(String fileName) throws XPathExpressionException, ParserConfigurationException, SAXException, IOException {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse(fileName);
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression classes_expr = xpath.compile("/rdf:RDF/rdfs:Class");
Object result = classes_expr.evaluate(doc, XPathConstants.NODESET);
NodeList classes = (NodeList) result;
System.out.println("I found : " + classes.getLength() + " classes " );
}
}
The rdfs file is:
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<rdfs:Class rdf:about="Class1">
</rdfs:Class>
<rdfs:Class rdf:about="Class2">
</rdfs:Class>
</rdf:RDF>
I don't really understand why the xpath returns 0 nodes in that example.
It's weird, cause i have implemented other dom parsers as well and they were working fine.
Can somebody help me?
Thanks

I visited the following link and i solved my problem:
Issues with xpath in java
The problem was that the xpath contained two namespaces (rdf,rdfs) like "/rdf:RDF/rdfs:Class".
If the xpath didn't contain any namespace e.g. /RDF/Class , there was not going to be an issue.
So after the line:
xpath = XPathFactory.newInstance().newXPath();
and before the line:
XPathExpression classes_expr = xpath.compile("/rdf:RDF/rdfs:Class");
I added the following:
xpath.setNamespaceContext(new NamespaceContext() {
public String getNamespaceURI(String prefix) {
switch (prefix) {
case "rdf": return "http://www.w3.org/1999/02/22-rdf-syntax-ns#";
case "rdfs" : return "http://www.w3.org/2000/01/rdf-schema#";
}
return prefix;
}
public String getPrefix(String namespace) {
if (namespace.equals("rdf")) return "rdf";
else if (namespace.equals("rdfs")) return "rdfs";
else return null;
}
#Override
public Iterator getPrefixes(String arg0) {
// TODO Auto-generated method stub
return null;
}
});

XML filtering using getelementsbytagname

I'm trying to parse a xml file using the below program but wondering why the getFirstChild() is blank while printing...
The nodelist contains all the employee nodes and I am processing each node and trying to get the firstchild and lastchild..
xml file:
<?xml version="1.0"?>
<Employees>
<Employee emplid="1111" type="admin">
<firstname>John</firstname>
<lastname>Watson</lastname>
<age>30</age>
<email>johnwatson#sh.com</email>
</Employee>
<Employee emplid="2222" type="admin">
<firstname>Sherlock</firstname>
<lastname>Homes</lastname>
<age>32</age>
<email>sherlock#sh.com</email>
</Employee>
</Employees>
java program:
package XML;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import org.xml.sax.SAXException;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
public class XMLTest {
/**
* #param args
*/
public static void main(String[] args) {
DocumentBuilderFactory builderfactory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = builderfactory.newDocumentBuilder();
Document xmldocument = builder.parse(new FileInputStream(new File("c:/employees.xml")));
NodeList node = xmldocument.getElementsByTagName("Employee");
System.out.println("node length="+node.getLength());
for (int temp = 0; temp < node.getLength(); temp++){
System.out.println("First Child = " +node.item(temp).getFirstChild().getNodeValue());
System.out.println("Last Child = " +node.item(temp).getLastChild().getNodeValue());
}
} catch (ParserConfigurationException | SAXException | IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}

It's most likely due to the whitespace (spaces, tabs, line breaks etc.) that comes through as text nodes in the list as well as the elements.
When working with java's XML DOM I tend to write a helper like this as it's pretty tedious.

The DocumentBuilderFactory controls the handling of whitespaces try:
builderFactory.setIgnoringElementContentWhitespace(true);
Hope it helps!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Retrieving XML node names - java

Related

Java Xpath multiple elements with same name of a parent node

How I need to get tag name of sub child in xml by comparing the attribute value and given string value using Java?

Escaping entire XML for a webservice response

Issues with xpath that contained namespaces in Java (Dom parser)

XML filtering using getelementsbytagname

Categories

Resources