Java XML XPath Full XML - java

got a little problem. I have the following code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("result1.xml");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//element");
String elements = (String) expr.evaluate(doc, XPathConstants.STRING);
What i get :
jcruz0#exblog.jp
Cheryl
Blake
195115
What i want:
<person>
<email>jcruz0#exblog.jp</email>
<firstname>Cheryl</firstname>
<lastname>Blake</lastname>
<number>195115</number>
</person>
So as you can see i want the full XML tree. Not just the NodeValue.
Maybe somebody knows the trick.
Thanks for any help.

You got the string value of the selected XML element because you specified XPathConstants.STRING to XPathExpression.evaluate().
Instead, specify a return type of XPathConstants.NODE if you know for sure that your XPath will select a single element,
String elements = (String) expr.evaluate(doc, XPathConstants.NODE);
or XPathConstants.NODESET for multiple elements, which you would then iterate over to process as necessary.

Something like this can be done.
XPathExpression expr = xpath.compile("/person");
NodeList elements = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < elements.getLength(); i++) {
// the person node
System.out.println(elements.item(i).getNodeName());
for (int x = 0; x < elements.item(i).getChildNodes().getLength(); x++) {
// the elements under person
if (elements.item(i).getChildNodes().item(x).getNodeType() == Node.ELEMENT_NODE) {
System.out.println("\t" + elements.item(i).getChildNodes().item(x).getNodeName() + " - " + elements.item(i).getChildNodes().item(x).getTextContent());
}
}
}
Output
person
email - jcruz0#exblog.jp
firstname - Cheryl
lastname - Blake
number - 195115
You can use the nodes to do what you want, or wrap them in < and > if you just want to print them.

Related

Java, XPath Expression to read all node names, node values, and attributes

I need help in make an xpath expression to read all node names, node values, and attributes in an xml string. I made this:
private List<String> listOne = new ArrayList<String>();
private List<String> listTwo = new ArrayList<String>();
public void read(String xml) {
try {
// Turn String into a Document
Document document = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes()));
// Setup XPath to retrieve all tags and values
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']", document, XPathConstants.NODESET);
// Iterate through nodes
for(int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
listOne.add(node.getNodeName());
listTwo.add(node.getNodeValue());
// Another list to hold attributes
}
} catch(Exception e) {
LogHandle.info(e.getMessage());
}
}
I found the expression //text()[normalize-space()=''] online; however, it doesn't work. When I get try to get the node name from listOne, it is just #text. I tried //, but that doesn't work either. If I had this XML:
<Data xmlns="Somenamespace.nsc">
<Test>blah</Test>
<Foo>bar</Foo>
<Date id="2">12242016</Date>
<Phone>
<Home>5555555555</Home>
<Mobile>5555556789</Mobile>
</Phone>
</Data>
listOne[0] should hold Data, listOne[1] should hold Test, listTwo[1] should hold blah, etc... All the attributes will be saved in another parallel list.
What expression should xPath evaluate?
Note: The XML String can have different tags, so I can't hard code anything.
Update: Tried this loop:
NodeList nodeList = (NodeList) xPath.evaluate("//*", document, XPathConstants.NODESET);
// Iterate through nodes
for(int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
listOne.add(i, node.getNodeName());
// If null then must be text node
if(node.getChildNodes() == null)
listTwo.add(i, node.getTextContent());
}
However, this only gets the root element Data, then just stops.
//* will select all element nodes, //#* all attribute nodes. However, an element node does not have a meaningful node value in the DOM, so you would need to read out getTextContent() instead of getNodeValue.
As you seem to consider an element with child elements to have a "null" value I think you need to check whether there are any child elements:
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
docBuilderFactory.setNamespaceAware(true);
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse("sampleInput1.xml");
XPathFactory fact = XPathFactory.newInstance();
XPath xpath = fact.newXPath();
NodeList allElements = (NodeList)xpath.evaluate("//*", doc, XPathConstants.NODESET);
ArrayList<String> elementNames = new ArrayList<>();
ArrayList<String> elementValues = new ArrayList<>();
for (int i = 0; i < allElements.getLength(); i++)
{
Node currentElement = allElements.item(i);
elementNames.add(i, currentElement.getLocalName());
elementValues.add(i, xpath.evaluate("*", currentElement, XPathConstants.NODE) != null ? null : currentElement.getTextContent());
}
for (int i = 0; i < elementNames.size(); i++)
{
System.out.println("Name: " + elementNames.get(i) + "; value: " + (elementValues.get(i)));
}
For the sample input
<Data xmlns="Somenamespace.nsc">
<Test>blah</Test>
<Foo>bar</Foo>
<Date id="2">12242016</Date>
<Phone>
<Home>5555555555</Home>
<Mobile>5555556789</Mobile>
</Phone>
</Data>
the output is
Name: Data; value: null
Name: Test; value: blah
Name: Foo; value: bar
Name: Date; value: 12242016
Name: Phone; value: null
Name: Home; value: 5555555555
Name: Mobile; value: 5555556789

How read more XSD schemas with Xpath?

I have 2 XSD schemas first.xsd and second.xsd
In first.xsd is:
<xs:include schemaLocation="second.xsd" />
and I want read elements in second.xsd.
I have defined:
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(new FileInputStream("first.xsd"));
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList result1 = null;
result1 = (NodeList) xPath.compile("//element[#name='List']").evaluate(xmlDocument, XPathConstants.NODESET);
for (int k = 0; k < result1.getLength(); k++) {
Element ele = (Element)result1.item(k);
System.out.println(ele.getAttribute("type")); }
Problem is program didnt find element list in second.xsd.
Can I define some as this?
Document xmlDocument = builder.parse(new FileInputStream("first.xsd","second.xsd"));
Is something as LSResourceResolver but that is for validation.
Can I use it for my code?
Thank you for advices.

Xpath query from a specific node

The usual queries that I'm currently supporting are from the root , meaning :
public Object evaluate(String expression, QName returnType) {...}
Now I want to do the Xpath query starting from some given Node , e.g. :
public Object evaluate(String expression, Node source, QName returnType) { ? }
Then , If my usual queries look like this (here's an exmaple) :
//load the document into a DOM Document
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("books.xml");
//create an XPath factory
XPathFactory factory = XPathFactory.newInstance();
//create an XPath Object
XPath xpath = factory.newXPath();
//make the XPath object compile the XPath expression
XPathExpression expr = xpath.compile("/inventory/book[3]/preceding-sibling::book[1]");
//evaluate the XPath expression
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
//print the output
System.out.println("1st option:");
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("i: " + i);
System.out.println("*******");
System.out.println(nodeToString(nodes.item(i)));
System.out.println("*******");
What kind of changes would I need for making this happen for the above method (public Object evaluate(String expression, Node source, QName returnType);)
Thanks!

How to get the text from the group of XML nodes using Java?

Following is the XML file -
<Country>
<Group>
<C>Tokyo</C>
<C>Beijing</C>
<C>Bangkok</C>
</Group>
<Group>
<C>New Delhi</C>
<C>Mumbai</C>
</Group>
<Group>
<C>Colombo</C>
</Group>
</Country>
I want to save the name of Cities to a text file using Java & XPath -
Below is the Java code which is unable to do the needful.
.....
.....
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("Continent.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
// XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("//Country/Group");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
BufferedWriter out = new BufferedWriter(new FileWriter("Cities.txt"));
Node node;
for (int i = 0; i < nodes.getLength(); i++)
{
node = nodes.item(i);
String city = xpath.evaluate("C",node);
out.write(" " + city + "\r\n");
}
out.close();
.....
.....
Can somebody help me to get the required output?
You are getting only the first city because that's what you asked for. Your first XPATH expression returns all the Group nodes. You iterate over these and evaluate the XPATH C relative to each Group, returning a single city.
Just change the first XPATH to //Country/Group/C and eliminate the second XPATH altogether -- just print the text value of each node returned by the first XPATH.
I.e.:
XPathExpression expr = xpath.compile("//Country/Group/C");
...
for (int i = 0; i < nodes.getLength(); i++)
{
node = nodes.item(i);
out.write(" " + node.getTextContent() + "\n");
}

How do I resolve two nodes which have the same name but under different parents?

<PublicRecords>
<USBankruptcies>
<USBanktruptcy>...<USBankruptcy>
<CourtId>...</CourtId>
<USBanktruptcy>...<USBankruptcy>
<CourtId>...</CourtId>
</USBankruptcies>
<USTaxLiens>
<USTaxLien>...<USTaxLien>
<CourtId>...</CourtId>
<USTaxLien>...<USTaxLien>
<CourtId>...</CourtId>
</USTaxLiens>
<USLegalItems>
<USLegalItem><USLegalItem>
<CourtId></CourtId>
<USLegalItem><USLegalItem>
<CourtId></CourtId>
</USLegalItems>
</PubicRecords>
I am using a combination of doc and xpath objects to extract the attributes and node contents.
NodeList bp = doc.getElementsByTagName("USBankruptcy");
NodeList nl = doc.getElementsByTagName("CourtId");
long itrBP;
for (itrBP = 0; itrBP < bp.getLength(); itrBP++ )
{
Element docElement = (Element) bp.item(itrBP);
Element courtElement = (Element) nl.item(itrBP);
NodeList df = docElement.getElementsByTagName("DateFiled");
if(df.getLength() > 0)
{
dateFiled = nullIfBlank(((Element)df.item(0)).getFirstChild().getTextContent());
dateFiled = df.format(dateFiled);
}
But, when I say get elements of tag name CourtID, it will get all the CourtIDs, not just the ones under USBankruptcy.
Is there any way to specify the parent?
I tried NodeList nl = doc.getElementsByTagName("USBankruptcies/CourtId");
It gave me a dom error on run time.
Rather than calling the getElementsByTagName("CourtId") method on the Document, call it on the child Element (in your case, the <USBankruptcies> element).
NodeList bankruptcyNodes = doc.getElementsByTagName("USBankruptcies");
Element bankruptcyElement = (Element) bankruptcyNodes.item(0);
NodeList bankruptcyCourtNodes = bankruptcyElement.getElementsByTagName("CourtId");
// etc...
Please find the code here:
DocumentBuilderFactory domFactory = DocumentBuilderFactory
.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("test.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("*//USBankruptcies/CourtId");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i));
}

Categories

Resources