String (with spaces) to DOM in Java - java

I have a function which converts a string to DOM, and then uses javax.xml.xpath.XPathFactory on the DOM object to pull data.
The XPathFactory works fine for the following string
<root><test><name>A</name></test><test><name>B</name></test></root>
but it fails if I have spaces between tags
<root> <test> <name>A</name> </test> <test> <name>B</name> </test></root>
I'm using XpathFactory to ready the values "A" and "B" from the DOM.
Can anyone tell me exactly why is XpathFactory failing when the string has spaces in between tags.
Thanks
--SD

The XPath is correct and works ok, I think that the problem is that
list.item(i).getChildNodes().item(0).getTextContent());
gets the first child node of the node matching the XPath, that in the case of the XML with the spaces is the spaces right after <employee>, whereas in the case of the XML without the spaces is the <name> element.
In other words in the case with spaces the child nodes of the first employee element are (one per line):
[spaces]
<name> . . . </name>
[spaces]
<company-no> . . . </company-no>
[spaces]
<chunk-id> . . .</chunk-id>
in the case without spaces they are:
<name> . . . </name>
<company-no> . . . </company-no>
<chunk-id> . . .</chunk-id>
and so in the first case the child nodes you need are 1, 3 and 5, in the second case are 0, 1 and 2.
I think you should modify this piece of code:
System.out.println("Name: " +list.item(i).getChildNodes().item(0).getTextContent());
System.out.println("Company: "+list.item(i).getChildNodes().item(1).getTextContent());
System.out.println("Chunk: "+list.item(i).getChildNodes().item(2).getTextContent());
to either use other XPaths to get the name, company and chunk sub-nodes or to skip the child nodes containing spaces.

/root/test/name
or even just
//name
optionally, directly get the child text nodes
//name/text()

Related

How to get the element from the xml using Xpath Java

<employees>
<employee>
<firstName>Lokesh</firstName>
<lastName>Gupta</lastName>
<department>
<id>101</id>
<name>IT</name>
</department>
</employee>
</employees>
I wanted to get the elements name using Xpath..
I need to count the number of elements that i am getting using count(//employees/*) and count(//employees/employee/department/*)
it is returning count of each parent..
I need to get the element names as well //employees/employee/*/name() to get the elements name FirstName, LastName and Department..
also (//employees/employee/department/*/name()) to return name and id.. but it is showing error javax.xml.transform.TransformerException: Unknown nodetype: name .
You want to get the elements names (not the value of it). name() has to appear the first.
Since javax only supports XPath 1.0, you can use :
concat(name(//employees/employee/*[1]),",",name(//employees/employee/*[2]),",",name(//employees/employee/*[3]))
Output : firstName,lastName,department
concat(name(//employees/employee/department/*[1]),",",name(//employees/employee/department/*[2]))
Output : id,name
If you don't know the number of child for each parent element, you should use a loop approach. First, count and store the number of child (count(//employees/employee/*)), then make a loop where you increase the position index ([i]) at each iteration //employees/employee/*[i] i=i+1.

How to search for child nodes in xml using DOM Java

I have the following XML structure
<CodeSnippet>
<Code id="code1">
<Tags>button java</Tags>
<Snippet>sample code</Snippet>
</Code>
<Code id="code2">
<Tags>eclipse jbutton java</Tags>
<Snippet>sample code</Snippet>
</Code>
<.....>
</CodeSnippet>
Now, I want to retrieve all the Snippet from the above xml when i search using Tags. For instance, if search for "java" then all the nodes that contain tags as "java" must return the snippet.
My search query is:
//Code/Tags[contains(concat(' ',/text(),' '), ' "+ searchTags[0] +" ')]";
Here, searchTags[0] contains "java".
My result set should contain the Snippets of the selected nodes i.e. code1 and code2 from above xml structure.
Try this expression:
//Code/Tags[contains(., 'Java')]/../Snippet
For retrieving all the "Tags" containing "java", I used the below xpath expression,
//Code/Tags/text()[contains(., 'java')]
For retrieving all the "Snippet" related to the tags "java", I used below expression
//Code/Tags[contains(./text(), 'java')]/parent::Code/Snippet/text()
Thanks to #dfsq for helping me out with his expression. Thanks a lot.
You can write :
//Tags//Snippet
If you used XPath

how to use text() in jxpath

Can you get the text() of a jxpath element or does it not work?
given some nice xml:
<?xml version="1.0" encoding="UTF-8"?>
<AXISWeb xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="AXISWeb.xsd">
<Action>
<Transaction>PingPOS</Transaction>
<PingPOS>
<PingStep>To POS</PingStep>
<PingDate>2012-11-15</PingDate>
<PingTime>16:35:57</PingTime>
</PingPOS>
<PingPOS>
<PingStep>POS.PROCESSOR18</PingStep>
<PingDate>2012-11-15</PingDate>
<PingTime>16:35:57</PingTime>
</PingPOS>
<PingPOS>
<PingStep>From POS</PingStep>
<PingDate>2012-11-15</PingDate>
<PingTime>16:35:57</PingTime>
</PingPOS>
</Action>
</AXISWeb>
//Does not work:
jxpc.getValue("/AXISWeb/Action/PingPOS[1]/PingStep/text()");
//Does not work:
jxpc.getValue("/action/pingPOS[1]/PingStep/text()");
//Does not work:
jxpc.getValue("/action/pingPOS[1]/PingStep[text()]");
I know I can get the text from using
jxpc.getValue("/action/pingPOS[1]/PingStep");
But that's not the point.
Shouldn't text() work? I could find no examples....
P.S. It's also very very picky about case and capitalization. Can you turn that off somehow?
Thanks,
-G
/AXISWeb/Action/PingPOS[1]/PingStep/text() is valid XPath for your document
But, from what I can see from the user guide of jxpath (note: I don't know jxpath at all), getValue() is already supposed to return the textual content of a node, so you don't need to use the XPath text() at all.
So you may use the following:
jxpc.getValue("/AXISWeb/Action/PingPOS[1]/PingStep");
Extracted from the user guide:
Consider the following XML document:
<?xml version="1.0" ?>
<address>
<street>Orchard Road</street>
</address>
With the same XPath, getValue("/address/street"), will return the string "Orchard Road", while
selectSingleNode("/address/street") - an object of type Element (DOM
or JDOM, depending on the type of parser used). The returned Element
is, of course, <street>Orchard Road</street>.
Now about case insensitive query on tag names, if you are using XPath 2 you can use lower-case() and node() but this is not really recommended, you may better use correct names.
/*[lower-case(node())='axisweb']/*[lower-case(node())='action']/...
or if using XPath 1, you may use translate() but it gets even worse:
/*[translate(node(),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') = 'axisweb']/*[translate(node(),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') = 'action']/...
All in all, try to ensure that you use correct query, you know it is case sensitive, so it's better to pay attention to it. As you would do in Java, foo and fOo are not the same variables.
Edit:
As I said, XML and thus XPath is case sensitive, so pingStep cannot match PingStep, use the correct name to find it.
Concerning text(), it is part of XPath 1.0, there is no need for XPath 2 to use it. The JXPath getValue() is already doing the call to text() for you. If you want to do it yourself you will have to use selectSingleNode("//whatever/text()") that will returns an Object of type TextElement (depending on the underlying parser).
So to sum up, the method JXPathContext.getValue() already does the work to select the node's text content for you, so you don't need to do it yourself and explicitly call XPath's text().
From a post that I've anserwed before the method .getTextContent() do the job for you.
No need to use "text()" when you evaluate the Xpath.
Example :
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new File("D:\\Loic_Workspace\\Test2\\res\\test.xml"));
System.out.println(doc.getElementsByTagName("retCode").item(0).getTextContent());
If not, you will get the tag and the value. If you want do more take a look at this

dom4j: How to resolve this XPath Error?

I am reading an XML using dom4j by using XPath techniques for selecting desired nodes. Consider that my XML looks like this:
<Emp_Dir>
<Emp_Classification type ="Permanent" >
<Emp id= "1">
<name>jame</name>
<Emp_Bio>
<age>12</age>
<height>5.4</height>
<weight>78</weight>
</Emp_Bio>
<Emp_Details>
<salary>2000</salary>
<designation>developer</designation>
</Emp_Details>
</Emp>
<Emp id= "2">
<name>jame</name>
<Emp_Bio>
<age>12</age>
<height>5.4</height>
<weight>78</weight>
</Emp_Bio>
<Emp_Details>
<salary>2000</salary>
<designation>developer</designation>
</Emp_Details>
</Emp>
</Emp_Classification>
<Emp_Classification type ="Contract" >
.
.
.
</Emp_Classification>
<Emp_Classification type ="PartTime" >
.
.
.
</Emp_Classification>
</Emp_Dir>
Note: The above XML might looks ugly to you but i only create this dummy file for the sake of understanding and keeping the secracy of my project
When i specify some simple XPath expression, like:
//Emp_Classification (or)
/Emp_Dir/Emp_Classification
then its works fine but when i specify some complex expression like:
/Emp_Dir/Emp_Classification/[#type='Permanent'] (or)
//Emp_Dir/Emp_Classification/[#type='Permanent']
then it gives me the following error:
"Invalid XPath expression: /Emp_Dir/Emp_Classification/[#type='Permanent'] Expected one of '.', '..', '#', '*', <QName>"
Coulde anybody guides me what goes wrong in my XPath?
My second question is that how do i select the Emp_Bio node of Permanent Employees only, does this works?
//Emp_Dir/Emp_Classification/[#type='Permanent']/Emp/Emp_Bio
Use : //Emp_Dir/Emp_Classification[#type='Permanent']
(note the removal of /)
And then use this : //Emp_Dir/Emp_Classification[#type='Permanent']/Emp/Emp_Bio for the latter part of the question.

XML handling in Java

Need to select all nodes from the path a/b/c as NodeList from a Document using getElementsByTagName() . How do i provide path of node as input?
eg: -
<root>
<a>
<b>
<c>1</c>
<c>2</c>
<c>3</c>
<c>4</c>
<c>5</c>
<c>6</c>
</b>
</a>
</root>
need to select all 'c' nodes from the path a/b/c . How can I achieve this. Directly selecting c is an option, but to avoid ambiguity if more 'c's are present, I need to give the path. How do I achieve this?
Take a look at the Java XPathAPI. You probably want to specify an XPath of /root/a/b to specify all the <c/> nodes in the above hierarchy.

Categories

Resources