Selecting multiple nodes using xpath in java - java

I an trying to get data using XPath.
I can reach the data what I want, but when there are multiple data, only the first one is selected.
And I want to count the number of target data.
For example, I want to count the numbers of queues whose message-vpn's name is vpn/b.
XML structure is as follows:
<queues>
<queue>
<name> queue/a </name>
<info>
<message-vpn> vpn/a </message-vpn>
</info>
</queue>
<queue>
<name> queue/b </name>
<info>
<message-vpn> vpn/b </message-vpn>
</info>
</queue>
<queue>
<name> queue/c </name>
<info>
<message-vpn> vpn/b </message-vpn>
</info>
</queue>
</queues>
And here's the xpath script I used.
/queues/queue/info/message-vpn[text()=("vpn/b")]
When I access the data, only queue/b is selected, not c.
Please help me to do that..

Your original xpath expression fails because the text vpn/b is surrounded by spaces. So another way to do it is
/queues/queue/info/message-vpn[normalize-space()=("vpn/b")]
If you want to count the number of these queues, use the count() method:
count(/queues/queue/info/message-vpn[normalize-space()=("vpn/b")])

Try below:
//queues/queue/info/message-vpn[contains(.,'vpn/b')]

Related

Check the quantity of similar nodes in xml, parse them 1 by 1

My program reads in xml data to automatically fill in the orders in the database. This program is written in Java with jpa, since it builds on sets.
When 1 client, orders 1 edition, for 1 delivery adres, I can easily parse the xml. But what if we would have multiple editions or delivery adresses? As an example, I here have an example of a possible xml file, offcourse without sensitive data:
<Order>
<Id>id</Id>
<Platform>platform</Platform>
<Brand></Brand>
<Orderdate>2022-03-14T18:34:28+01:00</Orderdate>
<Items>
<Item>
<Paper>
<EditionId>editionnr</EditionId>
<Name>nameOfEdition</Name>
<Platform>platform</Platform>
<Format>
<Name>Tabloid</Name>
</Format>
<NumberOfPages>x</NumberOfPages>
<PaperKind>paperkind</PaperKind>
<File name="xxxxxx"/>
</Paper>
<Contacts>
<Contact>
<Copies>x</Copies>
<Company>companyName</Company>
<Salutation></Salutation>
<Firstname>firstname</Firstname>
<Lastname>lastname</Lastname>
<Street>street</Street>
<HouseNumber>nr<HouseNumber/>
<BoxNumber/>
<Zip>xxxx</Zip>
<City>city</City>
<Country>countrytag</Country>
<C_Email>mail</C_Email>
</Contact>
</Contacts>
</Item>
<Item>
<Paper>
<EditionId>editionnr</EditionId>
<Name>nameOfEdition</Name>
<Platform>platform</Platform>
<Format>
<Name>Tabloid</Name>
</Format>
<NumberOfPages>x</NumberOfPages>
<PaperKind>paperkind</PaperKind>
<File name="xxxxxx"/>
</Paper>
<Contacts>
<Contact>
<Copies>x</Copies>
<Company>companyName</Company>
<Salutation></Salutation>
<Firstname>firstname</Firstname>
<Lastname>lastname</Lastname>
<Street>street</Street>
<HouseNumber>nr<HouseNumber/>
<BoxNumber/>
<Zip>xxxx</Zip>
<City>city</City>
<Country>countrytag</Country>
<C_Email>mail</C_Email>
</Contact>
</Contacts>
</Item>
</Items>
</Order>
I use this line of java code to get the specified information per node (offcourse I parse the entire xml in advance), which I than insert in my New Order () as the specified attribute :
eElements.getElementsByTagName("EditionId").item(0).getTextContext();
My question is, when I have multiple items or delivery adresses, how do instruct my program to check this?
in human language I want it do this:
check wheter there is only 1 item node or multiple
if only one, no problem
if multiple, create a line for each item
And offcourse the same for delivery adresses. But since their is only 1 contact per item, in this case you cannot see in the xml. In some of the xml's, they alse have multiple contacts within the item. But I guess this will be exactly the same method as for multiple items within orders.
What is the best way to instruct this?
To answer the first part, there is a getLength attribute available in documentBuilderFactory.
var nodeList = eElement.getElementsByTagName("EditionId");
editielijst.getLength());
the second part is use a loop -1 and than fill in
var EditionId1 = eElement.getElementsByTagName("EditionId").item(0).getTextContent();
var EditionId2 = eElement.getElementsByTagName("EditionId").item(1).getTextContent();

How to get xml nodes count in camel

I want to get the count of xml nodes present in the file with specific tags in camel exchange or camel route.
My xml tags are like this:
<parent>
<child>
<data>A</data>
</child>
<child>
<data>B</data>
</child>
<child>
<data>C</data>
<child>
<data>C1</data>
</child>
<child>
<data>C2</data>
</child>
</child>
</parent>
I want to count the <child> tags and it should return 5 for this.
Currently, I am getting size using Exchange but it is giving output as 3.
exchange.getIn().getBody(XmlTreesType.class).getParentTree().getChildNode().size();
This highly depends on your actual use case.
To extract the count, you could use the XPath language which allows you to extract information from XML easily.
To extract the count of all <child> nodes within your <parent> you could use the following:
count(/parent//child)
XPath expression.
To extract this value and store it in a header variable would look like this:
.from()
.setHeader("childCountHeader", xpath("count(/parent//child)", Integer.class));
Another typical use case in the camel Java DSL would be along the following, in order to directly route based on the count:
from()
.choice().xpath("count(/parent//child)>5")
//do something
.otherwise()
//do something else
.end();
If you want to use XPath inside vanilla java, as in a camel processor. You can build up an XPath processor as described in this answer.

Parse XML in which tag name is not fixed

It is easy to parse XML in which tags name are fixed. In XStream, we can simply use #XStreamAlias("tagname") annotation. But how to parse XML in which tag name is not fixed. Suppose I have following XML :
<result>
<result1>
<fixed1> ... </fixed1>
<fixed2> ... </fixed2>
</result1>
<result2>
<item>
<America>
<name> America </name>
<language> English </language>
</America>
</item>
<item>
<Spain>
<name> Spain </name>
<language> Spanish </language>
</Spain>
</item>
</result2>
</result>
Tag names America and Spain are not fixed and sometimes I may get other tag names like Germany, India, etc.
How to define pojo for tag result2 in such case? Is there a way to tell XStream to accept anything as alias name if tag name is not known before-hand?
if it is ok for you to get the tag from inside the tag itself (field 'name'), using Xpath, you can do:
//result2/*/name/text()
another option could be to use the whole element, like:
//result2/*
or also:
//result2/*/name()
Some technologies (specifically, data binding approaches) are optimized for handling XML whose structure is known at compile time. Others (like DOM and other DOM-like tree models - JDOM, XOM etc) are designed for handling XML whose structure is not known in advance. Use the tool for the job.
XSLT and XQuery try to blend both. In their schema-aware form, they can take advantage of static structure information when it is available. But more usually they are run in "untyped" mode, where there is no a-priori knowledge of element names or structure, and everything is handled as it comes. The XSLT rule-based processing paradigm is particularly well suited to "semi-structured" XML whose content is unpredictable or variable.

OAI Jaxen XPath problem

I'm having big problems with Xpath evaluation using Jaxen.
Here's part of XML i'm evaluating on:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<responseDate>2011-05-31T13:04:08+00:00</responseDate>
<request metadataPrefix="oai_dc" verb="ListRecords">http://citeseerx.ist.psu.edu/oai2</request>
<ListRecords>
<record>
<header>
<identifier>oai:CiteSeerXPSU:10.1.1.1.1484</identifier>
<datestamp>2009-05-24</datestamp>
</header>
<metadata>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Winner-Take-All..</dc:title>
<dc:relation>10.1.1.134.6077</dc:relation>
<dc:relation>10.1.1.65.2144</dc:relation>
<dc:relation>10.1.1.54.7277</dc:relation>
<dc:relation>10.1.1.48.5282</dc:relation>
</oai_dc:dc>
</metadata>
</record>
<resumptionToken>10.1.1.1.2041-1547151-500-oai_dc</resumptionToken>
</ListRecords>
</OAI-PMH>
I'm using Jaxen because in my use case it's much faster then Apache implementation. I'm using W3C DOM for XML representation.
I need to select all record arguments, and then on selected nodes evaluate other xpaths (it's needed because of my processing architecture).
I'm selecting all record nodes (this works):
/OAI-PMH/ListRecords/record
Then on every selected record node I'm evaluating other xpaths to get needed data:
Select identifier text value (this works):
header/identifier/text()
Select title text value (this does NOT work):
metadata/oai_dc:dc/dc:title/text()
I've registered namespaces prefixes with their URIs (oai_dc and dc). I also tried other xpaths but none of them work:
metadata/dc/title/text()
metadata//dc:title/text()
I've read other stackoverflow questions about xpaths, namespaces and solution to add prefix "oai" with URI "http://www.openarchives.org/OAI/2.0/". I tried adding that "oai:" prefix to nodes without defined prefix but as result I even didn't select record nodes. Any ideas what I'm doing wrong?
Solution:
Problem was about parser (thanks jasso). It wasn't set to be namespace aware - after changing that setting everything works fine, as expected.
I can't see how the XPath expression /OAI-PMH/ListRecords/record can possibly select anything, since your document does not have a {}OAI-PMH element, only a {http://www.openarchives.org/OAI/2.0/}OAI-PMH element. See http://jaxen.codehaus.org/faq.html

Finding all valid xpath from xml

I am trying to write a program in java where in i can find all the xpath for the given xml.I found out the link on the internet xpath generator but it does not work when one element can repeat multipletimes for example if we have xml like the following :-
<?xml version="1.0" encoding="UTF-8"?>
<Report>
<Name>
<FirstName>A</FirstName>
<LastName>B</LastName>
<MiddleName>C</MiddleName>
</Name>
<Name>
<FirstName>D</FirstName>
<LastName>E</LastName>
<MiddleName>S</MiddleName>
</Name>
</Report>
It will produce xpaths :-
/Report/Name/firstname for both firstname nodes.
but the expected should be /Report/Name1/firstname and /Report/Name[2]/firstname
Any ideas?
I think you may have to do this yourself.
Using a SAX parser will make it straightforward. Just maintain a stack of the elements you encounter and a count so you can increment the indexes (/Report/Name[1], /Report/Name[2]) easily.

Categories

Resources