get child nodes from parent (xml, java) - java

UPDATE
i was specifically targeting staff under some root node, not all "staff" elements in the whole document. i forgot to mention this important detail in the question. sorry guys.
i found this answer to my question:
getElementsByTagName
But with this data:
<one>
<two>
<three>
<company>
<staff id="1001">
<firstname>Golf</firstname>
<lastname>4</lastname>
<nickname>Schnecke</nickname>
<salary>1</salary>
</staff>
<staff id="2001">
<firstname>Audi</firstname>
<lastname>R8</lastname>
<nickname>Rennaudi</nickname>
<salary>1111111</salary>
</staff>
<staff id="2002">
<firstname>Skoda</firstname>
<lastname>xyz</lastname>
<nickname>xyz</nickname>
<salary>0.1</salary>
</staff>
</company>
</three>
</two>
</one>
and this code:
public static void parseXML2() {
File fXmlFile = new File("src\\main\\java\\staff.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = null;
try {
dBuilder = dbFactory.newDocumentBuilder();
} catch (ParserConfigurationException ex) {
Logger.getLogger(MyParser.class.getName()).log(Level.SEVERE, null, ex);
}
Document doc = null;
try {
doc = dBuilder.parse(fXmlFile);
} catch (SAXException ex) {
Logger.getLogger(MyParser.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(MyParser.class.getName()).log(Level.SEVERE, null, ex);
}
System.out.println("test");
System.out.println(doc.getElementsByTagName("company").item(0).getTextContent());
}
i dont get just one staff element, but all of them. how come?
i was expecting to get:
Golf
4
Schnecke
1
but instead i get this:
Golf
4
Schnecke
1
Audi
R8
Rennaudi
1111111
Skoda
xyz
xyz
0.1
looks like your post is mostly code, please add more details...yes the details are there.

You are almost there. If you want to get the text contents of the first staff node, then get the elements by that tag name:
System.out.println(doc.getElementsByTagName("staff").item(0).getTextContent());
// ^^^^^^
Update
In case you want to get the first staff node under company, then you can find them with node type and node name checks. Here is a rudimentary loop to do this:
Node companyNode = doc.getElementsByTagName("company").item(0);
NodeList companyChildNodes = companyNode.getChildNodes();
for (int i = 0; i < companyChildNodes.getLength(); i++) {
Node node = companyChildNodes.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE && Objects.equals("staff", node.getNodeName())) {
System.out.println(node.getTextContent());
break;
}
}
You might want to refactor the for loop into a separate method.
You can use XPATH too. I think it's more concise:
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//company/staff[1]");
NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
System.out.println(nl.item(0).getTextContent());
Explanation:
//company selects all the company tags, regardless where they are in the xml. It's the // that ignores the rest of the xml structure.
//company/staff selects all the staff tags that are under company tag.
[0] selects the first such item.

You have a line of code which gets the text content from the first company.
System.out.println(doc.getElementsByTagName("company").item(0).getTextContent());
This gets all the text content in the first company node which in this case is the data inside 3 staff elements. If you want to select the first staff element, you can either select it by name or by getting the first child of the company.

If your xml has a format like you mentioned, and with this line of code,
System.out.println(doc.getElementsByTagName("company").item(0).getTextContent();
you are printing ALL contents under the COMPANY tag name, thus its gonna print EVERYTHING in company, if you want to select to print only 1st staff, change tagname in the sysout to "staff" and item variable to (0,1,2,3 -> index of wanted staff)
System.out.println(doc.getElementsByTagName("staff").item(0).getTextContent();
i havent tried this out, but i think its gonna work

Related

How can I read all child elements of a XML file using Java?

The goal: read through the entire XML file and return only specific values, in this case, a "username" and "password", so I validate a user login.
The problem: when the program reads the "users.xml" file, it returns all attributes and values from the "user id=0001", but other child elements like "user id=0002" it won't return. Like there's nothing else in the file but the first element.
What I tried: printing all info from the file using a for loop to read line by line and it works just fine, but when it comes to getting one particular value, it can only see the first one.
I'd like to know how can I make it so that "getUser()" returns any "username" and "password" in the file?
Here's what I have so far:
users.xml
<!--I made some modifications in the password field-->
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE Users>
<Users>
<User id = "0001">
<firstname>sinead</firstname>
<lastname>o'connor</lastname>
<email>sinead#oconnor.ie</email>
<username>sinead</username>
<password>oconnor</password>
</User>
<User id= "0002">
<firstname>John</firstname>
<lastname>Don</lastname>
<email>john#don.ie</email>
<username>john</username>
<password>pass</password>
</User>
</Users>
readFile() method
void readFile(){
String pathXML = "src/user_system/my_library/xml/users.xml";
File usersXML = new File(pathXML);
try {
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = builderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(usersXML);
document.getDocumentElement().normalize();
} catch (ParserConfigurationException | SAXException | IOException e) {
e.printStackTrace();
}
}
getUser() method
//I just split between readFile() and getUser() for better readability.
//Just changed "Users" to "User".
void getUser(){
NodeList nodeList = document.getElementsByTagName("User");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
//I also modified this bit.
Element xmlElement = (Element)nodeList.item(i);
String usernameElement = xmlElement.getElementsByTagName("username").item(0)
.getTextContent();
String passwordElement = xmlElement.getElementsByTagName("password").item(0)
.getTextContent();
if (!usernameElement.equals(users.getUsername())){
System.out.println("\nUsername "+ "\""+ users.getUsername()+ "\"" +
" was not found in our registry." + "\nPlease try again");
//users.login() is just a simple username/password form in another class.
users.login();
}
else if(!passwordElement.equals(users.getPassword())){
System.out.println("\nWrong password. Please try again.");
users.login();
}else {
System.out.println("Welcome " + WordUtils.capitalize(xmlElement.getElementsByTagName("firstname").item(0)
.getTextContent())+ " " +WordUtils.capitalize(xmlElement.getElementsByTagName("lastname").item(0).getTextContent()));
}
}
}
I updated the code, but I still can't get other users, only the first one.
I get the following output:
Please, choose one of the options below:
1) Press 1 to Log In.
2) Press 2 to Create an Account.
3) Press 3 to Log Off.
> 1
Log In.
Please, enter your username:
> john
Please, enter your password:
> pass
Username "john" was not found in our registry.
Any help is appreciated.
You can pick up all the nodes based on their tag name, here you have used Users as your base node but you can also query User tag by calling,
NodeList nodeList = document.getElementsByTagName("User");
Using code above from root of the xml document you have only User nodes, once you have all the User nodes you can iterate it one by one and query nodes which are inside of User node as shown below.
NodeList nodeList = document.getElementsByTagName("User");
for(int i=0;i<nodeList.getLength();i++) {
Element xmlElement = (Element)nodeList.item(i);
System.out.println(xmlElement.getElementsByTagName("username").item(0).getTextContent());
System.out.println(xmlElement.getElementsByTagName("password").item(0).getTextContent());
}
Note that in your sample xml username and password tag has same values which you might want to change to validate the output.
Some outputs would be helpful.
My guess is that when you create nodelist, you're creating a list of length 1, since "User" will grab the single element with the tag "Users".
NodeList nodeList = document.getElementsByTagName("Users");
You probably want to populate the nodelist with
NodeList nodeList = document.getElementsByTagName("User");

Getting null values from XPath query

I have this xml file:
<?xml version="1.0" encoding="UTF-8"?>
<iet:aw-data xmlns:iet="http://care.aw.com/IET/2007/12" class="com.aw.care.bean.resource.MessageResource">
<iet:metadata filter=""/>
<iet:message-resource>
<iet:message>some message 1</iet:message>
<iet:customer id="1"/>
<iet:code>edi.claimfilingindicator.11</iet:code>
<iet:locale>iw_IL</iet:locale>
</iet:message-resource>
<iet:message-resource>
<iet:message>some message 2</iet:message>
<iet:customer id="1"/>
<iet:code>edi.claimfilingindicator.12</iet:code>
<iet:locale>iw_IL</iet:locale>
</iet:message-resource>
.
.
.
.
</iet:aw-data>
Using this code below i'm getting over the data and finding what I need.
try {
FileInputStream fileIS = new FileInputStream(new File("resources\\bootstrap\\content\\MessageResources_iw_IL\\MessageResource_iw_IL.ctdata.xml"));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(fileIS);
XPath xPath = XPathFactory.newInstance().newXPath();
String query = "//*[local-name()='message-resource']//*[local-name()='code'][contains(text(), 'account')]";
NodeList nodeList = (NodeList) xPath.compile(query).evaluate(xmlDocument, XPathConstants.NODESET);
System.out.println("size= " + nodeList.getLength());
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getNodeValue());
}
}
catch (Exception e){
e.printStackTrace();
}
The issue is that i'm getting only null values while printing in the for loop, any idea why it's happened?
The code needs to return a list of nodes which have a code and message fields that contains a given parameters (same as like SQL query with two parameters with operator of AND between them)
Check the documentation:
https://docs.oracle.com/javase/7/docs/api/org/w3c/dom/Node.html
getNodeValue() applied to an element node returns null.
Use getTextContent().
Alternatively, if you find DOM too frustrating, switch to one of the better tree models like JDOM2 or XOM.
Also, if you used an XPath 2.0 engine like Saxon, it would (a) simplify your expression to
//*:message-resource//*:code][contains(text(), 'account')]
and (b) allow you to return a sequence of strings from the XPath expression, rather than a sequence of nodes, so you wouldn't have to mess around with nodelists.
Another point: I suspect that the predicate [contains(text(), 'account')] should really be [.='account']. I'm not sure of that, but using text() instead of ".", and using contains() instead of "=", are both common mistakes.

PARSING A COMPLEX XML USING DOM

I know that this sort of question has been asked here before, but still i couldn't find any solution to my problem. So please, can anyone help me with it.....
Situation:
I am parsing the following xml response using DOM parser.
<feed>
<post_id>16</post_id>
<user_id>68</user_id>
<restaurant_id>5</restaurant_id>
<dish_id>7</dish_id>
<post_img_id>14</post_img_id>
<rezing_post_id></rezing_post_id>
<price>8.30</price>
<review>very bad</review>
<rating>4.0</rating>
<latitude>22.299999000000</latitude>
<longitude>73.199997000000</longitude>
<near> Gujarat</near>
<posted>1340869702</posted>
<display_name>username</display_name>
<username>vivek</username>
<first_name>vivek</first_name>
<last_name>mitra</last_name>
<dish_name>Hash brows</dish_name>
<restaurant_name>Waffle House</restaurant_name>
<post_img>https://img1.yumzing.com/1000/9cab8fc91</post_img>
<post_comment_count>0</post_comment_count>
<post_like_count>0</post_like_count>
<post_rezing_count>0</post_rezing_count>
<comments>
<comment/>
</comments>
</feed>
<feed>
<post_id>8</post_id>
<user_id>13</user_id>
<restaurant_id>5</restaurant_id>
<dish_id>6</dish_id>
<post_img_id>7</post_img_id>
<rezing_post_id></rezing_post_id>
<price>3.50</price>
<review>This is cheesy!</review>
<rating>4.0</rating>
<latitude>42.187000000000</latitude>
<longitude>-88.346497000000</longitude>
<near>Lake in the Hills IL</near>
<posted>1340333509</posted>
<display_name>username</display_name>
<username>Nullivex</username>
<first_name>Bryan</first_name>
<last_name>Tong</last_name>
<dish_name>Hash Brows with Cheese</dish_name>
<restaurant_name>Waffle House</restaurant_name>
<post_img>https://img1.yumzing.com/1000/78e5c184fd3ae752f8665636381a8f0006762dc0</post_img>
<post_comment_count>6</post_comment_count>
<post_like_count>1</post_like_count>
<post_rezing_count>1</post_rezing_count>
<comments>
<comment>
<user_id>16</user_id>
<email>existentialism27#gmail.com</email>
<email_new></email_new>
<email_verification_code></email_verification_code>
<password>9d99ef4f72f9d2df968a75e058c78245fa45e9e7</password>
<password_reset_code></password_reset_code>
<salt>31a988badccd35a1be7dacc073f60f52e49ff881</salt>
<username>existentialism27</username>
<first_name>Daniel</first_name>
<last_name>Amaya</last_name>
<display_name>username</display_name>
<birth_month>10</birth_month>
<birth_day>5</birth_day>
<birth_year>1985</birth_year>
<city>Colorado Springs</city>
<state>CO</state>
<country>US</country>
<timezone>US/Mountain</timezone>
<last_seen>1338365509</last_seen>
<is_confirmed>1</is_confirmed>
<is_active>1</is_active>
<post_comment_id>9</post_comment_id>
<post_id>8</post_id>
<comment>this is a test comment!</comment>
<posted>1340334121</posted>
</comment>
<comment>
<user_id>16</user_id>
<email>existentialism27#gmail.com</email>
<email_new></email_new>
<email_verification_code></email_verification_code>
<password>9d99ef4f72f9d2df968a75e058c78245fa45e9e7</password>
<password_reset_code></password_reset_code>
<salt>31a988badccd35a1be7dacc073f60f52e49ff881</salt>
<username>existentialism27</username>
<first_name>Daniel</first_name>
<last_name>Amaya</last_name>
<display_name>username</display_name>
<birth_month>10</birth_month>
<birth_day>5</birth_day>
<birth_year>1985</birth_year>
<city>Colorado Springs</city>
<state>CO</state>
<country>US</country>
<timezone>US/Mountain</timezone>
<last_seen>1338365509</last_seen>
<is_confirmed>1</is_confirmed>
<is_active>1</is_active>
<post_comment_id>10</post_comment_id>
<post_id>8</post_id>
<comment>this is a test comment!</comment>
<posted>1340334128</posted>
</comment>
</comments>
</feed>
In the above xml response, i am getting multiple "feed", which i am able to parse without any problem, but here each "feed" can have None or N numbers of "comment". I am not able to parse the comment for an individual feed. Can anyone suggest me how do proceed in the right direction.
I am also putting a snippet of code here, NOT the entire code.. that i am using to parse the xml doc, so it will be easier to pin point the problem.
DocumentBuilderFactory odbf = DocumentBuilderFactory.newInstance();
DocumentBuilder odb = odbf.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(xml));
Document odoc = odb.parse(is);
odoc.getDocumentElement().normalize ();
NodeList LOP = odoc.getElementsByTagName("feed");
System.out.println(LOP.getLength());
for (int s=0 ; s<LOP.getLength() ; s++){
Node FPN =LOP.item(s);
try{
if(FPN.getNodeType() == Node.ELEMENT_NODE)
{
Element token = (Element)FPN;
NodeList oNameList0 = token.getElementsByTagName("post_id");
Element ZeroNameElement = (Element)oNameList0.item(0);
NodeList textNList0 = ZeroNameElement.getChildNodes();
feed_post_id = Integer.parseInt(((Node)textNList0.item(0)).getNodeValue().trim());
System.out.println("#####The Parsed data#####");
System.out.println("post_id : " + ((Node)textNList0.item(0)).getNodeValue().trim());
System.out.println("#####The Parsed data#####");
}
}catch(Exception ex){}
}
Once you have the feed NodeList run on it and use:
NodeList nodes = feedNode.getChildNodes();
for (Node node: nodes)
{
if(node.getNodeName().equals("comments")){
//do something with comments node
}
}

parsing xml in java- multiple child elements

I want to parse xml elemets using java.I m succeeded in some part...But not sure how to do rest..I have xml as,
<MainTag>
<userid>user1</userid>
<country>US</country>
<city>LA</city>
<phone>
<number>1111111111</number>
</phone>
<phone>
<number>222222222</number>
</phone>
</MainTag>
<MainTag>
<userid>user2</userid>
<country>Aus</country>
<city>MB</city>
<phone>
<number>23233</number>
</phone>
<phone>
<number>8787822</number>
</phone>
<phone>
<number>10101</number>
</phone>
I am able to parse xml elements such as country,city etc as below.
public void endelement()
{
if (someText.equalsIgnoreCase("country"))
{
pojo.setCountry(Val);
}
else if(someText.equalsIgnoreCase("city"))
{
pojo.setCity(Val);
}
}
public void stratelement()
{
............
}
But in case of phone how I can parse it ? since one user has multiple phone nos.
I want to find multiple phone nos for particular user.
for e.g. in above xml
for user1 there are two phone nos.
for user2 there are three phone nos.
Can anybody help in this ? Thanks in advance.
I would recommend using JAXB, since it appears you are attempting to bind your xml to a POJO.
Looking at the code you have written here (and assuming that the example xml you have provided is a snippet of well formed xml), I am guess that your pojo object should have a member for phone numbers that is of type List<String>, and your pojo should have a method that allows you to add a phone number to the List (perhaps addPhoneNumber(String phoneNumber) {...})
First, that is not a well-formed XML (as it has two root elements) and you can't parse it with any parser API unless it is well-formed. Now, to parse the XML you would normally use the APIs meant for it like SAX, DOM or StAX or even better the JAXB binding API.
Since you seem to be new to this, I suggest you start learning JAXP. Use StAX instead of DOM or SAX.
you can use DocumetBuilderFactory java default class if you know the incoming xml format for example how many node it has and the names it is very simple look at this code ;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
//documentBuilder instance
DocumentBuilder db = dbf.newDocumentBuilder();
Document dom = db.parse("employees.xml");
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch(SAXException se) {
se.printStackTrace();
}catch(IOException ioe) {
ioe.printStackTrace();
}
//and than get root element
Element de= dom.getDocumentElement();
//get the nodelist of main element
NodeList nl = de.getElementsByTagName("Employee");
if(nl != null && nl.getLength() > 0) {
for(int i = 0 ; i < nl.getLength();i++) {
//get the employee element
Element el = (Element)nl.item(i);
}
}
//and then get data
private void getEmployee(Element el) {
//for each <employee> element get values
String name = getTextValue(el,"Name");
int id = getIntValue(el,"Id");
int age = getIntValue(el,"Age");
//get any element attribute
//String type = el.getAttribute("type");
}
thats all

Why does my XPath expression in Java return too many children?

I have the following xml file:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<config>
<a>
<b>
<param>p1</param>
<param>p2</param>
</b>
</a>
</config>
and the xpath code to get my node params:
Document doc = ...;
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("/config/a/b");
Object o = expr.evaluate(doc, XPathConstants.NODESET);
NodeList list = (NodeList) o;
but it turns out that the nodes list (list) has 5 children, including "\t\n", instead of just two. Is there something wrong with my code? How can I just get my two nodes?
Thank you!
When you select /config/a/b/, you are selecting all children of b, which includes three text nodes and two elements. That is, given your XML above and only showing the fragment in question:
<b>
<param>p1</param>
<param>p2</param>
</b>
the first child is the text (whitespace) following <b> and preceding <param>p1 .... The second child is the first param element. The third child is the text (whitespace) between the two param elements. And so on. The whitespace isn't ignored in XML, although many forms of processing XML ignore it.
You have a couple choices:
Change your xpath expression so it will only select element nodes, as suggested by Ted Dziuba, or
Loop over the five nodes returned and only select the non-text nodes.
You could do something like this:
for (int i = 0; i < nodes.getLength(); i++) {
if (nodes.item(i).getNodeType() != Node.TEXT_NODE) {
System.out.println(nodes.item(i).getNodeValue());
}
}
You can use the node type to select only element nodes, or to remove text nodes.
so the xpath looks like:
/config/a/b/*/text().
And the output for :
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
would be as expected: p1 and p2
How about
/config/a/b/*/text()/..
?
import org.w3c.dom.*;
import javax.xml.xpath.*;
import javax.xml.parsers.*;
import java.io.IOException;
import org.xml.sax.SAXException;
public class TestClient_XPath {
public static void main(String[] args) throws ParserConfigurationException,
SAXException, IOException, XPathExpressionException {
DocumentBuilderFactory domFactory = DocumentBuilderFactory
.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("yourfile.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression xPathExpression = xpath.compile("/a/b/c");
Object res = xPathExpression.evaluate(doc);
System.out.println(res.toString());
}
}
Xalan and Xerces appear to be embedded in rt.jar.
Don't include xerces and xalan libs.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4624775
I am not sure but shouldn't /config/a/b just return b? /config/a/b/param should return the two param nodes...
Could the view on the problem be the problem? Of course you get back the resulting node AND all its children. So you just have to look at the first element and not at its children.
But I can be totally wrong, because I am usually just use Xpath to navigate on DOM trees (HtmlUnit).

Categories

Resources