parsing xml in java- multiple child elements - java

I want to parse xml elemets using java.I m succeeded in some part...But not sure how to do rest..I have xml as,
<MainTag>
<userid>user1</userid>
<country>US</country>
<city>LA</city>
<phone>
<number>1111111111</number>
</phone>
<phone>
<number>222222222</number>
</phone>
</MainTag>
<MainTag>
<userid>user2</userid>
<country>Aus</country>
<city>MB</city>
<phone>
<number>23233</number>
</phone>
<phone>
<number>8787822</number>
</phone>
<phone>
<number>10101</number>
</phone>
I am able to parse xml elements such as country,city etc as below.
public void endelement()
{
if (someText.equalsIgnoreCase("country"))
{
pojo.setCountry(Val);
}
else if(someText.equalsIgnoreCase("city"))
{
pojo.setCity(Val);
}
}
public void stratelement()
{
............
}
But in case of phone how I can parse it ? since one user has multiple phone nos.
I want to find multiple phone nos for particular user.
for e.g. in above xml
for user1 there are two phone nos.
for user2 there are three phone nos.
Can anybody help in this ? Thanks in advance.

I would recommend using JAXB, since it appears you are attempting to bind your xml to a POJO.
Looking at the code you have written here (and assuming that the example xml you have provided is a snippet of well formed xml), I am guess that your pojo object should have a member for phone numbers that is of type List<String>, and your pojo should have a method that allows you to add a phone number to the List (perhaps addPhoneNumber(String phoneNumber) {...})

First, that is not a well-formed XML (as it has two root elements) and you can't parse it with any parser API unless it is well-formed. Now, to parse the XML you would normally use the APIs meant for it like SAX, DOM or StAX or even better the JAXB binding API.
Since you seem to be new to this, I suggest you start learning JAXP. Use StAX instead of DOM or SAX.

you can use DocumetBuilderFactory java default class if you know the incoming xml format for example how many node it has and the names it is very simple look at this code ;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
//documentBuilder instance
DocumentBuilder db = dbf.newDocumentBuilder();
Document dom = db.parse("employees.xml");
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch(SAXException se) {
se.printStackTrace();
}catch(IOException ioe) {
ioe.printStackTrace();
}
//and than get root element
Element de= dom.getDocumentElement();
//get the nodelist of main element
NodeList nl = de.getElementsByTagName("Employee");
if(nl != null && nl.getLength() > 0) {
for(int i = 0 ; i < nl.getLength();i++) {
//get the employee element
Element el = (Element)nl.item(i);
}
}
//and then get data
private void getEmployee(Element el) {
//for each <employee> element get values
String name = getTextValue(el,"Name");
int id = getIntValue(el,"Id");
int age = getIntValue(el,"Age");
//get any element attribute
//String type = el.getAttribute("type");
}
thats all

Related

How to parsing rdf xml in Java

I used java dom to parsing rdf xml in java, but I couldn't parsing rdf xml in the direction I wanted. Here is my code :
String[] arr = new String[100];
DocumentBuilderFactory factory_parsing = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder_parsing = factory_parsing.newDocumentBuilder();
Document document = builder_parsing.parse("D://Test/dog_test.xml");
Element root = document.getDocumentElement();
System.out.println(root.getNodeName());
Node firstNode = root.getFirstChild();
Node next = firstNode.getNextSibling();
NodeList childList = next.getChildNodes();
System.out.println(childList.getLength());
for(i=0; i<childList.getLength(); i++){
Node item = childList.item(i);
if(item.getNodeType() == Node.ELEMENT_NODE){
System.out.println(item.getNodeName());
arr[i] = item.getNodeName();
System.out.println(item.getTextContent());
}else {
System.out.println("blank node.");
}
}
The result :
rdf:RDF
5
blank node.
test:Animal.Name
Jeck
blank node.
test:Dog.breed
Akita
blank node.
dog_test.xml :
<rdf:RDF
xmlns:test="http://www.test/2022#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xml:base="http://www.desktop-eqkvgq5.net/2022#">
<test:Dog rdf:ID="_100">
<test:Animal.Name>Jeck</test:Animal.Name>
<test:Dog.breed>Akita</test:Dog.breed>
</test:Dog>
<test:Cat rdf:ID="_101">
<test:Animal.Name>Tom</test:Animal.Name>
<test:Cat.breed>Munchkin</test:Cat.breed>
</test:Cat>
</rdf:RDF>
So I analyzed it hard to use the Jena method, but I don't know how to use it. I'm exhausted.
My question is two.
Is my java dom code wrong? The result is a mess. Even information on the cat is completely missing. Is java dome unable to parsing rdf xml?
I'd like to use the RDFXMLParser provided by Jena, but I'd like to know an example of using it.
What I want is to store data corresponding to s, p, and o as a string and change the final result to triple.
The results I want are as follows :
100 rdf:type test:Dog
100 test:Animal.Name Jeck
100 test:Dog.breed Akita
101 rdf:type test:Cat
101 test:Animal.Name Tom
101 test:Cat.breed Munchkin
Thank you for any advice

Getting null values from XPath query

I have this xml file:
<?xml version="1.0" encoding="UTF-8"?>
<iet:aw-data xmlns:iet="http://care.aw.com/IET/2007/12" class="com.aw.care.bean.resource.MessageResource">
<iet:metadata filter=""/>
<iet:message-resource>
<iet:message>some message 1</iet:message>
<iet:customer id="1"/>
<iet:code>edi.claimfilingindicator.11</iet:code>
<iet:locale>iw_IL</iet:locale>
</iet:message-resource>
<iet:message-resource>
<iet:message>some message 2</iet:message>
<iet:customer id="1"/>
<iet:code>edi.claimfilingindicator.12</iet:code>
<iet:locale>iw_IL</iet:locale>
</iet:message-resource>
.
.
.
.
</iet:aw-data>
Using this code below i'm getting over the data and finding what I need.
try {
FileInputStream fileIS = new FileInputStream(new File("resources\\bootstrap\\content\\MessageResources_iw_IL\\MessageResource_iw_IL.ctdata.xml"));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(fileIS);
XPath xPath = XPathFactory.newInstance().newXPath();
String query = "//*[local-name()='message-resource']//*[local-name()='code'][contains(text(), 'account')]";
NodeList nodeList = (NodeList) xPath.compile(query).evaluate(xmlDocument, XPathConstants.NODESET);
System.out.println("size= " + nodeList.getLength());
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getNodeValue());
}
}
catch (Exception e){
e.printStackTrace();
}
The issue is that i'm getting only null values while printing in the for loop, any idea why it's happened?
The code needs to return a list of nodes which have a code and message fields that contains a given parameters (same as like SQL query with two parameters with operator of AND between them)
Check the documentation:
https://docs.oracle.com/javase/7/docs/api/org/w3c/dom/Node.html
getNodeValue() applied to an element node returns null.
Use getTextContent().
Alternatively, if you find DOM too frustrating, switch to one of the better tree models like JDOM2 or XOM.
Also, if you used an XPath 2.0 engine like Saxon, it would (a) simplify your expression to
//*:message-resource//*:code][contains(text(), 'account')]
and (b) allow you to return a sequence of strings from the XPath expression, rather than a sequence of nodes, so you wouldn't have to mess around with nodelists.
Another point: I suspect that the predicate [contains(text(), 'account')] should really be [.='account']. I'm not sure of that, but using text() instead of ".", and using contains() instead of "=", are both common mistakes.

get child nodes from parent (xml, java)

UPDATE
i was specifically targeting staff under some root node, not all "staff" elements in the whole document. i forgot to mention this important detail in the question. sorry guys.
i found this answer to my question:
getElementsByTagName
But with this data:
<one>
<two>
<three>
<company>
<staff id="1001">
<firstname>Golf</firstname>
<lastname>4</lastname>
<nickname>Schnecke</nickname>
<salary>1</salary>
</staff>
<staff id="2001">
<firstname>Audi</firstname>
<lastname>R8</lastname>
<nickname>Rennaudi</nickname>
<salary>1111111</salary>
</staff>
<staff id="2002">
<firstname>Skoda</firstname>
<lastname>xyz</lastname>
<nickname>xyz</nickname>
<salary>0.1</salary>
</staff>
</company>
</three>
</two>
</one>
and this code:
public static void parseXML2() {
File fXmlFile = new File("src\\main\\java\\staff.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = null;
try {
dBuilder = dbFactory.newDocumentBuilder();
} catch (ParserConfigurationException ex) {
Logger.getLogger(MyParser.class.getName()).log(Level.SEVERE, null, ex);
}
Document doc = null;
try {
doc = dBuilder.parse(fXmlFile);
} catch (SAXException ex) {
Logger.getLogger(MyParser.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(MyParser.class.getName()).log(Level.SEVERE, null, ex);
}
System.out.println("test");
System.out.println(doc.getElementsByTagName("company").item(0).getTextContent());
}
i dont get just one staff element, but all of them. how come?
i was expecting to get:
Golf
4
Schnecke
1
but instead i get this:
Golf
4
Schnecke
1
Audi
R8
Rennaudi
1111111
Skoda
xyz
xyz
0.1
looks like your post is mostly code, please add more details...yes the details are there.
You are almost there. If you want to get the text contents of the first staff node, then get the elements by that tag name:
System.out.println(doc.getElementsByTagName("staff").item(0).getTextContent());
// ^^^^^^
Update
In case you want to get the first staff node under company, then you can find them with node type and node name checks. Here is a rudimentary loop to do this:
Node companyNode = doc.getElementsByTagName("company").item(0);
NodeList companyChildNodes = companyNode.getChildNodes();
for (int i = 0; i < companyChildNodes.getLength(); i++) {
Node node = companyChildNodes.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE && Objects.equals("staff", node.getNodeName())) {
System.out.println(node.getTextContent());
break;
}
}
You might want to refactor the for loop into a separate method.
You can use XPATH too. I think it's more concise:
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//company/staff[1]");
NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
System.out.println(nl.item(0).getTextContent());
Explanation:
//company selects all the company tags, regardless where they are in the xml. It's the // that ignores the rest of the xml structure.
//company/staff selects all the staff tags that are under company tag.
[0] selects the first such item.
You have a line of code which gets the text content from the first company.
System.out.println(doc.getElementsByTagName("company").item(0).getTextContent());
This gets all the text content in the first company node which in this case is the data inside 3 staff elements. If you want to select the first staff element, you can either select it by name or by getting the first child of the company.
If your xml has a format like you mentioned, and with this line of code,
System.out.println(doc.getElementsByTagName("company").item(0).getTextContent();
you are printing ALL contents under the COMPANY tag name, thus its gonna print EVERYTHING in company, if you want to select to print only 1st staff, change tagname in the sysout to "staff" and item variable to (0,1,2,3 -> index of wanted staff)
System.out.println(doc.getElementsByTagName("staff").item(0).getTextContent();
i havent tried this out, but i think its gonna work

Read element inside element from XML in SAX or Dom

<rootNode>
<Movies>
<Movie id=1>
<title> title1</title>
<Actors>
<Actor>Actor1</Actor>
<Actor>Actor2</Actor>
<Actors>
</Movie>
</Movies>
<performers >
<performer id=100>
<name>name1</name>
<movie idref=1/>
</performer>
</performers>
</rootNode>
Question1: I only want to get the movie under the movies. I tried both of DOM and SAX. It also returns the under performers. How can I avoid this by using SAX or DOM
DOM:
doc.getElementsByTagName("movie");
SAX:
public void startElement(String uri, String localName,String qName,
Attributes attributes) throws SAXException {
if (qName.equalsIgnoreCase("movie"))
Question2: How can I get the element inside element (Actor under movies) by using DOM or SAX?
Basically, what I want to do is output the data in order.
1,title, Actor1,Actor2
100,name1,1
doc.getElementsByTagName("movies")[0].childNodes;
gets you all the movies/movie nodes (watch for lower-/upper-case!). See here http://www.w3schools.com/dom/dom_intro.asp for a short tutorial.
XPath is designed for this type of extraction. For your example file, the query would be something like the following. For simplicity, I assumed your xml was in a res/raw, but in practice you will need to create the InputSource from where ever you are getting your xml.
XPath xpath = XPathFactory.newInstance().newXPath();
String expression = "/rootNode/Movies/Movie";
try {
NodeList nodes = (NodeList) xpath.evaluate(expression, doc,XPathConstants.NODESET);
} catch (XPathExpressionException e) {
e.printStackTrace();
}

Java - parse xml string with variable tagnames?

I'm trying to parse an XML string, and the tagnames are variable; I haven't seen any examples on how to pull the information out without knowing them. For example, I will always know the <response> and <data> tags below, but what falls inside/outside of them could be anything from <employee> to you name it.
<?xml version="1.0" encoding="UTF-8"?>
<response>
<generic>
....
</generic>
<data>
<employee>
<name>Seagull</name>
<id>3674</id>
<age>34</age>
</employee>
<employee>
<name>Robin</name>
<id>3675</id>
<age>25</age>
</employee>
</data>
</response>
You could parse it into a generic dom object and traverse it. For example, you could use dom4j to do this.
From the dom4j quick start guide:
public void treeWalk(Document document) {
treeWalk( document.getRootElement() );
}
public void treeWalk(Element element) {
for ( int i = 0, size = element.nodeCount(); i < size; i++ ) {
Node node = element.node(i);
if ( node instanceof Element ) {
treeWalk( (Element) node );
}
else {
// do something....
}
}
}
public Document parse(URL url) throws DocumentException {
SAXReader reader = new SAXReader();
Document document = reader.read(url);
return document;
}
I have seen similar situation in the projects.
If you are going to deal with large XMLs, you can use Stax or Sax parser to read the XML. On every step (like on reaching end element), enter the data into a Map or a dta structure of your choice, where you keep tag names as the key and value as value in the Map. Finally once you have the parsing done, use this Map to figure out which object to build as finally you would have a proper entity representation of the information in the XML
If XML is small,use DOM and directly build the entity object by reading the specific tag (like employee> or use XPATh to where you expect the tag to be present, giving you hint of the entity. Build that object directly by reading the specific information from the XML.

Categories

Resources