I used java dom to parsing rdf xml in java, but I couldn't parsing rdf xml in the direction I wanted. Here is my code :
String[] arr = new String[100];
DocumentBuilderFactory factory_parsing = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder_parsing = factory_parsing.newDocumentBuilder();
Document document = builder_parsing.parse("D://Test/dog_test.xml");
Element root = document.getDocumentElement();
System.out.println(root.getNodeName());
Node firstNode = root.getFirstChild();
Node next = firstNode.getNextSibling();
NodeList childList = next.getChildNodes();
System.out.println(childList.getLength());
for(i=0; i<childList.getLength(); i++){
Node item = childList.item(i);
if(item.getNodeType() == Node.ELEMENT_NODE){
System.out.println(item.getNodeName());
arr[i] = item.getNodeName();
System.out.println(item.getTextContent());
}else {
System.out.println("blank node.");
}
}
The result :
rdf:RDF
5
blank node.
test:Animal.Name
Jeck
blank node.
test:Dog.breed
Akita
blank node.
dog_test.xml :
<rdf:RDF
xmlns:test="http://www.test/2022#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xml:base="http://www.desktop-eqkvgq5.net/2022#">
<test:Dog rdf:ID="_100">
<test:Animal.Name>Jeck</test:Animal.Name>
<test:Dog.breed>Akita</test:Dog.breed>
</test:Dog>
<test:Cat rdf:ID="_101">
<test:Animal.Name>Tom</test:Animal.Name>
<test:Cat.breed>Munchkin</test:Cat.breed>
</test:Cat>
</rdf:RDF>
So I analyzed it hard to use the Jena method, but I don't know how to use it. I'm exhausted.
My question is two.
Is my java dom code wrong? The result is a mess. Even information on the cat is completely missing. Is java dome unable to parsing rdf xml?
I'd like to use the RDFXMLParser provided by Jena, but I'd like to know an example of using it.
What I want is to store data corresponding to s, p, and o as a string and change the final result to triple.
The results I want are as follows :
100 rdf:type test:Dog
100 test:Animal.Name Jeck
100 test:Dog.breed Akita
101 rdf:type test:Cat
101 test:Animal.Name Tom
101 test:Cat.breed Munchkin
Thank you for any advice
Related
I am trying to parse the XML document at http://web.mta.info/status/ServiceStatusSubway.xml and extract all the PtSituationElement elements with the following code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document subwayStatusDoc = builder.parse(new URL("http://web.mta.info/status/ServiceStatusSubway.xml").openStream());
NodeList situationList = subwayStatusDoc.getDocumentElement().getElementsByTagName("PtSituationElement");
System.out.println(situationList.item(0)); //prints null
What am I doing wrong here ?
The PtSituationElement tags contain child tags, so you need to go into those. Just printing .item(0) relies on the toString() method, and apparently it does not do a great job of explaining your nodes.
So add this to see some of the data in the child nodes:
Node item = situationList.item(0);
NodeList childNodes = item.getChildNodes();
for (int j = 0; j < childNodes.getLength(); j++) {
System.out.println(childNodes.item(j).getTextContent());
}
(I'm not sure what you want to do with the data in the xml structure, but this example shows how you can proceed with your work.)
Also, I noted that the LongDescription tags contain HTML that is not correct XML (<br clear=left> should be <br clear=left> etc). The parser could have a problem with that. It would be better if the HTML was escaped (see How to escape "&" in XML?).
I have this xml file:
<?xml version="1.0" encoding="UTF-8"?>
<iet:aw-data xmlns:iet="http://care.aw.com/IET/2007/12" class="com.aw.care.bean.resource.MessageResource">
<iet:metadata filter=""/>
<iet:message-resource>
<iet:message>some message 1</iet:message>
<iet:customer id="1"/>
<iet:code>edi.claimfilingindicator.11</iet:code>
<iet:locale>iw_IL</iet:locale>
</iet:message-resource>
<iet:message-resource>
<iet:message>some message 2</iet:message>
<iet:customer id="1"/>
<iet:code>edi.claimfilingindicator.12</iet:code>
<iet:locale>iw_IL</iet:locale>
</iet:message-resource>
.
.
.
.
</iet:aw-data>
Using this code below i'm getting over the data and finding what I need.
try {
FileInputStream fileIS = new FileInputStream(new File("resources\\bootstrap\\content\\MessageResources_iw_IL\\MessageResource_iw_IL.ctdata.xml"));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(fileIS);
XPath xPath = XPathFactory.newInstance().newXPath();
String query = "//*[local-name()='message-resource']//*[local-name()='code'][contains(text(), 'account')]";
NodeList nodeList = (NodeList) xPath.compile(query).evaluate(xmlDocument, XPathConstants.NODESET);
System.out.println("size= " + nodeList.getLength());
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getNodeValue());
}
}
catch (Exception e){
e.printStackTrace();
}
The issue is that i'm getting only null values while printing in the for loop, any idea why it's happened?
The code needs to return a list of nodes which have a code and message fields that contains a given parameters (same as like SQL query with two parameters with operator of AND between them)
Check the documentation:
https://docs.oracle.com/javase/7/docs/api/org/w3c/dom/Node.html
getNodeValue() applied to an element node returns null.
Use getTextContent().
Alternatively, if you find DOM too frustrating, switch to one of the better tree models like JDOM2 or XOM.
Also, if you used an XPath 2.0 engine like Saxon, it would (a) simplify your expression to
//*:message-resource//*:code][contains(text(), 'account')]
and (b) allow you to return a sequence of strings from the XPath expression, rather than a sequence of nodes, so you wouldn't have to mess around with nodelists.
Another point: I suspect that the predicate [contains(text(), 'account')] should really be [.='account']. I'm not sure of that, but using text() instead of ".", and using contains() instead of "=", are both common mistakes.
I know that this sort of question has been asked here before, but still i couldn't find any solution to my problem. So please, can anyone help me with it.....
Situation:
I am parsing the following xml response using DOM parser.
<feed>
<post_id>16</post_id>
<user_id>68</user_id>
<restaurant_id>5</restaurant_id>
<dish_id>7</dish_id>
<post_img_id>14</post_img_id>
<rezing_post_id></rezing_post_id>
<price>8.30</price>
<review>very bad</review>
<rating>4.0</rating>
<latitude>22.299999000000</latitude>
<longitude>73.199997000000</longitude>
<near> Gujarat</near>
<posted>1340869702</posted>
<display_name>username</display_name>
<username>vivek</username>
<first_name>vivek</first_name>
<last_name>mitra</last_name>
<dish_name>Hash brows</dish_name>
<restaurant_name>Waffle House</restaurant_name>
<post_img>https://img1.yumzing.com/1000/9cab8fc91</post_img>
<post_comment_count>0</post_comment_count>
<post_like_count>0</post_like_count>
<post_rezing_count>0</post_rezing_count>
<comments>
<comment/>
</comments>
</feed>
<feed>
<post_id>8</post_id>
<user_id>13</user_id>
<restaurant_id>5</restaurant_id>
<dish_id>6</dish_id>
<post_img_id>7</post_img_id>
<rezing_post_id></rezing_post_id>
<price>3.50</price>
<review>This is cheesy!</review>
<rating>4.0</rating>
<latitude>42.187000000000</latitude>
<longitude>-88.346497000000</longitude>
<near>Lake in the Hills IL</near>
<posted>1340333509</posted>
<display_name>username</display_name>
<username>Nullivex</username>
<first_name>Bryan</first_name>
<last_name>Tong</last_name>
<dish_name>Hash Brows with Cheese</dish_name>
<restaurant_name>Waffle House</restaurant_name>
<post_img>https://img1.yumzing.com/1000/78e5c184fd3ae752f8665636381a8f0006762dc0</post_img>
<post_comment_count>6</post_comment_count>
<post_like_count>1</post_like_count>
<post_rezing_count>1</post_rezing_count>
<comments>
<comment>
<user_id>16</user_id>
<email>existentialism27#gmail.com</email>
<email_new></email_new>
<email_verification_code></email_verification_code>
<password>9d99ef4f72f9d2df968a75e058c78245fa45e9e7</password>
<password_reset_code></password_reset_code>
<salt>31a988badccd35a1be7dacc073f60f52e49ff881</salt>
<username>existentialism27</username>
<first_name>Daniel</first_name>
<last_name>Amaya</last_name>
<display_name>username</display_name>
<birth_month>10</birth_month>
<birth_day>5</birth_day>
<birth_year>1985</birth_year>
<city>Colorado Springs</city>
<state>CO</state>
<country>US</country>
<timezone>US/Mountain</timezone>
<last_seen>1338365509</last_seen>
<is_confirmed>1</is_confirmed>
<is_active>1</is_active>
<post_comment_id>9</post_comment_id>
<post_id>8</post_id>
<comment>this is a test comment!</comment>
<posted>1340334121</posted>
</comment>
<comment>
<user_id>16</user_id>
<email>existentialism27#gmail.com</email>
<email_new></email_new>
<email_verification_code></email_verification_code>
<password>9d99ef4f72f9d2df968a75e058c78245fa45e9e7</password>
<password_reset_code></password_reset_code>
<salt>31a988badccd35a1be7dacc073f60f52e49ff881</salt>
<username>existentialism27</username>
<first_name>Daniel</first_name>
<last_name>Amaya</last_name>
<display_name>username</display_name>
<birth_month>10</birth_month>
<birth_day>5</birth_day>
<birth_year>1985</birth_year>
<city>Colorado Springs</city>
<state>CO</state>
<country>US</country>
<timezone>US/Mountain</timezone>
<last_seen>1338365509</last_seen>
<is_confirmed>1</is_confirmed>
<is_active>1</is_active>
<post_comment_id>10</post_comment_id>
<post_id>8</post_id>
<comment>this is a test comment!</comment>
<posted>1340334128</posted>
</comment>
</comments>
</feed>
In the above xml response, i am getting multiple "feed", which i am able to parse without any problem, but here each "feed" can have None or N numbers of "comment". I am not able to parse the comment for an individual feed. Can anyone suggest me how do proceed in the right direction.
I am also putting a snippet of code here, NOT the entire code.. that i am using to parse the xml doc, so it will be easier to pin point the problem.
DocumentBuilderFactory odbf = DocumentBuilderFactory.newInstance();
DocumentBuilder odb = odbf.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(xml));
Document odoc = odb.parse(is);
odoc.getDocumentElement().normalize ();
NodeList LOP = odoc.getElementsByTagName("feed");
System.out.println(LOP.getLength());
for (int s=0 ; s<LOP.getLength() ; s++){
Node FPN =LOP.item(s);
try{
if(FPN.getNodeType() == Node.ELEMENT_NODE)
{
Element token = (Element)FPN;
NodeList oNameList0 = token.getElementsByTagName("post_id");
Element ZeroNameElement = (Element)oNameList0.item(0);
NodeList textNList0 = ZeroNameElement.getChildNodes();
feed_post_id = Integer.parseInt(((Node)textNList0.item(0)).getNodeValue().trim());
System.out.println("#####The Parsed data#####");
System.out.println("post_id : " + ((Node)textNList0.item(0)).getNodeValue().trim());
System.out.println("#####The Parsed data#####");
}
}catch(Exception ex){}
}
Once you have the feed NodeList run on it and use:
NodeList nodes = feedNode.getChildNodes();
for (Node node: nodes)
{
if(node.getNodeName().equals("comments")){
//do something with comments node
}
}
I want to parse xml elemets using java.I m succeeded in some part...But not sure how to do rest..I have xml as,
<MainTag>
<userid>user1</userid>
<country>US</country>
<city>LA</city>
<phone>
<number>1111111111</number>
</phone>
<phone>
<number>222222222</number>
</phone>
</MainTag>
<MainTag>
<userid>user2</userid>
<country>Aus</country>
<city>MB</city>
<phone>
<number>23233</number>
</phone>
<phone>
<number>8787822</number>
</phone>
<phone>
<number>10101</number>
</phone>
I am able to parse xml elements such as country,city etc as below.
public void endelement()
{
if (someText.equalsIgnoreCase("country"))
{
pojo.setCountry(Val);
}
else if(someText.equalsIgnoreCase("city"))
{
pojo.setCity(Val);
}
}
public void stratelement()
{
............
}
But in case of phone how I can parse it ? since one user has multiple phone nos.
I want to find multiple phone nos for particular user.
for e.g. in above xml
for user1 there are two phone nos.
for user2 there are three phone nos.
Can anybody help in this ? Thanks in advance.
I would recommend using JAXB, since it appears you are attempting to bind your xml to a POJO.
Looking at the code you have written here (and assuming that the example xml you have provided is a snippet of well formed xml), I am guess that your pojo object should have a member for phone numbers that is of type List<String>, and your pojo should have a method that allows you to add a phone number to the List (perhaps addPhoneNumber(String phoneNumber) {...})
First, that is not a well-formed XML (as it has two root elements) and you can't parse it with any parser API unless it is well-formed. Now, to parse the XML you would normally use the APIs meant for it like SAX, DOM or StAX or even better the JAXB binding API.
Since you seem to be new to this, I suggest you start learning JAXP. Use StAX instead of DOM or SAX.
you can use DocumetBuilderFactory java default class if you know the incoming xml format for example how many node it has and the names it is very simple look at this code ;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
//documentBuilder instance
DocumentBuilder db = dbf.newDocumentBuilder();
Document dom = db.parse("employees.xml");
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch(SAXException se) {
se.printStackTrace();
}catch(IOException ioe) {
ioe.printStackTrace();
}
//and than get root element
Element de= dom.getDocumentElement();
//get the nodelist of main element
NodeList nl = de.getElementsByTagName("Employee");
if(nl != null && nl.getLength() > 0) {
for(int i = 0 ; i < nl.getLength();i++) {
//get the employee element
Element el = (Element)nl.item(i);
}
}
//and then get data
private void getEmployee(Element el) {
//for each <employee> element get values
String name = getTextValue(el,"Name");
int id = getIntValue(el,"Id");
int age = getIntValue(el,"Age");
//get any element attribute
//String type = el.getAttribute("type");
}
thats all
I am new to programming in java and i have just learned how to parse an xml file. But i am not getting any idea on how to parse this xml file. Please help me with a code on how to get the tags day1 and their inner tags order1,order2
<RoutePlan>
<day1>
<Order1>
<customer> XYZ</customer>
<address> INDIA </address>
<data> 10-10-2011 </data>
<time> 9.30 A.M </time>
</Order1>
<Order2>
<customer> ABC </customer>
<address> US </address>
<data> 10-10-2011 </data>
<time> 10.30 A.M </time>
</Order2>
</day1>
I wrote the following code to retrieve. But i am only getting the data in order1 but not in order2
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(file);
document.getDocumentElement().normalize();
System.out.println("Root Element: "+document.getDocumentElement().getNodeName());
NodeList node = document.getElementsByTagName("day1");
for(int i=0;i<node.getLength();i++){
Node firstNode = node.item(i);
Element element = (Element) firstNode;
NodeList customer = element.getElementsByTagName("customer");
Element customerElement = (Element) customer.item(0);
NodeList firstName = customerElement.getChildNodes();
System.out.println("Name: "+((firstName.item(0).getNodeValue())));
NodeList address = element.getElementsByTagName("address");
Element customerAddress = (Element) address.item(0);
NodeList addName = customerAddress.getChildNodes();
System.out.println("Address: "+((addName.item(0).getNodeValue())));
NodeList date = element.getElementsByTagName("date");
Element customerdate = (Element) date.item(0);
NodeList dateN = customerdate.getChildNodes();
System.out.println("Address: "+((dateN.item(0).getNodeValue())));
NodeList time = element.getElementsByTagName("time");
Element customertime = (Element) time.item(0);
NodeList Ntime = customertime.getChildNodes();
System.out.println("Time: "+((Ntime.item(0).getNodeValue())));
}
I can give you not one, not two, but three directions to parse this XML (there are more but let's say they are the most commons ones):
DOM -> two good resources to start : here and here
SAX -> quickstart from official website: here
StAX -> a good introduction: here
Judging by the size of your XML document, I'd probably go for a DOM parsing, which gonna be the easiest to implement and to use (but if you have to deal with larger files, take a look at SAX for reading-only manipulations and StAX for reading and writing ones).
The reason you are getting only "Order1" elements is because:
You lock on the "day1" node.
You retrieve the "customer" elements by tag name which returns 2 elements.
You retrieve the first element and print its value and hence the second "customer" is ignored.
When working with DOM, be prepared to spin up multiple loops for retrieving data. Also, you are a bit misguided when it comes to representing your schema. You really don't need to name "elements" as "day1"/"order1" etc. In XML, that can be simply expressed by having multiple "day" or "order" elements which in turn automatically enforces ordering. An example XML would look like:
<route-plan>
<day>
<order>
<something>
</order>
</day>
<day>
<order>
<something>
</order>
</day>
</route-plan>
Now retrieving "day" elements is a simple matter of:
Look up "day" elements by tag name
For each "day" element
Look up "order" element by tag name
For each "order" element
Print out the value of "customer"/"address" etc.