Here is my code:
// get the factory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// Using factory get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();
// parse using builder to get DOM representation of the XML file
dom = db.parse(file);
} catch (ParserConfigurationException pce) {
pce.printStackTrace();
} catch (SAXException se) {
se.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
}
NodeList n1 = dom.getChildNodes();
Element e1 = (Element) n1.item(0);
System.out.println(n1.getLength());
System.out.println(e1.getNodeName());
NodeList n2 = n1.item(0).getChildNodes();
Element e2 = (Element) n2.item(0); //Line 61
System.out.println(n2.getLength());
System.out.println(e2.getNodeName());
Here is my XML file:
<?xml version="1.0" encoding="utf-8"?>
<test-fw:test
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:test-fw="http://simitar/test-fw">
<rule-tree>
<rule class="matchlines">
<property name="contiguous"> true</property>
<property name="inOrder">false</property>
<property name="exact">false</property>
<property name="lines">modelInstantiated</property>
</rule>
<rule class="matchlines">
<property name="contiguous"> true</property>
<property name="inOrder">true</property>
<property name="exact">false</property>
<property name="lines">InitEvent</property>
</rule>
</rule-tree>
</test-fw:test>
Here is my output:
1
test-fw:test
Exception in thread "main" java.lang.ClassCastException: com.sun.org.apache.xerces.internal.dom.DeferredTextImpl cannot be cast to org.w3c.dom.Element
at testpack.Main.run(Main.java:61)
at testpack.Main.main(Main.java:86)
I keep getting this error. I am completely lost. I have no idea what to do. I want to able to have one node, and be able to grab all it's children and put them into an array or list, so I can iterate through them.
Here are all my imports:
import java.io.File;
import java.io.IOException;
import java.util.List;
import java.util.Stack;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
I've had the hardest time trying to get this Java to parse this XML file.
NodeList n1 = dom.getChildNodes();
Element e1 = (Element) n1.item(0);
The node is not an Element, but a Node.
Try this:
Node no1 = (Node) n1.item(0);
Nodes can be text nodes or elements, for example. In particular,
<root>
<element/>
</root>
is 4 nodes. A root element, a text node containing \n, the element element and another text node containing \n.
Notice that NodeList.itemreturns a Node object, which can but does not have to be an Element.
In your case, the method returns a DeferredTextImpl instance, which represents a text node. This class implements the DeferredNode interface, which is, in turn, a subinterface of Node.
In order to process the Node instances, you'll have to make sure you can safely perform the cast. The Node interface provides methods that allow you to check the type of a node getNodeType, which returns a short value that you can compare to the constants defined in the very same interface like ELEMENT_NODE
Just need to check the Nodeis an Element or not . Following is the way to convert Node into Element.
NodeList nodes = root.getChildNodes();
for (int i = 0; i < nodes.getLength(); i++) {
if(nodes.item(i).getNodeType() == Node.ELEMENT_NODE){
Element element = (Element) nodes.item(i);
............................
}
}
Related
Currently, I need to get the element of XML without escaping.
For example:
<?xml version="1.0" encoding="UTF-8"?>
<Message>
<Header>H001</Header>
<Body>
<Item>ABC&ABC"</Item>
</Body>
</Message>
I need to get the value of "Item" element via XPath.
However, it is escaped automatically.
My Result = ABC&ABC"
Expected = ABC&ABC"
How can I get the expected value?
XPath will always return the values of nodes that result from XML parsing. The string value of the Item element in your XML, after parsing, is ABC&ABC", so that's what XPath gives you. If you want ABC&ABC" then you will have to reverse the action of the XML parser - this is known as serialization. Parsing "unescapes" entity and character references (it turns & into &). Serialization escapes special characters such as "&" (it turns & into &).
Put content surrounded by CDATA.
Note: Charater data (CDATA) will tell the parser to send the text as regular text (no markup) without parsing.
For example :
abc.xml
<?xml version="1.0" encoding="UTF-8"?>
<Messages>
<Message>
<Header>H001</Header>
<Body>
<Item><![CDATA[ABC&&ABC"]]></Item>
</Body>
</Message>
</Messages>
Java code :
import java.io.IOException;
import java.io.InputStream;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class Test {
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputStream input = Thread.currentThread().getContextClassLoader().getResourceAsStream("abc.xml");
Document doc = builder.parse(input);
doc.getDocumentElement().normalize();
NodeList list = doc.getElementsByTagName("Message");
for (int i = 0; i < list.getLength(); i++) {
Node node = list.item(i);
NodeList children = node.getChildNodes();
for (int j = 0; j < children.getLength(); j++) {
node = children.item(j);
System.out.println(node.getTextContent().trim());
}
}
}
}
Output :
H001
ABC&&ABC"
How can I parse and send data fields to db using java. I need code for store data to db. It would require any extra dependencies.
<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soap:Body>
<WS_FulfillmentResponse xmlns="https://imfawstest.codemantra.com/AIDC/">
<WS_FulfillmentResult>
<Date_Generated>12/2/2019 2:24:03 AM</Date_Generated>
<product>
<PUB_ID>25905</PUB_ID>
<LOG_NO>46725</LOG_NO>
<PIN_NO>94389</PIN_NO>
<SERIES_CODE>001</SERIES_CODE>
<SERIES_DESC>IMF Working Papers</SERIES_DESC>
<TITLE><![CDATA[cM_Safe and Wholesome Food - Nordic Reflections]]></TITLE>
<SUBTITLE />
<PUBL_SERIES_VOLNO>Working Paper No. 12/596</PUBL_SERIES_VOLNO>
<LANGUAGE_ID>4</LANGUAGE_ID>
<EDITION />
<PRC_STATUS_ID>11</PRC_STATUS_ID>
<PRC_STATUS_CODE>40</PRC_STATUS_CODE>
<Process_Status_Desc>Key' metadata required for promotion is entered by now (Project ID & description, Stock No.)</Process_Status_Desc>
<PUBLISHED_DATE />
<REVISION_DATE />
<EST_COMPL_DATE />
<ISBN>9781498302258</ISBN>
<ISSN_ID>152</ISSN_ID>
<COPYRIGHT_YR>2019</COPYRIGHT_YR>
<DOI>10.5089/9781498302258.001</DOI>
<Persistent_Link>https://elibrary.imf.org/view/IMF001/25905-9781498302258/25905-9781498302258/25905-9781498302258.xml</Persistent_Link>
<SKU>802258</SKU>
<DRAFT>0</DRAFT>
<FRONT_MATTERS>0</FRONT_MATTERS>
<MAIN>0</MAIN>
<Size>Small</Size>
<TOTAL>0</TOTAL>
<ROMAN_ARABIC />
<Dept_Phone>37779</Dept_Phone>
<Dept_Email>JLI2</Dept_Email>
<Dept_Contact_ID>22382</Dept_Contact_ID>
<Dept_Contact_Name />
<EXR_Editor_Name>Jim Beardow</EXR_Editor_Name>
<EXR_Editor_ID>45</EXR_Editor_ID>
<EXR_Editor_Phone>37899</EXR_Editor_Phone>
<EXR_Editor_Email>jbeardow#imf.org</EXR_Editor_Email>
<OutsidePublisher_Id>0</OutsidePublisher_Id>
<OutsidePublisher_Name />
<OutsidePublisher_Email />
<OutsidePublisher_Phone />
<OutsidePublisher_Address />
<PUBL_MANUS_EXPT_DATE>10/31/2019</PUBL_MANUS_EXPT_DATE>
<PUBL_DATE_TO_GRAPHICS />
<PUBL_DATE_TRANSLATION_RECD />
<PUBL_DATE_SENT_TRANSLATION>10/31/2019</PUBL_DATE_SENT_TRANSLATION>
<PUBL_MANUS_RCVD_DATE>10/31/2019</PUBL_MANUS_RCVD_DATE>
<PUBL_ASG_EDITOR_DATE>10/31/2019</PUBL_ASG_EDITOR_DATE>
<FIRST_RECIEPT_DATE>10/31/2019</FIRST_RECIEPT_DATE>
<PUBL_OUT_OF_PRINT_DATE />
<PRIORITY>Medium</PRIORITY>
<PUBL_DEPT_SUMMARY />
<PUBL_NOTE_CORESP />
<PUBL_INTERNAL_REMARKS />
above is sample format how to parse and take some fields name and isbn,price some more send to db how.
You can extract each data using inbuilt Java API for XML Processing called DOM XML Parser. It turns the XML file into DOM or Tree structure, and you have to traverse a node by node to get what you want. Here is an example that shows how to do it,
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
try {
// Creating a constructor of file class and parsing an XML file
File file = new File("XMLFile.xml");
// An instance of factory that gives a document builder
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// An instance of builder to parse the specified XML file
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(file);
doc.getDocumentElement().normalize();
System.out.println("Root element: " + doc.getDocumentElement().getNodeName());
NodeList nodeList = doc.getElementsByTagName("product");
// NodeList is not iterable, so we are using for loop
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
System.out.println("\nNode Name :" + node.getNodeName());
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) node;
System.out.println("PUB ID: " + eElement.getElementsByTagName("PUB_ID").item(0).getTextContent());
System.out.println("LOG NO: " + eElement.getElementsByTagName("LOG_NO").item(0).getTextContent());
System.out.println("PIN NO: " + eElement.getElementsByTagName("PIN_NO").item(0).getTextContent());
System.out.println("SERIES CODE: " + eElement.getElementsByTagName("SERIES_CODE").item(0).getTextContent());
System.out.println("SERIES DESC: " + eElement.getElementsByTagName("SERIES_DESC").item(0).getTextContent());
}
}
} catch (SAXException | IOException | ParserConfigurationException ex) {
Logger.getLogger(YourClass.class.getName()).log(Level.SEVERE, null, ex);
}
And then you can code an insert query to send data to your database.
Hope this helps you!
I am trying to find the relative depth of given XML element from specific element in the given XML file, I tried to use XPATH but I'm not very familiar with XML parsing and I'm not getting the desired result. I need as well to ignore the data elements while counting.
Below is the code that I have written and the sample XML file.
E.g. the depth of NM109_BillingProviderIdentifier from TS837_2000A_Loop element is 4.
The parent nodes are: TS837_2000A_Loop < NM1_SubLoop_2 < TS837_2010AA_Loop < NM1_BillingProviderName
as NM109_BillingProviderIdentifier is a child of NM1_BillingProviderName and thus the relative depth of NM1_BillingProviderName from TS837_2000A_Loop is 4 (including TS837_2000A_Loop).
package com.xmlexamples;
import java.io.File;
import java.io.FileInputStream;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
public class XmlParser {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new FileInputStream(new File("D://sample.xml")));
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
String expression;
expression = "count(NM109_BillingProviderIdentifier/preceding-sibling::TS837_2000A_Loop)+1";
Double d = (Double) xpath.compile(expression).evaluate(doc, XPathConstants.NUMBER);
System.out.println("position from TS837_2000A_Loop " + d);
}
}
<?xml version='1.0' encoding='UTF-8'?>
<X12_00501_837_P xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<TS837_2000A_Loop>
<NM1_SubLoop_2>
<TS837_2010AA_Loop>
<NM1_BillingProviderName>
<NM103_BillingProviderLastorOrganizationalName>VNA of Cape Cod</NM103_BillingProviderLastorOrganizationalName>
<NM109_BillingProviderIdentifier>1487651915</NM109_BillingProviderIdentifier>
</NM1_BillingProviderName>
<N3_BillingProviderAddress>
<N301_BillingProviderAddressLine>8669 NORTHWEST 36TH ST </N301_BillingProviderAddressLine>
</N3_BillingProviderAddress>
</TS837_2010AA_Loop>
</NM1_SubLoop_2>
</TS837_2000A_Loop>
</X12_00501_837_P>
The pivotal method for getting the depth of any node is by counting its ancestors (which include the parent, the parent of the parent etc):
count(NM109_BillingProviderIdentifier/ancestor-or-self::*)
This will give you the count up to the root. To get the relative count, i.e. from anything other than the root, assuming the names do not overlap, you can do this:
count(NM109_BillingProviderIdentifier/ancestor-or-self::*)
- count(NM109_BillingProviderIdentifier/ancestor::TS837_2000A_Loop/ancestor::*)
Depending on whether the current, or the base element should be included in the count, use the ancestor-or-self or ancestor axis.
PS: you should probably thank Pietro Saccardi for so kindly making your post and your huge (4kB on one line..) sample XML readable.
hello everyone i have on question for xpath
/abcd/nsanity/component_details[#component="ucs"]/command_details[<*configScope inHierarchical="true" cookie="{COOKIE}" dn="org-root" */>]/collected_data
i want to retrieve the string for above the xpath statement but when i am giving this xpath to xpath expression for evaulate it is throwing an exception like
Caused by: javax.xml.transform.TransformerException: A location path was expected, but the following token was encountered: <configScope
The bold part in your XPath expression is not a valid predicate expression. I can only guess, what do you want to achieve. If you want only the <command_details/> elements, which have a <configScope/> child element with attributes set to inHierarchical="true", cookie="{COOKIE}" and dn="org-root" then the XPath expression should be:
/abcd/nsanity/component_details[#component='ucs']/command_details[configScope[#inHierarchical='true' and #cookie='{COOKIE}' and #dn='org-root']]/collected_data
Here is an example XML:
<abcd>
<nsanity>
<component_details component="ucs">
<command_details>
<configScope inHierarchical="true" cookie="{COOKIE}" dn="org-root" />
<collected_data>Yes</collected_data>
</command_details>
<command_details>
<configScope inHierarchical="true" cookie="{COOKIE}" dn="XXX"/>
<collected_data>No</collected_data>
</command_details>
</component_details>
</nsanity>
</abcd>
The following Java program reads the XML file test.xml and evaluates the XPath expression (and prints the text node of element <collected_data/>.
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
public class Test {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document document = dbf.newDocumentBuilder().parse("test.xml");
XPath xpath = XPathFactory.newInstance().newXPath() ;
NodeList nl = (NodeList) xpath.evaluate("/abcd/nsanity/component_details[#component='ucs']/command_details[configScope[#inHierarchical='true' and #cookie='{COOKIE}' and #dn='org-root']]/collected_data", document, XPathConstants.NODESET);
for(int i = 0; i < nl.getLength(); i++) {
Element el = (Element) nl.item(i);
System.out.println(el.getTextContent());
}
}
}
I have the following xml file:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<config>
<a>
<b>
<param>p1</param>
<param>p2</param>
</b>
</a>
</config>
and the xpath code to get my node params:
Document doc = ...;
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("/config/a/b");
Object o = expr.evaluate(doc, XPathConstants.NODESET);
NodeList list = (NodeList) o;
but it turns out that the nodes list (list) has 5 children, including "\t\n", instead of just two. Is there something wrong with my code? How can I just get my two nodes?
Thank you!
When you select /config/a/b/, you are selecting all children of b, which includes three text nodes and two elements. That is, given your XML above and only showing the fragment in question:
<b>
<param>p1</param>
<param>p2</param>
</b>
the first child is the text (whitespace) following <b> and preceding <param>p1 .... The second child is the first param element. The third child is the text (whitespace) between the two param elements. And so on. The whitespace isn't ignored in XML, although many forms of processing XML ignore it.
You have a couple choices:
Change your xpath expression so it will only select element nodes, as suggested by Ted Dziuba, or
Loop over the five nodes returned and only select the non-text nodes.
You could do something like this:
for (int i = 0; i < nodes.getLength(); i++) {
if (nodes.item(i).getNodeType() != Node.TEXT_NODE) {
System.out.println(nodes.item(i).getNodeValue());
}
}
You can use the node type to select only element nodes, or to remove text nodes.
so the xpath looks like:
/config/a/b/*/text().
And the output for :
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
would be as expected: p1 and p2
How about
/config/a/b/*/text()/..
?
import org.w3c.dom.*;
import javax.xml.xpath.*;
import javax.xml.parsers.*;
import java.io.IOException;
import org.xml.sax.SAXException;
public class TestClient_XPath {
public static void main(String[] args) throws ParserConfigurationException,
SAXException, IOException, XPathExpressionException {
DocumentBuilderFactory domFactory = DocumentBuilderFactory
.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("yourfile.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression xPathExpression = xpath.compile("/a/b/c");
Object res = xPathExpression.evaluate(doc);
System.out.println(res.toString());
}
}
Xalan and Xerces appear to be embedded in rt.jar.
Don't include xerces and xalan libs.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4624775
I am not sure but shouldn't /config/a/b just return b? /config/a/b/param should return the two param nodes...
Could the view on the problem be the problem? Of course you get back the resulting node AND all its children. So you just have to look at the first element and not at its children.
But I can be totally wrong, because I am usually just use Xpath to navigate on DOM trees (HtmlUnit).