Parsing XML in Java - java

i have response structure that i want to parse in Java. Can anyone help me with this?
<message_response xmlns="">
<action name="GETCIL">
<param name="bookingNote" value="" require="" read-only=""><![CDATA[bookingNote]]></param>
<param name="CarrierLinkType" value="" require="" read-only=""><![CDATA[True]]></param>
<param name="Carrier" value="" require="" read-only=""><![CDATA[SK185]]></param>
<param_list name="ViaAddressList" id="GETCIL">
<value>
<param_list name="ViaAddressId" id="ViaAddressList">
<value><![CDATA[877765050_5511]]></value>
</param_list>
<param_list name="AddressDate" id="ViaAddressList">
<value><![CDATA[10/12/2010]]></value>
</param_list>
<param_list name="AddressTime" id="ViaAddressList">
<value><![CDATA[12:12]]></value>
</param_list>
</value>
</param_list>
</action>
</message_response>

The easiest way to extract specific values from an XML document (as opposed to parsing the complete document with SAX) is to use XPath as follows:
//1. load the document into memory.
DocumentBuilder documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
//2. Create an XPath.
XPath xpath = XPathFactory.newInstance().newXPath();
//3. Evaluate the xpath expression.
String actionName = xpath.evaluate("/message_response/action/#name", documentBuilder.parse(xmlFile));
There's not much more to it other than the XPath.evaluate method is overloaded in order to allow nodes and node lists to be returned (see javax.xml.xpath.XPathConstants for the types).
Then you just need to read-up on the xpath syntax (http://www.w3schools.com/xpath/xpath_syntax.asp).

Why the CDATA sections around the data?
You can use SAX or DOM to parse XML.
There are also libraries wrapping SAX and DOM parsers that make your life easier for common tasks. Two that come to mind for Java are JDOM and DOM4J. Google for them - there are tutorials and examples available that will show you what you need to know.

Related

How to use tokenize function in Xpath

can we use tokenize function in XPath
The general java code i use to process XSLT and XML files are :
XPath xPath = XPathFactory.newInstance().newXPath();
InputSource inputXML = new InputSource(new StringReader(xml));
String expression = "/root/customer/personalDetails[age=tokenize('20|30','|')]/name";
boolean evaluate1 = (boolean) xPath.compile(expression).evaluate(inputXML, XPathConstants.BOOLEAN);
XML :-
<?xml version="1.0" encoding="ISO-8859-15"?>
<root>
<customer>
<personalDetails>
<name>ABC</name>
<value>20</value>
</personalDetails>
<personalDetails>
<name>XYZ</name>
<value>21</value>
</personalDetails>
<personalDetails>
<name>PQR</name>
<value>30</value>
</personalDetails>
</customer>
</root>
Expected Response :- ABC,PQR
Yes, you can use the tokenize() function in XPath, provided your XPath processor supports XPath 2.0 or later.
For Java, the popular choice of XPath 2.0+ processor is Saxon.
You can use the JAXP API with Saxon, however, it's not really designed to work well with XPath 2.0+, so it's preferable to use Saxon's own API (called s9api).
For this particular example, you don't need tokenize(). In XPath 2.0+ you can write
[age=('20', '30')]

How to get specific field values from XML Response in Java?

When I am printing my API response, which gives me below xml as Response:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<BugInfo xmlns="ctessng" xmlns:ns2="http://www.w3.org/1999/xlink">
<Bug id="CSCvz53137">
<Field name="Assigned Date">09/01/2021 21:12:25</Field>
<Field name="Archived">N</Field>
<Field name="Assigner">James Vilson</Field>
<Field name="Status">V</Field>
<Field name="Submitter">Spark Mery</Field>
<Field name="Reason">Technically Inaccurate</Field>
<Field name="Regression">Y</Field>
<Field name="Resolved Date">09/02/2021 02:12:37</Field>
<Field name="Version">001.010</Field>
</Bug>
</BugInfo>
I want to fetch only specific values form this xml, like Assigned Date, Assigner, Submitter & Resolved-on
Assigned Date --> 09/01/2021 21:12:25
Assigner --> James Vilson
Submitter --> Spark Mery
Resolved Date --> 09/02/2021 02:12:37
What is the best/simplest way to read in values from this xml?
Regex
The most versatile would be plain text-filtering (match/find, extract) using a regular expression:
<Field name=\"(Assigned Date|Assigner|Submitter|Resolved Date)\">(.*)<
Iterating with find() then group(1) and group(2) can give you the desired strings.
See this regex demo
XPath
The pure XML-parsing way would be to use any XML parser, like DocumentBuilderFactory and SAXParser which can be used to read the XML into a document, then find the desired XML-nodes (Field elements) via XPath expression:
/BugInfo/Bug/Field[#name="Assigner"]|//Field[#name="Assigned Date"]|//Field[#name="Submitter"]|//Field[#name="Resolved Date"]
Iterating over the found nodes we can extract the child as text value.
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xPath.compile(xPathExpression).evaluate(xmlDocument, XPathConstants.NODESET);
See:
Filtering XML Document using XPATH in java
XPath OR operator for different nodes
XML mapping
The object-oriented way would use an XML mapper like Jackson to deserialize (unmarshall) the XML to an object.
Similar to the OkHTTP Recipe: Parse a JSON Response With Moshi (.kt, .java)
Then you would need a class where you can map the XML nodes to.
class Bug {
String submitter;
String assigner;
Date assignedOn;
Date resolvedOn
}
The mapping can be a bit tricky here, because from XML-model point-of-view a Bug node contains a collection of children Fields. But the target type, is semantically not a field-list, but a Bug-object with different typed properties.
This is probably the cleanest because it will be easy to parse: Bug bug = new XmlMapper().readValue(xmlString, Bug.class).

XML parsing /JAVA

I have an XML, for example
<root>
<config x="xxx" y="yyyy" z="zzz" />
<properties>blah blah blah </properties>
<example>
<name>...</name>
<decr>...</descr>
</example>
<example>
<name>...</name>
<decr>...</descr>
</example>
</root>
and I need to get nodes config, and properties and all values in it.
Thank you
Xpath can fetch you the data in the config tag. You need to create an expression first like this
expression="//root/config/#x", to get value of x,y,z.
For properties, follow this thread :
Parsing XML with XPath in Java
Hope this helps
DOM,DOM4J,SAX..
if the size of XML file is small,you can use DOM or DOM4J,but the size is big , you use the SAX
If you directly want to query or fetch data XPath can help, but if you want the data as Java Objects so that you can use it further then use JAXB
You can use SAX parser to read the xml manipulate its event based parsing and consumes more memory.
If your xml is big and requires lot of manipulations then go-for DOM/DOM4j either is good. DOM4L is very latest. DOM is widely used in industry.
Based on your requirement go for good parser.
Thanks,
Pavan

XML Parent & child Attributes reading in java?

I have the following data in my XML file.
<main>
<Team Name="Development" ID="10">
<Emp Source="Business" Total="130" Active="123" New="12" />
<Emp Source="Business" Total="131" Active="124" New="13" />
</Team>
<Team Name="Testing" ID="10">
<Emp Source="Business" Total="133" Active="125" New="14" />
</Team>
</main>
I want to read above data & store values into arrays,Can any one help on these?
Not sure why you need to convert those xml into Arrays, anyhow you can read xml and parse it by several ways. Normally we use DOM or Stax Parser and a Tutorial link is here, also here is a Java SAX Parsing Example tutorial.
Hope this can help you to achieve your goal. Update your question again if you stuck anywhere.
You can use parser in JAVA to parse the XML document. The package in java for this purpose is javax.xml.parsers . DocumentBuilder parses XML into a Document and Document is a tree structured data structure that is DOM(Document Object Model) readable file. Its nodes can be traversed/ changed/ accessed by DOM methods.
Here is a very good tutorial on XML DOM: http://www.roseindia.net/xml/dom/
and more specifically: http://www.roseindia.net/xml/dom/accessing-xml-file-java.shtml
also you can always refer to w3school for more theory on DOM!

XML Parsing - DOM or SAX - Complex xml with attributes as conditions to access hierarchy in java

<playingTestCodeDetails classCode="ENT" determinerCode="INSTANCE" >
<realmCode code="QD" />
<id assigningAuthorityName="PRMORDCODE" extension="16494" />
<id assigningAuthorityName="TESTNUMINBOOK" extension="16494" />
<code code="16494" codeSystemName="QTIM" displayName="SureSwab Candidiasis" />
<name use=""></name>
<asSeeAlsoCode classCode="ROL" > <!-- Have repeated Seealsocode section for multiple see also codes and stripped names -->
<realmCode code="QD" />
<code code="7600" displayName="Sample See Also Name" ></code>
</asSeeAlsoCode>
<asSeeAlsoCode classCode="ROL" >
<realmCode code="QD" />
<code code="6496" displayName="Sample See Also Name" ></code>
</asSeeAlsoCode>
</playingTestCodeDetails>
<subjectOf typeCode="SBJ">
<realmCode code="QD" />
<order classCode="OBS" moodCode="EVN" >
<realmCode code="QD" />
<performer nullFlavor="" typeCode="PRF"><!-- Have added this to accomodate the UnitCode-->
<performingLocatedEntity classCode="LOCE" nullFlavor="">
<locatedPerformingSite classCode="ORG" determinerCode="INSTANCE">
<id assigningAuthorityName="ASORDERED" extension="16494" />
</locatedPerformingSite>
</performingLocatedEntity>
</performer>
<origin nullFlavor="" typeCode="ORG"> <!-- Have added this to accomodate the Ordering Lab Code-->
<orderingLocatedEntity classCode="LOCE" >
<locatedOrderingSite classCode="ORG" determinerCode="INSTANCE">
<id assigningAuthorityName="PRMORDCODE" extension="16494"/>
<code code="SJC" codeSystemName="QTIM" codeSystem="ORDERINGLABCODE"/>
</locatedOrderingSite>
</orderingLocatedEntity>
</origin>
<pertinentInformation1 typeCode="PERT">
<realmCode code="QD" />
<clinicalInfo classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<title>Specialitysample1</title>
<text>Conditionsample1</text>
</clinicalInfo>
</pertinentInformation1>
<subjectOf typeCode="SUBJ">
<realmCode code="QD" />
<annotation classCode="ACT" moodCode="EVN" >
<realmCode code="QD" />
<code code="DOSCATNAME"></code>
<text><![CDATA[SureSwab<sup>®</sup>, <em>Candidiasis</em>, PCR]]></text>
</annotation>
</subjectOf>
</subjectOf>
I have a xml looking like above. I want to parse it; what is the best way to parse it?? DOM, SAX ( i have heard of JAXB, XSLT,.... not sure of this two); Can we have a combination of DOM & SAX to parse a XML??
A simple scenario to attain a tag value using attribute access as "code"
like when code=DOSCATNAME in tag then we need to take up data for corresponding tag.
Other scenario is to access tag and get the hierarchy and access extension attribute of when assigningAuthorityName attribute has value PRMORDCODE.
Can the above two scenarios can be achievable using a Parser??
I am a newbie, please understand what i need to parse & suggest me a thought... thanks in advance...
Use JAXB. Create class model and annotate your classes appropriately. The environment will do the rest.
For example you should create class PlayingTestCodeDetails with properties classCode, determinerCode etc.
I will tell you more: you can kindly ask JAXB to generate the classes for you. Start learning from this article: http://www.roseindia.net/jaxb/r/jaxb.shtml
It will take a couple of hours to start but than you will be done in 15 minutes. If you are using DOM you can start in 15 minutes of learning and the coding a couple of days to parse your XML.
It depends on your need which to use.
Both SAX and DOM are used to parse the XML document. Both has advantages and disadvantages and can be used in our programming depending on the situation.
SAX
• Parses node by node
• Doesn’t store the XML in memory
• We cant insert or delete a node
• SAX is an event based parser
• SAX is a Simple API for XML
• doesn’t preserve comments
• SAX generally runs a little faster than DOM
DOM
• Stores the entire XML document into memory before processing
• Occupies more memory
• We can insert or delete nodes
• Traverse in any direction.
• DOM is a tree model parser
• Document Object Model (DOM) API
• Preserves comments
• SAX generally runs a little faster than DOM
If we need to find a node and doesn’t need to insert or delete we can go with SAX itself otherwise DOM provided we have more memory.
These are few parsers:-
woodstox
dom4j
In addition to SAX and DOM there is STaX parsing available using XMLStreamReader which is an xml pull parser.

Categories

Resources