How can I parse CDATA? - java

How can I find and iterate through all the nodes present under CDATA and those nodes are started by (<) and closed by (>)?
Also, how should I iterate over all the child nodes and get the values like in below child node? I want to retrieve the value.
Input XML
<SOURCE TransactionId="1" ProviderName="ABCDD"><RESPONSE><![CDATA[<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><NetworkResponse xmlns="http://www.example.com/"><NetworkResult><Network offering_id="13" transaction_id="2" submission_id="3" timestamp="20140828 16010683 GMT" customer_id="NETTest">
<Network_List>
<Network_Info att0="Y" att1="N" att2="N" att3="Y" att4="Y">
<SIM_DATA>
<SIM><![CDATA[1100040101]]></SIM>
</SIM_DATA>
<NetworkResponseInfo k_status="C">
<KEY1>269</KEY1>
<PARENTNODE>
<CHILDNODE1>
<KEY2>XXXXXXX</KEY2>
<KEY3>YYYYYYY</KEY3>
</CHILDNODE1>
<CHILDNODE2>
<KEY4>N</KEY4>
<KEY5>I</KEY5>
</CHILDNODE2>
<CHILDNODE3>
<KEY6>1</KEY6>
<KEY7>3</KEY7>
</CHILDNODE3>
</PARENTNODE>
<KEY8><![CDATA[some image not visible]]></KEY8>
<KEY9>N</KEY9>
<KEY10>15</KEY10>
</NetworkResponseInfo>
</Network_Info>
</Network_List>
<response_message_list transaction_status_code="000" transaction_status_text="Successful"/>
</Network></NetworkResult></NetworkResponse></soap:Body></soap:Envelope>]]></RESPONSE></SOURCE>
Output XML
<ns3:NetworkResponse>
<Networks_OF_List>
<NetCharSeq>
<Nrep>
<type>Some Image</type>
<data> Data Coming from KEY8 CDATA section</data>
</Nrep>
<Nrep>
<type>ANYTHING</type>
<data>VALUE INSIDE SIM CDATA</data>
</Nrep>
<NetDetail>
<MYKEY1>Value present inside KEY4</MYKEY1>
<MYKEY2>Value present inside KEY5</MYKEY2>
</NetDetail>
<SystemID>Value of KEY2</SystemID>
<SystemPath>Valuelue of KEY3</SystemPath>
</NetCharSeq>
</Networks_OF_List>
</ns3:NetworkResponse>

(Welcome at SO. Please note that you are downvoted by some users because you do not show what you have done so far. Have a look at the How To Ask section to learn how to ask questions that actually can be answered and are considered proper questions in the SO format.)
If you can use XSLT 3.0, you can consider using the new fn:parse-xml function, which will take a document-as-a-string.
However, your CDATA-section contains itself escaped data, which means that, after you apply fn:parse-xml, you will have to do it once again for the text node that is the child of NetworkResult.
A better solution is often to fix this at the source and creating an XML format that allows other XML in certain elements (you can allow this with a proper XSD). It will save you a lot of trouble and at least you XML can then be pre-validated.
If you are stuck with XSLT 2.0 or 1.0, you can use disable-output-escaping (google it, there is a lot of info around on how to use it), but you will have to re-process your output once more because of the double-escape that is used. You may want to consider an XProc pipeline to ease the process.
You wrote: Also, how should I iterate over all the child nodes and get the values like in below child node
That is what XSLT is all about, please read this XSLT Tutorial, or any other tutorial you can find, it will be explained to you in the first minutes.
Update: as suggested by michael.hor257k in the comments, you can also parse the escaped data by hand using string manipulation functions. As he already says in the comments, this is laborious and error-prone, but sometimes, esp. if the XML is not really XML after unescaping, but something like XML, then this may be your only option.

Related

Comparing xml objects java

I have an xml bundle file which I would like to read through and compare the objects within the bundle. The start position would be the mo tag until the next mo tag.
I have done xmlunit but this compares 2 xml files. I would like to be able to compare the objects within one xml bundle file.
Don't know if this makes sense. If more info is needed, I can try explain more.
Sample of the xml file:
<mo>FIELD</mo>
<pk1>DM_READEXTRACT</pk1>
<bo>F1-FieldPhysicalBO</bo>
<boData>
<field>DM_READEXTRACT</field>
<dataType>CHAR</dataType>
<isSigned>false</isSigned>
<isWorkField>false</isWorkField>
<version>9</version>
</boData>
<entities>
<processingSequence>560</processingSequence>
<sequence>560</sequence>
</entities>
<mo>FIELD</mo>
<pk1>DM_READEXTRACT</pk1>
<bo>F1-FieldPhysicalBO</bo>
<boData>
<field>DM_READEXTRACT</field>
<dataType>CHAR</dataType>
<isSigned>false</isSigned>
<isWorkField>false</isWorkField>
<version>2</version>
</boData>
<entities>
<processingSequence>30</processingSequence>
<sequence>3</sequence>
</entities>
Maybe try to unmarshall XML to java objects and than compare?
http://www.mkyong.com/java/jaxb-hello-world-example/
XMLUnit works on Nodes as well - at least 2.x does.
By looking at your example, what you want to compare is not a proper tree but a forrest - there is no root element all others are children of.
What you can do is creating a DocumentFragment for each forrest you want to compare (on both the test and control sides) and add all roots of your forrest to it - and then tell XMLUnit to work on the DocumentFragments. You can obtain an instance of a DocumentFragment by first loading the DOM Document and then calling createDocumentFragment on it.

xml to jaxb in xml cyclic references

How to convert following XML to java using jaxb
<work>
<subwork id="sub">
<ret="it">
</subwork>
<ret id="it">
<time>9</time>
</ret>
</work>
It is a bit tough since ret tag is outside subwork tag
Frst, you need to start with valid XML. I've made assumptions in correcting the XML:
<work>
<subwork id="sub">
<ret id="it"/>
</subwork>
<ret id="it">
<time>9</time>
</ret>
</work>
Second (and there are other ways of doing this), you need to create a schema that describes this XML. Without doing it for you, I'll say that the trick is to define an element, ret, and then refer to that element within the work element and again within the subwork element.
Third, you then feed that schema file (.XSD) into a tool that generates the JAXB classes. Typically this is xcj.exe (included with the Java JDK).

DOM Parser Example for Objects within Objects

So say I have an XML file that looks like this:
<Object1s>
<Object1>
<Field1></Field1>
<Object2s>
<Object2>
<Field1a></Field1a>
<Field1b></Field1b>
</Object2>
<Object2>
<Field1a></Field1a>
<Field1b></Field1b>
</Object2>
</Object2s>
</Object1>
<Object1>
<Field1></Field1>
<Object2s>
<Object2>
<Field1a></Field1a>
<Field1b></Field1b>
</Object2>
</Object2s>
</Object1>
</Object1s>
The DOM tutorials I've found have not worked when I try and do the same sort of thing. For instance, I want to be able to separate the Object2s by the Object1 that they are in. When following the example given by DOM tutorials where this type of thing doesn't exist in their XML files, I get all the Object2s that are in any Object1 when I try to find them.
Can someone show me an example that handles something like this?
Okay, figured it out. What I do is use the element I declare for each element, and within that call .getElementsBytagName() to get the elements within that element.

Retrieve value of attribute using XPath

I am trying to retrieve the value of an attribute from an xmel file using XPath and I am not sure where I am going wrong..
This is the XML File
<soapenv:Envelope>
<soapenv:Header>
<common:TestInfo testID="PI1" />
</soapenv:Header>
</soapenv:Envelope>
And this is the code I am using to get the value. Both of these return nothing..
XPathBuilder getTestID = new XPathBuilder("local-name(/*[local-name(.)='Envelope']/*[local-name(.)='Header']/*[local-name(.)='TestInfo'])");
XPathBuilder getTestID2 = new XPathBuilder("Envelope/Header/TestInfo/#testID");
Object doc2 = getTestID.evaluate(context, sourceXML);
Object doc3 = getTestID2.evaluate(context, sourceXML);
How can I retrieve the value of testID?
However you're iterating within the java, your context node is probably not what you think, so remove the "." specifier in your local-name(.) like so:
/*[local-name()='Header']/*[local-name()='TestInfo']/#testID worked fine for me with your XML, although as akaIDIOT says, there isn't an <Envelope> tag to be seen.
The XML file you provided does not contain an <Envelope> element, so an expression that requires it will never match.
Post-edit edit
As can be seen from your XML snippet, the document uses a specific namespace for the elements you're trying to match. An XPath engine is namespace-aware, meaning you'll have to ask it exactly what you need. And, keep in mind that a namespace is defined by its uri, not by its abbreviation (so, /namespace:element doesn't do much unless you let the XPath engine know what the namespace namespace refers to).
Your first XPath has an extra local-name() wrapped around the whole thing:
local-name(/*[local-name(.)='Envelope']/*[local-name(.)='Header']
/*[local-name(.)='TestInfo'])
The result of this XPath will either be the string value "TestInfo" if the TestInfo node is found, or a blank string if it is not.
If your XML is structured like you say it is, then this should work:
/*[local-name()='Envelope']/*[local-name()='Header']/*[local-name()='TestInfo']/#testID
But preferably, you should be working with namespaces properly instead of (ab)using local-name(). I have a post here that shows how to do this in Java.
If you don't care for the namespaces and use an XPath 2.0 compatible engine, use * for it.
//*:Header/*:TestInfo/#testID
will return the desired input.
It will probably be more elegant to register the needed namespaces (not covered here, depends on your XPath engine) and query using these:
//soapenv:Header/common:TestInfo/#testID

Distributing the XML files

I am totally new to XML and its capabilities.
I have a file say xyz.xml.
It contains content like this:
<system-config>
<business-model>
<agent-category key="operator">
<singular-name>Operator</singular-name>
<plural-name>Operators</plural-name>
<attribute>agent-attribute.reference</attribute>
</agent-category>
Next I have
<agent-attribute id="agent-attribute.reference">
<name>Reference</name>
< description>A unique identifier for this agent, typically an MSISDN.</description>
<mandatory>true< /mandatory>
<editable>false< /editable>
<deletable>false< /deletable>
<sensitive>false< /sensitive>
<system-generated>false< /system-generated>
<input-method xsi:type="AgentReferenceInputMethod"></input-method>
<storage-location xsi:type="AgentRefStorage" field="reference"></storage-location>
</agent-attribute>
</business-model>
Now I want to distribute the agent-attributes to different file named agentAttr.xml.
Is it possible to do so (mind it <agent-attribute> is under <system-config><business-model>), if so how?
So you want to extract the agent-attribute portions ?. You can do that with simple XSLT transformation (use e.g. Xalan for that). Another option could be jsoup, parsing it using DOM or manually.

Categories

Resources