JAVA XML find node without knowing parent

JAVA XML find node without knowing parent - java

JAVA & XML
I have an xml document like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<elezione>
<codice>6</codice>
<descrizione>EUROPEE</descrizione>
<data>25 MAGGIO 2014</data>
<enti-partecipanti>
<italia>
<circ-europea>
<codice>2</codice>
<nome>II : ITALIA NORD-ORIENTALE</nome>
<regione> ..... </regione>
<regione> ..... </regione>
<regione>
<codice>4</codice>
<nome>TRENTINO-ALTO ADIGE</nome>
<provincia>
<codice>14</codice>
<nome>BOLZANO</nome>
.. Whole load of sub nodes and stuff
</provincia>
<provincia>
<codice>14</codice>
<nome>BOLZANO</nome>
.. Whole load of sub nodes and stuff
</provincia>
..
..
</regione>
<regione> ... </regionr>
</circ-europea>
</italia>
</enti-partecipanti>
</elezione>
I need to start examining from the <regione> node with "codice" = 14
Unfortunately the structure ABOVE the list of "<regione>" nodes changes continuously (the supplier of the xml is pretty CRAZY), but below that node, things are pretty standard.
Currently I'm using classic "DocumentBuilder ... " code.
The main problem is that I start my search for regione starting INSIDE the <elezione> node, and not from the document itself, so I don't know how to use xPath starting from a node instead of a document!

Use a xpath starting with "//".
"//regione" will match all the "regione" nodes.

Related

Comparing xml objects java

I have an xml bundle file which I would like to read through and compare the objects within the bundle. The start position would be the mo tag until the next mo tag.
I have done xmlunit but this compares 2 xml files. I would like to be able to compare the objects within one xml bundle file.
Don't know if this makes sense. If more info is needed, I can try explain more.
Sample of the xml file:
<mo>FIELD</mo>
<pk1>DM_READEXTRACT</pk1>
<bo>F1-FieldPhysicalBO</bo>
<boData>
<field>DM_READEXTRACT</field>
<dataType>CHAR</dataType>
<isSigned>false</isSigned>
<isWorkField>false</isWorkField>
<version>9</version>
</boData>
<entities>
<processingSequence>560</processingSequence>
<sequence>560</sequence>
</entities>
<mo>FIELD</mo>
<pk1>DM_READEXTRACT</pk1>
<bo>F1-FieldPhysicalBO</bo>
<boData>
<field>DM_READEXTRACT</field>
<dataType>CHAR</dataType>
<isSigned>false</isSigned>
<isWorkField>false</isWorkField>
<version>2</version>
</boData>
<entities>
<processingSequence>30</processingSequence>
<sequence>3</sequence>
</entities>

Maybe try to unmarshall XML to java objects and than compare?
http://www.mkyong.com/java/jaxb-hello-world-example/

XMLUnit works on Nodes as well - at least 2.x does.
By looking at your example, what you want to compare is not a proper tree but a forrest - there is no root element all others are children of.
What you can do is creating a DocumentFragment for each forrest you want to compare (on both the test and control sides) and add all roots of your forrest to it - and then tell XMLUnit to work on the DocumentFragments. You can obtain an instance of a DocumentFragment by first loading the DOM Document and then calling createDocumentFragment on it.

DOM Parser wrong childNodes Count

This is strange but let me try my best to put it accross.
I have a XML which i am reading through the normal way from desktop and parsing it through DOM parser.
<?xml version="1.0" encoding="UTF-8"?>
<Abase
xmlns="www.abc.com/Events/Abase.xsd">
<FVer>0</FVer>
<DV>abc App</DV>
<DP>abc Wallet</DP>
<Dversion>11</Dversion>
<sigID>Ss22</sigID>
<activity>Adding New cake</activity>
</Abase>
Reading the XML to get the childs.
Document doc = docBuilder.parse("C://Users//Desktop//abc.xml");
Node root = doc.getElementsByTagName("Abase").item(0);
NodeList listOfNodes = root.getChildNodes(); //Sysout Prints 13
So here my logic works well.When am trying to do by pushing the same XML to a queue and read it and get the child nodes it gives me no. of child nodes is 6.
Document doc=docBuilder.parse(new InputSource(new ByteArrayInputStream(msg.getBytes("UTF-8"))));
Node root = doc.getElementsByTagName("Abase").item(0);
NodeList listOfNodes = root.getChildNodes(); //Sysout Prints 6
this screws my logic of parsing the XML.Can anyone help me out?
UPDATE
Adding sending logic :
javax.jms.TextMessage tmsg = session.createTextMessage();
tmsg.setText(inp);
sender.send(tmsg);
PROBLEM
If i read this xml from desktop it says 13 childs, 6 element node and 7 text nodes.The Common Logic is :
Read all the childs and iterate through list of child items.
If node ISNOT text node get inside if block,add one parent element with two child and append to existing ROOT.Then get NodeName and get TextContext between the element node and push them as setTextContext for both the childs respectively.
So i have a fresh ELEMENT NODE now which have two childs .And as i dont need the already existing element node now which are still the childs of root,Lastly am removing them.
So the above logic is all screwed if i am pushing the XML to queue and areading it for doing the same logic.
OUTPUT XML which is coming good when i read from desktop,but reading from queue is having problem, because it screw the complete tree.
<Abase
xmlns="www.abc.com/Events/Abase.xsd">
<Prop>
<propName>FVer</propName>
<propName>0</propName> //similarly for other nodes
</Prop>
</Abase>
Thanks

Well, there are 13 children if whitespace text nodes are included, but only 6 if whitespace text nodes are dropped. So there's some difference in the way the tree has been built between the two cases, that affects whether whitespace text nodes are retained or not.

The document under "Output XML" means that there is something wrong on the sender side. My guess would by that inp isn't a String but some kind of object and setText(inp) doesn't call inp.toString() but instead triggers some kind of serialization code which produces this odd XML that you're seeing.

How can I parse CDATA?

How can I find and iterate through all the nodes present under CDATA and those nodes are started by (<) and closed by (>)?
Also, how should I iterate over all the child nodes and get the values like in below child node? I want to retrieve the value.
Input XML
<SOURCE TransactionId="1" ProviderName="ABCDD"><RESPONSE><![CDATA[<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><NetworkResponse xmlns="http://www.example.com/"><NetworkResult><Network offering_id="13" transaction_id="2" submission_id="3" timestamp="20140828 16010683 GMT" customer_id="NETTest">
<Network_List>
<Network_Info att0="Y" att1="N" att2="N" att3="Y" att4="Y">
<SIM_DATA>
<SIM><![CDATA[1100040101]]></SIM>
</SIM_DATA>
<NetworkResponseInfo k_status="C">
<KEY1>269</KEY1>
<PARENTNODE>
<CHILDNODE1>
<KEY2>XXXXXXX</KEY2>
<KEY3>YYYYYYY</KEY3>
</CHILDNODE1>
<CHILDNODE2>
<KEY4>N</KEY4>
<KEY5>I</KEY5>
</CHILDNODE2>
<CHILDNODE3>
<KEY6>1</KEY6>
<KEY7>3</KEY7>
</CHILDNODE3>
</PARENTNODE>
<KEY8><![CDATA[some image not visible]]></KEY8>
<KEY9>N</KEY9>
<KEY10>15</KEY10>
</NetworkResponseInfo>
</Network_Info>
</Network_List>
<response_message_list transaction_status_code="000" transaction_status_text="Successful"/>
</Network></NetworkResult></NetworkResponse></soap:Body></soap:Envelope>]]></RESPONSE></SOURCE>
Output XML
<ns3:NetworkResponse>
<Networks_OF_List>
<NetCharSeq>
<Nrep>
<type>Some Image</type>
<data> Data Coming from KEY8 CDATA section</data>
</Nrep>
<Nrep>
<type>ANYTHING</type>
<data>VALUE INSIDE SIM CDATA</data>
</Nrep>
<NetDetail>
<MYKEY1>Value present inside KEY4</MYKEY1>
<MYKEY2>Value present inside KEY5</MYKEY2>
</NetDetail>
<SystemID>Value of KEY2</SystemID>
<SystemPath>Valuelue of KEY3</SystemPath>
</NetCharSeq>
</Networks_OF_List>
</ns3:NetworkResponse>

(Welcome at SO. Please note that you are downvoted by some users because you do not show what you have done so far. Have a look at the How To Ask section to learn how to ask questions that actually can be answered and are considered proper questions in the SO format.)
If you can use XSLT 3.0, you can consider using the new fn:parse-xml function, which will take a document-as-a-string.
However, your CDATA-section contains itself escaped data, which means that, after you apply fn:parse-xml, you will have to do it once again for the text node that is the child of NetworkResult.
A better solution is often to fix this at the source and creating an XML format that allows other XML in certain elements (you can allow this with a proper XSD). It will save you a lot of trouble and at least you XML can then be pre-validated.
If you are stuck with XSLT 2.0 or 1.0, you can use disable-output-escaping (google it, there is a lot of info around on how to use it), but you will have to re-process your output once more because of the double-escape that is used. You may want to consider an XProc pipeline to ease the process.
You wrote: Also, how should I iterate over all the child nodes and get the values like in below child node
That is what XSLT is all about, please read this XSLT Tutorial, or any other tutorial you can find, it will be explained to you in the first minutes.
Update: as suggested by michael.hor257k in the comments, you can also parse the escaped data by hand using string manipulation functions. As he already says in the comments, this is laborious and error-prone, but sometimes, esp. if the XML is not really XML after unescaping, but something like XML, then this may be your only option.

How to use XPath to get attributes of BPMN nodes in java?

I have tried to use XPath with XML files and it works fine. Now I want to use it with BPMN files.
My BPMN file looks sth like this:
<bpmn2:startEvent id="StartEvent_1" name="StartProcess">
<bpmn2:outgoing>SequenceFlow_1</bpmn2:outgoing>
</bpmn2:startEvent>
I try to get the value of the id attribute of the bpmn2:startEvent node using this line of code:
startEventID = xml.getParameterString("(//bpmn2:startEvent/#id)");
System.out.println(startEventID);
But it prints me a blank line ... and not the id : StartEvent_1
Any suggestion for this plz?

You can use this expression: "//*[local-name()='startEvent']/#id".
Note that this may be tricky if you have same tag names in different namespaces.

Reducing code redundancy while creating XML with XOM

I am using XOM as my XML parsing library. And i am using this for creating XML also. Below is the scenario described with example.
Scenario:
Code:
Element root = new Element("atom:entry", "http://www.w3c.org/Atom");
Element city = new Element("info:city", "http://www.myinfo.com/Info");
city.appendChild("My City");
root.appendChild(city);
Document d = new Document(root);
System.out.println(d.toXML());
Generated XML:
<?xml version="1.0"?>
<atom:entry xmlns:atom="http://www.w3c.org/Atom">
<info:city xmlns:info="http://www.myinfo.com/Info">
My City
</info:city>
</atom:entry>
Notice in the XML that here info namespace is added with the node itself. But I need this to be added in root element. like below
<?xml version="1.0"?>
<atom:entry xmlns:atom="http://www.w3c.org/Atom" xmlns:info="http://www.myinfo.com/Info">
<info:city>
My City
</info:city>
</atom:entry>
And to do that, i just need following piece of code
Element root = new Element("atom:entry", "http://www.w3c.org/Atom");
=> root.addNamespaceDeclaration("info", "http://www.myinfo.com/Info");
Element city = new Element("info:city", "http://www.myinfo.com/Info");
... ... ...
Problem is here i had to add http://www.myinfo.com/Info twice. And in my case there are hundreds of namespaces. So there will so too much redendancy. Is there any way to get rid of this redundancy?

No, there is no way to get rid of this redundancy and that's a deliberate decision. In XOM the namespace is a fundamental part of the element itself, not a function of its position in the document.
Of course you could always declare a named constant for the namespace URI.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.