Find difference between xml file contents

Find difference between xml file contents - java

I am comparing my XML files using the sample code (Possible duplicate) in the below post by acdcjunior - Best way to compare 2 XML documents in Java
I see the below error from the assert test.
Expected presence of doctype declaration 'null' but was 'not null' - comparing at to <!DOCTYPE plist PSECTOR " ..........
Can someone please guide me what I can do to fix this?

Okay, I found the solution here - http://xmlunit.sourceforge.net/userguide/XMLUnit-Java.pdf
For efﬁciency reasons a Diff stops the comparison process as soon as the ﬁrst difference is found. To get all the differences
between two pieces of XML an instance of the DetailedDiff class, a subclass of Diff, is required. Note that a Detailed
Diff is constructed using an existing Diff instance.
For future readers, here is the solution (also in the link - Pg 9) -
DifferenceListener myDifferenceListener = new IgnoreTextAndAttributeValuesDifferenceListener();
Diff myDiff = new Diff(expectedXML, actualXML);
myDiff.overrideDifferenceListener(myDifferenceListener);
Assert.assertTrue("test XML matches control skeleton XML", myDiff.similar());
From the link again,
The DifferenceEngine class generates the events that are passed to a DifferenceListener implementation as two
pieces of XML are compared. Using recursion it navigates through the nodes in the control XML DOM, and determines which
node in the test XML DOM qualiﬁes for comparison to the current control node. The qualifying test node will match the control
node’s node type, as well as the node name and namespace (if deﬁned for the control node).

Related

Is there any way to get the AST (Abstract Syntax Tree) of a block of code in Java rather than of the entire class ?

I tried using Javalang module available in python to get the AST of Java source code , but it requires an entire class to generate the AST . Passing a block of code like an 'if' statement throws an error . Is there any other way of doing it ?
PS : I am preferably looking for a python module to do the task.
Thanks

Javalang can parse snippets of Java code:
>>> tokens = javalang.tokenizer.tokenize('System.out.println("Hello " + "world");')
>>> parser = javalang.parser.Parser(tokens)
>>> parser.parse_expression()
MethodInvocation

OP is interested in a non-Python answer.
Our DMS Software Reengineering Toolkit with its Java Front End can accomplish this.
DMS is a general purpose tools for parsing/analyzing/transforming code, parameterized by langauge definitions (including grammars). Given a langauge definition, DMS can easily be invoked on a source file/stream representing the goal symbol for a grammar by calling the Parse method offered by the langauge parameter, and DMS will build a tree for the parsed string. Special support is provided for parsing source file/streams for arbitrary nonterminals as defined by the langauge grammar; DMS will build an AST whose root is that nonterminal, parsing the source according to the subgrammar define by that nonterminal.
Once you have the AST, DMS provides lots of support for visiting the AST, inspecting/modifying nodes, carry out source-to-source transformations on the AST using surface syntax rewrite rules. Finally you can prettyprint the modified AST and get back valid source code. (If you have only parsed a code fragment for a nonterminal, what you get back is valid code for that nonterminal).
If OP is willing to compare complete files instead of snippets, our Smart Differencer might be useful out of the box. SmartDifferencer builds ASTs of its two input files, finds the smallest set of conceptual edits (insert, delete, move, copy, rename) over structured code elemnts that explains the differences, and reports that difference.

Finding an XML node by its attribute value and updating in Java

Let's say I have the following XML document:
<Offices>
<Office name="P">
<Counter>1000</Counter>
</Office>
<Office name="K">
<Counter>1006</Counter>
</Office>
</Offices>
With that document I need to perform the following in Java:
Parse the XML.
Get the value of Counter given a certain value for a name attribute.
Update the XML with a new value for Counter for exactly this Office.
For 2. I have considered using XPath but editing/updating the XML seems to be not that easy this way.
How could I go through the XML finding a certain office name and update its counter? The XML itself won't be large, only something like 20 office entries max.

You can try looking at this answer:
https://stackoverflow.com/a/5059411/1571550
It seems pretty straightforward and generic solution.

How to conditionally select a node in XPath

The XSD schema I am working with, calls for either an international or domestic address:
"/mns:PhysicalAddress/mns:DomesticAddress/mns:City"
or
"/mns:PhysicalAddress/mns:InternationalAddress/mns:City"
It is being used as a parameter in a Java method as in XMLUtils.BuildField(Document doc, String xpath).
I know I can go straight to the Java object that created that doc and use the auto-generated beans to query elements, but I prefer remaining within the concise realm of XPath. Is this possible?
If so, how do I write an XPath expression selects mns:City regardless of whether it is international or domestic address?
Note: This in Java, not Javascript, HTML or XSLT, so I don't think <xsl:if> is relevant here.

You could go with finding all Cities that have either parent:
//mns:City[(parent::mns:DomesticAddress|parent::mns:InternationalAddress)]
If you need to also ensure that the address is in the physical address:
//mns:City[(parent::mns:DomesticAddress|parent::mns:InternationalAddress)[parent::mns:PhysicalAddress]]
Alternatively, instead of reversing the hierarchy, you do a * and check the name:
/mns:PhysicalAddress/*[name()="mns:DomesticAddress" or name()="mns:InternationalAddress"]/mns:City

Depending in the precise structure of your XML,
/mns:PhysicalAddress/*/mns:City
may be enough, if that pulls in too much then the clearest option is probably just to use the two alternatives you already have, separated by a |:
/mns:PhysicalAddress/mns:DomesticAddress/mns:City | /mns:PhysicalAddress/mns:InternationalAddress/mns:City
Or slightly more concise but (in my opinion) less clear:
/mns:PhysicalAddress/*[self::mns:DomesticAddress | self::mns:InternationalAddress]/mns:City

Unexpected difference found when comparing elements with empty elements in xmlUnit

Comparing these two snippets of XML:
testXml:
<ELEMENT1>
<CHILD1></CHILD1>
</ELEMENT1>
actualXml:
<ELEMENT1>
<CHILD1>notEmpty</CHILD1>
</ELEMENT1>
using:
Diff diff = new Diff(testXml, actualXml);
Detailed detailedDiff = new DetailedDiff(diff);
Now detailedDiff.getAllDifferences(); will return a DifferenceConstants.HAS_CHILD_NODES_ID difference and if you print the difference to the console it looks like this:
Expected presence of child nodes to be 'false' but was 'true' - comparing <CHILD1...> at /ELEMENT1[1]/CHILD1[1] to <CHILD1...> at /ELEMENT1[1]/CHILD1[1]
My question is, why is the difference of type DifferenceConstants.HAS_CHILD_NODES_ID and not DifferenceConstants.TEXT_VALUE_ID? The structure of the two XML-snippets are the same, but the text value of the two differs.
So, why doesn't that trigger a difference?

Try to use this ElementQualifier:
Diff diff = new Diff(testXml, actualXml);
diff.overrideElementQualifier(new RecursiveElementNameAndTextQualifier() );
Detailed detailedDiff = new DetailedDiff(diff);
here is the description from javadoc:
public RecursiveElementNameAndTextQualifier()
Uses element names and the text nested an arbitrary level of child
elements deeper into the element to compare elements. Checks all
nodes, not just first child element.
Does not ignore empty text nodes.
The interested thing here is the "does not ignore empty text nodes".
It seems that the default ElementQualifier treats empty nodes as a missing node, and only checks for the first error related to one node. So in your case, possibly solely the "HAS_CHILD_NODES_ID" is thrown instead of including also "TEXT_VALUE_ID".
At least, RecursiveElementNameAndTextQualifier goes deeper.

XSD: Index of sequence in Element name

I'm building an XSD to generate JAXB objects in Java. Then I ran into this:
<TotalBugs>
<Bug1>...</Bug1>
<Bug2>...</Bug2>
...
<BugN>...</BugN>
</TotalBugs>
How do I build a sequence of elements where the index of the sequence is in the element name? Specifically, how do I get the 1 in Bug1

You don't want to do it in this way, XML has a top-down order by nature. Consequently, you don't have to enumerate yourself:
<totalBugs>
<bug><!-- Here comes 1st bug --></bug>
<bug><!-- Here comes 2nd bug --></bug>
...
<bug><!-- Here comes last bug --></bug>
</totalBugs>
You can access the 1st bug node in the list by the XPath expression:
/totalBugs/bug[1]
Note, indexes start by W3C standard at 1. Please refer to for further readings to w3schools.

I'm pretty sure XSD won't support what you need. However you can use <xsd:any> for that bit of the schema, then use something lower-level than JAXB to generate the XML for that particular part. (I think your generated classes will have fields like protected List<Element> any; which you can fill in using DOM).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Find difference between xml file contents - java

Related

Is there any way to get the AST (Abstract Syntax Tree) of a block of code in Java rather than of the entire class ?

Finding an XML node by its attribute value and updating in Java

How to conditionally select a node in XPath

Unexpected difference found when comparing elements with empty elements in xmlUnit

XSD: Index of sequence in Element name

Categories

Resources