How to compare 2 xml strings?

How to compare 2 xml strings? - java

I have a very specific requirment of comparing 2 xml strings in java. I have 2 xml strings. Original and modified. I need to compare the original xml string with the modified and find out what has been modified.
For example:
Original xml is
<Mycontacts>
<contact>
<firstName>Robert</firstName>
<PhoneNumber>9053428756</PhoneNumber>
<lastName>Bobbling</lastName>
<mobile>4168014523</mobile>
</contact>
<contact>
<firstName>Lily</firstName>
<PhoneNumber>9053428756</PhoneNumber>
<lastName>Bobbling</lastName>
<mobile>4168014523</mobile>
</contact>
</Mycontacts>
Modified xml:
<Mycontacts>
<contact>
<firstName>Robert</firstName>
<PhoneNumber>40454321333</PhoneNumber>
<lastName>Bobbling</lastName>
<mobile>4168014523</mobile>
</contact>
</Mycontacts>
As 1 contact is modified here and 1 id deleted I want to form 2 xml's trees. 1 is modify_xml and 1 is delete xml
modify xml:
<contact>
<firstName>Robert</firstName>
<PhoneNumber>40454321333</PhoneNumber>
<lastName>Bobbling</lastName>
<mobile>4168014523</mobile>
</contact>
delete xml:
<contact>
<name>Lily</name>
</contact>
How can this be done using java API's? Is parsing each node and creating a map for each contact entry a good option?

http://xmlunit.sourceforge.net/

I would parse the XML files to Java objects and compare those, assuming that the XML layout is not changing over time. You can use XStream or JAXB to do that.

Very difficult problem in the general case, for example if you want to detect that the element names have changed but the values have stayed the same, or if you want to detect that two elements are both still present but the order has been reversed. It's a lot easier if you know something about the structure of your data, and for example you are able to distinguish which values act as identifiers, so the problem reduces to finding an element in the other file with the same identifier and then asking which of its non-identifying properties have changed.
The essential point is that you need to say a lot more about the requirements before one can attempt a design.

Related

XmlUnit order based on value [duplicate]

This question already has answers here:
Compare two XML strings ignoring element order
(9 answers)
Closed 6 years ago.
I am comparing a saved example xml with a live marshalled xml in my JUnit testing. Validating the presence of a key value pair in the xml.
I am making use of XmlUnit 2.1.0 specifically
My xml is as follows:
<entries>
<entry>
<key>delete</key>
<value>ENABLED</value>
</entry>
<entry>
<key>view</key>
<value>DISABLED</value>
</entry>
<entry>
<key>create</key>
<value>DISABLED</value>
</entry>
</entries>
The order of the entries can vary. I'm unsure how to get it to validate correctly since it sees a different key value as a difference in the xml even though it's just an order change.
I am asserting similarity with the follow assertion in JUnit:
assertThat(marshalledXml, isSimilarTo(Input.fromFile("path/to/example.xml").ignoreWhitespace().ignoreComments());
I suspect I may need to make use of XPath matchers or the DefaultNodeMatchers with an ElementSelector.

Yes, you need to provide an ElementSelector that "knows" which nodes to pick for comparison in your specific case.
For most of the document the name of the element seems to be what you should use. At least that's true for entries, key and value. For entry elements you want to compare those elements, that have matching nested text in the key element that is their immediate child, right?
I think this translates to
ElementSelectors.conditionalBuilder()
.whenElementIsNamed("entry")
.thenUse(ElementSelectors.byXPath("./key", ElementSelectors.byNameAndText))
.elseUse(ElementSelectors.byName)
.build();
See https://github.com/xmlunit/user-guide/wiki/SelectingNodes for a more detailed discussion of the ElementSelector options. Your XML is pretty close to the table example used in the introduction and discussed in the next sections.

Compare strings using HashMap

I have few xml files like this
<?xml version = "1.0"?>
<note>
<to>Tim</to>
<from>Joe</from>
<head>About Job</head>
</note>
<?xml version = "1.0"?>
<note>
<to>Tim</to>
<from>Joe</from>
<head>How are u?</head>
</note>
<?xml version = "1.0"?>
<note>
<to>Marry</to>
<from>Pit</from>
<head>Welcome to home</head>
</note>
I parsing this files and store data into the text file like this
FROM:
Tim
Tim
Pit
TO:
Joe
Joe
Marry
HEAD:
About Job
How are u?
Welcome to home
I want the names are not repeated
How i can do it with hasMap, please help me! :)

If you only need a unique Collection of names, use a HashSet<String>. Only one instance would be stored for each unique name. HashMap would make sense if you want each unique name to serve as key. In that case you must decide what you want to store as the value for each name key.

From your question, it looks like you only need to store a collection of unique objects. Implementations of Set will help. You can use a HashSet of String objects:
Set uniqueNames = new HashSet<String>()

Set<String> from = new HashSet<String>();
You simply add new records with add method. Remember that set will not keep order in witch elements hav been inserted. But I suppose in your example you will sort names before printing, so it's ok.
If you want keep the iteration order use LinkedHashSet instead.

Read the text file and store the same into a HashSet like
Set names = new HashSet<String>();
names.put(nameString);
Above solution is case-sensitive - it means Tim and tim will be treated differently. Covert the name strings to lower case using "nameString".toLowerCase() before putting it into the HashSet if you are looking for case-insensitive behavior like
names.put(nameString.toLowerCase()); \\now Tim and tim will be treated equally
Hope this helps.

Comparing two sets of XML data without loading all the comparison data into memory

So I have two XML files that are being parsed for information. I'm trying to think of a way to determine what elements from one XML file are missing from the other XML file. Now currently the results for both XML files are loaded into two different arrays but this is not good because its a lot of data to hold on to.
I need to somehow figure out what is missing from one file without loading all the data permanently into memory since the XML files in question can be very very large.
Here is an example of the xml. Just imagine the other file is missing one of the weakness.I'm already using the SAX parser to get the actual data.
<weaknesses>
<wakness status="new" severity="low" id="14876">
<cwe id="133" href="http://cwevis.org">Title1</cwe>
<tool code="STRING" category="PERFORMANCE" name="aaa"/>
<rule name="Method invokes inefficient new String(String) constructor"/>
<locations>
<location path="Catcher.java" type="file">
<line end="93" start="93"/>
<description>stuff</description>
</location>
</locations>
</weakness>
<weakness status="new" severity="low" id="14877">
<cwe id="138" href="http://cwevis.org">Title2</cwe>
<tool code="PARAMETER" category="SECURITY" name="bbb"/>
<rule name="Servlet parameters unsafe"/>
<locations>
<location path="Catcher.java" type="file">
</locations>
</weakness>
<weakness status="new" severity="low" id="14878">
<cwe id="500" href="http://cwevis.org">Title3</cwe>
<tool code="FINAL" category="asd" name="vvv"/>
<rule name="Field isn't final and can't be protected from malicious code"/>
<locations>
<location path="Course.java" type="file">
<line end="56" start="56"/>
<description>stuff </description>
</location>
</locations>
</weakness>
</weaknesses>
Note: I'm programming this in Java and Assume that the elements are not sorted. the two ideas that come to mind are the easy way of loading both sets and comparing one against the other which dosent solve the memory problem. The other one would be to keep parsing the xml over and over without storing things but then its very process inefficient.

Lets say you compare xmlfile A against B. You first fill a set X with all A elements while parsing file A; while you parse file B, you try to remove from the stack X whatever elements you find. If you get true (it is removed from the set), you go ahead. If you get false (it was not in the set X), you save it in set Y). At the end of parsing file B, stack X will contain all elements in A and not in B; stack Y will contain all elements in B which are not in A.
This requires you to implement an entity class realizing the weakness object, which implements equals (for the remove call to work), and eventually the Comparable interface (a sorted collection may be a better fit for some dimensions of this problem).

Finding an XML node by its attribute value and updating in Java

Let's say I have the following XML document:
<Offices>
<Office name="P">
<Counter>1000</Counter>
</Office>
<Office name="K">
<Counter>1006</Counter>
</Office>
</Offices>
With that document I need to perform the following in Java:
Parse the XML.
Get the value of Counter given a certain value for a name attribute.
Update the XML with a new value for Counter for exactly this Office.
For 2. I have considered using XPath but editing/updating the XML seems to be not that easy this way.
How could I go through the XML finding a certain office name and update its counter? The XML itself won't be large, only something like 20 office entries max.

You can try looking at this answer:
https://stackoverflow.com/a/5059411/1571550
It seems pretty straightforward and generic solution.

XSD: Index of sequence in Element name

I'm building an XSD to generate JAXB objects in Java. Then I ran into this:
<TotalBugs>
<Bug1>...</Bug1>
<Bug2>...</Bug2>
...
<BugN>...</BugN>
</TotalBugs>
How do I build a sequence of elements where the index of the sequence is in the element name? Specifically, how do I get the 1 in Bug1

You don't want to do it in this way, XML has a top-down order by nature. Consequently, you don't have to enumerate yourself:
<totalBugs>
<bug><!-- Here comes 1st bug --></bug>
<bug><!-- Here comes 2nd bug --></bug>
...
<bug><!-- Here comes last bug --></bug>
</totalBugs>
You can access the 1st bug node in the list by the XPath expression:
/totalBugs/bug[1]
Note, indexes start by W3C standard at 1. Please refer to for further readings to w3schools.

I'm pretty sure XSD won't support what you need. However you can use <xsd:any> for that bit of the schema, then use something lower-level than JAXB to generate the XML for that particular part. (I think your generated classes will have fields like protected List<Element> any; which you can fill in using DOM).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to compare 2 xml strings? - java

http://xmlunit.sourceforge.net/

I would parse the XML files to Java objects and compare those, assuming that the XML layout is not changing over time. You can use XStream or JAXB to do that.

Related

XmlUnit order based on value [duplicate]

Compare strings using HashMap

Comparing two sets of XML data without loading all the comparison data into memory

Finding an XML node by its attribute value and updating in Java

XSD: Index of sequence in Element name

Categories

Resources