Stuck with JDOM Parsing? - java

I have a complex JDOM element like following (A), I need to change the structure like (B), for working on JAXB (Using with already existing classes,
only thing I can do is changing the structure of xml), Can I able to do this using JDOM api?
As I am a beginer in java, it is very difficult for me, if anyone point-out a solution, it is very much helpful for me
Existing element (A)
<DETAILS>
<ROWSET name="OPTIONS">
<ROW num="1">
<Uniqueno>1000</Uniqueno>
<ROWSET name="SUSPENCE">
<ROW num="1">
<Uniqueno>1001</Uniqueno>
<ROWSET name="PERSONS">
<ROW num="1">
<Name>60821894</Name>
<Age>44</Age>
</ROW>
<ROW num="2">
<Name>60821894</Name>
<Age>44</Age>
</ROW>
</ROWSET>
<ROWSET name="PERSONS">
<ROW num="1">
<Name>60821894</Name>
<Age>55</Age>
</ROW>
<ROW num="2">
<Name>60821894</Name>
<Age>55</Age>
</ROW>
<ROW num="3">
<Name>60821894</Name>
<Age>55</Age>
</ROW>
</ROWSET>
</ROW>
</ROWSET>
</ROW>
</ROWSET>
</DETAILS>
Required element (B)
<DETAILS>
<OPTIONS>
<Uniqueno>1000</Uniqueno>
<SUSPENCE>
<Uniqueno>1001</Uniqueno>
<PERSONS>
<Name>60821894</Name>
<Age>44</Age>
<Name>60821894</Name>
<Age>44</Age>
</PERSONS>
<PERSONS>
<Name>60821894</Name>
<Age>55</Age>
<Name>60821894</Name>
<Age>55</Age>
<Name>60821894</Name>
<Age>55</Age>
</PERSONS>
</SUSPENCE>
</OPTIONS>
</DETAILS>

May I suggest to use XSLT instead. Much easier. Start with something like this
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="DETAILS/ROWSET[#name='OPTIONS']">
<DETAILS>
<OPTIONS>
<xsl:apply-templates />
</OPTIONS>
</DETAILS>
</xsl:template>
<xsl:template match="ROW">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

Looking at the xmls, these are two entirely different xmls.You need to build a xml structure similar to B dyanamically.For this, the following link will help you.
http://www.ibm.com/developerworks/java/library/j-jdom/
Hope this will help you.

You have been asking essentially the same question multiple times.
Remove XML attribute using JDOM API?
Issues in Parsing XML
If you have not yet been able to get the previous questions right, you need to take a step back and work with more basic examples before you ramp up to doing multiple element moves.
While I agree with forty-two that XSL will be a better solution in the long run, I don't think you are in a place yet where that will make things easier (for you). If you have JDOM Elements available with your data, you should figure out your Java Debugger, and inspect the Elements as you add and remove them. You need to 'play' a bit to get a better understaning of how Java, XML, and JDOM work. Right now you are asking a whole bunch of related questions that show a basic misunderstanding of what in effect are 'foundation' concepts. You need to get those foundations right before you tackle these more complex concepts.
How about you start with something simple:
XMLOutputter xout = new XMLOutputter(Format.getPrettyFormat());
Document doc = new Document();
Element root = new Element("DETAILS");
doc.addContent(root);
xout.output(System.out, doc);
Element row = new Element("ROW");
root.addContent(row);
xout.output(System.out, doc);
row.detach();
xout.output(System.out, doc);
You can use the above to see how content is added, and detached from JDOM content.
Then, when you have that figured out, you can put it in loops, scans, etc. so that you can detach, and re-add content from other places in the Document hierarchy.

Related

If regEx is bad for matching XML, what is the correct way?

I was trying to do a simple string delete in XML.
I want to delete something like the following.
<A>
<B>Test Name</B>
</A>
Has to work with all possible XML, though.
<Test><A><B>Test Name</B></A></Test>
<Test ><A ><B >Test Name</B ></A ></Test >
<Test>
<A>
<B>Test Name</B>
</A>
</Test>
etc, etc.
The regularEX I got so far, is simply:
<A>\s*(\r\n|\r|\n)*\s*<B>Test Name<\/B>\s*(\r\n|\r|\n)*\s*<\/A>
Everyone always says regEx is bad for match XML, which it clearly is. So what should I use instead.
GC_
The best approach for this case would be using XSLT. And even with XSLT-1.0 this is simple (You can use the Java XSLT-processor, linux'es xsltproc or any other XSLT processor; every XSLT processor supports at least XSLT-1.0):
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*" />
<!-- identity template - matches everything except the things matched by other templates -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<!-- Removes the elements you do not want -->
<xsl:template match="A[B[normalize-space(.)='Test Name']]" />
</xsl:stylesheet>
The output of your sample case (with a hypothetical root element) would be
<Test/>
<Test/>
<Test/>
Trying to use RegEx would be error-prone and no good-practice at all.
Why would you make it complicated if it could be so easy?

How to flatten a multilevel xml into List<String xpath > in java?

the objective is to flatten a multilevel repeated xml into xpaths at a leaf level ; so that we can store it in a Key-Value store and retrieve it . The assumption is that every repeating node will have a UID.
generate a list of pairs where the Key is a "XPATH" and Values are the actual value of that leaf
Should be able to assemble it back into an xml
the xml is backed by an xsd ( is there a JAXB solution )
Edited and replaced the previous xml with a simpler one.
a sample xml looks like below
<?xml version="1.0" encoding="UTF-8"?>
<cars>
<car uid="WxiMr123">
<carDoor uid="WRP2">
<location uid="loc-1">
<width uom="ft">2</width>
<height uom="ft">3</height>
</location>
<location uid="loc-2">
<width uom="m">5</width>
<height uom="m">7</height>
</location>
</carDoor>
<commonData>
<timeCreated>2001-04-30T08:15:00.000Z</timeCreated>
</commonData>
</car>
</cars>
The xpath K,V pairs I'm looking at should look like
/cars/car[#uid="WxiMr123"]#uid , "WxiMr123"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]#uid, "WRP2"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-1"]#uid, "loc-1"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-1"]/width[#uom="ft"]#uom, "ft"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-1"]/width[#uom="ft"]/text(), "2"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-1"]/height[#uom="ft"]#uom, "ft"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-1"]/height[#uom="ft"]/text(), "3"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-2"]#uid, "loc-2"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-2"]/width[#uom="m"]#uom, "m"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-2"]/width[#uom="m"]/text(), "5"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-2"]/height[#uom="m"]#uom, "m"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-2"]/height[#uom="m"]/text(), "7"
/cars/car[#uid="WxiMr123"]/commonData/timeCreated/text(), "2001-04-30T08:15:00.000Z"
Any help is highly appreciated.
Producing the required document is not that difficult. For example, the following stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:for-each select="//text() | //#*[string()]">
<xsl:for-each select="ancestor::*">
<xsl:value-of select="concat('/', name())" />
<xsl:for-each select="#*">
<xsl:value-of select="concat('[#', name(), '="', ., '"]')" />
</xsl:for-each>
</xsl:for-each>
<xsl:choose>
<xsl:when test="name()">
<xsl:value-of select="concat('/#', name())" />
</xsl:when>
<xsl:otherwise>/text()</xsl:otherwise>
</xsl:choose>
<xsl:value-of select="concat(', "', ., '"
')" />
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
when applied to your example input, will produce the following result:
/cars/car[#uid="WxiMr123"]/#uid, "WxiMr123"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/#uid, "WRP2"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-1"]/#uid, "loc-1"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-1"]/width[#uom="ft"]/#uom, "ft"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-1"]/width[#uom="ft"]/text(), "2"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-1"]/height[#uom="ft"]/#uom, "ft"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-1"]/height[#uom="ft"]/text(), "3"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-2"]/#uid, "loc-2"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-2"]/width[#uom="m"]/#uom, "m"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-2"]/width[#uom="m"]/text(), "5"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-2"]/height[#uom="m"]/#uom, "m"
/cars/car[#uid="WxiMr123"]/carDoor[#uid="WRP2"]/location[#uid="loc-2"]/height[#uom="m"]/text(), "7"
/cars/car[#uid="WxiMr123"]/commonData/timeCreated/text(), "2001-04-30T08:15:00.000Z"
Note:
This may not be the most efficient method; with large XMl documents it might be better to apply template/s to traverse the entire tree recursively;
Elements may have multiple attributes; your example does not show how to handle these when building the paths;
Elements may be in namespaces; your example does not show how to handle these;
Empty nodes are excluded; if you try to reconstruct the original XML document from the result of this transformation (not that I have any idea how one would go about that), these nodes will be missing.

Remove data from XML file in DOM?

Is there an easy way (perhaps using the DOM api, or other) where I could remove the actual data from an XML file, leaving behind just a kind of template of its schema, so that we can see what potential information it can hold.
I will give an example, to make this clear.
Consider the users inputs the following xml file:
<photos page="2" pages="89" perpage="10" total="881">
<photo id="2636" owner="47058503995#N01"
secret="a123456" server="2" title="test_04"
ispublic="1" isfriend="0" isfamily="0" />
<photo id="2635" owner="47058503995#N01"
secret="b123456" server="2" title="test_03"
ispublic="0" isfriend="1" isfamily="1" />
<photo id="2633" owner="47058503995#N01"
secret="c123456" server="2" title="test_01"
ispublic="1" isfriend="0" isfamily="0" />
<photo id="2610" owner="12037949754#N01"
secret="d123456" server="2" title="00_tall"
ispublic="1" isfriend="0" isfamily="0" />
</photos>
Then I want to transform this into:
<photos page=“..." pages=“..." perpage=“..." total=“...">
<photo id=“.." owner=“.."
secret=“..." server=“..." title=“..."
ispublic=“..." isfriend=“..." isfamily=“...” />
</photos>
I’m sure this could be written manually, but would be the be best, most efficient and reliable way of doing this. (preferably in Java).
Thnx!
There are plenty of possibilities:
DOM API (included in JDK)
SAX API (included in JDK)
JDOM (easy to use, but external)
XSLT (transforming XML with prepared XSL stylesheet, JDK supports XSLT 1.0)
I think that XSLT is most reliable and universal way to transform XML into another XML. Here is some quick example:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:strip-space elements="*"/>
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()[position()=1]"/>
</xsl:copy>
</xsl:template>
<xsl:template match="#*">
<xsl:attribute name="{name()}">...</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Result:
<photos page="..." pages="..." perpage="..." total="...">
<photo id="..." owner="..." secret="..." server="..." title="..." ispublic="..."
isfriend="..."
isfamily="..."/>
</photos>
Rather than use the DOM API, in which you'd have to iterate across the structure yourself, take a look at the SAX API, which iterates itself and calls you back for each element, text node etc. For each element you get called back for, you'll get the set of attributes too.
You'd still have to determine what to output, reduce duplicates etc. But you get a callback for an end-of-element as well, so perhaps record everything you get given, and then for your end-of-element callback, just determine the unique set of data you wish to output.
There are heaps of XML parsers available that you can use to do this job. If you are interested in learning then try XmlBeans or JAXB. These APIs gives you great deal of control and validations. Plus you get to learn XSD and generation of java classes from XSD. Also parsing and writing into XML files is fairly easy with these APIs. Following are some useful links,
XmlBeans
JAXB 2.0

XPath Trouble with getting attributes

I'm having a bit of trouble with some XML in Java. The following is the result of an API call to EVE Online. How can I get the "name" and "characterID" for each row?
Frankly I just have no idea where to start with this one, so please don't ask for extra information. I just gotta know how to get those attributes.
<?xml version='1.0' encoding='UTF-8'?>
<eveapi version="1">
<currentTime>2007-12-12 11:48:50</currentTime>
<result>
<rowset name="characters" key="characterID" columns="name,characterID,corporationName,corporationID">
<row name="Mary" characterID="150267069"
corporationName="Starbase Anchoring Corp" corporationID="150279367" />
<row name="Marcus" characterID="150302299"
corporationName="Marcus Corp" corporationID="150333466" />
<row name="Dieinafire" characterID="150340823"
corporationName="Center for Advanced Studies" corporationID="1000169" />
</rowset>
</result>
<cachedUntil>2007-12-12 12:48:50</cachedUntil>
</eveapi>
Try
/eveapi/result/rowset/row/#name
and
/eveapi/result/rowset/row/#key

Dynamic xml filtering and transform (in Java)

I have an XML file that looks like
<?xml version='1.0' encoding='UTF-8'?>
<root>
<node name="foo1" value="bar1" />
<node name="foo2" value="bar2" />
</root>
I have a method
String processBar(String bar)
and I want to end up with
<?xml version='1.0' encoding='UTF-8'?>
<root>
<node name="foo1" value="processBar("bar1")" />
<node name="foo2" value="processBar("bar2")" />
</root>
Is there an easy way to do this? Preferably in Java. Note that the file is too large to safely load completely into memory. The data in the XML roughly arbitrary and processBar may be complex, so I don't want to use regular expressions.
Assuming you mean replacing the attribute values with the result of calling processBar on said attribute values...
Use the JDK's XSLT API to run the following:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:java="http://xml.apache.org/xalan/java"
extension-element-prefixes="java">
<xsl:template match="/root/node/#value">
<xsl:attribute name="value">
<xsl:value-of select="java:com.example.yourclass.processBar(string(.))"/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
This uses the Xalan-Java extensions and assumes a static method. You can get an instance of an object and store it in an xsl:variable, like this:
<xsl:variable name="frobber" select="java:com.example.Frobber.new()"/>
<xsl:value-of select="java:processBar($frobber, string(.))"/>
Or somesuch.
This only works with Xalan, but since that's the XSLT processor distributed with the JDK, I doubt it will be onerous to use Xalan.
you can either parse the whole thing in a java xml parser OR just get the file content into a string and then do a regexp replace on it (using i.e. http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#replaceAll%28java.lang.String,%20java.lang.String%29)

Categories

Resources