Selective XML parsing - java

This the xml file I have
<?xml version="1.0" encoding="UTF-8"?>
<Bank>
<Account type="saving">
<Id>1001</Id>
<Name>Jack Robinson</Name>
<Amt>10000</Amt>
</Account>
<Account type="current">
<Id>1002</Id>
<Name>Sony Corporation</Name>
<Amt>1000000</Amt>
</Account>
</Bank>
I need to parse this xml and get the contents between <Bank>...</Bank>. My output xml should be
<Account type="saving">
<Id>1001</Id>
<Name>Jack Robinson</Name>
<Amt>10000</Amt>
</Account>
<Account type="current">
<Id>1002</Id>
<Name>Sony Corporation</Name>
<Amt>1000000</Amt>
</Account>
Any ideas on how to achieve this using Java?

First of all:
your output XML is not valid XML.
XML must have root element which you try to remove.
As #Seelenvirtuose said, there are tons of ways to do what you want on many levels.
From simple manipulating original XML as String and up to using DOM model, JAXB, XPath/XQuery, or XSLT. It is matter of your choice.
As example with Apache commons utils:
String resultString = org.apache.commons.lang.StringUtils.substringBetween(originalXMLString,"<Bank>","</Bank>").trim();
Of course your output can be only String, because it is not valid XML. Then you can do with that String whatever you want - print it, store in file or DB etc...

Related

Process single element array when converting XML to JSON (<?xml-multiple?>)

I'm using org.json.XML.toJSONObject() method to convert an XML string to JSON. Here is a sample XML string that I need to convert.
<?xml version="1.0" encoding="UTF-8"?>
<jsonObject>
<data>
<?xml-multiple accounts?>
<accounts>
<Id>123</Id>
<creationDate>2021-10-21T15:43:00.12345Z</creationDate>
<displayName>account_x</displayName>
</accounts>
</data>
<links>
<self>self</self>
<first>first</first>
<prev>prev</prev>
<next>next</next>
<last>last</last>
</links>
<meta>
<totalRecords>10</totalRecords>
<totalPages>10</totalPages>
</meta>
</jsonObject>
Here, 'accounts' is an element of an array and contains only a single element. But the org.json library cannot detect this. It can detect only if there are multiple elements.
My question is, is there a library that I can use to detect a single element array using the available tag in the XML string?

Append data after root element without loading the whole XML file into memory

I have to output an XML file which can contain use amount of data, I am using DOM parser to write XML file. It is also possible to append data to an existing XML file.
My requirement is add data to the root element.
Is it possible to append data without reading the entire XML document (Not to load XML into memory)?
Example Data:
Current XML file:
<employees>
<employee>
<name>jon</name>
<age> 22</age>
<address> address1 </address>
</employee>
</employees>
Required file:
<employees>
<employee>
<name>jon</name>
<age> 22</age>
<address> address1 </address>
</employee>
<employee>
<name>jon1</name>
<age> 24</age>
<address> address2 </address>
</employee>
</employees>
It would be hard if you don't want to load entire XML into memory.
You can achieve this by manipulating raw String (substring, etc.) - but I don't recommend this.
Or you can try using SAX reader http://www.saxproject.org/apidoc/org/xml/sax/XMLReader.html which enables you to manipulate XMLs "on the go". (I'm sorry, although you can use SAX parsers to process XML without reading its whole content, you cannot edit with it)
EDIT:
On second though you can copy existing XML using SAX parser, and by adding event listener to e.g. root you can add a child. It might be good solution if your concern is memory (big xml file).
You could use DOM4j for doing that.

Best way to bind a XML File in Java (NetBeans)

i want to bind this simple XML File in my java project:
<?xml version="1.0" encoding="UTF-8"?>
<itg>
<reader>
<chapter id="1">
<subchapter id="1"></subchapter>
<subchapter id="2"></subchapter>
</chapter>
<chapter id="2">
<subchapter id="1"></subchapter>
<subchapter id="2"></subchapter>
</chapter>
<chapter id="3"></chapter>
</reader>
<questions>
</questions>
</itg>
I use NetBeans, and actually i bind the XML File by parsing the xml file into a ArrayList, an bind the list.
It works, but it is possible to bind the xml File in a better way?
Thanks!
For this small XML (and not only) I would recommend that you take a look at JAXB. The two basic operations are marshalling (converting Java objects to XML data) and unmarshalling (converting XML data to Java objects) but verification and so on is also provided.

Is it possible to create an object from XML file using DOM?

I use DOM parser to read data from XML file. I know how to read, modify and write back the data. However, I would like to know if it is possible to create an object from an XML file.
I have an XML file which looks like this:
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE people SYSTEM "validator.dtd">
<people>
<student>
<name>John</name>
<course>Computer Technology</course>
<semester>6</semester>
<scheme>E</scheme>
</student>
<student>
<name>Foo</name>
<course>Industrial Electronics</course>
<semester>6</semester>
<scheme>E</scheme>
</student>
</people>
and I would like to make it an objects out of it so I can pass them around. Does a solution exist ?
Yes. This is possible through JAXB (Java API for XML binding)
All JAXB implementations provide a tool called a binding compiler to bind a XML schema in order to generate the corresponding Java classes.
For details refer: http://www.oracle.com/technetwork/articles/javase/index-140168.html#xmp1
You could have a look at XML beans or JAXB libraries. In case you don't have a schema file but have a sample XML file, you could create one using inst2xsd tool of xmlbeans. http://xmlbeans.apache.org/docs/2.0.0/guide/tools.html. This could get you started with the schema.

Preserving the CDATA format with a SAX parser

I'm trying to parse an XML file and insert some attributes in my database. I'm developing in JAVA and using SAX to parse the XML file.
My problem is that when I read an attribute in CDATA format I only get what the CDATA contains. Perhaps I wan't to keep the CDATA format?
For example with the XML below :
<?xml version="1.0" encoding="UTF-8"?>
<Bank>
<Account type="saving">
<Id>1001</Id>
<Name><![CDATA[<Jack> <Robinson>]]></Name>
<Amt>10000</Amt>
</Account>
<Account type="current">
<Id>1002</Id>
<Name>Sony Corporation</Name>
<Amt>1000000</Amt>
</Account>
</Bank>
I would like to get the Name and have it like this <![CDATA[<Jack> <Robinson>]]> and not only <Jack> <Robinson> which is what I am getting.
Can anyone help me with this issue please.
PS : Sorry for my English, I'm french.
Best regards,
Like #Quentin asked, I am curious why do you care about markup.
Did you consider appending <![CDATA[ and ]] using StringBuffer manually in your output.

Categories

Resources