Read XML in JAVA

Read XML in JAVA - java

I would like to read my XML in JAVA
<?xml version="1.0" encoding="UTF-8"?>
<myapp version="1.0">
<photo_information>
<date>2016/08/20</date>
<time>17:21:59</time>
<user_data></user_data>
<prints>1</prints>
<photos>
<photo image="1">IMG_0001.JPG</photo>
<photo image="2">IMG_0002.JPG</photo>
<photo image="3">IMG_0003.JPG</photo>
<photo image="4">IMG_0004.JPG</photo>
<output>prints\160820_172159.jpg</output>
</photos>
</photo_information>
</myapp>
I need the following infos:
prints
All images (IMG_0001.JPG, IMG_0002.JPG, IMG_0003.JPG, IMG_0004.JPG)
Output (prints\160820_172159.jpg)
I tried this with this code but it´s not working:
package my.app.test;
import java.io.File;
import java.io.IOException;
import java.util.List;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.JDOMException;
import org.jdom.input.SAXBuilder;
import org.jdom.output.XMLOutputter;
public class TestXML {
public static void main(String[] args) {
Document doc = null;
String filePath = "/myPath/IMG_0001.xml";
File f = new File(filePath);
try {
SAXBuilder builder = new SAXBuilder();
doc = builder.build(f);
XMLOutputter fmt = new XMLOutputter();
fmt.output(doc, System.out);
Element element = doc.getRootElement();
System.out.println("\nWurzelelement: " + element);
System.out.println("Wurzelelementname: " + element.getName());
List alleKinder = (List) element.getChildren();
System.out.println("Erstes Kindelement: "
+ ((Element) alleKinder.get(0)).getName());
List benannteKinder = element.getChildren("photos");
System.out.println("benanntes Kindelement: "
+ ((Element) benannteKinder.get(0)).getName());
Element kind = element.getChild("bw_mode");
System.out.println("Photo: " + kind.getValue());
Element kind2 = element.getChild("photo");
System.out.println("Photo: " + kind2.getAttributeValue("name"));
} catch (JDOMException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}

The best way is to create a schema and use JAXB to unmarshall incoming XML into java object and read it like POJO.
If you don't know how to create schema then you can use some online tool to get help - http://xmlgrid.net/xml2xsd.html
You will then have to generate Java objects using ANT.
Yes! Hell lot of work for such a simple problem but this is how xmls should be parsed in Java.
Remember !!! XML is like a war. If it isn't helping you then most probably you are not using it enough.

Alternatively to your approach you could take a look at the library xstream xstream.
This library enables you to serialize and deserialize objects to xml code.
Your first step is to model a class that contains all fields of your photo information. Normally you would call it PhotoInformation:
class PhotoInformation {
LocalDate date;
LocalTime time;
UserData userData;
int prints;
Photos photos;
}
In addition you need to create a few other classes: UserData and Photos.
In the next step you need to set up the parser of xstream to fill your objects with the content from the xml file.
For that you'll find a tutorial here or here if you like annotations.

Using JAXB Parsing would be a better option. To parse this you would have to make Classes.
//Add this to your pom in plugins
<groupId>org.codehaus.mojo</groupId>
<artifactId>jaxb2-maven-plugin</artifactId>
<version>1.6</version>
Making the following classes
#XmlRootElement(name = "myapp")
class MyApp{
PhotoInformation[] photoInformation;
#XmlElement(name = "photo_information")
public PhotoInformation getPhotoInformation() {
return individuals;
}
public void setPhotoInformation(Individuals photoInformation) {
this.individuals = individuals;
}
}
#XmlRootElement(name= "photo_Information")
class PhotoInformation {
LocalDate date;
LocalTime time;
UserData userData;
int prints;
Photos[] photos;
//add getters and setters for the above variables with the #XmlElement Annotations and correct tag name
}
#XmlRootElement(name= "photo")
class Photo{
String photo;
//add getter and setter for the above variables with the #XmlElement Annotations and correct tag name
}
for UnMarshalling(Parsing) the file
JAXBContext jaxbContext = JAXBContext.newInstance(MyApp.class);
Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
File XMLfile = new File(folderPath);
myApp myapp = (myApp) jaxbUnmarshaller.unmarshal(XMLfile);
Through myapp object you will be able to access the rest of the tags that inside the myApp class and you can further iterate through photo_information and have access to photos.
I hope this helps

Related

What is the model of using Smook and Freemarker to transform Java Objects to XML?

I am having trouble finding clear documentation on how to do the following transformation:
Java Object -> Smooks/Freemarker Template -> XML Output
Here is the example I am trying:
Java POJO (I have a separate DAO clas that populates this object):
package Transformer;
public class JavaObject {
String name;
}
Main transformer class:
package Transformer;
import java.io.IOException;
import java.io.StringWriter;
import javax.xml.transform.stream.StreamResult;
import org.milyn.Smooks;
import org.milyn.container.ExecutionContext;
import org.milyn.payload.JavaSource;
import org.xml.sax.SAXException;
public class Transformer {
protected static String runSmooksTransform(Object javaObject) throws IOException, SAXException {
Smooks smooks = new Smooks("smooks-config.xml");
try {
ExecutionContext executionContext = smooks.createExecutionContext();
StringWriter writer = new StringWriter();
smooks.filterSource(executionContext, new JavaSource("smooks-config.xml"), new StreamResult(writer));
return writer.toString();
} finally {
smooks.close();
}
}
public static void main(String args[]) {
try {
Transformer.runSmooksTransform(javaObject);
} catch(Throwable ex){
System.err.println("Uncaught exception - " + ex.getMessage());
ex.printStackTrace(System.err);
}
}
}
So here is the point where I am confused... I have seen a few different ways to "map" the template
here are some examples I have seen:
A .ftl template file with mapping like this:
<Nm> ${Name} </Nm>
An XML mapping like this:
<medi:segment minOccurs="0" maxOccurs="1" segcode="" xmltag="Group">
<medi:field xmltag="Name" />
</medi:segment>
Mapping in the smooks-config.xml itself:
<?xml version="1.0"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.0.xsd"
xmlns:ftl="http://www.milyn.org/xsd/smooks/freemarker-1.1.xsd">
<resource-config selector="global-parameters">
<param name="stream.filter.type">SAX</param>
</resource-config>
<reader mappingModel="example.xml" />
<ftl:freemarker applyOnElement="order">
<ftl:template>
<Nm>${name}</Nm>
</ftl:template>
</ftl:freemarker>
</smooks-resource-list>
So can anyone please explain the correct way to use Smooks + a Freemarker template to convert a java object to a specified XML output?
Or point me to documentation/example specific to this use case?
Thank you

I don't know anything about how it's done in Smooks, but it's very likely that you need to add a public String getName() { return name; } to the JavaObject class, or else it won't be visible form the FreeMarker template. It actually depends on the FreeMarker configuration settings (and I don't know how Smooks configures it), so anything is possible in theory, but it's likely that you need a getter method, but if not then at least the field need to be public.
Also you don't pass javaObject to Smooks in your example code, though I guess it's not a the real code.

JDOM XPath Getting Inner Element without Namespace

I have an xml like this:
<root
xmlns:gl-bus="http://www.xbrl.org/int/gl/bus/2006-10-25"
xmlns:gl-cor="http://www.xbrl.org/int/gl/cor/2006-10-25" >
<gl-cor:entityInformation>
<gl-bus:accountantInformation>
...............
</gl-bus:accountantInformation>
</gl-cor:entityInformation>
</root>
All I want to extract the element "gl-cor:entityInformation" from the root with its child elements. However, I do not want the namespace declarations come with it.
The code is like this:
XPathExpression<Element> xpath = XPathFactory.instance().compile("gl-cor:entityInformation", Filters.element(), null, NAMESPACES);
Element innerElement = xpath.evaluateFirst(xmlDoc.getRootElement());
The problem is that the inner element holds the namespace declarations now. Sample output:
<gl-cor:entityInformation xmlns:gl-cor="http://www.xbrl.org/int/gl/cor/2006-10-25">
<gl-bus:accountantInformation xmlns:gl-bus="http://www.xbrl.org/int/gl/bus/2006-10-25">
</gl-bus:accountantInformation>
</gl-cor:entityInformation>
This is how I get xml as string:
public static String toString(Element element) {
Format format = Format.getPrettyFormat();
format.setTextMode(Format.TextMode.NORMALIZE);
format.setEncoding("UTF-8");
XMLOutputter xmlOut = new XMLOutputter();
xmlOut.setFormat(format);
return xmlOut.outputString(element);
}
As you see the namespace declarations are passed into the inner elements. Is there a way to get rid of these declarations without losing the prefixes?
I want this because later on I will be merging these inner elements inside another parent element and this parent element has already those namespace declarations.

JDOM by design insists that the in-memory model of the XML is well structured at all times. The behaviour you are seeing is exactly what I would expect from JDOM and I consider it to be "right". JDOM's XMLOutputter also outputs well structured and internally consistent XML and XML fragments.
Changing the bahaviour of the internal in-memory model is not an option with JDOM, but customizing the XMLOutputter to change its behaviour is relatively easy. The XMLOutputter is structured to have an "engine" supplied as a constructor argument: XMLOutputter(XMLOutputProcessor). In addition, JDOM supplies an easy-to-customize default XMLOutputProcessor called AbstractXMLOutputProcessor.
You can get the behaviour you want by doing the following:
private static final XMLOutputProcessor noNamespaces = new AbstractXMLOutputProcessor() {
#Override
protected void printNamespace(final Writer out, final FormatStack fstack,
final Namespace ns) throws IOException {
// do nothing with printing Namespaces....
}
};
Now, when you create your XMLOutputter to print your XML element fragment, you can do the following:
public static String toString(Element element) {
Format format = Format.getPrettyFormat();
format.setTextMode(Format.TextMode.NORMALIZE);
format.setEncoding("UTF-8");
XMLOutputter xmlOut = new XMLOutputter(noNamespaces);
xmlOut.setFormat(format);
return xmlOut.outputString(element);
}
Here's a full program working with your input XML:
import java.io.IOException;
import java.io.Writer;
import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.JDOMException;
import org.jdom2.Namespace;
import org.jdom2.filter.Filters;
import org.jdom2.input.SAXBuilder;
import org.jdom2.output.Format;
import org.jdom2.output.XMLOutputter;
import org.jdom2.output.support.AbstractXMLOutputProcessor;
import org.jdom2.output.support.FormatStack;
import org.jdom2.output.support.XMLOutputProcessor;
import org.jdom2.xpath.XPathExpression;
import org.jdom2.xpath.XPathFactory;
public class JDOMEray {
public static void main(String[] args) throws JDOMException, IOException {
Document eray = new SAXBuilder().build("eray.xml");
Namespace[] NAMESPACES = {Namespace.getNamespace("gl-cor", "http://www.xbrl.org/int/gl/cor/2006-10-25")};
XPathExpression<Element> xpath = XPathFactory.instance().compile("gl-cor:entityInformation", Filters.element(), null, NAMESPACES);
Element innerElement = xpath.evaluateFirst(eray.getRootElement());
System.out.println(toString(innerElement));
}
private static final XMLOutputProcessor noNamespaces = new AbstractXMLOutputProcessor() {
#Override
protected void printNamespace(final Writer out, final FormatStack fstack,
final Namespace ns) throws IOException {
// do nothing with printing Namespaces....
}
};
public static String toString(Element element) {
Format format = Format.getPrettyFormat();
format.setTextMode(Format.TextMode.NORMALIZE);
format.setEncoding("UTF-8");
XMLOutputter xmlOut = new XMLOutputter(noNamespaces);
xmlOut.setFormat(format);
return xmlOut.outputString(element);
}
}
For me the above program outputs:
<gl-cor:entityInformation>
<gl-bus:accountantInformation>...............</gl-bus:accountantInformation>
</gl-cor:entityInformation>

Moxy's getValueByXPath returns null for all but root element

See my sscce.
Looking at examples, it looks like I should be able to use moxy's getValueByXPath to access a child element of an umarshalled xml object. But instead I'm always returned null. Attributes on the root object are accessible.
When I run the example in this question's answer, it works fine :/ Here's what I'm doing:
xml:
<?xml version="1.0" encoding="UTF-8"?>
<OTA_HotelInvCountNotifRQ xmlns="http://www.opentravel.org/OTA/2003/05" AltLangID="alt lang id fnord">
<Inventories AreaID="areaID_fnord">
<Inventory>
<UniqueID ID="inventory unique id fnord"/>
</Inventory>
</Inventories>
</OTA_HotelInvCountNotifRQ>
java:
import org.eclipse.persistence.jaxb.JAXBContext;
import org.eclipse.persistence.jaxb.JAXBContextFactory;
....
OTAHotelInvCountNotifRQ rq = ...
JAXBContext ctx = (JAXBContext) JAXBContextFactory.createContext("org.opentravel.ota._2003._05", Main.class.getClassLoader());
String altLangId = ctx.getValueByXPath(rq, "#AltLangID", null, String.class);
assertThat("rq's altlang attr", altLangId, is(ALT_LANG_ID));
InvCountType inventories = ctx.getValueByXPath(rq, "Inventories", null, InvCountType.class);
assertThat("inventories", inventories, is(not(nullValue())));
I have a runnable simple self-contained complete example (mvn exec:java). I'm not able to change the OTA classes (I generated them from xsd and included them for convenience).
Any ideas why this is returning null instead of the expected object?

Since your XML document is namespace qualified, you need to to namespace qualify your XPath. Then you need to provide the prefix to namespace mapping pairings using an instance of NamespaceResolver. This is passed as a parameter to the getValueByXPath method.
import java.io.File;
import javax.xml.bind.*;
import org.eclipse.persistence.jaxb.JAXBHelper;
import org.eclipse.persistence.oxm.NamespaceResolver;
import org.opentravel.ota._2003._05.*;
public class Demo {
public static void main(String[] args) throws Exception {
JAXBContext jc = JAXBContext.newInstance("org.opentravel.ota._2003._05", ObjectFactory.class.getClassLoader(), null);
Unmarshaller unmarshaller = jc.createUnmarshaller();
File xml = new File("input.xml");
OTAHotelInvCountNotifRQ rq = (OTAHotelInvCountNotifRQ) unmarshaller.unmarshal(xml);
NamespaceResolver nsResolver = new NamespaceResolver();
nsResolver.put("ns", "http://www.opentravel.org/OTA/2003/05");
InvCountType inventories = JAXBHelper.getJAXBContext(jc).getValueByXPath(rq, "ns:Inventories", nsResolver, InvCountType.class);
System.out.println(inventories);
}
}

xstream deserialization returns package.class#7dccc2. I assume my aliasing is incorrect?

I am trying to use a simple xml file to store data. I know its overkill but I thought I could learn a bit about xml at the same time.
I am trying to read the value 1, in the following xml file:
`<?xml version="1.0" encoding="UTF-8" standalone="no"?><invoiceNo>1</invoiceNo>`
The getter/ setter class for the xml data is:
`package com.InvoiceToAccounts.swt;
public class XML_Log {
public String invoiceNo;
public void setNewInvoiceNo(String invoiceNo) {
this.invoiceNo = invoiceNo;
}
public String getNewInvoiceNo() {
return invoiceNo;
}
}`
My class to read it is:
`package com.InvoiceToAccounts.swt;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import com.thoughtworks.xstream.*;
import com.thoughtworks.xstream.io.xml.DomDriver;
public class XML_Reader {
public String XRead(String logLocation) {
XStream xs = new XStream(new DomDriver());
XML_Log only1 = new XML_Log();
xs.alias("invoiceNo", XML_Log.class);//Alias
try {
FileInputStream fis = new FileInputStream(logLocation);//e.g."c:/temp/employeedata.txt"
xs.fromXML(fis, only1);
//print the data from the object that has been read
System.out.println(only1.toString());
} catch (FileNotFoundException ex) {
ex.printStackTrace();
}
return (only1.toString());
}
}`
And finally I call the xml reader from a main with:
//read log
XML_Reader log1 = new XML_Reader();
String last_Invoice_No=log1.XRead("I:\\Invoice_Log.xml");
System.out.println("last_Invoice_no: " + last_Invoice_No);
My problem is the output that last_Invoice_No receives is:
last_Invoice_no: com.InvoiceToAccounts.swt.XML_Log#7dccc2
Therefore I assume it is something I am doing in the XML_Reader class?
I have read the tutorials on aliases and assumed I had it correct?
Thanks for any help in advance.

com.InvoiceToAccounts.swt.XML_Log#7dccc2
This means you haven't given your XML_Log class a meaningful toString implementation. You cannot infer anything else from this.
try adding to XML_Log this method.
public String toString() {
return "XML_Log{ invoiceNo=" + invoiceNo + "}";
}

How to generate CDATA block using JAXB?

I am using JAXB to serialize my data to XML. The class code is simple as given below. I want to produce XML that contains CDATA blocks for the value of some Args. For example, current code produces this XML:
<command>
<args>
<arg name="test_id">1234</arg>
<arg name="source"><html>EMAIL</html></arg>
</args>
</command>
I want to wrap the "source" arg in CDATA such that it looks like below:
<command>
<args>
<arg name="test_id">1234</arg>
<arg name="source"><[![CDATA[<html>EMAIL</html>]]></arg>
</args>
</command>
How can I achieve this in the below code?
#XmlRootElement(name="command")
public class Command {
#XmlElementWrapper(name="args")
protected List<Arg> arg;
}
#XmlRootElement(name="arg")
public class Arg {
#XmlAttribute
public String name;
#XmlValue
public String value;
public Arg() {};
static Arg make(final String name, final String value) {
Arg a = new Arg();
a.name=name; a.value=value;
return a; }
}

Note: I'm the EclipseLink JAXB (MOXy) lead and a member of the JAXB (JSR-222) expert group.
If you are using MOXy as your JAXB provider then you can leverage the #XmlCDATA extension:
package blog.cdata;
import javax.xml.bind.annotation.XmlRootElement;
import org.eclipse.persistence.oxm.annotations.XmlCDATA;
#XmlRootElement(name="c")
public class Customer {
private String bio;
#XmlCDATA
public void setBio(String bio) {
this.bio = bio;
}
public String getBio() {
return bio;
}
}
For More Information
http://bdoughan.blogspot.com/2010/07/cdata-cdata-run-run-data-run.html
http://blog.bdoughan.com/2011/05/specifying-eclipselink-moxy-as-your.html

Use JAXB's Marshaller#marshal(ContentHandler) to marshal into a ContentHandler object. Simply override the characters method on the ContentHandler implementation you are using (e.g. JDOM's SAXHandler, Apache's XMLSerializer, etc):
public class CDataContentHandler extends (SAXHandler|XMLSerializer|Other...) {
// see http://www.w3.org/TR/xml/#syntax
private static final Pattern XML_CHARS = Pattern.compile("[<>&]");
public void characters(char[] ch, int start, int length) throws SAXException {
boolean useCData = XML_CHARS.matcher(new String(ch,start,length)).find();
if (useCData) super.startCDATA();
super.characters(ch, start, length);
if (useCData) super.endCDATA();
}
}
This is much better than using the XMLSerializer.setCDataElements(...) method because you don't have to hardcode any list of elements. It automatically outputs CDATA blocks only when one is required.

Solution Review:
The answer of fred is just a workaround which will fail while validating the content when the Marshaller is linked to a Schema because you modify only the string literal and do not create CDATA sections. So if you only rewrite the String from foo to <![CDATA[foo]]> the length of the string is recognized by Xerces with 15 instead of 3.
The MOXy solution is implementation specific and does not work only with the classes of the JDK.
The solution with the getSerializer references to the deprecated XMLSerializer class.
The solution LSSerializer is just a pain.
I modified the solution of a2ndrade by using a XMLStreamWriter implementation. This solution works very well.
XMLOutputFactory xof = XMLOutputFactory.newInstance();
XMLStreamWriter streamWriter = xof.createXMLStreamWriter( System.out );
CDataXMLStreamWriter cdataStreamWriter = new CDataXMLStreamWriter( streamWriter );
marshaller.marshal( jaxbElement, cdataStreamWriter );
cdataStreamWriter.flush();
cdataStreamWriter.close();
Thats the CDataXMLStreamWriter implementation. The delegate class simply delegates all method calls to the given XMLStreamWriter implementation.
import java.util.regex.Pattern;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamWriter;
/**
* Implementation which is able to decide to use a CDATA section for a string.
*/
public class CDataXMLStreamWriter extends DelegatingXMLStreamWriter
{
private static final Pattern XML_CHARS = Pattern.compile( "[&<>]" );
public CDataXMLStreamWriter( XMLStreamWriter del )
{
super( del );
}
#Override
public void writeCharacters( String text ) throws XMLStreamException
{
boolean useCData = XML_CHARS.matcher( text ).find();
if( useCData )
{
super.writeCData( text );
}
else
{
super.writeCharacters( text );
}
}
}

Here is the code sample referenced by the site mentioned above:
import java.io.File;
import java.io.StringWriter;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.apache.xml.serialize.OutputFormat;
import org.apache.xml.serialize.XMLSerializer;
import org.w3c.dom.Document;
public class JaxbCDATASample {
public static void main(String[] args) throws Exception {
// unmarshal a doc
JAXBContext jc = JAXBContext.newInstance("...");
Unmarshaller u = jc.createUnmarshaller();
Object o = u.unmarshal(...);
// create a JAXB marshaller
Marshaller m = jc.createMarshaller();
// get an Apache XMLSerializer configured to generate CDATA
XMLSerializer serializer = getXMLSerializer();
// marshal using the Apache XMLSerializer
m.marshal(o, serializer.asContentHandler());
}
private static XMLSerializer getXMLSerializer() {
// configure an OutputFormat to handle CDATA
OutputFormat of = new OutputFormat();
// specify which of your elements you want to be handled as CDATA.
// The use of the '^' between the namespaceURI and the localname
// seems to be an implementation detail of the xerces code.
// When processing xml that doesn't use namespaces, simply omit the
// namespace prefix as shown in the third CDataElement below.
of.setCDataElements(
new String[] { "ns1^foo", // <ns1:foo>
"ns2^bar", // <ns2:bar>
"^baz" }); // <baz>
// set any other options you'd like
of.setPreserveSpace(true);
of.setIndenting(true);
// create the serializer
XMLSerializer serializer = new XMLSerializer(of);
serializer.setOutputByteStream(System.out);
return serializer;
}
}

For the same reasons as Michael Ernst I wasn't that happy with most of the answers here. I could not use his solution as my requirement was to put CDATA tags in a defined set of fields - as in raiglstorfer's OutputFormat solution.
My solution is to marshal to a DOM document, and then do a null XSL transform to do the output. Transformers allow you to set which elements are wrapped in CDATA tags.
Document document = ...
jaxbMarshaller.marshal(jaxbObject, document);
Transformer nullTransformer = TransformerFactory.newInstance().newTransformer();
nullTransformer.setOutputProperty(OutputKeys.INDENT, "yes");
nullTransformer.setOutputProperty(OutputKeys.CDATA_SECTION_ELEMENTS, "myElement {myNamespace}myOtherElement");
nullTransformer.transform(new DOMSource(document), new StreamResult(writer/stream));
Further info here: http://javacoalface.blogspot.co.uk/2012/09/outputting-cdata-sections-with-jaxb.html

The following simple method adds CDATA support in JAX-B which does not support CDATA natively :
declare a custom simple type CDataString extending string to identify the fields that should be handled via CDATA
Create a custom CDataAdapter that parses and print content in CDataString
use JAXB bindings to link CDataString and you CDataAdapter. the CdataAdapter will add/remove to/from CdataStrings at Marshall/Unmarshall time
Declare a custom character escape handler that does not escape character when printing CDATA strings and set this as the Marshaller CharacterEscapeEncoder
Et voila, any CDataString element will be encapsulated with at Marshall time. At unmarshall time, the will automatically be removed.

Supplement of #a2ndrade's answer.
I find one class to extend in JDK 8. But noted that the class is in com.sun package. You can make one copy of the code in case this class may be removed in future JDK.
public class CDataContentHandler extends com.sun.xml.internal.txw2.output.XMLWriter {
public CDataContentHandler(Writer writer, String encoding) throws IOException {
super(writer, encoding);
}
// see http://www.w3.org/TR/xml/#syntax
private static final Pattern XML_CHARS = Pattern.compile("[<>&]");
public void characters(char[] ch, int start, int length) throws SAXException {
boolean useCData = XML_CHARS.matcher(new String(ch, start, length)).find();
if (useCData) {
super.startCDATA();
}
super.characters(ch, start, length);
if (useCData) {
super.endCDATA();
}
}
}
How to use:
JAXBContext jaxbContext = JAXBContext.newInstance(...class);
Marshaller marshaller = jaxbContext.createMarshaller();
StringWriter sw = new StringWriter();
CDataContentHandler cdataHandler = new CDataContentHandler(sw,"utf-8");
marshaller.marshal(gu, cdataHandler);
System.out.println(sw.toString());
Result example:
<?xml version="1.0" encoding="utf-8"?>
<genericUser>
<password><![CDATA[dskfj>><<]]></password>
<username>UNKNOWN::UNKNOWN</username>
<properties>
<prop2>v2</prop2>
<prop1><![CDATA[v1><]]></prop1>
</properties>
<timestamp/>
<uuid>cb8cbc487ee542ec83e934e7702b9d26</uuid>
</genericUser>

As of Xerxes-J 2.9, XMLSerializer has been deprecated. The suggestion is to replace it with DOM Level 3 LSSerializer or JAXP's Transformation API for XML. Has anyone tried approach?

Just a word of warning: according to documentation of the javax.xml.transform.Transformer.setOutputProperty(...) you should use the syntax of qualified names, when indicating an element from another namespace. According to JavaDoc (Java 1.6 rt.jar):
"(...) For example, if a URI and local name were obtained from an element defined with , then the qualified name would be "{http://xyz.foo.com/yada/baz.html}foo. Note that no prefix is used."
Well this doesn't work - the implementing class from Java 1.6 rt.jar, meaning com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl interprets elements belonging to a different namespace only then correctly, when they are declared as "http://xyz.foo.com/yada/baz.html:foo", because in the implementation someone is parsing it looking for the last colon. So instead of invoking:
transformer.setOutputProperty(OutputKeys.CDATA_SECTION_ELEMENTS, "{http://xyz.foo.com/yada/baz.html}foo")
which should work according to JavaDoc, but ends up being parsed as "http" and "//xyz.foo.com/yada/baz.html", you must invoke
transformer.setOutputProperty(OutputKeys.CDATA_SECTION_ELEMENTS, "http://xyz.foo.com/yada/baz.html:foo")
At least in Java 1.6.

The following code will prevent from encoding CDATA elements:
Marshaller marshaller = context.createMarshaller();
marshaller.setProperty(Marshaller.JAXB_ENCODING, "UTF-8");
marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
StringWriter stringWriter = new StringWriter();
PrintWriter printWriter = new PrintWriter(stringWriter);
DataWriter dataWriter = new DataWriter(printWriter, "UTF-8", new CharacterEscapeHandler() {
#Override
public void escape(char[] buf, int start, int len, boolean b, Writer out) throws IOException {
out.write(buf, start, len);
}
});
marshaller.marshal(data, dataWriter);
System.out.println(stringWriter.toString());
It will also keep UTF-8 as your encoding.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Read XML in JAVA - java

Related

What is the model of using Smook and Freemarker to transform Java Objects to XML?

JDOM XPath Getting Inner Element without Namespace

Moxy's getValueByXPath returns null for all but root element

xstream deserialization returns package.class#7dccc2. I assume my aliasing is incorrect?

How to generate CDATA block using JAXB?

Categories

Resources