Jackson XmlMapper map from child element

Jackson XmlMapper map from child element - java

I´m using the Jackson XmlMapper to map and xml into a POJO but I have the following problem:
My XML looks like this (not the original one, only an example):
<?xml version="1.0" encoding="UTF-8"?>
<result>
<pojo>
<name>test</name>
</pojo>
</result>
The problem is, I don´t want to parse the "result" object. I wan´t to parse the pojo as an own object. Can I do this with XmlMapper?
thank you!
Artur

You can do it but you must write some boiler plate code.
You must create an instance of XMLStreamReader to be able to do customized reading of your xml input. The next() method allows to go to the next parsing event of the reader. It's rather a tricky method() related to the internal rules of the reader. So read the documentation to understands particularities :
From the Javadoc:
int javax.xml.stream.XMLStreamReader.next() throws XMLStreamException
Get next parsing event - a processor may return all contiguous
character data in a single chunk, or it may split it into several
chunks. If the property javax.xml.stream.isCoalescing is set to true
element content must be coalesced and only one CHARACTERS event must
be returned for contiguous element content or CDATA Sections. By
default entity references must be expanded and reported transparently
to the application. An exception will be thrown if an entity reference
cannot be expanded. If element content is empty (i.e. content is "")
then no CHARACTERS event will be reported.
Given the following XML: content
textHello</greeting>]]>other content The
behavior of calling next() when being on foo will be: 1- the comment
(COMMENT) 2- then the characters section (CHARACTERS) 3- then the
CDATA section (another CHARACTERS) 4- then the next characters section
(another CHARACTERS) 5- then the END_ELEMENT
NOTE: empty element (such as ) will be reported with two
separate events: START_ELEMENT, END_ELEMENT - This preserves parsing
equivalency of empty element to . This method will throw an
IllegalStateException if it is called after hasNext() returns false.
Returns: the integer code corresponding to the current parse event
Let me illustrate the way to proceed with an unit test :
#Test
public void mapXmlToPojo() throws Exception {
XMLInputFactory factory = XMLInputFactory2.newFactory();
InputStream inputFile = MapXmlToPojo.class.getResourceAsStream("pojo.xml");
XMLStreamReader xmlStreamReader = factory.createXMLStreamReader(inputFile);
XmlMapper xmlMapper = new XmlMapper();
xmlStreamReader.next();
xmlStreamReader.next();
Pojo pojo = xmlMapper.readValue(xmlStreamReader, Pojo.class);
Assert.assertEquals("test", pojo.getName());
}

Just to add more on this (In order yo make this generic), I had a scenario where I had to extract a specific element and map that to java object, in this case we can put a conditional check whenever that tag encountered get that out and map the same.
I have added DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES to ignore fields which is not needed, in case our pojo has less fields to map than what we are getting from end source.
Below is the tested code -
while (xmlStreamReader.hasNext()) {
xmlStreamReader.next();
if (xmlStreamReader.nextTag() == XMLEvent.START_ELEMENT) {
QName name = xmlStreamReader.getName();
if (("spcific_name").equalsIgnoreCase(name.getLocalPart())) {
objectMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
result = objectMapper.readValue(xmlStreamReader, Pojo.class);
break;
}
}
}

Related

Two Xmls input, One output with XSL transform

I'm trying to write an XSL that basically need to take some values from one xml and other from another and output a XML. I've searched online for some solution and I found that I've to put this <xsl:variable name='file' select="'file:///C:/Users/file.xml'"> inside my input XML which is supposed to load another XML and store it into a variable but from this I dont know how to get the tags value of the document.
The file.xml is this one
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<silosMediaObject>
<canBeDeleted>-1</canBeDeleted>
<checkedOut>-1</checkedOut>
<checkedOutBy>-1</checkedOutBy>
<deleted>-1</deleted>
<description>Traccia audio migrata da ASCN</description>
<externalResourcePath>TEST/ASCN/lq/3763_2015-05-05.mp3</externalResourcePath>
<fileName>3763_2015-05-05.mp3</fileName>
<framesPerSecond>-1</framesPerSecond>
<hasScheduledIngestion>false</hasScheduledIngestion>
<isArchived>-1</isArchived>
<isArchiving>-1</isArchiving>
<isAvailable>-1</isAvailable>
<isEncoding>-1</isEncoding>
<isRestoring>-1</isRestoring>
<isVerified>-1</isVerified>
<mediaObjectId>-1</mediaObjectId>
<mediaTypeId>-1</mediaTypeId>
<mosId>4347</mosId>
<resourceIsExternal>-1</resourceIsExternal>
<sourceMediaObjectId>-1</sourceMediaObjectId>
<state>AVAILABLE</state>
<versionLinkId>-1</versionLinkId>
</silosMediaObject>
The Java class I'm using to transform the file is this one:
public class TestMain {
public static void main(String[] args) throws IOException, URISyntaxException, TransformerException {
TransformerFactory factory = TransformerFactory.newInstance();
Source xslt = new StreamSource(new File("C:\\Users\\xmltemplate_transformer.xsl"));
Transformer transformer = factory.newTransformer(xslt);
Source text = new StreamSource(new File("C:\\Users\\tobe_transformed.xml"));
transformer.transform(text, new StreamResult(new File("C:\\Users\\out.xml")));
}
}

I've searched online for some solution and I found that I've to put this <xsl:variable name='file' select="'file:///C:/Users/file.xml'"> inside my input XML which is supposed to load another XML and store it into a variable
I don't know where you got that idea, but you're confused. The select value is interpreted as an XPath expression. Yours is a string literal containing a URL with the file scheme. As far as XPath or XSLT is concerned, it is just a string. One might do something further to cause the file designated by that URL to be parsed, but what you've presented has no such effect.
In particular, you might have wanted to do this:
<xsl:variable name='file' select="document('file:///C:/Users/file.xml')"/>
The document() function is the secret sauce that actually causes the designated file to be read and parsed (if possible); when used as shown, its result is a node set containing the root node of the resulting document, or an empty node set if the designated document cannot be parsed and the processor elects not to signal an error.
Note: when you say you put the xsl:variable "inside my input XML", I presume you mean at an appropriate place inside your (XML-based) XSL stylesheet. If you actually mean that you have placed it in a different XML data file that you are processing, then it will have no direct effect there, other than to be included, as itself, in the input tree.
but from this I dont know how to get the tags value of the document.
Having successfully parsed the file, you can use the resulting node set anywhere that XSLT expects an expression that evaluates to a node set. In particular, within its scope, you can use a reference to the variable you've defined ($file) as an argument to XPath functions, or as a whole expression, such as the select expression of an xsl:apply-templates. Since you haven't said what, specifically, you want to do with the contents, I cannot be any more specific myself. See what you can do, and if you can't figure out the details then that could be a suitable topic for a new question.

XML File looses its format after reading and writing in Java

I'm writing a program in Java that it's going to read a XML file and do some modification,and then write the file with the same format.
The following is the code block that reads and writes the XML file:
final Document fileDocument = parseFileAsDocument(file);
final OutputFormat format = new OutputFormat(fileDocument);
try {
final FileWriter out = new FileWriter(file);
final XMLSerializer serializer = new XMLSerializer(out,format);
serializer.serialize(fileDocument);
}
catch (final IOException e) {
System.out.println(e.getMessage());
}
This is the method used to parse the file:
private Document parseFileAsDocument(final File file) {
Document inputDocument = null;
try {
inputDocument = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(file);
}//catching some exceptions{}
return inputDocument;
}
I'm noticing two changes after the file is written:
Before I had a node similar to this:
<instance ref='filter'>
<value></value>
</instance>
After reading and writing, the node looks like this:
<instance ref="filter">
<value/>
</instance>
As you can see from above, the 'filter' has been changed to "filter" with double quote.
The second change is <value></value> has been changed to <value/>. This change happens across the XML file whenever we have a node similar to <tag></tag> with no value in between. So if we have something like <tag>somevalue</tag>, there is no issue.
Any thought please how to get the XML nodes format to be the same after writing?
I'd appreciate it!

You can't, and you shouldn't try. It's a bit like complaining that when you add 0123 and 0234, you get 357 without the leading zeroes. Leading zeroes in integers aren't considered significant, so arithmetic operations don't preserve them. The same happens to insignificant details of your XML, like the distinction between double quotes and single quotes, and the distinction between a self-closing tags and a start/end tag pair for an empty element. If any consumer of the XML is depending on these details, they need to be sent for retraining.
The most usual reason for asking for lexical details to be preserved is that you want to detect changes. But this means you are doing your comparisons the wrong way: you should be comparing at the logical level, not the physical level. One way to do comparisons is to canonicalize the XML, so whenever there is an arbitrary choice to be made between equivalent representations, it is made the same way.

Is startDocument() with Androids XMLSerializer required?

For a certain application (serializing and deserializing an object for transport via XMPP PubSub item payload), I need to create XML fragments - this is I have to omit the document declaration.
I'm using the org.xmlpull.v1.XmlSerializer class; unfortunately there doesn't seem to be much documentation available on the correct usage of it. At least all documentation I've found on its startDocument() method leaves it unclear whether I can or cannot skip calling this method. At least all examples I've found call this method (but all of them explained just how to create complete XML documents, no fragments).
To give a code example:
XmlSerializer xmlSerializer = Xml.newSerializer();
StringWriter xmlStringWriter = new StringWriter();
try {
xmlSerializer.setFeature("http://xmlpull.org/v1/doc/features.html#indent-output", true);
xmlSerializer.setOutput(xmlStringWriter);
// xmlSerializer.startDocument("UTF-8", true);
xmlSerializer.startTag(null, "tag-name");
// ...
xmlSerializer.endTag(null, "tag-name";
// xmlSerializer.endDocument();
xmlSerializer.flush();
} catch (IOException e) {
// Hanle exception
}
String xmlOutputString = xmlStringWriter.toString();
Is this allowed? And if not, is there any other way to generate fragments with XMLSerializer without parsing the output string in order to manually remove the document declaration (e.g. calling startDocument only with null parameters)?

Here comes the answer in short terms: No, calling startDocument() is not required and will skip generating the document declaration.

Parsing 'pseudo' XML (that is, not well formed) in java?

I have some xml that looks like this:
<xml><name>oscar</name><race>puppet</race><class>grouch</class></xml>
The tags change and are variable, so there won't always be a 'name' tag.
I've tried 3 or 4 parses and they all seem to choke on it. Any hints?

Just because it doesn't have a defined schema, doesn't mean it isn't "valid" XML - your sample XML is "well formed".
The dom4j library will do it for you. Once parsed (your XML will parse OK) you can iterate through child elements, no matter what their tag name, and work with your data.
Here's an example of how to use it:
import org.dom4j.*;
String text = "<xml><name>oscar</name><race>puppet</race><class>grouch</class></xml>";
Document document = DocumentHelper.parseText(text);
Element root = document.getRootElement();
for ( Iterator i = root.elementIterator(); i.hasNext(); ) {
Element element = (Element) i.next();
String tagName = element.getQName();
String contents = element.getText();
// do something
}

This is valid xml; try adding an XML Schema that allows for optional elements. If you can write an xml schema, you can use JAXB to parse it. XML allows for having optional elements; it isn't too "strict" about it.

Your XML sample is well-formed XML, and if anything "chokes" on it then it would be useful for us to know exactly what the symptoms of the "choking" are.

Parsing an XML file without root in Java

I have this XML file which doesn't have a root node. Other than manually adding a "fake" root element, is there any way I would be able to parse an XML file in Java? Thanks.

I suppose you could create a new implementation of InputStream that wraps the one you'll be parsing from. This implementation would return the bytes of the opening root tag before the bytes from the wrapped stream and the bytes of the closing root tag afterwards. That would be fairly simple to do.
I may be faced with this problem too. Legacy code, eh?
Ian.
Edit: You could also look at java.io.SequenceInputStream which allows you to append streams to one another. You would need to put your prefix and suffix in byte arrays and wrap them in ByteArrayInputStreams but it's all fairly straightforward.

Your XML document needs a root xml element to be considered well formed. Without this you will not be able to parse it with an xml parser.

One way is to provide your own dummy wrapper without touching the original 'xml' (the not well formed 'xml') Need the word for that:
Syntax
<!DOCTYPE some_root_elem SYSTEM "/home/ego/some.dtd"
[
<!ENTITY entity-name "Some value to be inserted at the entity">
]
Example:
<!DOCTYPE dummy [
<!ENTITY data SYSTEM "http://wherever-my-data-is">
]>
<dummy>
&data;
</dummy>

You could use another parser like Jsoup. It can parse XML without a root.

I think even if any API would have an option for this, it will only return you the first node of the "XML" which will look like a root and discard the rest.
So the answer is probably to do it yourself. Scanner or StringTokenizer might do the trick.
Maybe some html parsers might help, they are usually less strict.

Here's what I did:
There's an old java.io.SequenceInputStream class, which is so old that it takes Enumeration rather than List or such.
With it, you can prepend and append the root element tags (<div> and </div> in my case) around your no-root XML stream. (You shouldn't do it by concatenating Strings due to performance and memory reasons.)
public void tryExtractHighestHeader(ParserContext context)
{
String xhtmlString = context.getBody();
if (xhtmlString == null || "".equals(xhtmlString))
return;
// The XHTML needs to be wrapped, because it has no root element.
ByteArrayInputStream divStart = new ByteArrayInputStream("<div>".getBytes(StandardCharsets.UTF_8));
ByteArrayInputStream divEnd = new ByteArrayInputStream("</div>".getBytes(StandardCharsets.UTF_8));
ByteArrayInputStream is = new ByteArrayInputStream(xhtmlString.getBytes(StandardCharsets.UTF_8));
Enumeration<InputStream> streams = new IteratorEnumeration(Arrays.asList(new InputStream[]{divStart, is, divEnd}).iterator());
try (SequenceInputStream wrapped = new SequenceInputStream(streams);) {
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(wrapped);
From here you can do whatever you like, but keep in mind the extra element.
XPath xPath = XPathFactory.newInstance().newXPath();
}
catch (Exception e) {
throw new RuntimeException("Failed parsing XML: " + e.getMessage());
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.