Doing DOM Node-to-String transformation, but with namespace issues

Doing DOM Node-to-String transformation, but with namespace issues - java

So we have an XML Document with custom namespaces. (The XML is generated by software we don't control. It's parsed by a namespace-unaware DOM parser; standard Java7SE/Xerces stuff, but also outside our effective control.) The input data looks like this:
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<MainTag xmlns="http://BlahBlahBlah" xmlns:CustomAttr="http://BlitherBlither">
.... 18 blarzillion lines of XML ....
<Thing CustomAttr:gibberish="borkborkbork" ... />
.... another 27 blarzillion lines ....
</MainTag>
The Document we get is usable and xpath-queryable and traversable and so on.
Converting this Document into a text format for writing out to a data sink uses the standard Transformer approach described in a hundred SO "how do I change my XML Document into a Java string?" questions:
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
StringWriter stringwriter = new StringWriter();
transformer.transform (new DOMSource(theXMLDocument), new StreamResult(stringwriter));
return stringwriter.toString();
and it works perfectly.
But now I'd like to transform individual arbitrary Nodes from that Document into strings. A DOMSource constructor accepts Node pointers just the same as it accepts a Document (and in fact Document is just a subclass of Node, so it's the same API as far as I can tell). So passing in an individual Node in the place of "theXMLDocument" in the snippet above works great... until we get to the Thing.
At that point, transform() throws an exception:
java.lang.RuntimeException: Namespace for prefix 'CustomAttr' has not been declared.
at com.sun.org.apache.xml.internal.serializer.SerializerBase.getNamespaceURI(Unknown Source)
at com.sun.org.apache.xml.internal.serializer.SerializerBase.addAttribute(Unknown Source)
at com.sun.org.apache.xml.internal.serializer.ToUnknownStream.addAttribute(Unknown Source)
......
That makes sense. (The "com.sun.org.apache" is weird to read, but whatever.) It makes sense, because the namespace for the custom attribute was declared at the root node, but now the transformer is starting at a child node and can't see the declarations "above" it in the tree. So I think I understand the problem, or at least the symptom, but I'm not sure how to solve it though.
If this were a String-to-Document conversion, we'd be using a DocumentBuilderFactory instance and could call .setNamespaceAware(false), but this is going in the other direction.
None of the available properties for transformer.setOutputProperty() affect the namespaceURI lookup, which makes sense.
There is no such corresponding setInputProperty or similar function.
The input parser wasn't namespace aware, which is how the "upstream" code got as far as creating its Document to hand to us. I don't know how to hand that particular status flag on to the transforming code, which is what I really would like to do, I think.
I believe it's possible to (somehow) add a xmlns:CustomAttr="http://BlitherBlither" attribute to the Thing node, the same as the root MainTag had. But at that point the output is no longer identical XML to what was read in, even if it "means" the same thing, and the text strings are eventually going to be compared in the future. We wouldn't know if it were needed until the exception got thrown, then we could add it and try again... ick. For that matter, changing the Node would alter the original Document, and this really ought to be a read-only operation.
Advice? Is there some way of telling the Transformer, "look, don't stress your dimwitted little head over whether the output is legit XML in isolation, it's not going to be parsed back in on its own (but you don't know that), just produce the text and let us worry about its context"?

Given your quoted error message "Namespace for prefix 'CustomAttr' has not been declared.",
I'm assuming that your pseudo code is along the lines of:
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<MainTag xmlns="http://BlahBlahBlah" xmlns:CustomAttr="http://BlitherBlither">
.... 18 blarzillion lines of XML ....
<Thing CustomAttr:attributeName="borkborkbork" ... />
.... another 27 blarzillion lines ....
</MainTag>
With that assumption, here's my suggestion:
So you want to extract the "Thing" node from the "big" XML. The standard approach is to use a little XSLT to do that. You prepare the XSL transformation with:
Transformer transformer = transformerFactory.newTransformer(new StreamSource(new File("isolate-the-thing-node.xslt")));
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
transformer.setParameter("elementName", stringWithCurrentThing); // parameterize transformation for each Thing
...
EDIT: #Ti, please note the parameterization instruction above (and below in the xslt).
The file 'isolate-the-thing-node.xslt' could be a flavour of the following:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:custom0="http://BlahBlahBlah"
xmlns:custom1="http://BlitherBlither"
version="1.0">
<xsl:param name="elementName">to-be-parameterized</xsl:param>
<xsl:output encoding="utf-8" indent="yes" method="xml" omit-xml-declaration="no" />
<xsl:template match="/*" priority="2" >
<!--<xsl:apply-templates select="//custom0:Thing" />-->
<!-- changed to parameterized selection: -->
<xsl:apply-templates select="custom0:*[local-name()=$elementName]" />
</xsl:template>
<xsl:template match="node() | #*" priority="1">
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Hope that gets you over the "Thing" thing :)

I have managed to parse the provided document, get the Thing node and print it without issues.
Take a look at the Working Example:
Node rootElement = d.getDocumentElement();
System.out.println("Whole document: \n");
System.out.println(nodeToString(rootElement));
Node thing = rootElement.getChildNodes().item(1);
System.out.println("Just Thing: \n");
System.out.println(nodeToString(thing));
nodeToString:
private static String nodeToString(Node node) {
StringWriter sw = new StringWriter();
try {
Transformer t = TransformerFactory.newInstance().newTransformer();
t.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
t.setOutputProperty(OutputKeys.INDENT, "yes");
t.transform(new DOMSource(node), new StreamResult(sw));
} catch (TransformerException te) {
System.out.println("nodeToString Transformer Exception");
}
return sw.toString();
}
Output:
Whole document:
<?xml version="1.0" encoding="UTF-8"?><MainTag xmlns="http://BlahBlahBlah" xmlns:CustomAttr="http://BlitherBlither">
<Thing CustomAttr="borkborkbork"/>
</MainTag>
Just Thing:
<?xml version="1.0" encoding="UTF-8"?><Thing CustomAttr="borkborkbork"/>
When I try the same code with CustomAttr:attributeName as suggested by #marty it fails with the original exception, so it looks like somewhere in your original XML you are prefixing a attribute or node with that custom CustomAttr namespace.
In the latter case you can leverage the problem with setNamespaceAware(true), which will include the namespace information on the Thing node itself.
<?xml version="1.0" encoding="UTF-8"?><Thing xmlns:CustomAttr="http://BlitherBlither" CustomAttr:attributeName="borkborkbork" xmlns="http://BlahBlahBlah"/>

Related

Change xml namespace url in Java

I have a java REST API and we recently changed domain. The api is versioned although up to now this has involved adding removing elements across the versions.
I would like to change the namespaces if someone goes back to previous versions but I am struggling. I have realised now, after some hacking about, that it is probably because I am changing the namespace of the xml that is actually being referenced. I was thinking of it as a text document but I guess the tool is not ?
So looking at this xml with the n#namespace url veg.com ->
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<ns2:apple xmlns:ns2="http://veg.com/app/api/apple" xmlns:ns1="http://veg.com/app/api" xmlns:ns3="http://veg.com/app/api/apple/red"
xmlns:ns4="http://veg.com/app/banana" xmlns:ns5="http://veg.com/app/api/pear" xmlns:ns6="http://veg.com/app/api/orange"
ns1:created="2016-05-23T16:47:55+01:00" ns1:href="http://falseserver:8080/app/api/apple/1" ns1:id="1">
<ns2:name>granny smith</ns2:title>
<ns2:flavour>sweet</ns2:status>
<ns2:origin>southwest region</ns2:grantCategory>
...
</ns2:apple>
I would like to change the namespaces to fruit.com. This is a very hacky unit test which shows the broad approach that I have been trying...
#Test
public void testNamespaceChange() throws Exception {
Document appleDoc = load("apple.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
org.w3c.dom.Node node = (org.w3c.dom.Node) xpath.evaluate("//*[local-name()='apple']", appleDoc , XPathConstants.NODE);
NamedNodeMap nodeMap = node.getAttributes();
for (int i = 0; i < nodeMap.getLength(); i++) {
if (nodeMap.item(i).getNodeName().startsWith("xmlns:ns")) {
nodeMap.item(i).setTextContent( nodeMap.item(i).getNodeValue().replace( "veg.com", "fruit.com"));
}
}
//Check values have been set
for (int i = 0; i < nodeMap.getLength(); i++) {
System.out.println(nodeMap.item(i).getNodeName());
System.out.println(nodeMap.item(i).getNodeValue());
System.out.println("----------------");
}
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.transform(new DOMSource(node), result);
System.out.println("XML IN String format is: \n" +
writer.toString());
}
So the result of this is that the loop of nodeMap items shows the updates taking hold
i.e. all updated along these lines
xmlns:ns1
http://fruit.com/app/api
-------------------------------------------
xmlns:ns2
http://fruit.com/app/api/apple
-------------------------------------------
xmlns:ns3
http://fruit.com/app/api/apple/red
-------------------------------------------
...
but when I print out the transfomed document I get what I see in the api response...
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<ns2:apple xmlns:ns2="http://veg.com/app/api/apple" xmlns:ns1="http://veg.com/app/api" xmlns:ns3="http://fruit.com/app/api/apple/red"
xmlns:ns4="http://fruit.com/app/banana" xmlns:ns5="http://fruit.com/app/api/pear" xmlns:ns6="http://fruit.com/app/api/orange"
ns1:created="2016-05-23T16:47:55+01:00" ns1:href="http://falseserver:8080/app/api/apple/1" ns1:id="1">
The sibling (and further down the hierarchy) namespaces have been changed but ns1 and ns2 have remained unchanged.
Can anyone tell me why and whether there is a simple way for me to update them ? I guess the next step for me might be to stream the xml doc into a string, update them as text and then reload it as an xml document but I'm hoping I'm being defeatist and there is a more elegant solution ?

I would solve it with an XSLT like this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="*[namespace-uri()='http://veg.com/app/api/apple']" priority="1">
<xsl:element name="local-name()" namespace="http://fruit.com/app/api/apple">
<xsl:apply-templates select="#*|node()"/>
</xsl:element>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This stylesheet combines the identity transform with a template which changes namespace of elements in http://veg.com/app/api/apple to http://fruit.com/app/api/apple.
I think it is much simpler that Java code that you have. You'd be also more flexible, should you find out you have more differences between version of you XML apart just namespaces.
Please consider this to be a rough sketch. I wrote a book on XSLT some 15 years ago, but did not use XSLT for more than 6 or 7 years.

Accessing unparsed entities in XSLT with a SAXTransformerFactory and TransformerHandlers

I have some trouble while retrieving unparsed entity URIs, with the XPath function unparsed-entity-uri().
I'm using a SAXTransformerFactory like in "Efficient XSLT pipeline in Java" question, because I need to perform a transformations chain (i.e. apply several XSLT transformations, and use the result of a transformation as input for the second transformation).
I discovered I'm unable to retrieve unparsed entity thank to the code below. Actually it works well with Xalan, but not with Saxon-HE (version 9.7.0) - but I need Saxon because I'd rather XSLT 2.0 (even if in the code below there's nothing specific to XSLT 2, it's only for the sake of providing an example). It also works with Saxon if I don't use a TransformerHandler, e.g. stf.newTransformer(new StreamSource("transfo.xsl")).transform(new StreamSource("input.xsl"), new StreamResult(System.out)) will produce the desired output.
Is there a configuration step that I forgot?
// use "org.apache.xalan.processor.TransformerFactoryImpl" for Xalan
String transformerFactoryClassName = "net.sf.saxon.TransformerFactoryImpl";
SAXTransformerFactory stf = (SAXTransformerFactory) TransformerFactory.newInstance(transformerFactoryClassName,
LaunchSimpleTransformationUnparsedEntities.class.getClassLoader());
try {
TransformerHandler thTransf = stf
.newTransformerHandler(new StreamSource("transfo.xsl"));
// output the result in console
thTransf.setResult(new StreamResult(System.out));
// Launch transformation of input.xml
Transformer t = stf.newTransformer();
t.transform(new StreamSource("input.xml"),
new SAXResult(thTransf));
} catch (TransformerConfigurationException e) {
e.printStackTrace();
} catch (TransformerException e) {
e.printStackTrace();
}
In input, I have (for input.xml):
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE book
[<!ENTITY cover_hadrien SYSTEM "images/covers/cover_hadrien.jpg" NDATA jpeg>]>
<book>
<title>Les mémoires d'Hadrien</title>
<author>Marguerite Yourcenar</author>
<cover imgref="cover_hadrien" />
</book>
and a sample XSLT (for transfo.xsl):
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:template match="cover">
<xsl:copy>
<xsl:value-of select="unparsed-entity-uri(#imgref)"/>
</xsl:copy>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
as a result, I would expect something like:
<?xml version="1.0" encoding="UTF-8"?><book>
<title>Les mémoires d'Hadrien</title>
<author>Marguerite Yourcenar</author>
<cover>images/covers/cover_hadrien.jpg</cover>
</book>
but <cover> is empty when performing the transformation with Saxon.

Interesting observation. The issue in fact is not with Saxon's TransformerHandler, but rather with the "identity transformer" obtained using SAXTransformerFactory.newTransformer(): the identity transformer is not passing unparsed entities down the line. This is essentially because Saxon's identity transformer is reusing parts of the XSLT engine, and XSLT does not provide any way for a transformation to output unparsed entities in the result. If you sent the SAX parser output directly to the TransformerHandler, rather than going via an identity transformer, then I think it would all work.
As with all things JAXP-related, the specification of SAXTransformerFactory.newTransformer() is infuriatingly vague. All it says is that the returned Transformer performs a copy of the Source to the Result. i.e. the "identity transform". What exactly counts as a copy? I think Saxon's interpretation has been that it is equivalent to the effect of doing an XSLT identity transform - which would lose unparsed entities (as well as other things like CDATA sections, the DTD, etc).
Incidentally XSLT 2.0 specifies that the result of unparsed-entity-uri() should be an absolute URI (XSLT 1.0 doesn't say anything on the subject) so even if this is fixed, the Saxon output will be different.
Entered as a Saxon issue here: https://saxonica.plan.io/issues/3201 I think we need to be a bit careful about passing unparsed entities to a SAXResult if we don't pass all the other events expected by a SAX DTDHandler - and we're certainly not going to change the Saxon identity transformer to retain things (like DTD declarations) that aren't modelled in XDM.

Indeed, following #MichaelKay's details, launching the transformation that way works properly:
// launch transformation of input.xml
XMLReader reader = XMLReaderFactory.createXMLReader();
reader.setContentHandler(thTransf);
reader.setDTDHandler(thTransf);
reader.parse(new InputSource(input.xml"));
(this will replace the following line:
// Launch transformation of input.xml
Transformer t = stf.newTransformer();
t.transform(new StreamSource("input.xml"),
new SAXResult(thTransf));
that were used initially).

In JAXB or Xstream is it possible to Filter out certain Child Elements on Type/value during unmarshall

Hope everyone is well, quick question to see if anyone has any feedback.
I was experimenting with both JaxB and Xstream over last two days. I was basically using the XML libraries to marshal and unmarshal XML to / from Java objects. Now this was a very simple task which I got working very quickly. However, the XML I want to unmarshal into a list of Java objects is very long and contains many child elements that could be ignored and not put into the list of java objects.
For example the xml would look similar to:
<?xml version="1.0" encoding="UTF-8"?>
<Tables>
<Table1>
<TYPE>Test1</TYPE>
<DATE>2014-01-16</DATE>
<FLAG>True</FLAG>
</Table1>
<Table1>
<TYPE>Test2</TYPE>
<DATE>2014-01-15</DATE>
<FLAG>False</FLAG>
</Table1>
<Table1>
<TYPE>Test1</TYPE>
<DATE>2014-01-14</DATE>
<FLAG>True</FLAG>
</Table1>
</Tables>
So I would like the library to iterate through all the xml elements and unmarshal into a list of java objects which so far works, however as it iterates I would like to add additional functionality to check the Type and Flag element values, if TYPE value equals Test2 and or if Flag value equals False to ignore this child element all together and not include it in the finished list of Java objects. Does anyone know if this is possible with either JaxB or Xstream? Alternatively, can anyone suggest maybe a better approach to accomplish this which requires minimum code and manual parsing.
I have been looking at ValidationEventHandler and XmlAdapter in JaxB but I do not think these will allow me do what I want. I got close with the Xmldapter however the unmarshal has to return either null or an object for each xml child element it processes, it also changed the xml syntax to attribute form i.e TYPE = "Test1" etc which I did not see any way of altering.
Xstream allows you to implement a Converter which has a canConvert method, however this only works on Class type, and not child element type which I weant to check for each child element. Had a look at MapperWrapper wrapMapper method which can be overloaded in Xstream, but it only shows element attribute name, i.e FLAG and does not show value, also if it did show value I do not see anyway of telling the function to ignore child root element and all attributes for said child.
Anyway, that's my two cent. Any advice?

If you choose EclipseLink MOXy as your JAXB implementation (rather than the default implementation), you can use annotations on your Java classes for unmarshalling that employ XPath expressions. This could be used to filter out certain input. Here is the link: http://www.eclipse.org/eclipselink/moxy.php
Alternatively, and probably more simple, would be to use the XML transformation API with a stylesheet that has templates which filter out the unwanted content. Please check class javax.xml.bind.util.JAXBResult, which allows you to transform from one source (for example an InputStream or InputReader) directly to Java objects. Think of it as unmarshalling with a transformer in between.
EDIT:
I'll give you a hand with a basic XSLT and some code.
Here's the stylesheet that would do what you describe:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="Table1[TYPE = 'Test1' or FLAG = 'True']">
<!-- Don't do anything, since we want to filter these Table1 elements out -->
</xsl:template>
</xsl:stylesheet>
And a code excerpt that can serve as a basis:
//Obtain a TransformerFactory
//Obtain a Source for your stylesheet, like a StreamSource
Transformer transformer = transformerFactory.newTransformer(source);
//Next, create an Unmarshaller from a JAXBContext
Unmarshaller unmarshaller = context.createUnmarshaller();
//Create a JAXBResult with the Unmarshaller
JAXBResult result = new JAXBResult(unmarshaller);
//Obtain a Source for your input XML, and transform
transformer.transform(inputSource, result);
//Get the JAXBElement from the result
final JAXBElement<?> jaxbEl = (JAXBElement<?>)result.getResult();
//And now your unmarshalled Java bean from the JAXBElement
Object bean = jaxbEl.getValue();

Append an element in a XML using DOM keeping the format

i have a xml like this
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Empleado>
<ConsultorTecnico>
<Nombre>Pablo</Nombre>
<Legajo antiguedad="4 meses">7778</Legajo>
</ConsultorTecnico>
<CNC>
<Nombre>Brian</Nombre>
<Legajo antiguedad="1 año, 7 meses">2134</Legajo>
<Sueldo>4268.0</Sueldo>
</CNC>
</Empleado>
What i want is to read a XML and append "Sueldo" at the same level than "Nombre" and "Legajo" in the element "CNC". "Sueldo" must be "Legajo" x 2
The code I have appends "Sueldo" as you can see in the XML above but it does not indent it as it should, Im using the propierties to indent (This XML is created the same way, using DOM)
public class Main
{
public static void main(String[] args)
{
try
{
File xml = new File("C:\\Empleado.xml");
if (xml.exists() == true)
{
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(xml);
String legajo = doc.getElementsByTagName("Legajo").item(1).getFirstChild().getNodeValue();
Element sueldo = doc.createElement("Sueldo");
Node valorSueldo = doc.createTextNode(String.valueOf(Float.valueOf(legajo)*2));
sueldo.appendChild(valorSueldo);
Node cnc = doc.getElementsByTagName("CNC").item(0);
cnc.appendChild(sueldo);
DOMSource source = new DOMSource(doc);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
t.setOutputProperty(OutputKeys.INDENT, "yes");
t.setOutputProperty("{http://xml.apache.org/xslt}indent-amount","2");
FileOutputStream fos = new FileOutputStream("C:\\Empleado.xml");
StreamResult sr = new StreamResult(fos);
t.transform(source,sr);
}
else
throw new Exception("No hay archivo XML con ese nombre en el directorio");
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
}
}
Thank you in advance guys, I'll appreciate the help here!

Assuming your input file is the same as the output you've shown but without the Sueldo element, then the initial CNC element has five child nodes as far as the DOM is concerned
The whitespace text node (newline and four spaces) between <CNC> and <Nombre>
The Nombre element node
The whitespace text node (newline and four spaces) between </Nombre> and <Legajo
The Legajo element node
The whitespace text node (newline and two spaces) between </Legajo> and </CNC>
You are inserting the Sueldo element after this final text node, which produces
<CNC>
<Nombre>Brian</Nombre>
<Legajo antiguedad="1 año, 7 meses">2134</Legajo>
<Sueldo>4268.0</Sueldo></CNC>
and the INDENT output property simply moves the closing </CNC> tag to the next line, aligned with the opening one. To get the auto indentation to do the right thing you would need to remove all the whitespace-only text nodes from the initial tree.
Alternatively, forget the auto-indentation and do it yourself - instead of adding Sueldo as the last child of CNC (after that final text node), instead add a newline-and-four-spaces text node immediately after the Legajo (i.e. before the last text node) and then add the Sueldo element after that.
As an alternative approach entirely, I would consider doing the transformation in XSLT rather than using the DOM APIs
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<!-- ignore whitespace-only text nodes in the input -->
<xsl:strip-space elements="*"/>
<!-- and re-indent the output -->
<xsl:output method="xml" indent="yes" />
<!-- Copy everything verbatim except where otherwise specified -->
<xsl:template match="#*|node()">
<xsl:copy><xsl:apply-templates select="#*|node()" /></xsl:copy>
</xsl:template>
<!-- For CNC elements, add a Sueldo as the last child -->
<xsl:template match="CNC">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
<Sueldo><xsl:value-of select="Legajo * 2" /></Sueldo>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
which you could either run using the TransformerFactory API from Java code or using a standalone command-line XSLT processor.

XML does not intrinsically define any indentation or pretty-form. If you want it to be "indented", you need to insert content with newlines and spaces. In this case, you need content immediately after element Legajo and before element Sueldo.
To my taste, the best strategy is to ignore all formatting from XML files and use generalized prettyfiers immediately before human consumption. Or, better, give them good XML editors. If you have every program that manipulates XML files concerned about this detail, most of the benefits of XML are gone (and a lot of effort misused).
UPDATE: Just noticed that you are using element CNC to "position" the insert, not Legajo. The space-and-newlines content needs to go immediately before element CNC (and after element Sueldo).

How to make javax Transformer output HTML (no self-closing tags)?

I'm using a javax.xml.transform.Transformer to convert an XML file into an HTML file. It can happen that a div will have no content, which causes the Transformer to output <div/>, which breaks rendering.
I've searched and found that "You can change the xslt output to html instead of xml to avoid the problem with self closing tags", but that was for a different tool and I'm left wondering: how do I do that with a javax Transformer?

It looks like you create the transformer as normal, and then use Transformer.setOutputProperty to set the METHOD property to "html"
For example:
private static final DocumentBuilderFactory sDocumentFactory;
private static DocumentBuilder sDocumentBuilder;
private static DOMImplementation sDomImplementation;
private static final TransformerFactory sTransformerFactory =
TransformerFactory.newInstance();
private static Transformer sTransformer;
static {
sDocumentFactory = DocumentBuilderFactory.newInstance();
sDocumentFactory.setNamespaceAware( true );
sDocumentFactory.setIgnoringComments( true );
sDocumentFactory.setIgnoringElementContentWhitespace( true );
try {
sDocumentBuilder = sDocumentFactory.newDocumentBuilder();
sDomImplementation = sDocumentBuilder.getDOMImplementation();
sTransformer = sTransformerFactory.newTransformer();
sTransformer.setOutputProperty( OMIT_XML_DECLARATION, "yes" );
sTransformer.setOutputProperty( INDENT, "no" );
sTransformer.setOutputProperty( METHOD, "html" );
} catch( final Exception ex ) {
ex.printStackTrace();
}
}

The way to output valid HTML with XSLT is to use the <xsl:output> instruction with its method attribute set to html.
Here is a small example:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<div>
<xsl:apply-templates select="x/y/z"/>
</div>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the following XML document:
<t/>
the wanted result is produced (the same result is produced by 8 different XSLT processors I am working with):
<div></div>
In case the unwanted output happens only with a specific XSLT processor, then this is an implementation issue with this particular processor and more an "xsltprocessors" than "xslt" question.

This answer in another thread doesn't seem to work in my case; even if I specify <xsl:output method="html"...> it still produces <div/> instead of <div></div>.
I don't know if my IDE or compiler is broken (IBM Rational Application Developer), but I'm using a work-around of detecting blank nodes and inserting single spaces in them. Less clean, but effective...

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.