Cutting xml parts with StAX and process them with XSLT - java

I'm trying to cut a xml into parts and then apply it some transformations. Currently I have this code:
public class XMLStax_xslt {
static boolean allowStream = false;
public static void main(String[] args) throws Exception {
FileInputStream fis = new FileInputStream("SourceExternalFile.xml");
XMLInputFactory xmlif = null;
xmlif = XMLInputFactory.newInstance();
Source xslt = new StreamSource(new File("myTransformFile.xslt"));
StreamFilter filter = new StreamFilter() {
#Override
public boolean accept(XMLStreamReader reader) {
int eventType = reader.getEventType();
if ( eventType == XMLEvent.START_ELEMENT )
{
String currentTag = reader.getLocalName();
if (currentTag.equals("wantedTag"))
{
allowStream = true;
}
}
if ( eventType == XMLEvent.END_ELEMENT )
{
String currentTag = reader.getLocalName();
if (currentTag.equals("wantedTag"))
{
allowStream = false;
}
}
return allowStream;
}
};
XMLStreamReader xmlR = xmlif.createFilteredReader(xmlif.createXMLStreamReader(fis),filter);
while (xmlR.hasNext())
{
TransformerFactory transformerXSLT = TransformerFactory.newInstance();
Transformer currentXslt = transformerXSLT.newTransformer(xslt);
currentXslt.transform(new StAXSource(xmlR), new StreamResult("targetFile.xml"));
}
fis.close();
}
}
Which works when the line return allowStream; is changed to return true;. So, what I need is send only the part I need to the transformation because sending the whole XML is not an option.
How can I achieve that?
Thanks.

The trouble was that I was passing the string to the transformer, instead of the whole node. Changing XMLStreamReader by XMLEventReader does the trick.
Here's the change:
public static void main(String[] args) throws Exception {
FileInputStream fis = new FileInputStream("SourceExternalFile.xml");
XMLInputFactory xmlif = null;
xmlif = XMLInputFactory.newInstance();
Source xslt = new StreamSource(new File("myTransformFile.xslt"));
XMLEventReader xmlR = xmlif.createXMLEventReader(xmlif.createXMLStreamReader(fis));
TransformerFactory transformerXSLT = TransformerFactory.newInstance();
Transformer currentXslt = transformerXSLT.newTransformer(xslt);
while (xmlR.hasNext())
{
XMLEvent xmlEvent = xmlR.nextEvent();
if ( xmlEvent.equals("wantedTag") )
{
currentXslt.transform(new StAXSource(xmlR), new StreamResult("targetFile.xml"));
}
}
xmlR.close();
fis.close();
}

Related

XLM transformation based on XSLT: no exception thown by Transformer

I convert an XML file to PDF, through XSL-LO.
This process involves a Transformer based on a XSLT file.
This is how I get the Transformer:
private void prepareTransformer(final TransformerFactory tFactory)
throws TransformerConfigurationException {
xmlTransformer = tFactory.newTransformer(
new StreamSource(getClass().getResourceAsStream(xsltPath)));
}
My problem is that if the XSLT is not well formed, then the above mentioned code would logs an exception but not throws it. This is the output in the console:
System-ID unbekannt; Zeilennummer328; Spaltennummer69; xsl:when ist an dieser Position in der Formatvorlage nicht zulässig!
System-ID unbekannt; Zeilennummer63; Spaltennummer54; org.xml.sax.SAXException: javax.xml.transform.TransformerException: ElemTemplateElement-Fehler: testcaseDetails
javax.xml.transform.TransformerException: ElemTemplateElement-Fehler: testcaseDetails
(Position des Fehlers unbekannt)org.apache.fop.fo.ValidationException: Dem Element "fo:simple-page-master" fehlt ein verlangtes Property "master-name"! (Keine Kontextinformationen verfügbar)
As no exception is thrown by the librairy, I can not so easily handle this situation.
Expected behaviour: #newTransformer(Source source) would throw an exception. Is there a way to achieve that?
EDIT: here the full class:
#Log
public class Report {
private final String xsltPath;
private FopFactory fopFactory;
private Transformer xmlTransformer;
private Report(final String xsltPath) {
this.xsltPath = xsltPath;
try {
prepareTransformer(createFobTransformer());
} catch (final SAXException | IOException | ConfigurationException |
TransformerConfigurationException ex) {
log.log(Level.SEVERE, null, ex);
throw new PdfReportException(ex);
}
}
public static Report getInstance() {
return new Report("/report/report-regd.xslt");
}
private void prepareTransformer(final TransformerFactory tFactory)
throws TransformerConfigurationException {
xmlTransformer = tFactory.newTransformer(
new StreamSource(getClass().getResourceAsStream(xsltPath)));
}
private TransformerFactory createFobTransformer()
throws SAXException, IOException, ConfigurationException {
final InputStream inFop = getClass().getResourceAsStream("/report/fop.xconf");
final var fopBuilder = new FopFactoryBuilder(new File(".").toURI(),
new ClasspathResolverURIAdapter());
final var cfgBuilder = new DefaultConfigurationBuilder();
final Configuration cfg = cfgBuilder.build(inFop);
fopBuilder.setConfiguration(cfg);
fopFactory = fopBuilder.build();
final TransformerFactory tFactory = TransformerFactory.newInstance();
tFactory.setURIResolver(new ClasspathURIResolver());
return tFactory;
}
public byte[] generatePDFReport(final String xml) {
final var out = new ByteArrayOutputStream();
final var agent = fopFactory.newFOUserAgent();
try {
final var fop = agent.newFop(MimeConstants.MIME_PDF, out);
final var result = new SAXResult(fop.getDefaultHandler());
final var src = new StreamSource(new StringReader(xml));
xmlTransformer.transform(src, result);
} catch (final FOPException | TransformerException e) {
throw new PdfReportException(e);
}
return out.toByteArray();
}
public static class PdfReportException extends RuntimeException {
private static final long serialVersionUID = 6009368304491510184L;
public PdfReportException(final Throwable cause) {
super(cause);
}
}
}
No exception is catched in Report constructor:

How to improve performance of JAXB/StAX XML output

I am attempting to write out a very large XML object, using the code below. I am processing 200K-350K objects/nodes, and the output-to-file is unbearably slow.
Any suggestions on how to improve the performance of the output implementation? I understand that the IndentingXMLStreamWriter may be one of the culprits, but I really need the output to be human readable (even if it is likely not going to be read due to size).
driver implementation...
public class SomeClient {
public static void main(String args[]) {
TransactionXmlWriter txw = new TransactionXmlWriter();
TransactionType tranType = getNextTransaction();
try {
txw.openXmlOutput("someFileName.xml");
while(tranType != null) {
txw.processObject(tranType);
tranType = getNextTransaction();
}
txw.closeXmlOutput();
} catch(JAXBException e) {
} catch(FileNotFoundException e) {
} catch(XMLStreamExceptoin e) {
}
}
}
implementation class...
public class TransactionXmlWriter {
private final QName root = new QName("ipTransactions");
private Marshaller marshaller = null;
private FileOutputStream fileOutputStream = null;
private XMLOutputFactory xmlOutputFactory = null;
private XMLStreamWriter xmlStreamWriter = null;
// constructor
public TransactionXmlWriter() throws JAXBException{
JAXBContext jaxbContext = JAXBContext.newInstance(TransactionType.class);
xmlOutputFactory = XMLOutputFactory.newFactory();
marshaller = jaxbContext.createMarshaller();
marshaller.setProperty(Marshaller.JAXB_FRAGMENT, true);
}
// write out "body" of XML
public void processObject(TransactionType transaction) {
JAXBElement<TransactionType> transactionJaxB = null;
try {
transactionJaxB = new JAXBElement<>(root, TransactionType.class, transaction);
marshaller.marshal(transactionJaxB, xmlStreamWriter);
} catch(JAXBException e) {
// TO DO : some kind of error handling
System.out.println(e.getMessage());
System.out.println(e.getStackTrace());
}
}
// open file to write XML into
public void openXmlOutput(String fileName) throws FileNotFoundException,
XMLStreamException {
fileOutputStream = new FileOutputStream(fileName);
xmlStreamWriter = new IndentingXMLStreamWriter(xmlOutputFactory.createXMLStreamWriter(fileOutputStream));
writeXmlHeader();
}
// write XML footer and close the stream/file
public void closeXmlOutput() throws XMLStreamException {
writeXmlFooter();
xmlStreamWriter.close();
}
private void writeXmlHeader() throws XMLStreamException {
xmlStreamWriter.writeStartDocument("UTF-8", "1.0");
xmlStreamWriter.writeStartElement("ipTransactions");
}
private void writeXmlFooter() throws XMLStreamException {
xmlStreamWriter.writeEndElement();
xmlStreamWriter.writeEndDocument();
}
}

Reading and writing an xml in java

I am reading an XML file using Stax parser and writing it using DOM in java. I am not getting desired XML output. I read following XML file
<config>
<Sensor1>
<name>abc</name>
<range>100</range>
</Sensor1>
<sensor2>
<name>xyz</name>
<range>100</range>
</sensor2>
</config>
I parse the above XML file using Stax parser as follows
public void readConfig(String configFile) {
boolean sensor1 = false;
boolean sensor2 = false;
try
{
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
InputStream in = new FileInputStream(configFile);
XMLEventReader eventReader = inputFactory.createXMLEventReader(in);
// Read the XML document
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
if (event.isStartElement()) {
StartElement startElement = event.asStartElement();
if (startElement.getName().getLocalPart() == (sensor1)) {
sensor1 = true;
Sensor1 Obj1 = new Sensor1();
}
if (startElement.getName().getLocalPart() == (sensor2)) {
sensor2 = true;
Sensor2 Obj2 = new Sensor2();
}
if (sensor1) {
if (event.asStartElement().getName().getLocalPart().equals(name)) {
event = eventReader.nextEvent();
Obj1.set_Sensor_Name(event.asCharacters().getData());
continue;
}
if (event.asStartElement().getName().getLocalPart().equals(range)) {
event = eventReader.nextEvent();
Obj1.set_Sensor_Range(event.asCharacters().getData());
continue;
}
}
if (sensor2) {
if (event.asStartElement().getName().getLocalPart().equals(name)) {
event = eventReader.nextEvent();
Obj2.set_Sensor_Name(event.asCharacters().getData());
continue;
}
if (event.asStartElement().getName().getLocalPart().equals(range)) {
event = eventReader.nextEvent();
Obj1.set_Sensor_Range(event.asCharacters().getData());
continue;
}
}
if (event.isEndElement()) {
EndElement endElement = event.asEndElement();
if (endElement.getName().getLocalPart() == (sensor1)) {
sensor1.addToArray();
}
if (endElement.getName().getLocalPart() == (sensor2)) {
sensor2.addToArray();
}
}
}
In "Sensor1" and "Sensor2" class I am adding extra information depending on some condition.
class Sensor1 {
public ArrayList<Object> list = new ArrayList<Object>();
String name;
double range;
public void set_Sensor_Name(String name) {
this.name = name;
}
public void set_Sensor_Range(double range) {
this.range = range;
}
public void addToArray(){
double distance =50;
if(distance<range){
list.add("TITANIC");
list.add(123456);
}
WriteFile fileObj = new WriteFile();
fileObj.writeXMlFile(list);
}
}
This is the class to write the XML
public class WriteFile {
public void writeXmlFile(ArrayList<Object> list) {
try {
DocumentBuilderFactory dFact = DocumentBuilderFactory.newInstance();
DocumentBuilder build = dFact.newDocumentBuilder();
Document doc = build.newDocument();
Element root = doc.createElement("SensorTracks");
doc.appendChild(root);
Element sensorInfo = doc.createElement("SensorDetails");
root.appendChild(sensorInfo);
Element vesselInfo = doc.createElement("VesselDetails");
root.appendChild(vesselInfo);
for(int i=0; i<list.size(); i +=4 ) {
Element name = doc.createElement("SensorName");
name.appendChild(doc.createTextNode(String.valueOf(list.get(i))));
sensorInfo.appendChild(name);
Element range = doc.createElement("SensorRange");
name.appendChild(doc.createTextNode(String.valueOf(list.get(i+1))));
sensorInfo.appendChild(range);
Element mmi = doc.createElement("shipname");
mmi.appendChild(doc.createTextNode(String.valueOf(list.get(i+2))));
vesselInfo.appendChild(mmi);
Element license = doc.createElement("license");
license.appendChild(doc.createTextNode(String.valueOf(list.get(i+3))));
vesselInfo.appendChild(license);
}
// Save the document to the disk file
TransformerFactory tranFactory = TransformerFactory.newInstance();
Transformer aTransformer = tranFactory.newTransformer();
// format the XML nicely
aTransformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");
aTransformer.setOutputProperty(
"{http://xml.apache.org/xslt}indent-amount", "4");
aTransformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(doc);
try {
FileWriter fos = new FileWriter("/home/ros.xml");
StreamResult result = new StreamResult(fos);
aTransformer.transform(source, result);
} catch (IOException e) {
e.printStackTrace();
}
} catch (TransformerException ex) {
System.out.println("Error outputting document");
} catch (ParserConfigurationException ex) {
System.out.println("Error building document");
}
When I execute, I get following XML
<SensorTracks>
<sensorDetails>
<SensorName>xyz</SensorName>
<SensorRange>100</SensorRange>
</sensorDetails>
<VesselDetails>
<shipname>TITANIC</shipname>
<license>123456</license>
</vesselDetails>
MY FINAL OUTPUT MUST BE
<config>
<SensorTracks>
<sensorDetails>
<SensorName>xyz</SensorName>
<SensorRange>100</SensorRange>
<SensorName>abc</SensorName>
<SensorRange>100</SensorRange>
</sensorDetails>
<VesselDetails>
<shipname>TITANIC</shipname>
<license>123456</license>
</vesselDetails>
What wrong thing I am I doing in my code ?? Any help is appreciated. Thanks in advance
I am answering my own question again. The problem is very simple. To get the desired output as mention above. just make the following changes to "WriteFile" class.
FileWriter fos = new FileWriter("/home/ros.xml" ,true);
Finally, I am learning Java :)
Frankly speaking the example looks cumbersome. Do you consider to use apache digester of jaxb?
http://commons.apache.org/digester/
http://www.oracle.com/technetwork/articles/javase/index-140168.html

Parsing an XML in Java using STax

This, even to me, seems like a silly question but then is one of those to which i cant find an answer.
Im trying to parse an XML using STax in Java and the XMl im trying to parse looks like this --
<?xml version="1.0" encoding="UTF-8"?>
<Macros>
<MacroDefinition>
<MacroName>
<string>Macro1</string>
</MacroName>
</MacroDefinition>
</Macros>
Now i have a Macro class as follows --
public class Macro {
private String name;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
I also have a parser class from where i try to convert the XML into an object of the 'Macro' class. The parser class snippet is as follows --
public class StaxParser {
static final String MACRODEFINITION = "MacroDefinition";
static final String MACRONAME = "MacroName";
static final String STRING = "string";
#SuppressWarnings({ "unchecked", "null" })
public List<Item> readMacro(String configFile) {
List<Macro> macroList = new ArrayList<Macro>();
try {
// First create a new XMLInputFactory
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
// Setup a new eventReader
InputStream in = new FileInputStream(configFile);
XMLEventReader eventReader = inputFactory.createXMLEventReader(in);
// Read the XML document
Macro macro = null;
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
if (event.isStartElement()) {
StartElement startElement = event.asStartElement();
if (startElement.getName().getLocalPart() == (MACRODEFINITION)) {
macro = new Macro();
}
if (event.isStartElement()) {
if (event.asStartElement().getName().getLocalPart()
.equals(MACRONAME)) {
Iterator<Attribute> attributes = event
.asStartElement().getAttributes();
while (attributes.hasNext()) {
Attribute attribute = attributes.next();
if (attribute.getName().toString()
.equals(STRING)) {
macro.setMacroName(event.asCharacters()
.getData());
}
}
event = eventReader.nextEvent();
continue;
}
}
}
// If we reach the end of an item element we add it to the list
if (event.isEndElement()) {
EndElement endElement = event.asEndElement();
if (endElement.getName().getLocalPart() == (MACRODEFINITION)) {
macroList.add(macro);
}
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (XMLStreamException e) {
e.printStackTrace();
}
return macroList;
}
}
The problem im facing is that the parser is not able to read the child nodes of 'MacroName'. Im thinking getAttributes is what is causing it not to work but have no clue of what method i should be calling to get the child nodes of any particular node.
Any help with this would be greatly appreciated.
Thanks
p1nG
Sorry to say that, but your code has many issues and doesn't even compile.
First of all, the return type should be List<Macro>, since the Macro class doesn't inherit from, nor implement, the Item.
Second, you should ensure a safe nesting, to follow the schema of your XML, not arbitrarily test for event name equality and create Macro objects here and there along the way. If you plan to retreive also other data besides the macro name, you can't get away with just checking for the STRING event occurence.
Third, it's useless to nest the same checks, e.g. event.isStartElement().
Fourth, you should provide a Source or a Reader or a Stream to a class such as the StaxParser, not directly a filename, but I didn't include this change to avoid breaking your API.
class StaxParser {
static final String MACRODEFINITION = "MacroDefinition";
static final String MACRONAME = "MacroName";
static final String STRING = "string";
#SuppressWarnings({ "unchecked", "null" })
public List<Macro> readMacro(final String configFile) {
final List<Macro> macroList = new ArrayList<Macro>();
try {
// First create a new XMLInputFactory
final XMLInputFactory inputFactory = XMLInputFactory.newInstance();
// Setup a new eventReader
final InputStream in = new FileInputStream(configFile);
final XMLEventReader eventReader = inputFactory.createXMLEventReader(in);
// Read the XML document
final Template template = getTemplate(eventReader);
macroList.addAll(template.process(null, getMacrosProcessor(template)));
} catch (final FileNotFoundException e) {
e.printStackTrace();
} catch (final XMLStreamException e) {
e.printStackTrace();
}
return macroList;
}
interface Template {
<T> T process(String parent, EventProcessor<T> ep) throws XMLStreamException;
}
static Template getTemplate(final XMLEventReader eventReader) {
return new Template() {
#Override
public <T> T process(final String parent, final EventProcessor<T> ep) throws XMLStreamException {
T t = null;
boolean process = true;
while (process && eventReader.hasNext()) {
final XMLEvent event = eventReader.nextEvent();
if (ep.acceptsEvent(event)) {
t = ep.processEvent(event);
}
if (event.isEndElement()) {
if (null != parent && parent.equals(event.asEndElement().getName().getLocalPart())) {
process = false;
}
}
}
return t;
}
};
}
interface EventProcessor<T> {
boolean acceptsEvent(XMLEvent event);
T processEvent(XMLEvent event) throws XMLStreamException;
}
static EventProcessor<List<Macro>> getMacrosProcessor(final Template template) {
final List<Macro> macroList = new ArrayList<Macro>();
return new EventProcessor<List<Macro>>() {
#Override
public boolean acceptsEvent(final XMLEvent event) {
return event.isStartElement()
&& MACRODEFINITION.equals(event.asStartElement().getName().getLocalPart());
}
#Override
public List<Macro> processEvent(final XMLEvent event) throws XMLStreamException {
macroList.add(template.process(MACRODEFINITION, getMacroDefinitionProcessor(template)));
return macroList;
}
};
}
static EventProcessor<Macro> getMacroDefinitionProcessor(final Template template) {
return new EventProcessor<Macro>() {
#Override
public boolean acceptsEvent(final XMLEvent event) {
return event.isStartElement() && MACRONAME.equals(event.asStartElement().getName().getLocalPart());
}
#Override
public Macro processEvent(final XMLEvent event) throws XMLStreamException {
final Macro macro = new Macro();
macro.setName(template.process(MACRONAME, getMacroNameProcessor(template)));
return macro;
}
};
}
static EventProcessor<String> getMacroNameProcessor(final Template template) {
return new EventProcessor<String>() {
#Override
public boolean acceptsEvent(final XMLEvent event) {
return event.isStartElement() && STRING.equals(event.asStartElement().getName().getLocalPart());
}
#Override
public String processEvent(final XMLEvent event) throws XMLStreamException {
return template.process(STRING, getStringProcessor());
}
};
}
static EventProcessor<String> getStringProcessor() {
return new EventProcessor<String>() {
#Override
public boolean acceptsEvent(final XMLEvent event) {
return event.isCharacters();
}
#Override
public String processEvent(final XMLEvent event) throws XMLStreamException {
return event.asCharacters().getData();
}
};
}
}
First notice that Macro1 is not XML attribute, so event attributes will be empty. Code after changes (I have only shown lines of code that may be of interest):
if (event.isStartElement()
&& event.asStartElement().getName().getLocalPart().equals(STRING)) {
if (macro == null) {
macro = new Macro();
}
macro.setName(eventReader.getElementText());
}
A few tips: never ever compare strings using == use equals method. If you need full working example I could post my solution, but it is bit more complicated.
You have to change
macro.setMacroName(event.asCharacters().getData());
to
macro.setMacroName(attribute.getvalue().toString());

Stax issue parsing document which have an end element and start element on the same line

I have the following code for converting the elements of an XML file into a String using Stax:
private static XMLStreamReader getReader(InputStream inputStream) throws XMLStreamException {
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
xmlInputFactory.setProperty("javax.xml.stream.isValidating", false);
xmlInputFactory.setProperty("javax.xml.stream.supportDTD", false);
XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(inputStream);
return xmlStreamReader;
}
private static String readElement(XMLStreamReader reader) throws XMLStreamException, TransformerException {
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
StAXSource source = new StAXSource(reader);
t.transform(source, new StreamResult(outputStream));
return outputStream.toString();
}
public static void main(String[] args) throws Exception {
InputStream inputStream = new FileInputStream("c:\\temp\\test.xml");
XMLStreamReader xmlStreamReader = getReader(inputStream);
int count = 0;
while (xmlStreamReader.hasNext()) {
int eventType = xmlStreamReader.next();
if (eventType == XMLEvent.START_ELEMENT) {
String elementName = xmlStreamReader.getName().getLocalPart();
if (!elementName.toLowerCase().equals("element")) {
continue;
}
String productStr = readElement(xmlStreamReader);
System.out.println(productStr);
}
}
}
}
This works fine on the following XML fragment:
<testDoc>
<element>
<a>hello world</a>
<b>hello world again</b>
</element>
<element>
<a>foo</a>
<b>foo bar</b>
</element>
</testDoc>
However, there are problems with this fragment where the </element> and <element> are on the same line:
<testDoc>
<element>
<a>hello world</a>
<b>hello world again</b>
</element><element>
<a>foo</a>
<b>foo bar</b>
</element>
</testDoc>
In the second example it only seems to process the first element and not the second one. Any ideas?
Update:
I got it to work with the following code:
public static void main(String[] args) throws Exception {
InputStream inputStream = new FileInputStream("c:\\temp\\test.xml");
XMLStreamReader xmlStreamReader = getReader(inputStream);
int count = 0;
while (xmlStreamReader.hasNext()) {
int eventType = xmlStreamReader.getEventType();
if (eventType == XMLEvent.START_ELEMENT) {
String elementName = xmlStreamReader.getName().getLocalPart();
if (!elementName.toLowerCase().equals("element")) {
xmlStreamReader.next();
continue;
}
System.out.println(readElement(xmlStreamReader));
} else {
xmlStreamReader.next();
}
}
}
Looks like a bug to me. You don't say which Stax parser you are using: some of them are pretty ropey. Woodstox is the most reliable.

Categories

Resources