Issue loading .XML files with doc4J in Java - java

I am facing an issue where I cannot load even the sample word2003xml.xml which is provided by doc4J for tests in docx4j-samples-docx4j-8.3.1.zip found here https://www.docx4java.org/downloads.html
I tried loading the file using 2 different constructors but the result is the same.
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new FileInputStream(new File("C:\\Mine\\project4tests\\word2003xml.xml")));
WordprocessingMLPackage wordMLPackage2 = WordprocessingMLPackage.load(new java.io.File("C:\\Mine\\project4tests\\word2003xml.xml"));
Here is the exception that I am getting:
Exception in thread "main" org.docx4j.openpackaging.exceptions.Docx4JException: Couldn't load xml from stream
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:641)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:418)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:376)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:341)
at org.docx4j.openpackaging.packages.WordprocessingMLPackage.load(WordprocessingMLPackage.java:182)
at Main.main(Main.java:13)
Caused by: javax.xml.bind.UnmarshalException
with linked exception:
[com.sun.istack.internal.SAXParseException2; lineNumber: 3; columnNumber: 827; unexpected element (uri:"http://schemas.microsoft.com/office/word/2003/wordml", local:"wordDocument"). Expected elements are <{http://schemas.microsoft.com/office/2006/xmlPackage}package>,<{http://schemas.microsoft.com/office/2006/xmlPackage}part>,<{http://schemas.microsoft.com/office/2006/xmlPackage}xmlData>]
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.handleStreamException(UnmarshallerImpl.java:468)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:402)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:371)
at org.docx4j.convert.in.FlatOpcXmlImporter.<init>(FlatOpcXmlImporter.java:132)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:638)
... 5 more
Caused by: com.sun.istack.internal.SAXParseException2; lineNumber: 3; columnNumber: 827; unexpected element (uri:"http://schemas.microsoft.com/office/word/2003/wordml", local:"wordDocument"). Expected elements are <{http://schemas.microsoft.com/office/2006/xmlPackage}package>,<{http://schemas.microsoft.com/office/2006/xmlPackage}part>,<{http://schemas.microsoft.com/office/2006/xmlPackage}xmlData>
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.handleEvent(UnmarshallingContext.java:726)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:247)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:242)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportUnexpectedChildElement(Loader.java:109)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext$DefaultRootLoader.childElement(UnmarshallingContext.java:1131)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(UnmarshallingContext.java:556)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(UnmarshallingContext.java:538)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.InterningXmlVisitor.startElement(InterningXmlVisitor.java:60)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXStreamConnector.handleStartElement(StAXStreamConnector.java:231)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StAXStreamConnector.bridge(StAXStreamConnector.java:165)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:400)
... 8 more
Caused by: javax.xml.bind.UnmarshalException: unexpected element (uri:"http://schemas.microsoft.com/office/word/2003/wordml", local:"wordDocument"). Expected elements are <{http://schemas.microsoft.com/office/2006/xmlPackage}package>,<{http://schemas.microsoft.com/office/2006/xmlPackage}part>,<{http://schemas.microsoft.com/office/2006/xmlPackage}xmlData>
... 19 more
There is no issue loading a .DOCX file, however what I need to use the docx4J library is to convert an old .DOC (WordprocessingML more like an .XML) file into a .DOCX.
Similar to what is done here https://coderanch.com/t/721499/java/Word-XML-DOCX
Does anybody know why I cannot load the file properly?

See https://github.com/plutext/docx4j/blob/master/docx4j-core/src/main/java/org/docx4j/convert/in/word2003xml/Word2003XmlConverter.java for 2003 XML files.
Note that .doc is the old binary format; its not XML, it is something different again.

Functionality by JasonPlutext here: https://github.com/plutext/docx4j/blob/master/docx4j-core/src/main/java/org/docx4j/convert/in/word2003xml/Word2003XmlConverter.java
This was later fixed with this commit here: https://github.com/plutext/docx4j/commit/2c846e7c633d0264757521d15a3f5f37b037b815

Related

Parse error while converting edi to java?

I am trying to convert EDI data format to java.
The EDI data is as follows
HDR*1*0*59.97*64.92*4.95*Wed Nov 15 13:45:28 EST 2006
CUS*user1*Harry^Fletcher*SD
ORD*1*1*364*The 40-Year-Old Virgin*29.98
ORD*2*1*299*Pulp Fiction*29.99
I have referred to the folllowing link while implementing this.
While executing the project, getting the below error:
Caused by: org.smooks.api.SmooksException: Parse Error: Failed to populate order-item[2]. Cause: Parse Error: Terminator '%NL;' not found
I tired executing the mentioned project, wanted data formatted to be java object.
But ended up with the below error
Caused by: org.smooks.api.SmooksException: Parse Error: Failed to populate order-item[2]. Cause: Parse Error: Terminator '%NL;' not found
You are missing a newline at the end of the EDI document hence the error. For some reason, the newline isn't rendered when viewing the example file from GitHub but it's present when you view it locally.

Apache POI XmlException for theme (http://purl.oclc.org/ooxml/drawingml)

Currently, I'm using Apache POI for reading an excel file(.xlsx) but encountering an exception during instantiation of XSSFWorkbook when passing the data stream. Below is the exception encountered.
Apache Poi version: 4.0.1
Exception in thread "main" org.apache.poi.ooxml.POIXMLException: error: The document is not a theme#http://schemas.openxmlformats.org/drawingml/2006/main: document element namespace mismatch expected "http://schemas.openxmlformats.org/drawingml/2006/main" got "http://purl.oclc.org/ooxml/drawingml/main"
at org.apache.poi.ooxml.POIXMLFactory.createDocumentPart(POIXMLFactory.java:66)
at org.apache.poi.ooxml.POIXMLDocumentPart.read(POIXMLDocumentPart.java:657)
at org.apache.poi.ooxml.POIXMLDocument.load(POIXMLDocument.java:180)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:286)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:307)
at com.wl.dni.excel.parser.Test.main(Test.java:47)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:56)
at org.apache.poi.ooxml.POIXMLFactory.createDocumentPart(POIXMLFactory.java:63)
... 5 more
Caused by: java.io.IOException: error: The document is not a theme#http://schemas.openxmlformats.org/drawingml/2006/main: document element namespace mismatch expected "http://schemas.openxmlformats.org/drawingml/2006/main" got "http://purl.oclc.org/ooxml/drawingml/main"
at org.apache.poi.xssf.model.ThemesTable.<init>(ThemesTable.java:88)
... 11 more
Caused by: org.apache.xmlbeans.XmlException: error: The document is not a theme#http://schemas.openxmlformats.org/drawingml/2006/main: document element namespace mismatch expected "http://schemas.openxmlformats.org/drawingml/2006/main" got "http://purl.oclc.org/ooxml/drawingml/main"
at org.apache.xmlbeans.impl.store.Locale.verifyDocumentType(Locale.java:454)
at org.apache.xmlbeans.impl.store.Locale.autoTypeDocument(Locale.java:359)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1275)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1259)
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
at org.openxmlformats.schemas.drawingml.x2006.main.ThemeDocument$Factory.parse(Unknown Source)
at org.apache.poi.xssf.model.ThemesTable.<init>(ThemesTable.java:86)
... 11 more
Any idea how to fix this kind of issue or library can be use. Thanks.
Apache POI does not support xlsx files saved with Strict OOXML format (which uses the http://purl.oclc.org/ooxml/drawingml namespace).
Try to save the file using standard (transitional) OOXML format.
https://github.com/pjfanning/ooxml-strict-converter might help if you need to convert the file yourself.

Log4j2 not working with json config file even if configured properly

This is a duplicate of this question but the solution proposed there doesn't work for me and I cannot yet comment.
The issue is explained in the title itself: Log4j2 is not working with .json config file even if configured properly with log4j.configurationFactory=org.apache.logging.log4j.core.config.json.JsonConfigurationFactory in log4j2.component.properties file.
The full error stack trace is:
[Fatal Error] log4j2.json:1:1: Content is not allowed in prolog.
ERROR StatusLogger Error parsing /Users/sm/cdss-scala/risk-stratification/src/main/resources/log4j2.json
org.xml.sax.SAXParseException; systemId: file:///Users/sm/cdss-scala/risk-stratification/src/main/resources/log4j2.json; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
at org.apache.logging.log4j.core.config.xml.XmlConfiguration.<init>(XmlConfiguration.java:95)
at org.apache.logging.log4j.core.config.xml.XmlConfigurationFactory.getConfiguration(XmlConfigurationFactory.java:46)
at org.apache.logging.log4j.core.config.ConfigurationFactory$Factory.getConfiguration(ConfigurationFactory.java:491)
at org.apache.logging.log4j.core.config.ConfigurationFactory$Factory.getConfiguration(ConfigurationFactory.java:420)
at org.apache.logging.log4j.core.config.ConfigurationFactory.getConfiguration(ConfigurationFactory.java:265)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:613)
at org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:634)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:229)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:152)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:45)
at org.apache.logging.log4j.LogManager.getContext(LogManager.java:194)
at org.apache.logging.log4j.scala.Logger$.apply(Logger.scala:39)
at org.apache.logging.log4j.scala.Logging$class.$init$(Logging.scala:28)
at eu.connecare.cdss.hadrian.HadrianService$.<init>(HadrianService.scala:11)
at eu.connecare.cdss.hadrian.HadrianService$.<clinit>(HadrianService.scala)
at eu.connecare.cdss.hadrian.HadrianService.runEngine(HadrianService.scala)
at eu.connecare.cdss.hadrian.HadrianLaunchable.main(HadrianLaunchable.java:6)
ERROR StatusLogger No logging configuration
It's no use to show your json config, as my experience, something wrong of BOM with your json file. Try to add/remove the BOM to/from your json file.
Waht's BOM and how to remove the BOM, see here:
Byte order mark

getting exception error whiule doing xslt transformation through xalan in java

I am transforming an xml to generate a new xml by parsing it through xslt bit while doing transformation through xalan api.. now below i am doing the transformation
String mess = "C:\\wer\\erty.xml";
mess = mess.trim().replaceFirst("^([\\W]+)<","<");
// perform XSL transformation
xsltTransformer.transform(msgStreamSource, xmlOutput);
I am getting the below error ..
ERROR: 'Content is not allowed in prolog.'
ERROR: 'com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Content is not allowed in prolog.'
javax.xml.transform.TransformerException: javax.xml.transform.TransformerException: com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Content is not allowed in prolog.
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(Unknown Source)

SAXParseException when print the pdf from xslt and xml

I am trying to generate pdf from xml and xslt.
[Fatal Error] :89:14: Invalid byte 1 of 1-byte UTF-8 sequence.
file:////; Line #89; Column #14; org.xml.sax.SAXParseException; systemId: file:////; lineNumber: 89; columnNumber: 14; Invalid byte 1 of 1-byte UTF-8 sequence.
severe:XSLT Transformation failed null
JBAS014134: EJB Invocation failed on component PDFGenerationBean for
method public abstract java.util.List
au.com.copl.dbaccesslayer.session.PDFGenerationRemote.getPDFs(java.util.List,java.util.List,java.lang.Integer,java.lang.Integer)
throws java.lang.Exception: javax.ejb.EJBException:
java.lang.RuntimeException: java.lang.NullPointerException
Line 89 is
Part 1 – contains information about us and the services we can provide
to you; and.
Actually - sign creating the problem here. Now I have removed from this line. And pdf generated successfully.
now changed with this line
Part 1 contains information about us and the services we can provide
to you; and.

Categories

Resources