I have a problem using Apache POI to read some .docx contents and show the result as unformatted preview. Im using POI version 3.11.
Code:
private static String POI2Text(File file) {
POITextExtractor extractor = null;
try {
extractor = ExtractorFactory.createExtractor(file);
return extractor.getText();
} catch (Exception ex) {
logger.warn("Error:", ex);
} finally {
if (extractor!=null) try { extractor.close(); } catch (Exception ex) { logger.warn("Error:", ex); }
}
return "";
}
The following Exception is thrown in the finally block (extractor.close()):
org.apache.poi.openxml4j.exceptions.OpenXML4JRuntimeException: Fail to save: an error occurs while > saving the package : part
at org.apache.poi.openxml4j.opc.ZipPackage.saveImpl(ZipPackage.java:503) ~[agent.jar:na]
at org.apache.poi.openxml4j.opc.OPCPackage.save(OPCPackage.java:1425) ~[agent.jar:na]
at org.apache.poi.openxml4j.opc.OPCPackage.save(OPCPackage.java:1412) ~[agent.jar:na]
at org.apache.poi.openxml4j.opc.ZipPackage.closeImpl(ZipPackage.java:353) ~[agent.jar:na]
at org.apache.poi.openxml4j.opc.OPCPackage.close(OPCPackage.java:425) ~[agent.jar:na]
at org.apache.poi.POIXMLTextExtractor.close(POIXMLTextExtractor.java:87) ~[agent.jar:na]
....
Caused by: java.lang.IllegalArgumentException: part
at org.apache.poi.openxml4j.opc.OPCPackage.addPackagePart(OPCPackage.java:873) ~[agent.jar:na]
at org.apache.poi.openxml4j.opc.ZipPackage.saveImpl(ZipPackage.java:448) ~[agent.jar:na]
... 15 common frames omitted
Any ideas how to prevent this exception? The biggest problem is, that poi doesn't release the file handle after the exception is thrown. I need to be able to move or edit the file outside of my app.
Just a quick feedback: I could resolve the issue by opening an inputstream as read-only and then using this stream to extract data with the POITextExtractor.
try (InputStream is = Files.newInputStream(path, StandardOpenOption.READ);
POITextExtractor extractor = ExtractorFactory.createExtractor(is)) {
return extractor.getText();
} catch (Exception ex) {
logger.warn("Error in file {}", path, ex);
}
Related
i'm currently trying to develop a tool which uses Apache TikaParser to extract the content from different files. In most cases everything works fine but there a some files where Tika throws the following exception:
Mar 09, 2020 11:21:58 AM org.apache.poi.ss.format.CellFormat <init>
WARNING: Invalid format: "_([$€-2]\ * "-"_);"
java.lang.IllegalArgumentException: Unsupported [] format block '[' in '_([$€-2]\ * "-"_)' with c2: null
at org.apache.poi.ss.format.CellFormatPart.formatType(CellFormatPart.java:373)
at org.apache.poi.ss.format.CellFormatPart.getCellFormatType(CellFormatPart.java:287)
at org.apache.poi.ss.format.CellFormatPart.<init>(CellFormatPart.java:191)
at org.apache.poi.ss.format.CellFormat.<init>(CellFormat.java:193)
at org.apache.poi.ss.format.CellFormat.getInstance(CellFormat.java:167)
at org.apache.poi.ss.usermodel.DataFormatter.getFormat(DataFormatter.java:343)
at org.apache.poi.ss.usermodel.DataFormatter.formatRawCellContents(DataFormatter.java:901)
at org.apache.poi.ss.usermodel.DataFormatter.formatRawCellContents(DataFormatter.java:873)
at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.formatNumberDateCell(FormatTrackingHSSFListener.java:143)
at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener$TikaFormatTrackingHSSFListener.formatNumberDateCell(ExcelExtractor.java:673)
at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.internalProcessRecord(ExcelExtractor.java:447)
at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processRecord(ExcelExtractor.java:340)
at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.processRecord(FormatTrackingHSSFListener.java:92)
at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener$TikaFormatTrackingHSSFListener.processRecord(ExcelExtractor.java:666)
at org.apache.poi.hssf.eventusermodel.HSSFRequest.processRecord(HSSFRequest.java:109)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:178)
at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:135)
at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processFile(ExcelExtractor.java:316)
at org.apache.tika.parser.microsoft.ExcelExtractor.parse(ExcelExtractor.java:169)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:183)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:131)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:188)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
at attproc.processors.AttachmentProcessor.run(AttachmentProcessor.java:68)
at attproc.Main.lambda$main$0(Main.java:89)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
I'm trying to catch this exception with the following code:
try {
byte[] content = Files.readAllBytes(path);
try {
Metadata metadata = new Metadata();
BodyContentHandler handler = new BodyContentHandler(-1);
ParseContext parseContext = new ParseContext();
parseContext.set(PDFParserConfig.class, tikaConfig.pdfConfig);
try {
tikaConfig.autoDetectParser.parse(new ByteArrayInputStream(content), handler, metadata, parseContext);
text = Optional.ofNullable(handler.toString()).orElse("");
} catch (Exception ignored) {}
} catch (Exception ignored) {
}
} catch (IOException ignored) {
}
"tikaConfig" is a singleton object:
public class TikaConfiguration {
private final TikaConfig tikaConfig;
public final PDFParserConfig pdfConfig;
public final Parser autoDetectParser;
private static TikaConfiguration instance;
private TikaConfiguration() throws Exception {
ClassLoader classLoader = getClass().getClassLoader();
InputStream stream = classLoader.getResourceAsStream("tikaconfig.xml");
this.tikaConfig = new TikaConfig(stream);
this.pdfConfig = new PDFParserConfig();
pdfConfig.setExtractInlineImages(false);
tikaConfig.getDetector();
autoDetectParser = new AutoDetectParser(tikaConfig);
}
public static TikaConfiguration setConfiguration() {
if (TikaConfiguration.instance == null) {
try {
TikaConfiguration.instance = new TikaConfiguration();
} catch (Exception ignored) {}
}
return TikaConfiguration.instance;
}
}
What do i have to do to catch this exception?
Take a look at this somewhat old thread. What you are seeing looks very similar. It suggests that the POI library, used by Tika for parsing Excel, is throwing a warning, not an error (and your log output reflects that also). The warning happens to include a stack trace in its logging (caught by POI I assume, then passed on to Tika).
The warning would therefore not be caught by your code (it's not a thrown exception).
As one commenter mentions in the JIRA:
I'm not sure this is even a bug. This is the output of the POILogger, not, e.g. printStackTrace().
Regardless of its status as a bug, a work-around is also proposed in the JIRA: When running the application, redirect the err stream to null (an example is provided).
I downloaded the spreadsheet attached to the JIRA and I was able to recreate their version of your message:
WARNING: Invalid format: "_([$Ç-2]\ * #,##0.00_);"
java.lang.IllegalArgumentException: Unsupported [] format block '[' in '_([$Ç-2]\ * #,##0.00_)' with c2: null
at org.apache.poi.ss.format.CellFormatPart.formatType(CellFormatPart.java:373)
at org.apache.poi.ss.format.CellFormatPart.getCellFormatType(CellFormatPart.java:287)
at org.apache.poi.ss.format.CellFormatPart.<init>(CellFormatPart.java:191)
at org.apache.poi.ss.format.CellFormat.<init>(CellFormat.java:193)
...
However, my program completed successfully. It went on to generate its output correctly.
I am coding a download file controller
Sometime user will close the browser window before the file is fully written. - which is cool.
The problem is that my logs are full of this error:
org.apache.catalina.connector.ClientAbortException: java.io.IOException: An established connection was aborted by the software in your host machine
at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:333)
at org.apache.catalina.connector.OutputBuffer.flushByteBuffer(OutputBuffer.java:758)
at org.apache.catalina.connector.OutputBuffer.append(OutputBuffer.java:663)
at org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:368)
at org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:346)
at org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:96)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2147)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:2102)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2123)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:2078)
When I am trying to catch only this specific error eclipse is saying:
ClientAbortException cannot be resolved to a type
I have the project setup and running correctly so is it possible to catch only this specific exception:
org.apache.catalina.connector.ClientAbortException
I would like to keep the IOException in case of another catastrophe.
The ClientAbortException is derived from IOException. You have to inspect exactly what exception caused the IOException e:
...
} catch (FileNotFoundException fnfe) {
// ... handle FileNotFoundException
} catch (IOException e) {
String exceptionSimpleName = e.getCause().getClass().getSimpleName();
if ("ClientAbortException".equals(exceptionSimpleName)) {
// ... handle ClientAbortException
} else {
// ... handle general IOException or another cause
}
}
return null;
Instead of looking for a certain class name (that ties your application to a specific servlet container) I usually handle IOExceptions on write differently than IOExceptions on read, like so (very pseudo-ish code):
try {
byte[] buffer = ...
in.read(buffer);
try {
out.write(buffer);
} catch (IOException writeException) {
// client aborted request
}
} catch (IOException readException) {
// something went wrong -> signal 50x or something else
}
Worked out quite fine so far.
(As of #Nikolas Charalambidis answer)
Ignore "org.apache.catalina.connector.ClientAbortException:java.io.IOException: Connection reset by peer"
e.getCause().getClass().getSimpleName() == "IOException"
e.getMessage() == "java.io.IOException: Connection reset by peer"
And don't handle general IOException and another cause
...
} catch (IOException e) {
if (! e.getMessage().contains("Connection reset by peer")) {
throw e;
}
} finally {
close(output);
close(input);
}
...
Here is the code sample, I want to capture the exception throwed by mybatis:
String resource = "com/sureone/server/db/mybatis-config.xml";
Reader reader = null;
try {
reader = Resources.getResourceAsReader(resource);
} catch (IOException e) {
e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates.
}
SqlSessionFactory factory = new SqlSessionFactoryBuilder().build(reader);
sqlSession = factory.openSession(true);
tUserMapper = sqlSession.getMapper(TUserMapper.class);
if(tUserMapper.insert(user)>0){ <===Exception throwed here for duplicate entry problem
return verifyLogin(user.getAccount(),user.getPassword());
}
return null;
The exception I want to captured:
org.apache.ibatis.exceptions.PersistenceException:
### Error updating database. Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry 'userName' for key 'account_UNIQUE'
You can capture the PersistenceException as you would do usually :
try {
...
} catch (PersistenceException pe) {
}
But don't forget that this Exception wraps the real one:
From MyBatis code
} catch (Exception e) {
throw ExceptionFactory.wrapException("Error committing transaction. Cause: " + e, e);
}
So if you would like the get a grip on the cause of the PersistenceException you'll have to use .getCause() method on the PersistenceException
Be aware that MyBatis can also launch its own PersistenceException (TooManyResultException,BindingException ...) classes, those won't have a cause Exception wrapped.
You can capture the ibatis exception by adding a try/catch block around your statements that invoke myBatis query/insert. For instance, if you use the SqlSessionTemplate and the selectList() method, you can do this:
try {
myResults = mySqlSessionTemplate.selectList("getInfoList", parameterMap);
} catch (final org.apache.ibatis.exceptions.PersistenceException ex) {
logger.error("Problem accessing database");
throw ex;
}
Whether you re-throw the exception or consume and deal with it here is your choice. However, beware of "eating" it and not dealing with the problem, since this will allow calling code to progress without knowing about the underlying data access problem.
If you want more info on the error, the full source can be downloaded here
Hey, I'm reading an ini file using java.util.Properties; and I've run into a strange issue. When I try to load a specific file, the thing spits out this strange exception that I've been trying for about a day to eliminate.
java.io.IOException: Read error
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(Unknown Source)
at java.util.Properties$LineReader.readLine(Unknown Source)
at java.util.Properties.load0(Unknown Source)
at java.util.Properties.load(Unknown Source)
at IniReader.load(IniReader.java:20)
at plane.<init>(plane.java:22)
at renderingArea.<init>(flight_optimizer.java:93)
at flight_optimizer_GUI.<init>(flight_optimizer.java:159)
at flight_optimizer.main(flight_optimizer.java:46)
I had previously been reading this file just fine with no problems, I then changed a bit of how I was calling and had to add an extra line at the bottom. If I remove that line, the problem does not occour.
the txt file is:
x=0
y=0
max_velocity=.1
passengers=100
num_planes=1
If I remove the num_planes=1 line, the file gets read fine.
Relevant code:
import java.util.Enumeration;
public class IniReader {
//global vars
public IniReader(){
// initializeing stuffs
}
public void load(InputStream inStream) throws IOException {
this.inStream = inStream;
this.properties.load(this.inStream);
this.keys = this.properties.propertyNames();
inStream.close();
}
}
class renderingArea extends JPanel {
//Global vars
private IniReader ini;
public renderingArea(){
super();
// Initializing some things
files = new fileManager();
ini = new IniReader();
FileInputStream planeStream;
FileInputStream cityStream;
try {
planeStream = files.getIni("plane.ini");
ini.load(planeStream);
//extraneous code
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e1) {
e1.printStackTrace();
} catch (NumberFormatException e1) {
e1.printStackTrace();
}
}
//moar extraneous code
}
That is why:
Your code (flight_optimizer.java, line 82 and further):
FileInputStream planeStream;
...
planeStream = files.getIni("plane.ini");
ini.load(planeStream);
...
for( int i=0; i<planes.length; i++ ){
planes[i] = new plane(planeStream);
}
Both the second line and every cycle iteration leads us here (IniReader.java, line 17):
public void load(InputStream inStream) throws IOException {
this.inStream = inStream;
this.properties.load(this.inStream);
this.keys = this.properties.propertyNames();
inStream.close();
}
You are trying to use the same InputStream multiple times, moreover, you are trying to use it after it already was closed. You will need to recreate the stream, or, preferably, read configuration once and use it multiple times.
As a side note, the recommended way to use the streams in Java is the following:
InputStream is = ...;
try {
// Reading from the stream
} finally {
is.close();
}
This will make sure that the system resources associated with the stream will always be released.
I had the same problem. Turns out that my underlying InputStream was already closed. That became obvious when I ran my test under Linux, where a more meaningful error message was emitted by the operating system.
While upgrading sun application server 8.2 to a new patch level an exception occurred and I don't know why. Following a code snippet from a Servlet:
public void init() throws ServletException {
Properties reqProperties = new Properties();
try {
reqProperties.load(this.getClass().getResourceAsStream(
"/someFile.properties"));
} catch (IOException e) {
e.printStackTrace();
}
...
}
The file does exists on the classpath and in previous patch versions it worked just fine. but now when deploying this result in a exception. The stack trace:
[#|2010-04-14T16:43:48.208+0200|WARNING|sun-appserver-ee8.2|javax.enterprise.system.core.classloading|_ThreadID=11;|loader.InputStreams with no valid reference is closed
java.lang.Throwable
at com.sun.enterprise.loader.EJBClassLoader$SentinelInputStream.<init>(EJBClassLoader.java:1172)
at com.sun.enterprise.loader.EJBClassLoader.getResourceAsStream(EJBClassLoader.java:858)
at java.lang.Class.getResourceAsStream(Class.java:1998)
at a.package.TestServlet.init(TestServlet.java:44)
at javax.servlet.GenericServlet.init(GenericServlet.java:261)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:592)
at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:249)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAsPrivileged(Subject.java:517)
at org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:282)
at org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:165)
at org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:118)
at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1093)
at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:931)
at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4183)
at org.apache.catalina.core.StandardContext.start(StandardContext.java:4535)
at com.sun.enterprise.web.WebModule.start(WebModule.java:241)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1086)
at org.apache.catalina.core.StandardHost.start(StandardHost.java:847)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1086)
at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:483)
at org.apache.catalina.startup.Embedded.start(Embedded.java:894)
at com.sun.enterprise.web.WebContainer.start(WebContainer.java:741)
at com.sun.enterprise.web.HttpServiceWebContainer.startInstance(HttpServiceWebContainer.java:963)
at com.sun.enterprise.web.HttpServiceWebContainerLifecycle.onStartup(HttpServiceWebContainerLifecycle.java:50)
at com.sun.enterprise.server.ApplicationServer.onStartup(ApplicationServer.java:300)
at com.sun.enterprise.server.PEMain.run(PEMain.java:308)
at com.sun.enterprise.server.PEMain.main(PEMain.java:221)
|#]
I've no idea what could be the problem anyone have any idea?
(note that I changed some names in the code and stacktrace)
Are you sure it throws an exception? We get warnings like this in Glassfish all the time. The EJBClassLoader uses a throwable to dump the stack trace so it may look like an exception to you.
EJBClassLoader wraps all streams with sentinels. This warning simply tells you that your stream is not closed. You can safely ignore it. To get rid of the warning, you have to close the stream after you use it.
you should always close inputstreams after using:
public void init() throws ServletException {
InputStream str = null;
Properties reqProperties = new Properties();
try {
str = this.getClass().getResourceAsStream("/someFile.properties");
reqProperties.load(str);
} catch (IOException e) {
e.printStackTrace();
} finally {
if (str != null) {
try {
str.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
btw, the finally clause can be made a lot simpler using apache commons / io:
finally {
IOUtils.closeQuietly(str);
}