Read xlsx file using POIFSFileSystem - java

I need to unprotect a protected xlsx file.e.g Book1.xlsx
Below code runs fine for the first time, Reads Book1.xlsx, decrypt it and again write it to the same filename.
public static void unprotectXLSXSheet(String fileName, String password) {
try{
POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(fileName));
EncryptionInfo info = new EncryptionInfo(fs);
Decryptor d = Decryptor.getInstance(info);
d.verifyPassword(password);
InputStream is = d.getDataStream(fs);
System.out.println(is.available());
XSSFWorkbook wb = new XSSFWorkbook(OPCPackage.open(is));
FileOutputStream fileOut;
fileOut = new FileOutputStream(fileName);
wb.write(fileOut);
fileOut.flush();
fileOut.close();
}catch(FileNotFoundException ex){
ex.printStackTrace();
}catch(IOException ex){
ex.printStackTrace();
But when the same code tries to access the newly created unprotected Book1.xlsx(or anyother unprotected xlsx file) it fails and showing
Exception in thread "main" org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:131)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:138)
at com.wolseley.Excel.TestMainDummy.unprotectXLSXSheet(TestMainDummy.java:113)
at com.wolseley.Excel.TestMainDummy.main(TestMainDummy.java:52)
i need help in reading xlsx file and also unlock it using password, as done above.

Basically the following line of code doesn't work for Office 2007+ XML documents:
POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(fileName));
So you'll first need to check the header in the input stream whether it's supported by calling this:
POIFSFileSystem.hasPOIFSHeader(is)
and only decrypting if the above returns true. The hasPOIFSHeader method requires an input stream that supports mark/reset, so check that as well and wrap it in a PushbackInputStream if not.
Putting it all together then becomes something like this:
public static void unprotectXLSXSheet(String fileName, String password) throws Exception {
InputStream is = null;
FileOutputStream fileOut = null;
try {
is = new FileInputStream(fileName);
if (!is.markSupported()) {
is = new PushbackInputStream(is, 8);
}
if (POIFSFileSystem.hasPOIFSHeader(is)) {
POIFSFileSystem fs = new POIFSFileSystem(is);
EncryptionInfo info = new EncryptionInfo(fs);
Decryptor d = Decryptor.getInstance(info);
d.verifyPassword(password);
is = d.getDataStream(fs);
}
System.out.println(is.available());
XSSFWorkbook wb = new XSSFWorkbook(OPCPackage.open(is));
fileOut = new FileOutputStream(fileName);
wb.write(fileOut);
fileOut.flush();
} finally {
if (is != null) {
is.close();
}
if (fileOut != null) {
fileOut.close();
}
}
}

Below old Stack Overflow answer may be help you out from this.
Reading property sets from Office 2007+ documents with java poi
The class you'll want is POIXMLProperties, something like:
OPCPackage pkg = OPCPackage.open(new File("file.xlsx"));
POIXMLProperties props = new POIXMLProperties(pkg);
System.out.println("The title is " + props.getCorePart().getTitle());
From POIXMLProperties you can get access to all the built-in properties, and the custom ones too!

Related

Unable to create HSSFWorkbook workbook for a .xls file that is within a ZIP file

My requirement is that there is a .xls file within a zip file, which can be downloaded using an URL. As I want to read this excel file in memory (for later processing), without downloading the zip locally, I have used ZipInputStream and this is how the main part of my code looks like:
String finalUrl = "https://server/myZip.zip"
URL url = new URL(finalUrl);
InputStream inputStream = new BufferedInputStream(url.openStream());
ZipInputStream zis = new ZipInputStream(inputStream);
ZipEntry file;
try {
while ((file = zis.getNextEntry()) != null) {
if (file.getName().endsWith(".xls")) {
log.info("xls file found");
log.info("file name : {}", file.getName());
byte excelBytes[] = new byte[(int)file.getSize()];
zis.read(excelBytes);
InputStream excelInputStream = new ByteArrayInputStream(excelBytes);
HSSFWorkbook wb = new HSSFWorkbook(excelInputStream);
HSSFSheet sheet = wb.getSheetAt(8);
log.info("sheet : {}", sheet.getSheetName());
}
else {
log.info("xls file not found");
}
}
}
finally{
zis.close();
}
But unfortunately I am receiving the following error:
java.lang.ArrayIndexOutOfBoundsException: Index -3 out of bounds for length 3247
Note:
The .xls file is around 2MB and the zip file does not have any complex structure such as sub-directories or multiple files.
Any help here would be highly appreciated. Thanks!
Thanks to #PJ Fanning for highlighting this,
The problem was in zis.read(excelBytes) which does not guarantee to read all the bytes. After using IOUtils.toByteArrayinstead, the problem was resolved. The correct code is:
String finalUrl = "https://server/myZip.zip"
URL url = new URL(finalUrl);
InputStream inputStream = new BufferedInputStream(url.openStream());
ZipInputStream zis = new ZipInputStream(inputStream);
ZipEntry file;
try {
while ((file = zis.getNextEntry()) != null) {
if (file.getName().endsWith(".xls")) {
log.info("xls file found");
log.info("file name : {}", file.getName());
byte excelBytes[] = IOUtils.toByteArray(zis);
InputStream excelInputStream = new ByteArrayInputStream(excelBytes);
HSSFWorkbook wb = new HSSFWorkbook(excelInputStream);
HSSFSheet sheet = wb.getSheetAt(8);
log.info("sheet : {}", sheet.getSheetName());
}
else {
log.info("xls file not found");
}
}
}
finally{
zis.close();
}

Domino Java extract attachment and read excel using Apache POI

I am trying to extract "excel template" from the existing notes document to the file system and reading using Apache POI. I am trying to extract excel template only if it doesn't already exist in the directory, else trying to read from the same location but it's throwing null exception when it runs second time.
here is my code,
//If it has an attachment
if (doc.hasEmbedded()) {
RichTextItem body = (RichTextItem)docTmpl.getFirstItem("Body");
Vector atts = body.getEmbeddedObjects();
EmbeddedObject att = (EmbeddedObject)atts.elementAt(0);
if ( att.getType() == EmbeddedObject.EMBED_ATTACHMENT ) {
String templatePath = "C:\\externalfiles\\excel\\" + att.getSource();
File f = new File( templatePath );
if(f.exists())
{
System.out.println("Template already exist,use existing one." + templatePath);
} else{
att.extractFile(templatePath);
}
XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(templatePath));
FileOutputStream fileOut = new FileOutputStream(fileName);
XSSFSheet sheet1 = wb.getSheet("General");

How to read embedded objects from a doc file?

I am trying to read embedded documents in a .doc file using POIFS file system. According to this there is an ObjectPool directory which contains all embedded documents especially for a doc file.
I found the directory but don't know the way to read these documents.
Please suggest any way to read these documents. If POIFS is not the suitable way then please suggest any other library.
My code is:
public static void ReadCSV(String fileName) throws IOException{
FileInputStream myInput = new FileInputStream(fileName);
POIFSFileSystem fs = new POIFSFileSystem(myInput);
HSSFWorkbook workbook = new HSSFWorkbook(fs);
for (HSSFObjectData obj : workbook.getAllEmbeddedObjects()) {
//the OLE2 Class Name of the object
System.out.println("Objects : "+ obj.getOLE2ClassName()+ " 2 .");
String oleName = obj.getOLE2ClassName();
if (oleName.equals("Worksheet")) {
// some code to process embedded excel file;
} else if (oleName.equals("Document")) {
System.out.println("Document");
DirectoryNode dn = (DirectoryNode) obj.getDirectory();
HWPFDocument embeddedWordDocument = new HWPFDocument(dn,fs);
System.out.println("Doc : " + embeddedWordDocument.getRange().text());
// want to extract document not text into a doc file
//************************
FileOutputStream fos = new FileOutputStream("E:\\log.txt");
fos.write(text.getBytes());
//************************
} else if (oleName.equals("Presentation")) {
// some code to process embedded power point file;
} else {
// some code to process other kind of embedded files;
}
}
}

Unable to open excel using ApachePOI - Getting Exception

While trying to open an excel using ApachePOI I get
org.apache.poi.openxml4j.exceptions.InvalidOperationException: Can't open the specified file: 'C:\Users\mdwaipay\AppData\Local\Temp\poifiles\poi-ooxml-1570030023.tmp'
I checked. No such folder is being created. I am using Apache POI version 3.6.
Any help? A similar code was running fine in a different workspace. At loss of thoughts here.
Code:
public Xls_Reader(String path) {
this.path=path;
try {
fis = new FileInputStream(path);
workbook = new XSSFWorkbook(fis);
sheet = workbook.getSheetAt(0);
fis.close();
}
catch (Exception e)
{ e.printStackTrace();
}
}
Why are you taking a perfectly good file, wrapping it in an InputStream, then asking POI to have to buffer the whole lot for you so it can do random access? Life is much better if you just pass the File to POI directly, so it can skip about as needed!
If you want to work with both XSSF (.xlsx) and HSSF (.xls), change your code to be
public Xls_Reader(String path) {
this.path = path;
try {
File f = new File(path);
workbook = WorkbookFactory.create(f);
sheet = workbook.getSheetAt(0);
} catch (Exception e) {
e.printStackTrace();
}
}
If you only want XSSF support, and/or you need full control of when the resources get closed, instead do something like
OPCPackage pkg = OPCPackage.open(path);
Workbook wb = new XSSFWorkbook(pkg);
// use the workbook
// When you no longer needed it, immediately close and release the file resources
pkg.close();

Preserve images in Excel headers using Apache POI

I am trying to generate Excel reports using Apache POI 3.6 (latest).
Since POI has limited support for header and footer generation (text only), I decided to start from a blank excel file with the header already prepared and fill the Excel cells using POI (cf. question 714172).
Unfortunately, when opening the workbook with POI and writing it immediately to disk (without any cell manpulation), the header seems to be lost.
Here is the code I used to test this behavior:
public final class ExcelWorkbookCreator {
public static void main(String[] args) {
FileOutputStream outputStream = null;
try {
outputStream = new FileOutputStream(new File("dump.xls"));
InputStream inputStream = ExcelWorkbookCreator.class.getResourceAsStream("report_template.xls");
HSSFWorkbook workbook = new HSSFWorkbook(inputStream, true);
workbook.write(outputStream);
} catch (Exception exception) {
throw new RuntimeException(exception);
} finally {
if (outputStream != null) {
try {
outputStream.close();
} catch (IOException exception) {
// Nothing much to do
}
}
}
}
}
HSSFWorkbook workbook = new HSSFWorkbook();
HSSFSheet sheet = new HSSFSheet();
Header header = sheet.getHeader() //get header from workbook's sheet
header.setCenter(HSSFHeader.font("COURIER", "Normal")+ HSSFHeader.fontSize((short) 15) + "Hello world" +new Date()); // set header with desire font style
FileOutputStream fileOut = new FileOutputStream("C:\\book.xls");
workbook.write(fileOut);
The headers of the Excel file are preserved as long as those headers are supported in Excel 97-2003. For example, images are supported (I just tried it), but colored text is not.
The tricky part of this is that your Excel template file "dump.xls" must be in Excel 97-2003 format. Please note: this is not the file extension, but the actual contents of the file. The newest Excel will happily save the newest formatting in a .xls file, which POI cannot read.
To test this, save your Excel file as an .xls file. Important - If you receive a compatibility warning, then you must click the "Correct" link in the dialog to correct the Excel. Just clicking Proceed makes the Excel file invalid to POI.
Once you have a real .xls file (with compatible contents) then your code works. I just tested it myself:
public static void main(String[] args) throws Exception {
try (FileInputStream fis = new FileInputStream("./report_template.xls");
FileOutputStream fos = new FileOutputStream("./dump.xls")) {
HSSFWorkbook wb = new HSSFWorkbook(fis);
wb.write(fos);
}
}

Categories

Resources