Embed files into XSSF sheets in Excel, using Apache POI - java

I have found kiwiwings answer to the question of how you can embed files into Excel using Apache POI, but unfortunately his answer only covers HSSF spreadsheets (the XLS format), and we are currently using the new XSSF format (XLSX), and the solution proposed for HSSF spreadsheets will not work. I tried porting it, but the final nail in the coffin comes from the fact that there is no HSSFObjectData equivalent in the XSSF world.
This is what I have done so far - I have found a way to attach the files to the Excel file. This code does it:
private PackagePart packageNotebook(
final OPCPackage pkg,
final String notebookTable,
final String taskId,
final String notebookName,
final byte[] contents
) throws InvalidFormatException, IOException
{
final PackagePartName partName =
PackagingURIHelper.createPartName( "/notebook/" + notebookTable + "/" + taskId + "/" + notebookName );
pkg.addRelationship( partName, TargetMode.INTERNAL, PackageRelationshipTypes.CUSTOM_XML );
final PackagePart part = pkg.createPart( partName, "text/xml" );
IOUtils.write( contents, part.getOutputStream() );
return part;
}
I was also able to create the image I wanted to use as the anchor in the Excel file. What I am unable to do, however, is to "link" that image to the embedded content, as kiwiwings was able to do in his reply.
My end goal is to have an XLSX Excel file with embedded objects in it, in such a way that the user can double click in the anchor I open in the cells and then be able to edit the file, just like you would do if you were embedding a file using the Excel client.
Does anyone have a working example on how to do that?

I've applied a patch via #60586, so embedding is now much easier. The following snipplet is taken from the corresponding JUnit test.
Workbook wb1 = new XSSFWorkbook();
Sheet sh = wb1.createSheet();
int picIdx = wb1.addPicture(getSamplePng(), Workbook.PICTURE_TYPE_PNG);
byte samplePPTX[] = getSamplePPT(true);
int oleIdx = wb1.addOlePackage(samplePPTX, "dummy.pptx", "dummy.pptx", "dummy.pptx");
Drawing<?> pat = sh.createDrawingPatriarch();
ClientAnchor anchor = pat.createAnchor(0, 0, 0, 0, 1, 1, 3, 6);
pat.createObjectData(anchor, oleIdx, picIdx);

Compared to the HSSF/Package Manager stuff this was more straight forward. :)
So as usual I've started by creating the necessary file via Excel 2016 and checked what's inside the xmls. Office likes to put a lot of AlternateContent tags in - in the below solution I've removed those wrappers and directly provided the elements within the original "Choice" elements, at least with Excel2016 it works ... - be aware that the embeddings in the original files of Excel2016 couldn't be opened in Libre Office 5 / Excel-Viewer, so your users need a normal installation.
As I've implemented it in the full developer codebase of POI, you might need use the full schemas.
The preview pictures will be replaced when the users tries to open the embedded objects. It would be nice if POIs WMF packages could be used to generate preview images on the fly, but I only have implemented them read-only up till now :(
If the embedded elements can't be opened, please drop me a line of your users Office installation and I try to downgrade accordingly.
package org.apache.poi.xssf;
import java.awt.geom.Rectangle2D;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.HashSet;
import java.util.Set;
import javax.xml.namespace.QName;
import org.apache.poi.POIXMLDocument;
import org.apache.poi.hpsf.ClassID;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.openxml4j.opc.PackagePart;
import org.apache.poi.openxml4j.opc.PackagePartName;
import org.apache.poi.openxml4j.opc.PackageRelationship;
import org.apache.poi.openxml4j.opc.PackageRelationshipTypes;
import org.apache.poi.openxml4j.opc.PackagingURIHelper;
import org.apache.poi.openxml4j.opc.TargetMode;
import org.apache.poi.poifs.filesystem.Ole10Native;
import org.apache.poi.poifs.filesystem.Ole10NativeException;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xslf.usermodel.XMLSlideShow;
import org.apache.poi.xslf.usermodel.XSLFTextBox;
import org.apache.poi.xssf.usermodel.XSSFClientAnchor;
import org.apache.poi.xssf.usermodel.XSSFDrawing;
import org.apache.poi.xssf.usermodel.XSSFPicture;
import org.apache.poi.xssf.usermodel.XSSFRelation;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.xmlbeans.XmlCursor;
import org.apache.xmlbeans.XmlException;
import org.apache.xmlbeans.XmlObject;
import org.junit.Test;
import org.openxmlformats.schemas.drawingml.x2006.main.CTOfficeArtExtension;
import org.openxmlformats.schemas.drawingml.x2006.main.CTOfficeArtExtensionList;
import org.openxmlformats.schemas.drawingml.x2006.spreadsheetDrawing.CTPicture;
import org.openxmlformats.schemas.drawingml.x2006.spreadsheetDrawing.CTTwoCellAnchor;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTOleObject;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTOleObjects;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet;
public class TestEmbed {
static final String drawNS = "http://schemas.microsoft.com/office/drawing/2010/main";
static final String relationshipsNS = "http://schemas.openxmlformats.org/officeDocument/2006/relationships";
// write some embedded objects to sheet
#Test
public void write() throws IOException, InvalidFormatException {
XSSFWorkbook wb = new XSSFWorkbook();
XSSFSheet sh = wb.createSheet();
int imgPptId = addImageToWorkbook(wb, "ppt-icon.jpg", Workbook.PICTURE_TYPE_JPEG);
int imgPckId = addImageToWorkbook(wb, "PackageIcon.png", Workbook.PICTURE_TYPE_PNG);
String imgPckRelId = addImageToSheet(sh, imgPckId, Workbook.PICTURE_TYPE_PNG);
String imgPptRelId = addImageToSheet(sh, imgPptId, Workbook.PICTURE_TYPE_JPEG);
// embed two different HTML pages via package manager
XSSFClientAnchor imgAnchor1 = new XSSFClientAnchor(0, 0, 0, 0, 1, 1, 3, 3);
String oleRelId1 = addHtml(sh, 1);
int shapeId1 = addImageToShape(sh, imgAnchor1, imgPckId);
addObjectToShape(sh, imgAnchor1, shapeId1, oleRelId1, imgPckRelId, "Objekt-Manager-Shellobjekt");
XSSFClientAnchor imgAnchor2 = new XSSFClientAnchor(0, 0, 0, 0, 5, 1, 7, 3);
String oleRelId2 = addHtml(sh, 2);
int shapeId2 = addImageToShape(sh, imgAnchor2, imgPckId);
addObjectToShape(sh, imgAnchor2, shapeId2, oleRelId2, imgPckRelId, "Objekt-Manager-Shellobjekt");
// embed a slideshow (no package manager needed)
XSSFClientAnchor imgAnchor3 = new XSSFClientAnchor(0, 0, 0, 0, 1, 5, 7, 10);
String oleRelId3 = addSlideShow(sh, 1);
int shapeId3 = addImageToShape(sh, imgAnchor3, imgPptId);
addObjectToShape(sh, imgAnchor3, shapeId3, oleRelId3, imgPptRelId, "Presentation");
FileOutputStream fos = new FileOutputStream("bla.xlsx");
wb.write(fos);
fos.close();
wb.close();
}
// read Ole10Native objects from workbook
#Test
public void read() throws IOException, XmlException, Ole10NativeException {
XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream("bla.xlsx"));
XSSFSheet sheet = wb.getSheetAt(0);
CTWorksheet cws = sheet.getCTWorksheet();
if (!cws.isSetOleObjects()) {
System.out.println("sheet has no ole objects");
} else {
Set<Integer> processedShapes = new HashSet<Integer>();
for (XmlObject xOleObj : cws.getOleObjects().selectPath("declare namespace p='"+XSSFRelation.NS_SPREADSHEETML+"' .//p:oleObject")) {
XmlCursor cur = xOleObj.newCursor();
String shapeId = cur.getAttributeText(new QName("shapeId"));
String relId = cur.getAttributeText(new QName(relationshipsNS, "id"));
cur.dispose();
if (processedShapes.contains(Integer.valueOf(shapeId))) {
continue;
}
processedShapes.add(Integer.valueOf(shapeId));
PackagePart pp = sheet.getRelationById(relId).getPackagePart();
if ("application/vnd.openxmlformats-officedocument.oleObject".equals(pp.getContentType())) {
InputStream is = pp.getInputStream();
POIFSFileSystem poifs = new POIFSFileSystem(is);
is.close();
Ole10Native ole10 = Ole10Native.createFromEmbeddedOleObject(poifs);
poifs.close();
System.out.println("Filename: "+ole10.getFileName()+" - content length: "+ole10.getDataSize());
}
}
}
wb.close();
}
// add a dummy html to the embeddings folder
private static String addHtml(XSSFSheet sh, int oleId) throws IOException, InvalidFormatException {
String html10 = "<html><body><marquee>This is the end. Html-id: "+oleId+"</marquee></body></html>";
Ole10Native ole10 = new Ole10Native("html"+oleId+".html", "html"+oleId+".html", "html"+oleId+".html", html10.getBytes("ISO-8859-1"));
ByteArrayOutputStream bos = new ByteArrayOutputStream(500);
ole10.writeOut(bos);
POIFSFileSystem poifs = new POIFSFileSystem();
poifs.getRoot().createDocument(Ole10Native.OLE10_NATIVE, new ByteArrayInputStream(bos.toByteArray()));
poifs.getRoot().setStorageClsid(ClassID.OLE10_PACKAGE);
final PackagePartName pnOLE = PackagingURIHelper.createPartName( "/xl/embeddings/oleObject"+oleId+".bin" );
final PackagePart partOLE = sh.getWorkbook().getPackage().createPart( pnOLE, "application/vnd.openxmlformats-officedocument.oleObject" );
PackageRelationship prOLE = sh.getPackagePart().addRelationship( pnOLE, TargetMode.INTERNAL, POIXMLDocument.OLE_OBJECT_REL_TYPE );
OutputStream os = partOLE.getOutputStream();
poifs.writeFilesystem(os);
os.close();
poifs.close();
return prOLE.getId();
}
// add a dummy slideshow to the embeddings folder
private static String addSlideShow(XSSFSheet sh, int pptId) throws IOException, InvalidFormatException {
XMLSlideShow ppt = new XMLSlideShow();
XSLFTextBox tb = ppt.createSlide().createTextBox();
tb.setText("this is the end - PPT-ID: "+pptId);
tb.setAnchor(new Rectangle2D.Double(100,100,100,100));
final PackagePartName pnPPT = PackagingURIHelper.createPartName( "/xl/embeddings/sample"+pptId+".pptx" );
final PackagePart partPPT = sh.getWorkbook().getPackage().createPart( pnPPT, "application/vnd.openxmlformats-officedocument.presentationml.presentation" );
PackageRelationship prPPT = sh.getPackagePart().addRelationship( pnPPT, TargetMode.INTERNAL, POIXMLDocument.PACK_OBJECT_REL_TYPE );
OutputStream os = partPPT.getOutputStream();
ppt.write(os);
os.close();
ppt.close();
return prPPT.getId();
}
private static int addImageToWorkbook(XSSFWorkbook wb, String fileName, int pictureType) throws IOException {
FileInputStream fis = new FileInputStream(fileName);
int imgId = wb.addPicture(fis, pictureType);
fis.close();
return imgId;
}
private static String addImageToSheet(XSSFSheet sh, int imgId, int pictureType) throws InvalidFormatException {
final PackagePartName pnIMG = PackagingURIHelper.createPartName( "/xl/media/image"+(imgId+1)+(pictureType == Workbook.PICTURE_TYPE_PNG ? ".png" : ".jpeg") );
PackageRelationship prIMG = sh.getPackagePart().addRelationship( pnIMG, TargetMode.INTERNAL, PackageRelationshipTypes.IMAGE_PART );
return prIMG.getId();
}
private static int addImageToShape(XSSFSheet sh, XSSFClientAnchor imgAnchor, int imgId) {
XSSFDrawing pat = sh.createDrawingPatriarch();
XSSFPicture pic = pat.createPicture(imgAnchor, imgId);
CTPicture cPic = pic.getCTPicture();
int shapeId = (int)cPic.getNvPicPr().getCNvPr().getId();
cPic.getNvPicPr().getCNvPr().setHidden(true);
CTOfficeArtExtensionList extLst = cPic.getNvPicPr().getCNvPicPr().addNewExtLst();
// https://msdn.microsoft.com/en-us/library/dd911027(v=office.12).aspx
CTOfficeArtExtension ext = extLst.addNewExt();
ext.setUri("{63B3BB69-23CF-44E3-9099-C40C66FF867C}");
XmlCursor cur = ext.newCursor();
cur.toEndToken();
cur.beginElement(new QName(drawNS, "compatExt", "a14"));
cur.insertAttributeWithValue("spid", "_x0000_s"+shapeId);
return shapeId;
}
private static void addObjectToShape(XSSFSheet sh, XSSFClientAnchor imgAnchor, int shapeId, String objRelId, String imgRelId, String progId) {
CTWorksheet cwb = sh.getCTWorksheet();
CTOleObjects oo = cwb.isSetOleObjects() ? cwb.getOleObjects() : cwb.addNewOleObjects();
CTOleObject ole1 = oo.addNewOleObject();
ole1.setProgId(progId);
ole1.setShapeId(shapeId);
ole1.setId(objRelId);
XmlCursor cur1 = ole1.newCursor();
cur1.toEndToken();
cur1.beginElement("objectPr", XSSFRelation.NS_SPREADSHEETML);
cur1.insertAttributeWithValue("id", relationshipsNS, imgRelId);
cur1.insertAttributeWithValue("defaultSize", "0");
cur1.beginElement("anchor", XSSFRelation.NS_SPREADSHEETML);
cur1.insertAttributeWithValue("moveWithCells", "1");
CTTwoCellAnchor anchor = CTTwoCellAnchor.Factory.newInstance();
anchor.setFrom(imgAnchor.getFrom());
anchor.setTo(imgAnchor.getTo());
XmlCursor cur2 = anchor.newCursor();
cur2.copyXmlContents(cur1);
cur2.dispose();
cur1.toParent();
cur1.toFirstChild();
cur1.setName(new QName(XSSFRelation.NS_SPREADSHEETML, "from"));
cur1.toNextSibling();
cur1.setName(new QName(XSSFRelation.NS_SPREADSHEETML, "to"));
cur1.dispose();
}
}

Related

Unable to call the parameters from one class-method to another class-method

I have tried with the below code having two classes once is Class1-TestData and Method1-excel(),
And another call to access the parameters Class2-AdminLoginAction and Method2-Admin_Login().
Here is the problem with I need to call the string parameters like UID and PWD as I marked in the Screenshot attached. But the script showed some error and was unable to access it. So, How can I solve this problem, Am I going to the right approach or any other way? Please do need full as soon as possible.
Script image for two classes
import java.io.IOException;
import java.util.Date;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.annotations.*;
import com.aventstack.extentreports.ExtentReports;
import com.aventstack.extentreports.ExtentTest;
import com.aventstack.extentreports.Status;
import com.aventstack.extentreports.reporter.ExtentSparkReporter;
public class AdminLoginAction extends TestData{
WebDriver d;
Date currentdate = new Date();
String Screenshotdate = currentdate.toString().replace(" ", "-").replace(":", "-");
ExtentSparkReporter spark = new ExtentSparkReporter("ExtentReport.html");
ExtentReports extent = new ExtentReports();
#Test()
public void Admin_Login() throws InterruptedException, IOException {
TestData excel = new TestData();
extent.attachReporter(spark);
ExtentTest test = extent.createTest("Launch browswer and access the WeClean Login page");
test.log(Status.PASS, "Launch browser success...!!!");
test.pass("Verified launching browser");
System.setProperty("webdriver.chrome.driver", "D:\\backup\\selenium\\chromedriver.exe");
d = new ChromeDriver();
d.manage().window().maximize();
d.get(URL);
Utils.CaptureScreenshot(d, Screenshotdate + "_Login.png");
d.findElement(By.id("loginUser")).sendKeys(UID);
d.findElement(By.id("password")).sendKeys(PWD);
d.findElement(By.id("loginButton")).click();
Thread.sleep(5000);
Utils.CaptureScreenshot(d, Screenshotdate + "_HomePage.png");
test.log(Status.PASS, "Admin Logged in Successful");
test.pass("Verified Admin logged in");
}
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.Scanner;
import org.apache.poi.xssf.usermodel.XSSFCell;
import org.apache.poi.xssf.usermodel.XSSFRow;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.testng.annotations.Test;
public class TestData{
public void excel() throws IOException {
String filePath = System.getProperty("user.dir")+"\\Inputfiles";
File file = new File(filePath + "\\TestData.xlsx");
FileInputStream inputStream = new FileInputStream(file);
XSSFWorkbook wb=new XSSFWorkbook(inputStream);
XSSFSheet sheet=wb.getSheet("Admin_inputs");
XSSFRow row2=sheet.getRow(1);
XSSFCell cell=row2.getCell(0);
double UID= cell.getNumericCellValue();
XSSFCell cell2 = row2.getCell(1);
String PWD = cell2.getStringCellValue();
}
You can try like this:
public class TestData {
double UID;
String PWD;
public void excel() throws IOException {
String filePath = System.getProperty("user.dir") + "\\Inputfiles";
File file = new File(filePath + "\\TestData.xlsx");
FileInputStream inputStream = new FileInputStream(file);
XSSFWorkbook wb = new XSSFWorkbook(inputStream);
XSSFSheet sheet = wb.getSheet("Admin_inputs");
XSSFRow row2 = sheet.getRow(1);
XSSFCell cell = row2.getCell(0);
UID = cell.getNumericCellValue();
XSSFCell cell2 = row2.getCell(1);
PWD = cell2.getStringCellValue();
}
}
In Admin_Login() method, am just printing UID and PWD, change the code according to your requirement:
public class AdminLoginAction extends TestData {
#Test()
public void Admin_Login() throws InterruptedException, IOException {
// TestData excel = new TestData(); // you don't need to create this object because you are already inheriting `TestData` class.
excel(); // before using UID and PWD you have to call excel() method.
System.out.println(UID);
System.out.println(PWD);
}
}

Apache poi : setting custom properties at worksheet level using apache poi

public static void main(String[] args) {
try {
FileInputStream file = new FileInputStream(new File("D://New Microsoft Excel Worksheet.xlsx"));
XSSFWorkbook wb = new XSSFWorkbook(file);
XSSFSheet sheet = wb.createSheet("newsheet5");
CTWorksheet ctSheet = sheet.getCTWorksheet();
CTCustomProperties props = ctSheet.addNewCustomProperties();
props.addNewCustomPr().setId("APACHE POI");
props.addNewCustomPr().setName("Tender no = 48");
props.addNewCustomPr().setId("APACHE POI 2");
props.addNewCustomPr().setName("tender no = 58");
ctSheet.setCustomProperties(props);
FileOutputStream out = new FileOutputStream("D://New Microsoft Excel Worksheet.xlsx");
wb.write(out);
out.close();
wb.close();
} catch (Exception e) {
e.printStackTrace();
}
}
Xlsx file is corrupted after writing custom properties at sheet level.
I'm getting an error message as 'excel cannot open the file because the file format or file extension is not valid . Vefiry that the file has not been corrupted and the file extension matches the format of the file' when tried open the excel file.
Sheet custom properties only are useable using VBA. They are stored in the Excel file but the values are within binary document parts customProperty1.bin, customProperty2.bin, ... This is nothing what apache poi provides access to until now.
Using XSSF one needs creating the binary document part, then getting the relation Id to that binary document part. Then set CTCustomProperties - CTCustomProperty. There the Id points to the binary document part containing the value and the name is the property name.
Following complete example shows this. It is tested and works using current apache poi 4.1.2. It needs ooxml-schemas-1.4.jar being in class path because default poi-ooxml-schemas-4.1.2.jar does not contain all needed low level CT*-classes.
import java.io.FileOutputStream;
import java.io.OutputStream;
import java.io.IOException;
import org.apache.poi.xssf.usermodel.*;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.*;
import org.apache.poi.openxml4j.opc.*;
import org.apache.poi.ooxml.POIXMLDocumentPart;
import java.nio.charset.StandardCharsets;
class CreateExcelSheetCustomProperties {
static void setSheetCustomProperty(XSSFSheet sheet, String customPropertyName, String customPropertyValue) throws Exception {
OPCPackage opcpackage = sheet.getWorkbook().getPackage();
int i = opcpackage.getUnusedPartIndex("/customProperty#.bin");
PackagePartName partname = PackagingURIHelper.createPartName("/customProperty" + i + ".bin");
PackagePart part = opcpackage.createPart(partname, "application/vnd.openxmlformats-officedocument.spreadsheetml.customProperty");
POIXMLDocumentPart customProperty = new POIXMLDocumentPart(part) {
#Override
protected void commit() throws IOException {
PackagePart part = getPackagePart();
OutputStream out = part.getOutputStream();
try {
out.write(customPropertyValue.getBytes(StandardCharsets.UTF_16LE));
out.close();
} catch (Exception ex) {
ex.printStackTrace();
};
}
};
String rId = sheet.addRelation(null, XSSFRelation.CUSTOM_PROPERTIES, customProperty).getRelationship().getId();
CTWorksheet ctSheet = sheet.getCTWorksheet();
CTCustomProperties props = ctSheet.getCustomProperties();
if (props == null) props = ctSheet.addNewCustomProperties();
CTCustomProperty prop = props.addNewCustomPr();
prop.setId(rId);
prop.setName(customPropertyName);
}
public static void main(String[] args) throws Exception {
try (XSSFWorkbook workbook = new XSSFWorkbook();
FileOutputStream fileout = new FileOutputStream("./Excel.xlsx") ) {
XSSFSheet sheet = workbook.createSheet();
setSheetCustomProperty(sheet, "APACHE POI", "Tender no = 48");
setSheetCustomProperty(sheet, "APACHE POI 2", "tender no = 58");
workbook.write(fileout);
}
}
}
I've been struggling with the same issue and found a way to make it work, but it's far from optimal. Here it is anyway and hopefully you or someone else can come up with a better method.
package temp.temp;
import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.List;
import org.apache.poi.ooxml.POIXMLDocumentPart;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTCustomProperties;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTCustomProperty;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet;
public class Temp2 {
public static void main(String[] args) {
File inputFile = new File("C:\\myspreadsheet.xlsx");
try (BufferedInputStream fis = new BufferedInputStream(new FileInputStream(inputFile))) {
XSSFWorkbook wb = new XSSFWorkbook(fis);
for (int i = 0; i < wb.getNumberOfSheets(); i++) {
XSSFSheet sheet = wb.getSheetAt(i);
System.out.println("\nSheetName=" + sheet.getSheetName());
CTWorksheet ctSheet = sheet.getCTWorksheet();
CTCustomProperties props = ctSheet.getCustomProperties();
if (props != null) {
List<CTCustomProperty> propList = props.getCustomPrList();
propList.stream().forEach((prop) -> {
POIXMLDocumentPart rel = sheet.getRelationById(prop.getId());
if (rel != null) {
try (InputStream inp = rel.getPackagePart().getInputStream()) {
byte[] inBytes = inp.readAllBytes();
// By experimentation, byte array has two bytes per character with least
// significant in first byte which is UTF-16LE encoding. Don't know why!
String value = new String(inBytes, "UTF-16LE");
System.out.println(" " + prop.getName() + "=" + value);
} catch (IOException ioe) {
//Error
}
}
});
}
}
wb.close();
} catch (Exception e) {
System.out.println(e);
}
System.out.println("End");
}
}
Note that CTWorksheet comes from poi-ooxml-schemas-xx.jar and CustomProperties from ooxml-schemas-yy.jar, so both have to be on the classpath. If you're using modules (as I am), this gives big problems! Good Luck

gettting exception while reading duplicate column name excel file using sparkexcel library. How to overcome this issue

I am using spark-excel(com.crealytics.spark.excel) library to read excel file. If no duplicate column in excel file, the library working fine. If any duplicate column name occurs in excel file, throwing below exception.
How to overcome this error?
Is there any workaround solution to overcome this?
Exception in thread "main" org.apache.spark.sql.AnalysisException: Found duplicate column(s) in the data schema: `net territory`;
at org.apache.spark.sql.util.SchemaUtils$.checkColumnNameDuplication(SchemaUtils.scala:85)
Using spark excel API getting exception .
StructType schema = DataTypes.createStructType(new StructField[]{DataTypes.createStructField("CGISAI", DataTypes.StringType, true), DataTypes.createStructField("SALES TERRITORY", DataTypes.StringType, true)});
SQLContext sqlcxt = new SQLContext(jsc);
Dataset<Row> df = sqlcxt.read()
.format("com.crealytics.spark.excel")
.option("path", "file:///"+siteinfofile)
.option("useHeader", "true")
.option("spark.read.simpleMode", "true")
.option("treatEmptyValuesAsNulls", "true")
.option("inferSchema", "false")
.option("addColorColumns", "False")
.option("sheetName", "sheet1")
.option("startColumn", 22)
.option("endColumn", 23)
//.schema(schema)
.load();
return df;
This is the code I am using. I am using sparkexcel library from com.crealytics.spark.excel.
I want the solution to identify whether excel file has duplicate columns or not. if have duplicate columns, how to rename/eliminate the duplicate columns.
WorkAround is as below:
convert .xlsx file into .csv . using spark default csv api that can handle duplicate column names by renaming them automatically.
Below is the code to convert from xlsx to csv file.
/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
package com.huawei.java.tools;
/**
*
* #author Nanaji Jonnadula
*/
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.openxml4j.opc.PackageAccess;
import org.apache.poi.ss.usermodel.DataFormatter;
import org.apache.poi.ss.util.CellAddress;
import org.apache.poi.ss.util.CellReference;
import org.apache.poi.util.SAXHelper;
import org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler;
import org.apache.poi.xssf.model.StylesTable;
import org.apache.poi.xssf.usermodel.XSSFComment;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import javax.xml.parsers.ParserConfigurationException;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import static org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.SheetContentsHandler;
public class ExcelXlsx2Csv {
private static class SheetToCSV implements SheetContentsHandler {
private boolean firstCellOfRow = false;
private int currentRow = -1;
private int currentCol = -1;
private StringBuffer lineBuffer = new StringBuffer();
/** * Destination for data */
private FileOutputStream outputStream;
public SheetToCSV(FileOutputStream outputStream) {
this.outputStream = outputStream;
}
#Override
public void startRow(int rowNum) {
/** * If there were gaps, output the missing rows: * outputMissingRows(rowNum - currentRow - 1); */
// Prepare for this row
firstCellOfRow = true;
currentRow = rowNum;
currentCol = -1;
lineBuffer.delete(0, lineBuffer.length()); //clear lineBuffer
}
#Override
public void endRow(int rowNum) {
lineBuffer.append('\n');
try {
outputStream.write(lineBuffer.substring(0).getBytes());
} catch (IOException e) {
System.out.println("save date to file error at row number: {}"+ currentCol);
throw new RuntimeException("save date to file error at row number: " + currentCol);
}
}
#Override
public void cell(String cellReference, String formattedValue, XSSFComment comment) {
if (firstCellOfRow) {
firstCellOfRow = false;
} else {
lineBuffer.append(',');
}
// gracefully handle missing CellRef here in a similar way as XSSFCell does
if (cellReference == null) {
cellReference = new CellAddress(currentRow, currentCol).formatAsString();
}
int thisCol = (new CellReference(cellReference)).getCol();
int missedCols = thisCol - currentCol - 1;
if (missedCols > 1) {
lineBuffer.append(',');
}
currentCol = thisCol;
if (formattedValue.contains("\n")) {
formattedValue = formattedValue.replace("\n", "");
}
formattedValue = "\"" + formattedValue + "\"";
lineBuffer.append(formattedValue);
}
#Override
public void headerFooter(String text, boolean isHeader, String tagName) {
// Skip, no headers or footers in CSV
}
}
private static void processSheet(StylesTable styles, ReadOnlySharedStringsTable strings,
SheetContentsHandler sheetHandler, InputStream sheetInputStream) throws Exception {
DataFormatter formatter = new DataFormatter();
InputSource sheetSource = new InputSource(sheetInputStream);
try {
XMLReader sheetParser = SAXHelper.newXMLReader();
ContentHandler handler = new XSSFSheetXMLHandler(
styles, null, strings, sheetHandler, formatter, false);
sheetParser.setContentHandler(handler);
sheetParser.parse(sheetSource);
} catch (ParserConfigurationException e) {
throw new RuntimeException("SAX parser appears to be broken - " + e.getMessage());
}
}
public static void process(String srcFile, String destFile,String sheetname_) throws Exception {
File xlsxFile = new File(srcFile);
OPCPackage xlsxPackage = OPCPackage.open(xlsxFile.getPath(), PackageAccess.READ);
ReadOnlySharedStringsTable strings = new ReadOnlySharedStringsTable(xlsxPackage);
XSSFReader xssfReader = new XSSFReader(xlsxPackage);
StylesTable styles = xssfReader.getStylesTable();
XSSFReader.SheetIterator iter = (XSSFReader.SheetIterator) xssfReader.getSheetsData();
int index = 0;
while (iter.hasNext()) {
InputStream stream = iter.next();
String sheetName = iter.getSheetName();
System.out.println(sheetName + " [index=" + index + "]");
if(sheetName.equals(sheetname_)){
FileOutputStream fileOutputStream = new FileOutputStream(destFile);
processSheet(styles, strings, new SheetToCSV(fileOutputStream), stream);
fileOutputStream.flush();
fileOutputStream.close();
}
stream.close();
++index;
}
xlsxPackage.close();
}
public static void main(String[] args) throws Exception {
ExcelXlsx2Csv.process("D:\\data\\latest.xlsx", "D:\\data\\latest.csv","sheet1"); //source , destination, sheetname
}
}

Is iText7 possible to display fonts with IVS(Ideographic Variation Sequence)?

I am using the iText7(v7.1.1) to create PDF-file.
Environment: java version "1.7.0_45".
For IVS(Ideographic Variation Sequence) see below
http://blogs.adobe.com/CCJKType/files/2017/09/iuc32-lunde-s5t3.pdf
Sample-Code see below
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfWriter;
import com.itextpdf.layout.Document;
import com.itextpdf.layout.element.Table;
import com.itextpdf.layout.element.Paragraph;
import com.itextpdf.kernel.font.PdfFont;
import com.itextpdf.kernel.font.PdfFontFactory;
import com.itextpdf.io.font.PdfEncodings;
import java.io.File;
public class SimpleTableIVS {
public static final String DEST = "SimpleTableIVS.pdf";
public static void main(String[] args) throws Exception {
File file = new File(DEST);
new SimpleTableIVS().manipulatePdf(DEST);
}
protected void manipulatePdf(String dest) throws Exception {
PdfDocument pdfDoc = new PdfDocument(new PdfWriter(dest));
Document doc = new Document(pdfDoc);
// UTF-8 encoding table and Unicode characters
// http://www.utf8-chartable.com/unicode-utf8-table.pl?start=131072&unicodeinhtml=hex&htmlent=1
// http://www.utf8-chartable.com/unicode-utf8-table.pl?start=33792&number=1024
byte[] bUtfA = {(byte)0xd8, (byte)0x40, (byte)0xdc, (byte)0x0b}; // U+2000B [IVS:2000B_E0103]
byte[] bUtfB = {(byte)0x84, (byte)0x5b}; // U+845B, [IVS: 845B_E0103]
// After Add "Ideographic Variation Selector"
byte[] bUtfC = {(byte)0xdb, (byte)0x40, (byte)0xdd, (byte)0x01}; // U+E0101
byte[] bUtfD = {(byte)0xdb, (byte)0x40, (byte)0xdd, (byte)0x02}; // U+E0102
//PdfFont font = PdfFontFactory.createFont("C:/Windows/Fonts/msmincho.ttc,0", PdfEncodings.IDENTITY_H);
//PdfFont font = PdfFontFactory.createFont("C:/Windows/Fonts/meiryo.ttc,0", PdfEncodings.IDENTITY_H);
PdfFont font = PdfFontFactory.createFont("C:/Program Files/ipamjm/ipamjm.ttf", PdfEncodings.IDENTITY_H);
String strUtfA = new String(bUtfA, "UTF-16");
String strUtfB = new String(bUtfB, "UTF-16");
String strUtfC = strUtfA + (new String(bUtfC, "UTF-16"));
String strUtfD = strUtfB + (new String(bUtfD, "UTF-16"));
Table table = new Table(4);
table.addCell(new Paragraph("\u200d" + strUtfA).setFont(font).setFontSize(12));
table.addCell(new Paragraph("\u200d" + strUtfB).setFont(font).setFontSize(12));
table.addCell(new Paragraph("\u200d" + strUtfC).setFont(font).setFontSize(12));
table.addCell(new Paragraph("\u200d" + strUtfD).setFont(font).setFontSize(12));
doc.add(table);
doc.close();
}
}
Unfortunately, Ideographic Variation Sequences are not currently supported in iText. Supporting it is feasible, but not very easy, and thus is not the highest priority at the moment.
An internal development ticket has been created for that and it would be great to implement this feature in the next versions.

Read embedded Object in excel using java

I want to make one excel sheet which I need to send other for filling it.
In the excel sheet , the other person fill his information and can also attach text/doc file with excel sheet....
I need to access that text/doc file .. Please provide me a solution .
I am using Apache POI - HSSF api.
Thanks in advance.
package excelExchange;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.Iterator;
import java.util.Vector;
import org.apache.poi.hslf.HSLFSlideShow;
import org.apache.poi.hslf.usermodel.SlideShow;
import org.apache.poi.hssf.usermodel.HSSFCell;
import org.apache.poi.hssf.usermodel.HSSFObjectData;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.poifs.filesystem.DirectoryNode;
//import org.apache.poi.h;
import org.apache.poi.poifs.filesystem.*;
public class ReadEmbeddedObject {
public static void main(String[] args) throws IOException {
String fileName = "C:\\Mayur\\NewsLetter\\files\\projectInfo.xls";
//Vector dataHolder =
ReadCSV(fileName);
}
public static void ReadCSV(String fileName) throws IOException{
Vector cellVectorHolder = new Vector();
FileInputStream myInput = new FileInputStream(fileName);
// myFileSystem=fs
//myWorkBook=workbook
POIFSFileSystem fs = new POIFSFileSystem(myInput);
HSSFWorkbook workbook = new HSSFWorkbook(fs);
for (HSSFObjectData obj : workbook.getAllEmbeddedObjects()) {
//the OLE2 Class Name of the object
System.out.println("Objects : "+ obj.getOLE2ClassName()+ " 2 .");
String oleName = obj.getOLE2ClassName();
if (oleName.equals("Worksheet")) {
System.out.println("Worksheet");
DirectoryNode dn = (DirectoryNode) obj.getDirectory();
HSSFWorkbook embeddedWorkbook = new HSSFWorkbook(dn, fs, false);
System.out.println(oleName+": " + embeddedWorkbook.getNumberOfSheets());
System.out.println("Information :--- ");
System.out.println(" name " + embeddedWorkbook.getSheetName(0));
//System.out.println(entry.getName() + ": " + embeddedWorkbook.getNumberOfSheets());
} else if (oleName.equals("Document")) {
System.out.println("Document");
DirectoryNode dn = (DirectoryNode) obj.getDirectory();
HWPFDocument embeddedWordDocument = new HWPFDocument(dn,fs);
System.out.println("Doc : " + embeddedWordDocument.getRange().text());
} else if (oleName.equals("Presentation")) {
System.out.println("Presentation");
DirectoryNode dn = (DirectoryNode) obj.getDirectory();
SlideShow embeddedPowerPointDocument = new SlideShow(new HSLFSlideShow(dn, fs));
//Entry entry = (Entry) entries.next();
System.out.println(": " + embeddedPowerPointDocument.getSlides().length);
} else {
System.out.println("Else part ");
if(obj.hasDirectoryEntry()){
// The DirectoryEntry is a DocumentNode. Examine its entries to find out what it is
DirectoryNode dn = (DirectoryNode) obj.getDirectory();
for (Iterator entries = dn.getEntries(); entries.hasNext();) {
Entry entry = (Entry) entries.next();
System.out.println(oleName + "." + entry.getName());
}
} else {
// There is no DirectoryEntry
// Recover the object's data from the HSSFObjectData instance.
byte[] objectData = obj.getObjectData();
}
}
}
}
}
</code>
POI has APIs to iterate over embedded objects. (HSSFWorkbook .getAllEmbeddedObjects or XSSFWorkbook.getAllEmbedds ). Examples here http://poi.apache.org/spreadsheet/quick-guide.html#Embedded
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.util.ArrayList;
/**
* Demonstrates how you can extract embedded data from a .xlsx file
*/
public class GetEmbedded {
public static void main(String[] args) throws Exception {
String path = "SomeExcelFile.xlsx"
XSSFWorkbook workbook = new XSSFWorkbook(new FileInputStream(new File(path)));
for (PackagePart pPart : workbook.getAllEmbedds()) {
String contentType = pPart.getContentType();
if (contentType.equals("application/vnd.ms-excel")) { //This is to read xls workbook embedded to xlsx file
HSSFWorkbook embeddedWorkbook = new HSSFWorkbook(pPart.getInputStream());
int countOfSheetXls=embeddedWorkbook.getNumberOfSheets();
}
else if (contentType.equals("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet")) { //This is to read xlsx workbook embedded to xlsx file
if(pPart.getPartName().getName().equals("/xl/embeddings/Microsoft_Excel_Worksheet12.xlsx")){
//"/xl/embeddings/Microsoft_Excel_Worksheet12.xlsx" - Can read an Excel from a particular sheet
// This is the worksheet from the Parent Excel-sheet-12
XSSFWorkbook embeddedWorkbook = new XSSFWorkbook(pPart.getInputStream());
int countOfSheetXlsx=embeddedWorkbook.getNumberOfSheets();
ArrayList<String> sheetNames= new ArrayList<String>();
for(int i=0;i<countOfSheetXlsx;i++){
String name=workbook.getSheetName(i);
sheetNames.add(name);
}
}
}
}
}
}

Categories

Resources