I'm writing a little tool that converts slides in PPT Files to pngs, the problem I'm having is with hidden slides. How can I change a slide to be visible in java? Im currently using Apache POI for conversion to PNGs, although this doesn't work for clipart so I am tempted with exporting it to a PDF using unoconv first, then minipulating that. But doing it like this doesn't take in to account all the hidden slides. So how could I programmatically change the hidden slides to be visible?
This is kind of a hack and has been tested only with a PPT from Libre Office with POI 3.9 / POI-Scratchpad 3.8.
The spec ([MS-PPT].pdf / version 3.0 / page 201) says, that Bit 3 (fHidden) of Byte 18 specifies whether the corresponding slide is hidden and is not displayed during the slide show
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.lang.reflect.Field;
import org.apache.poi.hslf.model.Slide;
import org.apache.poi.hslf.record.Record;
import org.apache.poi.hslf.record.RecordTypes;
import org.apache.poi.hslf.record.UnknownRecordPlaceholder;
import org.apache.poi.hslf.usermodel.SlideShow;
public class UnhidePpt {
public static void main(String[] args) throws Exception {
FileInputStream fis = new FileInputStream("hiddenslide.ppt");
SlideShow ppt = new SlideShow(fis);
fis.close();
Field f = UnknownRecordPlaceholder.class.getDeclaredField("_contents");
f.setAccessible(true);
for (Slide slide : ppt.getSlides()) {
for (Record record : slide.getSlideRecord().getChildRecords()) {
if (record instanceof UnknownRecordPlaceholder
&& record.getRecordType() == RecordTypes.SSSlideInfoAtom.typeID) {
UnknownRecordPlaceholder urp = (UnknownRecordPlaceholder)record;
byte contents[] = (byte[])f.get(urp);
contents[18] &= (255-4);
f.set(urp, contents);
}
}
}
FileOutputStream fos = new FileOutputStream("unhidden.ppt");
ppt.write(fos);
fos.close();
}
}
Related
I have read examples in merging PDF documents section however I couldn't develop more optimal solution for the following task:
I would like to merge series of PDF and image files coming in any order (original post). The inefficiency comes from the fact that I need to create dummy 1-page PDF file for image using PdfWriter and then read it back from byte array using PdfReader.
Question: Is there more efficient way of doing the same (maybe via PdfCopy#addPage())?
import java.io.ByteArrayOutputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Image;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfSmartCopy;
import com.itextpdf.text.pdf.PdfWriter;
/**
* Helper class that creates PDF from given image(s) (JPEG, PNG, ...) or PDFs.
*/
public class MergeToPdf {
public static void main(String[] args) throws IOException, DocumentException {
if (args.length < 2) {
System.err.println("At least two arguments are required: in1.pdf [, image2.jpg ...], out.pdf");
System.exit(1);
}
Document mergedDocument = new Document();
PdfSmartCopy pdfCopy = new PdfSmartCopy(mergedDocument, new FileOutputStream(args[args.length - 1]));
mergedDocument.open();
for (int i = 0; i < args.length - 1; i++) {
PdfReader reader;
if (args[i].toLowerCase().endsWith(".pdf")) {
System.out.println("Adding PDF " + args[i] + "...");
// Copy PDF document:
reader = new PdfReader(args[i]);
}
else {
System.out.println("Adding image " + args[i] + "...");
final ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
final Document imageDocument = new Document();
PdfWriter.getInstance(imageDocument, byteStream);
imageDocument.open();
// Create single page with the dimensions as source image and no margins:
Image image = Image.getInstance(args[i]);
image.setAbsolutePosition(0, 0);
imageDocument.setPageSize(image);
imageDocument.newPage();
imageDocument.add(image);
imageDocument.close();
// Copy PDF document with only one page carrying the image:
reader = new PdfReader(byteStream.toByteArray());
}
pdfCopy.addDocument(reader);
reader.close();
}
mergedDocument.close();
}
}
I am having an issue with some code I'm writing in Java using PDFBox. I am attempting to populate a PDF with particular forms based on values read from an excel spreadsheet. Below is my class file.
import java.io.FileInputStream;
import java.io.File;
import java.io.IOException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.PDPageContentStream.AppendMode;
import org.apache.pdfbox.pdmodel.font.PDFont;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.hssf.usermodel.*;
/**
* This is a test file for reading and populating a PDF with specific forms
*/
public class JU_TestFile {
PDPage Stick_Form;
PDPage IKE_Form;
PDPage BO_Form;
/**
* Constructor.
*/
public JU_TestFile() throws IOException
{
this.BO_Form = (PDPage) PDDocument.load(new File("C:\\Users\\saf\\Desktop\\JavaTest\\BO Pole Form.pdf")).getPage(0);
this.IKE_Form = (PDPage) PDDocument.load(new File("C:\\Users\\saf\\Desktop\\JavaTest\\IKE Form.pdf")).getPage(0);
this.Stick_Form = (PDPage) PDDocument.load(new File("C:\\Users\\saf\\Desktop\\JavaTest\\Sticking Form.pdf")).getPage(0);
}
public void buildFile(String fileName, String excelSheet) throws IOException {
// Create a Blank PDF Document and load in JU Excel Spreadsheet
PDDocument workingDocument = new PDDocument();
FileInputStream fis = new FileInputStream(new File(excelSheet));
// Load in the workbook
HSSFWorkbook JU_XML = new HSSFWorkbook(fis);
int sheetNumber = 0;
int rowNumber = 0;
String cellValue = "Starting Value";
HSSFSheet currentSheet = JU_XML.getSheetAt(sheetNumber);
// While we have not reached the 25th row in our current sheet
while (rowNumber <= 24) {
// Get the value in the current row, on the 8th column in the xls file
cellValue = currentSheet.getRow(rowNumber + 6).getCell(7).getStringCellValue();
// If it has stuff in it,
if (cellValue != "") {
// Check if it has the letters "IKE" and append the IKE form to our PDF
if (cellValue != "IKE") {
workingDocument.importPage(IKE_Form);
// If it is anything else (other than empty), append the Stick Form to our PDF
} else {
workingDocument.importPage(Stick_Form);
}
// Let's move on to the next row
rowNumber++;
// If the next row number is the "26th" row, we know we need to move on to the
// next sheet, and also reset the rows to the first row of that next sheet
if (rowNumber == 25) {
rowNumber = 0;
currentSheet = JU_XML.getSheetAt(++sheetNumber);
}
// if the 9th row is empty, we should break out of the loop and save/close our PDF, we are done
} else {
break;
}
}
workingDocument.save(fileName);
workingDocument.close();
}
}
I am getting the following error:
Exception in thread "main" java.io.IOException: COSStream has been closed and cannot be read. Perhaps its enclosing PDDocument has been closed?
I've done research and it seems like a PDDocument is closing before I run the workingDocument.save(fileName) command. I'm not quite sure how to fix this, and I'm also a bit lost on how to find a workaround. I'm a bit rusty on my programming, so any help would be super appreciated! Also any feedback on how to make future posts more informative would be great.
Thanks in advance
Please try it
PDFMergerUtility merger = new PDFMergerUtility();
PDDocument combine = PDDocument.load(file);
merger.appendDocument(getDocument(), combine);
merger.mergeDocuments();
combine.close();
Update:
Since merger.mergeDocuments(); is deprecated in recent APIs, try to make use of the same method using following overloaded methods...
merger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());
or
merger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
Depends on your memory usage, you can further fine tune this method by passing MemoryUsageSetting object.
I am trying to view word file in my editor pane
I tried these lines
import java.awt.Dimension;
import java.awt.GridLayout;
import java.io.File;
import java.io.FileInputStream;
import javax.swing.JEditorPane;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
public class editorpane extends JEditorPane
{
public editorpane(File file)
{
try
{
FileInputStream fis = new FileInputStream(file.getAbsolutePath());
HWPFDocument hwpfd = new HWPFDocument(fis);
WordExtractor we = new WordExtractor(hwpfd);
String[] array = we.getParagraphText();
for (int i = 0; i < array.length; i++)
{
this.setPage(array[i]);
}
} catch (Exception e)
{
e.printStackTrace();
}
but gives me
org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:131)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:138)
at org.apache.poi.hwpf.HWPFDocumentCore.verifyAndBuildPOIFS(HWPFDocumentCore.java:106)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:174)
at frame1.editorpane.<init>(editorpane.java:24)
in this line
HWPFDocument hwpfd = new HWPFDocument(fis);
how can I solve that ??
beside I am not sure about these lines
for (int i = 0; i < array.length; i++)
{
this.setPage(array[i]);
}
can I get them confirmed ??
You are trying to open a .docx file (XWPF) with code for .doc (HWPF) files. You can use XWPFWordExtractor for .docx files.
There is an ExtractorFactory which you can use to let POI decide which of these applies and uses the correct class to open the file, however you can then not iterate by page as only a generic getText() method is available then.
Use it like this
POITextExtractor extractor = ExtractorFactory.createExtractor(file);
extractor.getText();
Can anyone explain why this happens. I read an image and render it into an output writer. If it is a color file (or black and white), it renders fine. However, if the source image is grayscale, all I get is a black box.
Sample files available at https://www.dropbox.com/sh/kyfsh5curobwxrw/AACfWr1NhX8lPUZpzVGWIPQia?dl=0
My pom plugin dependancy snippets follow.
<dependency>
<groupId>javax.media</groupId>
<artifactId>jai_core</artifactId>
<version>1.1.3</version>
</dependency>
<dependency>
<groupId>com.sun.media</groupId>
<artifactId>jai_imageio</artifactId>
<version>1.1</version>
</dependency>
A test program. I understand that this bit of code in itself is really of no value, but in reality it is part of a larger suite of operations. This code represents my efforts to narrow down the issue to a small piece of code.
import javax.imageio.IIOImage;
import javax.imageio.ImageIO;
import javax.imageio.ImageReader;
import javax.imageio.ImageWriter;
import javax.imageio.stream.ImageInputStream;
import javax.imageio.stream.ImageOutputStream;
import java.awt.*;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
public class GrayScaleImaging {
public static void main(String[] args)
throws IOException {
//works
// final File inputFile = new File("/home/vinayb/Downloads/page1_color.tif");
// final File outputFile = new File("/home/vinayb/Downloads/page1_color_mod.tif");
//doesn't work
final File inputFile = new File("/home/vinayb/Downloads/page1_grayscale.tif");
final File outputFile = new File("/home/vinayb/Downloads/page1_grayscale_mod.tif");
if (outputFile.exists()) {
outputFile.delete();
}
ImageReader imageReader = null;
ImageWriter imageWriter = null;
Graphics2D g = null;
try (final ImageInputStream imageInputStream = ImageIO.createImageInputStream(inputFile);
final ImageOutputStream imageOutputStream = ImageIO.createImageOutputStream(outputFile);) {
//setup reader
imageReader = ImageIO.getImageReaders(imageInputStream).next();
imageReader.setInput(imageInputStream);
//read image
final BufferedImage initialImage = imageReader.read(0);
//prepare graphics for the output
final BufferedImage finalImage = new BufferedImage(initialImage.getWidth(), initialImage.getHeight(), imageType(initialImage));
g = finalImage.createGraphics();
//do something to the image
//doSomething(g)
//draw image
g.drawImage(initialImage, 0, 0, initialImage.getWidth(), initialImage.getHeight(), null);
//setup writer based on reader
imageWriter = ImageIO.getImageWriter(imageReader);
imageWriter.setOutput(imageOutputStream);
//write
imageWriter.write(null, new IIOImage(initialImage, null, imageReader.getImageMetadata(0)), imageWriter.getDefaultWriteParam());
} finally {
//cleanup
if (imageWriter != null) {
imageWriter.dispose();
}
if (imageReader != null) {
imageReader.dispose();
}
if (g != null) {
g.dispose();
}
}
}
private static int imageType(BufferedImage bufferedImage) {
return bufferedImage.getType() == 0 ? BufferedImage.TYPE_INT_ARGB : bufferedImage.getType();
}
}
Okay, I have now fixed my decoder, thanks for the sample file! :-)
Anyway, after some research, I have come to the conclusion that the problem is definitively the sample file, not your code nor the library you are using.
The issue with this file is that the TIFF metadata contains PhotometricInterpretation == 3/Palette and ColorMap tags. Ie. the image uses indexed color model/palette. If the image is read as it should according to (my understanding of) the spec, using the supplied color map, the image comes out all black. If instead I ignore this, and rather read it as gray scale (assuming PhotometricInterpretation == 1/BlackIsZero), it comes out as black text on white (light gray) background.
Edit:
A better explanation, is that the values in the color map are all 8 bit quantities (using the low 8 bits of each color entry) instead of using the full 16 bit as they should... If I detect this while reading and creating a palette using only the low 8 bits, the image comes out as intended (as in DropBox). This is still a bad image according to the spec, but detectable.
I am attempting to modify the background-color of a single page of a multi-page PDF document created using iText.
The easiest way to do this appeared to be by creating a Rectangle the entire size of the page, with the specified background color, and applying it to the page in question using the PdfContentByte utility. (having explored using the Document API, this seemed not to be the best option, since this applied the styling to ALL pages in the document, which I did not want).
When run, on close inspection, I can see that there is a single pixel along the upper, right and bottom margins, which remains white, the rest of the page being the correct color. I have played with the rectangle to ensure no margins were created, but to no avail. Find the code I am using below.
Rectangle r = new Rectangle(0, 0, helper.getPageWidth(), helper.getPageHeight());
r.setBackgroundColor(Constants.GREEN);
PdfContentByte cb = helper.getWriter().getDirectContent();
cb.rectangle(r);
cb.setColorFill(Constants.GREEN);
cb.setColorStroke(Constants.GREEN);
cb.fillStroke();
It seems whatever I try, I cannot get rid of the single white pixel row along these 3 sides of the page. Does anyone have any idea how to bleed to the VERY edge of an iText page?
First:Please mention the itext version you are using.I'm currently used your code snippet and made some changes and it work out well.May be full code snippet will help me to find out whats wrong in your code.
(prime suspect to me this line Rectangle r = new Rectangle(0,0,helper.getPageWidth(),helper.getPageHeight()))
I've attached the output and the code i used.
package com.pra.itext;
import com.lowagie.text.DocumentException;
import com.lowagie.text.Rectangle;
import com.lowagie.text.pdf.PdfContentByte;
import com.lowagie.text.pdf.PdfReader;
import com.lowagie.text.pdf.PdfStamper;
import java.awt.Color;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
/**
*
* #author Prajit
*/
public class ItextRect {
public static void main(String[] args) {
PdfReader rdrPdf = null;
PdfStamper stmprPdf = null;
try {
rdrPdf = new PdfReader("E:/Head.First.Servlets&Jsp.pdf");
stmprPdf = new PdfStamper(rdrPdf, new FileOutputStream(new File("D:/Example.pdf")));
for (int pgCnt = 1; pgCnt <= rdrPdf.getNumberOfPages(); pgCnt++) {
if (pgCnt == 1) {
PdfContentByte pdfCntntByt = stmprPdf.getUnderContent(pgCnt);
Rectangle r = new Rectangle(rdrPdf.getPageSize(pgCnt));
r.setBackgroundColor(Color.red);
pdfCntntByt.rectangle(r);
pdfCntntByt.setColorFill(Color.red);
pdfCntntByt.setColorStroke(Color.red);
pdfCntntByt.fillStroke();
}
}
stmprPdf.close();
rdrPdf.close();
} catch (DocumentException de) {
System.err.println(de.getMessage());
} catch (IOException ioe) {
System.err.println(ioe.getMessage());
}
}
}