How to insert an PDPage within another PDPage with pdfbox

How to insert an PDPage within another PDPage with pdfbox - java

I use different tools like processing to create vector plots. These plots are written as single or multi-page pdfs. I would like to include these plots in a single report-like pdf using pdfbox.
My current workflow includes these pdfs as images with the following pseudo code
PDDocument inFile = PDDocument.load(file);
PDPage firstPage = (PDPage) inFile.getDocumentCatalog().getAllPages().get(0);
BufferedImage image = firstPage.convertToImage(BufferedImage.TYPE_INT_RGB, 300);
PDXObjectImage ximage = new PDPixelMap(document, image);
PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.drawXObject(ximage, 0, 0, ximage.getWidth(), ximage.getHeight());
contentStream.close();
While this works it looses the benefits of the vector file formats, espectially file/size vs. printing qualitity.
Is it possible to use pdfbox to include other pdf pages as embedded objects within a page (Not added as a separate page)? Could I e.g. use a PDStream? I would prefer a solution like pdflatex is able to embed pdf figures into a new pdf document.
What other Java libraries can you recommend for that task?

Is it possible to use pdfbox to include other pdf pages as embedded objects within a page
It should be possible. The PDF format allows the use of so called form xobjects to serve as such embedded objects. I don't see an explicit implementation for that, though, but the procedure is similar enough to what PageExtractor or PDFMergerUtility do.
A proof of concept derived from PageExtractor using the current SNAPSHOT of the PDFBox 2.0.0 development version:
PDDocument source = PDDocument.loadNonSeq(SOURCE, null);
List<PDPage> pages = source.getDocumentCatalog().getAllPages();
PDDocument target = new PDDocument();
PDPage page = new PDPage();
PDRectangle cropBox = page.findCropBox();
page.setResources(new PDResources());
target.addPage(page);
PDFormXObject xobject = importAsXObject(target, pages.get(0));
page.getResources().addXObject(xobject, "X");
PDPageContentStream content = new PDPageContentStream(target, page);
AffineTransform transform = new AffineTransform(0, 0.5, -0.5, 0, cropBox.getWidth(), 0);
content.drawXObject(xobject, transform);
transform = new AffineTransform(0.5, 0.5, -0.5, 0.5, 0.5 * cropBox.getWidth(), 0.2 * cropBox.getHeight());
content.drawXObject(xobject, transform);
content.close();
target.save(TARGET);
target.close();
source.close();
This code imports the first page of a source document to a target document as XObject and puts it twice onto a page there with different scaling and rotation transformations, e.g. for this source
it creates this
The helper method importAsXObject actually doing the import is defined like this:
PDFormXObject importAsXObject(PDDocument target, PDPage page) throws IOException
{
final PDStream src = page.getContents();
if (src != null)
{
final PDFormXObject xobject = new PDFormXObject(target);
OutputStream os = xobject.getPDStream().createOutputStream();
InputStream is = src.createInputStream();
try
{
IOUtils.copy(is, os);
}
finally
{
IOUtils.closeQuietly(is);
IOUtils.closeQuietly(os);
}
xobject.setResources(page.findResources());
xobject.setBBox(page.findCropBox());
return xobject;
}
return null;
}
As mentioned above this is only a proof of concept, corner cases have not yet been taken into account.

To update this question:
There is already a helper class in org.apache.pdfbox.multipdf.LayerUtility to do the import.
Example to show superimposing a PDF page onto another PDF: SuperimposePage.
This class is part of the Apache PDFBox Examples and sample transformations as shown by #mkl were added to it.

As mkl appropriately suggested, PDFClown is among the Java libraries which provide explicit support for page embedding (so-called Form XObjects (see PDF Reference 1.7, § 4.9)).
In order to let you get a taste of the way PDFClown works, the following code represents the equivalent of mkl's PDFBox solution (NOTE: as mkl later stated, his code sample was by no means optimised, so this comparison may not correspond to the actual status of PDFBox -- comments are welcome to clarify this):
Document source = new File(SOURCE).getDocument();
Pages sourcePages = source.getPages();
Document target = new File().getDocument();
Page targetPage = new Page(target);
target.getPages().add(targetPage);
XObject xobject = sourcePages.get(0).toXObject(target);
PrimitiveComposer composer = new PrimitiveComposer(targetPage);
Dimension2D targetSize = targetPage.getSize();
Dimension2D sourceSize = xobject.getSize();
composer.showXObject(xobject, new Point2D.Double(targetSize.getWidth() * .5, targetSize.getHeight() * .35), new Dimension(sourceSize.getWidth() * .6, sourceSize.getHeight() * .6), XAlignmentEnum.Center, YAlignmentEnum.Middle, 45);
composer.showXObject(xobject, new Point2D.Double(targetSize.getWidth() * .35, targetSize.getHeight()), new Dimension(sourceSize.getWidth() * .4, sourceSize.getHeight() * .4), XAlignmentEnum.Left, YAlignmentEnum.Top, 90);
composer.flush();
target.getFile().save(TARGET, SerializationModeEnum.Standard);
source.getFile().close();
Comparing this code to PDFBox's equivalent you can notice some relevant differences which show PDFClown's neater style (it would be nice if some PDFBox expert could validate my assertions):
Page-to-FormXObject conversion: PDFClown natively supports a dedicated method (Page.toXObject()), so there's no need for additional heavy-lifting such as the helper method importAsXObject();
Resource management: PDFClown automatically (and transparently) allocates page resources, so there's no need for explicit calls such as page.getResources().addXObject(xobject, "X");
XObject drawing: PDFClown supports both high-level (explicit scale, translation and rotation anchors) and low-level (affine transformations) methods to place your FormXObject into the page, so there's no need to necessarily deal with affine transformations.
The whole point is that PDFClown features a rich architecture made up of multiple abstraction layers: according to your requirements, you can choose the most appropriate coding style (either to delve into PDF's low-level basic structures or to leverage its convenient and elegant high-level model). PDFClown lets you tweak every single byte and solve complex tasks with a ridiculously simple method call, at your will.
DISCLOSURE: I'm the lead developer of PDFClown.

Related

Remove or Update added Image icon from pdf page using OpenPdf based on iText Core

I have added an icon as Image object into PDF page with OpenPdf that is based on iText core. Here is my code
// inout stream from file
InputStream inputStream = new FileInputStream(file);
// we create a reader for a certain document
PdfReader reader = new PdfReader(inputStream);
// we create a stamper that will copy the document to a new file
PdfStamper stamp = new PdfStamper(reader, new FileOutputStream(file));
// adding content to each page
PdfContentByte over;
// get watermark icon
Image img = Image.getInstance(PublicFunction.getByteFromDrawable(context, R.drawable.ic_chat_lawone_new));
img.setAnnotation(new Annotation(0, 0, 0, 0, "https://github.com/LibrePDF/OpenPDF"));
img.setAbsolutePosition(pointF.x, pointF.y);
img.scaleAbsolute(50, 50);
// get page file number count
int pageNumbers = reader.getNumberOfPages();
if (pageNumbers < pageIndex) {
// closing PdfStamper will generate the new PDF file
stamp.close();
throw new PDFException("page index is out of pdf file page numbers", new Throwable());
}
// annotation added into target page
over = stamp.getOverContent(pageIndex);
if (over == null) {
stamp.close();
throw new PDFException("getUnderContent is null", new Throwable());
}
over.addImage(img);
// closing PdfStamper will generate the new PDF file
stamp.close();
// close reader
reader.close();
now I need to delete or update the color of added image object on user click, I have the click function that returns MotionEvent, now I need to delete or update or replace added image object.
Any Idea?!

In your parallel OpenPDF issue 464 you posted additionally:
Here my progress
Now I can achieve the XObjects added into pdf file, and I can remove them from pdf page this way:
// inout stream from file
InputStream inputStream = new FileInputStream(file);
// we create a reader for a certain document
PdfReader pdfReader = new PdfReader(inputStream);
// get page file number count
int pageNumbers = pdfReader.getNumberOfPages();
// we create a stamper that will copy the document to a new file
PdfStamper pdfStamper = new PdfStamper(pdfReader, new FileOutputStream(file));
// get page
PdfDictionary page = pdfReader.getPageN(currPage);
PdfDictionary resources = page.getAsDict(PdfName.RESOURCES);
// get page resources
PdfArray annots = resources.getAsArray(PdfName.ANNOTS);
PdfDictionary xobjects = resources.getAsDict(PdfName.XOBJECT);
// remove Xobjects
for (PdfName key : xobjects.getKeys()) {
xobjects.remove(key);
}
// remove annots
for (PdfObject element : annots.getElements()) {
annots.remove(0);
}
// close pdf stamper
pdfStamper.close();
// close pdf reader
pdfReader.close();
So the XObjects will remove from the screen, but still there is a problem!!!
When I remove them and try to add a new one, last deleted object appears and add into the pdf page! REALLY!!! :))
I think there should be another place that these objects should be removed from.
What happens here is that you indeed do remove the bitmap image resources from the page:
for (PdfName key : xobjects.getKeys()) {
xobjects.remove(key);
}
but you don't remove the instructions for drawing these resources from the content stream. This has two consequences:
Your PDF strictly speaking becomes invalid as a resource is referenced from the content stream which is not defined in the page resources. Depending on the viewer in question this might result in warning messages.
If the PDF is further processed and some new XObject is added to the same page with the same resource name, the original image drawing instruction now again has a resource to draw and makes it appear at the original position once more.
This explains your observation:
When I remove them and try to add a new one, last deleted object appears and add into the pdf page! REALLY!!! :))
I assume you used the same source image in your test, so it looked like the original image appeared again at the original position when actually the new image appeared there.
Thus, instead of merely removing the image XObject, you have the choice of either
also removing the XObject drawing instruction from the content stream or
replacing the image XObject by an invisible XObject instead.
The former option in general is non-trivial, in particular if your PDF-tool-to-be also allows other changes of the page content. In case of iText 5 or iText 7 I'd propose using the PdfContentStreamEditor (see here) / PdfCanvasEditor (see here) class to find and remove Do operations from the page content streams but I have no OpenPDF version of that class yet.
What you can do quite easily, though, is replacing the image resources by form XObjects without any content:
PdfTemplate pdfTemplate = PdfTemplate.createTemplate(pdfStamper.getWriter(), 1, 1);
// replace Xobjects
for (PdfName key : xobjects.getKeys()) {
xobjects.put(key, pdfTemplate.getIndirectReference());
}
(RemoveImage test testRemoveImageAddedLikeHamidReza)
Beware, replacing all XObjects by empty XObjects has an obvious disadvantage, it replaces all XObjects, not merely the ones your tool created before! Thus, if the original PDFs processed by your tool also drew XObjects in their immediate content streams, those XObjects also are rendered invisible. If you don't want that, you need some specific criteria to recognize the image XObjects you added and only replace them.
Furthermore, there are other problems afoot: Each time you process the OverContent of a page in a PdfStamper, the pre-existing content of that page is wrapped into a q / Q (save-graphics-state / restore-graphics-state) envelope to prevent changes of the graphics state in that previous content bleed through and mix up your OverContent additions. Thus, if you manipulate a file many times in your tool, the original page content may be wrapped in a fairly deep nesting of such envelopes. Unfortunately PDF readers may support only a limited nesting depth, e.g. ISO 32000-1 mentions a maximum depth of 28 envelopes.
Thus, if you still have the chance to overhaul your design, you should consider putting the images into annotation appearances instead of into the page content. After all, you already do generate annotations, currently merely to transport a link, so you could also generate annotations with appearances.

Add named destinations to an existing PDF document with iText

I have a PDF previously created with FOP, and I need to add some named destinations to it so later another program can open and navigate the document with the Adobe PDF open parameters, namely the #namedest=destination_name parameter.
I don't need to add bookmarks or other dynamic content but just some destinations with a name and thus injecting a /Dests collection with names defined in the resulting PDF.
I use iText 5.3.0 and I read the chapter 7 of iText in Action (2nd edition), but still I cannot figure it out how to add the destinations and so use them with #nameddest in a browser.
I'm reading and manipulating the document with PdfReader and PdfStamper. I already know in advance where to put every destination after having parsed the document with a customized Listener and a PdfContentStreamProcessor, searching for a specific text marker on each page.
This is a shortened version of my code:
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new BufferedOutputStream(dest));
// search text markers for destinations, page by page
for (int i=1; i<reader.getNumberOfPages(); i++) {
// get a list of markers for this page, as obtained with a custom Listener and a PdfContentStreamProcessor
List<MyDestMarker> markers = ((MyListener)listener).getMarkersForThisPage();
// add a destination for every text marker in the current page
Iterator<MyDestMarker> it = markers.iterator();
while(it.hasNext()) {
MyDestMarker marker = it.next();
String name = marker.getName();
String x = marker.getX();
String y = marker.getY();
// create a new destination
PdfDestination dest = new PdfDestination(PdfDestination.FITH, y); // or XYZ
// add as a named destination -> does not work, only for new documents?
stamper.getWriter().addNamedDestination(name, i /* current page */, dest);
// alternatives
PdfContentByte content = stamper.getOverContent(i);
content.localDestination(name, dest); // doesn't work either -> no named dest found
// add dest name to a list for later use with Pdf Open Parameters
destinations.add(name);
}
}
stamper.close();
reader.close();
I also tried creating a PdfAnnotation with PdfFormField.createLink() but still, I just manage to get the annotation but with no named destination defined it does not work.
Any solution for this? Do I need to add some "ghost" content over the existing one with Chunks or something else?
Thanks in advance.
edit 01-27-2016:
I recently found an answer to my question in the examples section of iText website, here.
Unfortunately the example provided does not work for me if I test it with a pdf without destinations previously defined in it, as it is the case with the source primes.pdf which already contains a /Dests array. This behaviour appears to be consistent with the iText code, since the writer loads the destinations in a map attribute of PdfDocument which is not "inherited" by the stamper on closing.
That said, I got it working using the method addNamedDestination() of PdfStamper added with version 5.5.7; this method loads a named destination in a local map attribute of the class which is later processed and consolidated in the document when closing the stamper.
This approach reaised a new issue though: the navigation with Pdf Open Parameters (#, #nameddest=) works fine with IE but not with Chrome v47 (and probably Firefox, too). I tracked the problem down to the order in which the dests names are defined and referenced inside the document; the stamper uses a HashMap as the container for the destinations, which of course does not guarantee the order of its objects and for whatever reason Chrome refuse to recognise destinations not listed in "natural" order. So, the only way I got it to work is replacing the namedDestinations HashMap with a natural-ordered TreeMap.
Hope this help others with the same issue.

I 've been in the same need for my project previously. Had to display and navigate pdf document with acrobat.jar viewer. To navigate i needed the named destinations in the pdf. I have looked around the web for a possible solution, but no fortunate for me. Then I this idea strikes my mind.
I tried to recreate the existing pdf with itext, navigating through each page and adding localdestinations to each page and i got what I wanted. below is the snip of my code
OutputStream outputStream = new FileOutputStream(new File(filename));
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
PdfContentByte cb = writer.getDirectContent();
PdfOutline pol = cb.getRootOutline();
PdfOutline oline1 = null;
InputStream in1 = new FileInputStream(new File(inf1));
PdfReader reader = new PdfReader(in1);
for (int i = 1; i <= reader.getNumberOfPages(); i++)
{
document.newPage();
document.setMargins(0.0F, 18.0F, 18.0F, 18.0F);
PdfImportedPage page = writer.getImportedPage(reader, i);
document.add(new Chunk(new Integer(i).toString()).setLocalDestination(new Integer(i).toString()));
System.out.println(i);
cb.addTemplate(page, 0.0F, 0.0F);
}
outputStream.flush();
document.close();
outputStream.close();
Thought it would help you.

How to underlay a content stream with using PDPageContentStream?

I am trying to create a watermark with using PDPageContentStream. This is what I have right now
PDPageContentStream contentStream = new PDPageContentStream(doc,page, true,true);
contentStream.beginText();
contentStream.setFont(font,40);
contentStream.setTextRotation(Math.PI/4,page.getMediaBox().getWidth()/4,page.getMediaBox().getHeight()/4);
contentStream.setNonStrokingColor(210,210,210); //light grey
contentStream.drawString(_text);
contentStream.endText();
contentStream.close();
What happens is it creates a 45 degree angled text with the light grey color. But -of course- it overlays the actual page content beneath it and it is not possible to see some of the content.
Is it possible to create the contentStream first and then append the page content?
I found this example. It uses PDExtendedGraphicsState and PDResources. I am new to pdfbox and almost have no graphics experience.
Are these what I need and What is a resource in pdfbox?
Thanks in advance.
p.s. I am aware that I can use overlay utility with a jpeg. But I am trying to figure out this problem with PDPageContentStream for now.

If you use PDFBox 2.0+, then its slightly easier now:
PDExtendedGraphicsState extendedGraphicsState = new PDExtendedGraphicsState();
extendedGraphicsState.setNonStrokingAlphaConstant((float) alpha);
contents.saveGraphicsState();
contents.setGraphicsStateParameters(extendedGraphicsState);
// do your stuff
contents.restoreGraphicsState();

I figured it out. The answer is actually here. But I had to walk it around a little bit. Here's my code:
PDExtendedGraphicsState extendedGraphicsState = new PDExtendedGraphicsState();
// Set the transparency/opacity
extendedGraphicsState.setNonStrokingAlphaConstant(0.5f);
PDResources resources = page.findResources();// Get the page resources.
// Get the defined graphic states.
Map graphicsStateDictionary = resources.getGraphicsStates();
if (graphicsStateDictionary != null){
graphicsStateDictionary.put("TransparentState", extendedGraphicsState);
resources.setGraphicsStates(graphicsStateDictionary);
}
PDPageContentStream contentStream = new PDPageContentStream(document, page,true,true);
contentStream.appendRawCommands("/TransparentState gs\n");
contentStream.setNonStrokingColor(210,210,210);
This code snippet creates a PDExtendedGraphicsState object. My understanding is 'Resources' is some kind of dictionary and holds attributes of the different PD objects like PDPage or PDGraphics.
At the beginning there is no state such as 'transparent state'. We create it with
extendedGraphicsState.setNonStrokingAlphaConstant(0.5f);
then we name the graphicsState object as TransparentState. This is how we use it in the AppendRawCommands.
This explanation might be insufficient or wrong. Please leave your comments. I'd be happy to understand it better.

As mentioned in a comment to the OP's own answer, that answer did not exactly what he had asked for (how to underlay an existing stream) but instead overlaying with transparency.
Thus, here an answer showing how one can do the originally asked for, underlaying an existing stream.
How to actually prepend a content stream
Underlaying essentially means prepending the new content to the existing content because the later content covers the former.
Unfortunately the PDFBox class PDPageContentStream does only provide constructors to append or replace everything with a new content stream but none to prepend a new content stream.
It is possible to cheat a bit, though: One can first append a new stream, fill it as if it was prepended, and finally reorder the streams:
void transformPage(PDDocument document, PDPage page) throws IOException, COSVisitorException
{
PDPageContentStream stream = new PDPageContentStream(document, page, true, true);
// add any content to stream as if it was the first stream
stream.close();
COSBase contents = page.getCOSDictionary().getDictionaryObject(COSName.CONTENTS);
if (contents instanceof COSStreamArray)
{
COSStreamArray contentsArray = (COSStreamArray) contents;
COSArray newArray = new COSArray();
newArray.add(contentsArray.get(contentsArray.getStreamCount() - 1));
for (int i = 0; i < contentsArray.getStreamCount() - 1; i++)
{
newArray.add(contentsArray.get(i));
}
COSStreamArray newStreamArray = new COSStreamArray(newArray);
page.getCOSDictionary().setItem(COSName.CONTENTS, newStreamArray);
}
}
Which is better, prepending or appending with transparency
The answer is... it depends. ;)
On one hand the solution using transparency often is nearer to the expected appearance than the solution using underlaying: If e.g. you underlay beneath a content that first fills the whole area in white, you won't see anything of the underlayed content!
On the other hand there are multiple context forbidding the use of transparency, especially in case of PDF/A variants. As overlaying with transparency in such context is not permitted, one either has to overlay using very thin lines only or simply underlay.

Add page as layer from separate pdf(different page size) using pdfbox

How can I add a page from external pdf doc to destination pdf if pages have different sizes?
Here is what I'd like to accomplish:
I tried to use LayerUtility (like in this example PDFBox LayerUtility - Importing layers into existing PDF), but once I import page from external pdf the process hangs:
PDDocument destinationPdfDoc = PDDocument.load(fileInputStream);
PDDocument externalPdf = PDDocument.load(EXTERNAL PDF);
List<PDPage> destinationPages = destinationPdfDoc.getDocumentCatalog().getAllPages();
LayerUtility layerUtility = new LayerUtility(destinationPdfDoc);
// process hangs here
PDXObjectForm firstForm = layerUtility.importPageAsForm(externalPdf, 0);
AffineTransform affineTransform = new AffineTransform();
layerUtility.appendFormAsLayer(destinationPages.get(0), firstForm, affineTransform, "external page");
destinationPdfDoc.save(resultTempFile);
destinationPdfDoc.close();
externalPdf.close();
What I'm doing wrong?

PDFBox dependencies
The main issue was that PDFBox has three core components and one required dependency. One core component was missing.
In comments the OP clarified that
Actually process doesn't hangs, the file is just not created at all.
As this sounds like there might have been an exception or error, trying to envelope the code as a try { ... } catch (Throwable t) { t.printStackTrace(); } block has been proposed in chat. And indeed,
java.lang.NoClassDefFoundError: org/apache/fontbox/util/BoundingBox
at org.apache.pdfbox.util.LayerUtility.importPageAsForm(LayerUtility.java:203)
at org.apache.pdfbox.util.LayerUtility.importPageAsForm(LayerUtility.java:135)
at ...
As it turned out, fontbox.jar was missing from the OP's setup.
The PDFBox version 1.8.x dependencies are described here. Especially there are the three core components pdfbox, fontbox, and jempbox all of which shall be present in the same version, and there is the required dependency commons-logging.
As soon as the missing component had been added, the sample worked properly.
Positioning the imported page
The imported page can be positioned on the target page by means of a translation in the AffineTransform parameter. This parameter also allows for other transformations, e.g. to scale, rotate, mirror, skew,...*
For the original sample files this PDF page
was added onto onto this page
which resulted in this page
The OP then wondered
how to position the imported layer
The parameter for that in the layerUtility.appendFormAsLayer call is the AffineTransform affineTransform. The OP used new AffineTransform() here which creates an identity matrix which in turn causes the source page to be added at the origin of coordinate system, in this case at the bottom.
By using a translation instead of the identity, e.g
PDRectangle destCrop = destinationPages.get(0).findCropBox();
PDRectangle sourceBox = firstForm.getBBox();
AffineTransform affineTransform = AffineTransform.getTranslateInstance(0, destCrop.getUpperRightY() - sourceBox.getHeight());
one can position the source page elsewhere, e.g. at the top:
PDFBox LayerUtility's expectations
Unfortunately it turns out that layerUtility.appendFormAsLayer appends the form to the page without resetting the graphics context.
layerUtility.appendFormAsLayer uses this code to add an additional content stream:
PDPageContentStream contentStream = new PDPageContentStream(
targetDoc, targetPage, true, !DEBUG);
Unfortunately a content stream generated by this constructor inherits the graphics state as is at the end of the existing content of the target page. This especially means that the user space coordinate system may not be in its default state anymore. Some software e.g. mirrors the coordinate system to have y coordinates increasing downwards.
If instead
PDPageContentStream contentStream = new PDPageContentStream(
targetDoc, targetPage, true, !DEBUG, true);
had been used, the graphics state would have been reset to its default state and, therefore, be known.
By itself, therefore, this method is not usable in a controlled manner for arbitrary input.
Fortunately, though, the LayerUtility also has a method wrapInSaveRestore(PDPage) to overcome this weakness by manipulating the content of the given page to have the default graphics state at the end.
Thus, one should replace
layerUtility.appendFormAsLayer(destinationPages.get(0), firstForm, affineTransform, "external page");
by
PDPage destPage = destinationPages.get(0);
layerUtility.wrapInSaveRestore(destPage);
layerUtility.appendFormAsLayer(destPage, firstForm, affineTransform, "external page");

PDFBox LayerUtility - Importing layers into existing PDF

I am using pdfbox to manipulate PDF content. I have a big PDF file (say 500 pages). I also have a few other single page PDF files containing only a single image which are around 8-15kb per file at the max. What I need to do is to import these single page pdf's like an overlay onto certain pages of the big PDF file.
I have tried the LayerUtility of pdfbox where I've succeeded but it creates a very large sized file as the output. The source pdf is about 1MB before processing and when added with the smaller pdf files, the size goes upto 64MB. And sometimes I need to include two smaller PDF's onto the bigger one.
Is there a better way to do this or am I just doing this wrong? Posting code below trying to add two layers onto a single page:
...
...
..
overlayDoc[pCounter] = PDDocument.load("data\\" + overlay + ".pdf");
outputPage[pCounter] = (PDPage) overlayDoc[pCounter].getDocumentCatalog().getAllPages().get(0);
LayerUtility lu = new LayerUtility( overlayDoc[pCounter] );
form[pCounter] = lu.importPageAsForm( bigPDFDoc, Integer.parseInt(pageNo)-1);
lu.appendFormAsLayer( outputPage[pCounter], form[pCounter], aTrans, "OVERLAY_"+pCounter );
outputDoc.addPage(outputPage[pCounter]);
mOverlayDoc[pCounter] = PDDocument.load("data\\" + overlay2 + ".pdf");
mOutputPage[pCounter] = (PDPage) mOverlayDoc[pCounter].getDocumentCatalog().getAllPages().get(0);
LayerUtility lu2 = new LayerUtility( mOverlayDoc[pCounter] );
mForm[pCounter] = lu2.importPageAsForm(outputDoc, outputDoc.getNumberOfPages()-1);
lu.appendFormAsLayer( mOutputPage[pCounter], mForm[pCounter], aTrans, "OVERLAY_2"+pCounter );
outputDoc.removePage(outputPage[pCounter]);
outputDoc.addPage(mOutputPage[pCounter]);
...
...

With code like the following I don't see any unepected growth of size:
PDDocument bigDocument = PDDocument.load(BIG_SOURCE_FILE);
LayerUtility layerUtility = new LayerUtility(bigDocument);
List bigPages = bigDocument.getDocumentCatalog().getAllPages();
// import each page to superimpose only once
PDDocument firstSuperDocument = PDDocument.load(FIRST_SUPER_FILE);
PDXObjectForm firstForm = layerUtility.importPageAsForm(firstSuperDocument, 0);
PDDocument secondSuperDocument = PDDocument.load(SECOND_SUPER_FILE);
PDXObjectForm secondForm = layerUtility.importPageAsForm(secondSuperDocument, 0);
// These things can easily be done in a loop, too
AffineTransform affineTransform = new AffineTransform(); // Identity... your requirements may differ
layerUtility.appendFormAsLayer((PDPage) bigPages.get(0), firstForm, affineTransform, "Superimposed0");
layerUtility.appendFormAsLayer((PDPage) bigPages.get(1), secondForm, affineTransform, "Superimposed1");
layerUtility.appendFormAsLayer((PDPage) bigPages.get(2), firstForm, affineTransform, "Superimposed2");
bigDocument.save(BIG_TARGET_FILE);
As you see I superimposed the first page of FIRST_SUPER_FILE on two pages of the target file but I only imported the page once. Thus, also the resources of that imported page are imported only once.
This approach is open for loops, too, but don't import the same page multiple times! Instead import all required template pages once up front as forms and in the later loop reference those forms again and again.
(I hope this solves your issue. If not, supply more code and the sample PDFs to reproduce your issue.)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to insert an PDPage within another PDPage with pdfbox - java

Related

Remove or Update added Image icon from pdf page using OpenPdf based on iText Core

Add named destinations to an existing PDF document with iText

How to underlay a content stream with using PDPageContentStream?

Add page as layer from separate pdf(different page size) using pdfbox

PDFBox LayerUtility - Importing layers into existing PDF

Categories

Resources