I am using following code to shrink every pages (Top and bottom) of existing pdf using iText library.
Code working fine.
But now if i process result pdf, i get 0 value for rotation of every page, while old pdf has other rotation too(i.e. 90deg).
I want to keep rotation as it is but unable to do it.
Code i am using As below to shrink pages
public void shrinkPDFPages() throws Exception {
PdfReader reader = new PdfReader("D:/testpdfs/test.pdf");
Document doc = new Document();
PdfWriter writer = PdfWriter.getInstance(doc, new FileOutputStream(
"D://testpdfs/result.pdf"));
doc.open();
PdfContentByte cb = writer.getDirectContent();
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
PdfImportedPage page = writer.getImportedPage(reader, i);
float pageHeight = reader.getPageSizeWithRotation(i).getHeight();
float pageWidth = reader.getPageSizeWithRotation(i).getWidth();
int rotation = reader.getPageRotation(i);
Rectangle pageRectangle = reader.getPageSizeWithRotation(i);
Rectangle PageRect = null;
System.out.println(rotation);
switch (rotation) {
case 0:
PageRect = new Rectangle(pageRectangle.getWidth(), pageRectangle
.getHeight());
doc.setPageSize(PageRect);
doc.newPage();
AffineTransform af = new AffineTransform();
af.scale(1, 0.84f);
af.translate(1, 50);
cb.addTemplate(page, af);
break;
case 90:
PageRect = new Rectangle(pageRectangle.getWidth(), pageRectangle
.getHeight());
doc.setPageSize(PageRect);
doc.newPage();
cb.addTemplate(page, 0, -1f, 0.84f, 0, 50, pageHeight);
break;
case 270:
PageRect = new Rectangle(pageRectangle.getWidth(), pageRectangle
.getHeight());
doc.setPageSize(PageRect);
doc.newPage();
cb.addTemplate(page, 0, 1f, -0.84f, 0, pageWidth - 50, 0);
break;
case 180:
PageRect = new Rectangle(pageRectangle.getWidth(), pageRectangle
.getHeight());
doc.setPageSize(PageRect);
doc.newPage();
cb.addTemplate(page, -1f, 0, 0, -0.84f, pageWidth,
pageHeight - 50);
break;
default:
break;
}
}
doc.close();
}
What should i do? so rotation remains as it is.
One more problem i am fetching is, unable to preserve internal hyper links.
Actual pdf page:
After Shrink(Scale Down Content):
Scaling a PDF to make pages larger is easy. This is shown in the ScaleRotate example. It's only a matter of changing the default version of the user unit. By default, this unit is 1, meaning that 1 user unit equals to 1 point. You can increase this value to 75,000. Larger values of the user unit, will result in larger pages.
Unfortunately, the user unit can never be smaller than 1, so you can't use this technique to shrink a page. If you want to shrink a page, you need to introduce a transformation (resulting in a new CTM).
This is shown in the ShrinkPdf example. In this example, I take a PDF named hero.pdf that measures 8.26 by 11.69 inch, and we shrink it by 50%, resulting in a PDF named hero_shrink.pdf that measures 4.13 by 5.85 inch.
To achieve this, we need a dirty hack:
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
int n = reader.getNumberOfPages();
PdfDictionary page;
PdfArray crop;
PdfArray media;
for (int p = 1; p <= n; p++) {
page = reader.getPageN(p);
media = page.getAsArray(PdfName.CROPBOX);
if (media == null) {
media = page.getAsArray(PdfName.MEDIABOX);
}
crop = new PdfArray();
crop.add(new PdfNumber(0));
crop.add(new PdfNumber(0));
crop.add(new PdfNumber(media.getAsNumber(2).floatValue() / 2));
crop.add(new PdfNumber(media.getAsNumber(3).floatValue() / 2));
page.put(PdfName.MEDIABOX, crop);
page.put(PdfName.CROPBOX, crop);
stamper.getUnderContent(p).setLiteral("\nq 0.5 0 0 0.5 0 0 cm\nq\n");
stamper.getOverContent(p).setLiteral("\nQ\nQ\n");
}
stamper.close();
reader.close();
}
We loop over every page and we take the crop box of each page. If there is no crop box, we take the media box. These boxes are stored as arrays of 4 numbers. In this example, I assume that the first two numbers are zero and I divide the next two values by 2 to shrink the page to 50% (if the first two values are not zero, you'll need a more elaborate formula).
Once I have the new array, I change the media box and the crop box to this array, and I introduce a CTM that scales all content down to 50%. I need to use the setLiteral() method to fool iText.
Based on your feedback in the comments section, it appears that you do not understand the mechanics explained in my example. Hence I have made a second example, named ShrinkPdf2:
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
int n = reader.getNumberOfPages();
float percentage = 0.8f;
for (int p = 1; p <= n; p++) {
float offsetX = (reader.getPageSize(p).getWidth() * (1 - percentage)) / 2;
float offsetY = (reader.getPageSize(p).getHeight() * (1 - percentage)) / 2;
stamper.getUnderContent(p).setLiteral(
String.format("\nq %s 0 0 %s %s %s cm\nq\n", percentage, percentage, offsetX, offsetY));
stamper.getOverContent(p).setLiteral("\nQ\nQ\n");
}
stamper.close();
reader.close();
}
In this example, I don't change the page size (no changes to the media or crop box), I only shrink the content (in this case to 80%) and I center the shrunken content on the page, leaving bigger margins to the top, bottom, left and right.
As you can see, this is only a matter of applying the correct Math. This second example is even more simple than the first, so I introduced some extra complexity: now you can easily change the percentage (in the first example I hardcoded 50%. in this case I introduced the percentage variable and I defined it as 80%).
Note that I apply the scaling in the X direction as well as in the Y direction to preserve the aspect ratio. Looking at your screen shots, it looks like you only want to shrink the content in the Y direction. For instance like this:
String.format("\nq 1 0 0 %s 0 %s cm\nq\n", percentage, offsetY)
Feel free to adapt the code to meet your need, but... the result will be ugly: all text will look funny and if you apply it to photos of people standing up vertically, you'll make them look fat.
Related
I have problem with size document. I pass vertical orientation(595.0x842.0) pdf and I change orientation pdf to horizontal(842.0x595.0) but When I return ByteArrayOutputStream as ByteArray and I try read this ByteArray I get vertical orientation(595.0x842.0) instead horizontal(842.0x595.0)
To write new content I use PdfWriter and I send Document with new size and ByteArrayOutputStream as second parametr.
PdfReader reader = new PdfReader(pdf.getFile());
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
Document document = new Document(PageSize.A4.rotate(), 0, 0, 0, 0);
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
I expect the output horizontal(842.0x595.0) instead (595.0x842.0)
I need to this dimension in this code:
PdfReader reader = new PdfReader(pdf.getFile());
int n = reader.getNumberOfPages();
PdfStamper stamper = new PdfStamper(reader,outputStream);
PdfContentByte pageContent;
for (int i = 0; i < n;) {
pageContent = stamper.getOverContent(++i);
System.out.println(reader.getPageSize(i));
ColumnText.showTextAligned(pageContent, Element.ALIGN_RIGHT,
new Phrase(String.format("page %s of %s", i, n)),
reader.getPageSize(i).getWidth()- 20, 20, 0);
}
This variable pdf.getFile() is a ByteArray(ByteArrayOutputStream ) from previous code.
I expect page number in right down corner(x point 842) but actual is 595.
For better understand I send screenshot.
You use the PdfReader method getPageSize but you should use getPageSizeWithRotation, i.e. you should replace
System.out.println(reader.getPageSize(i));
ColumnText.showTextAligned(pageContent, Element.ALIGN_RIGHT,
new Phrase(String.format("page %s of %s", i, n)),
reader.getPageSize(i).getWidth()- 20, 20, 0);
by
System.out.println(reader.getPageSizeWithRotation(i));
ColumnText.showTextAligned(pageContent, Element.ALIGN_RIGHT,
new Phrase(String.format("page %s of %s", i, n)),
reader.getPageSizeWithRotation(i).getWidth()- 20, 20, 0);
The Backgrounds
There are two properties of a page responsible for the final displayed page dimension: MediaBox and Rotate. (Let's for the moment ignore the crop box and all the other boxes which also exist.)
A landscape A4 page, therefore, can be created in two conceptually different ways, either as a 842x595 media box with 0° (or 180°) rotation or as a 595x842 media box with 90° (or 270°) rotation.
If you instantiate a Document with PageSize.A4.rotate(), iText uses the latter way. (If you had instantiated it with new RectangleReadOnly(842,595), iText would have used the former way.)
PdfReader.getPageSize only inspects the media box. Thus, it returns a rectangle with width 595 for your PDF page with 595x842 media box and 90° rotation.
PdfReader.getPageSizeWithRotation also inspects the page rotation. Thus, it returns a rectangle with width 842 for your PDF page with 595x842 media box and 90° rotation.
I am trying to add Page numbers to merged PDF files using Itext on top right corner of the pages, but my pdf content size is different, after merging the PDF's while trying to print the page sizes i am getting approximately same sizes(height and width) on each page, but i am not able see page numbers, because of content size difference. please see below code and pdf attachements which am using for merging PDFs and adding page numbers.
public class PageNumber {
public static void main(String[] args) {
PageNumber number = new PageNumber();
try {
String DOC_ONE_PATH = "C:/Users/Admin/Downloads/codedetailsforartwork/elebill.pdf";
String DOC_TWO_PATH = "C:/Users/Admin/Downloads/codedetailsforartwork/PP-P0109916.pdf";
String DOC_THREE_PATH = "C:/Users/Admin/Downloads/codedetailsforartwork/result.pdf";
String[] files = { DOC_ONE_PATH, DOC_TWO_PATH };
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileOutputStream(DOC_THREE_PATH));
document.open();
PdfReader reader;
int n;
for (int i = 0; i < files.length; i++) {
reader = new PdfReader(files[i]);
n = reader.getNumberOfPages();
for (int page = 0; page < n; ) {
copy.addPage(copy.getImportedPage(reader, ++page));
}
copy.freeReader(reader);
reader.close();
}
// step 5
document.close();
number.manipulatePdf(
"C:/Users/Admin/Downloads/codedetailsforartwork/result.pdf",
"C:/Users/Admin/Downloads/codedetailsforartwork/PP-P0109916_1.pdf");
} catch (IOException | DocumentException | APIException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public static void manipulatePdf(String src, String dest)
throws IOException, DocumentException, APIException {
PdfReader reader = new PdfReader(src);
int n = reader.getNumberOfPages();
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
PdfContentByte pagecontent;
for (int i = 0; i < n;) {
pagecontent = stamper.getOverContent(++i);
System.out.println(i);
com.itextpdf.text.Rectangle pageSize = reader.getPageSize(i);
pageSize.normalize();
float height = pageSize.getHeight();
float width = pageSize.getWidth();
System.out.println(width + " " + height);
ColumnText.showTextAligned(pagecontent, Element.ALIGN_CENTER,
new Phrase(String.format("page %d of %d", i, n)),
width - 200, height-85, 0);
}
stamper.close();
reader.close();
}
}
PDF files Zip
#Bruno's answer explains and/or references answer with explanations for all relevant facts on the issue at hand.
In a nutshell, the two issues of the OP's code are:
he uses reader.getPageSize(i); while this indeed returns the page size, PDF viewers do not display the whole page size but merely the crop box on it. Thus, the OP should use reader.getCropBox(i) instead. According to the PDF specification, "the crop box defines the region to which the contents of the page shall be clipped (cropped) when displayed or printed. ... The default value is the page’s media box."
he uses pageSize.getWidth() and pageSize.getHeight() to determine the upper right corner but should use pageSize.getRight() and pageSize.getTop() instead. The boxes defining the PDF coordinate system may not have the origin in their lower left corner.
I don't understand why you are defining the position of the page number like this:
com.itextpdf.text.Rectangle pageSize = reader.getPageSize(i);
pageSize.normalize();
float height = pageSize.getHeight();
float width = pageSize.getWidth();
where you use
x = width - 200;
y = height - 85;
How does that make sense?
If you have an A4 page in portrait with (0,0) as the coordinate of the lower-left corner, the page number will be added at position x = 395; y = 757. However, (0,0) isn't always the coordinate of the lower-left corner, so the first A4 page with the origin at another position will already put the page number at another position. If the page size is different, the page number will move to other places.
It's as if you're totally unaware of previously answered questions such as How should I interpret the coordinates of a rectangle in PDF? and Where is the Origin (x,y) of a PDF page?
I know, I know, finding these specific answers on StackOverflow is hard, but I've spent many weeks organizing the best iText questions on StackOverflow on the official web site. See for instance: How should I interpret the coordinates of a rectangle in PDF? and Where is the origin (x,y) of a PDF page?
These Q&As are even available in a free ebook! If you take a moment to educate yourself by reading the documentation, you'll find the answer to the question How to position text relative to page? that was already answered on StackOverflow in 2013: How to position text relative to page using iText?
For instance, if you want to position your page number at the bottom and in the middle, you need to define your coordinates like this:
float x = pageSize.getBottom() + 10;
float y = pageSize.getLeft() + pageSize.getWidth() / 2;
ColumnText.showTextAligned(pagecontent, Element.ALIGN_CENTER,
new Phrase(String.format("page %d of %d", i, n)), x, y, 0);
I hope this answer will inspire you to read the documentation. I've spent weeks of work on organizing that documentation and it's frustrating when I discover that people don't read it.
I have a bunch of PDF documents in a folder and I want to augment them with a watermark. What are my options from a Java serverside context?
Preferably the watermark will support transparency. Both vector and raster is desirable.
Please take a look at the TransparentWatermark2 example. It adds transparent text on each odd page and a transparent image on each even page of an existing PDF document.
This is how it's done:
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
int n = reader.getNumberOfPages();
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
// text watermark
Font f = new Font(FontFamily.HELVETICA, 30);
Phrase p = new Phrase("My watermark (text)", f);
// image watermark
Image img = Image.getInstance(IMG);
float w = img.getScaledWidth();
float h = img.getScaledHeight();
// transparency
PdfGState gs1 = new PdfGState();
gs1.setFillOpacity(0.5f);
// properties
PdfContentByte over;
Rectangle pagesize;
float x, y;
// loop over every page
for (int i = 1; i <= n; i++) {
pagesize = reader.getPageSizeWithRotation(i);
x = (pagesize.getLeft() + pagesize.getRight()) / 2;
y = (pagesize.getTop() + pagesize.getBottom()) / 2;
over = stamper.getOverContent(i);
over.saveState();
over.setGState(gs1);
if (i % 2 == 1)
ColumnText.showTextAligned(over, Element.ALIGN_CENTER, p, x, y, 0);
else
over.addImage(img, w, 0, 0, h, x - (w / 2), y - (h / 2));
over.restoreState();
}
stamper.close();
reader.close();
}
As you can see, we create a Phrase object for the text and an Image object for the image. We also create a PdfGState object for the transparency. In our case, we go for 50% opacity (change the 0.5f into something else to experiment).
Once we have these objects, we loop over every page. We use the PdfReader object to get information about the existing document, for instance the dimensions of every page. We use the PdfStamper object when we want to stamp extra content on the existing document, for instance adding a watermark on top of each single page.
When changing the graphics state, it is always safe to perform a saveState() before you start and to restoreState() once you're finished. You code will probably also work if you don't do this, but believe me: it can save you plenty of debugging time if you adopt the discipline to do this as you can get really strange effects if the graphics state is out of balance.
We apply the transparency using the setGState() method and depending on whether the page is an odd page or an even page, we add the text (using ColumnText and an (x, y) coordinate calculated so that the text is added in the middle of each page) or the image (using the addImage() method and the appropriate parameters for the transformation matrix).
Once you've done this for every page in the document, you have to close() the stamper and the reader.
Caveat:
You'll notice that pages 3 and 4 are in landscape, yet there is a difference between those two pages that isn't visible to the naked eye. Page 3 is actually a page of which the size is defined as if it were a page in portrait, but it is rotated by 90 degrees. Page 4 is a page of which the size is defined in such a way that the width > the height.
This can have an impact on the way you add a watermark, but if you use getPageSizeWithRotation(), iText will adapt. This may not be what you want: maybe you want the watermark to be added differently.
Take a look at TransparentWatermark3:
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
int n = reader.getNumberOfPages();
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.setRotateContents(false);
// text watermark
Font f = new Font(FontFamily.HELVETICA, 30);
Phrase p = new Phrase("My watermark (text)", f);
// image watermark
Image img = Image.getInstance(IMG);
float w = img.getScaledWidth();
float h = img.getScaledHeight();
// transparency
PdfGState gs1 = new PdfGState();
gs1.setFillOpacity(0.5f);
// properties
PdfContentByte over;
Rectangle pagesize;
float x, y;
// loop over every page
for (int i = 1; i <= n; i++) {
pagesize = reader.getPageSize(i);
x = (pagesize.getLeft() + pagesize.getRight()) / 2;
y = (pagesize.getTop() + pagesize.getBottom()) / 2;
over = stamper.getOverContent(i);
over.saveState();
over.setGState(gs1);
if (i % 2 == 1)
ColumnText.showTextAligned(over, Element.ALIGN_CENTER, p, x, y, 0);
else
over.addImage(img, w, 0, 0, h, x - (w / 2), y - (h / 2));
over.restoreState();
}
stamper.close();
reader.close();
}
In this case, we don't use getPageSizeWithRotation() but simply getPageSize(). We also tell the stamper not to compensate for the existing page rotation: stamper.setRotateContents(false);
Take a look at the difference in the resulting PDFs:
In the first screen shot (showing page 3 and 4 of the resulting PDF of TransparentWatermark2), the page to the left is actually a page in portrait rotated by 90 degrees. iText however, treats it as if it were a page in landscape just like the page to the right.
In the second screen shot (showing page 3 and 4 of the resulting PDF of TransparentWatermark3), the page to the left is a page in portrait rotated by 90 degrees and we add the watermark as if the page is in portrait. As a result, the watermark is also rotated by 90 degrees. This doesn't happen with the page to the right, because that page has a rotation of 0 degrees.
This is a subtle difference, but I thought you'd want to know.
If you want to read this answer in French, please read Comment créer un filigrane transparent en PDF?
Best option is iText. Check a watermark demo here
Important part of the code (where the watermar is inserted) is this:
public class Watermark extends PdfPageEventHelper {
#Override
public void onEndPage(PdfWriter writer, Document document) {
// insert here your watermark
}
Read carefully the example.
onEndPage() method will be something like (in my logo-watermarks I use com.itextpdf.text.Image;):
Image image = Image.getInstance(this.getClass().getResource("/path/to/image.png"));
// set transparency
image.setTransparency(transparency);
// set position
image.setAbsolutePosition(absoluteX, absoluteY);
// put into document
document.add(image);
I have a pdf which comprises of some data, followed by some whitespace. I don't know how large the data is, but I'd like to trim off the whitespace following the data
PdfReader reader = new PdfReader(PDFLOCATION);
Rectangle rect = new Rectangle(700, 2000);
Document document = new Document(rect);
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(SAVELCATION));
document.open();
int n = reader.getNumberOfPages();
PdfImportedPage page;
for (int i = 1; i <= n; i++) {
document.newPage();
page = writer.getImportedPage(reader, i);
Image instance = Image.getInstance(page);
document.add(instance);
}
document.close();
Is there a way to clip/trim the whitespace for each page in the new document?
This PDF contains vector graphics.
I'm usung iTextPDF, but can switch to any Java library (mavenized, Apache license preferred)
As no actual solution has been posted, here some pointers from the accompanying itext-questions mailing list thread:
As you want to merely trim pages, this is not a case of PdfWriter + getImportedPage usage but instead of PdfStamper usage. Your main code using a PdfStamper might look like this:
PdfReader reader = new PdfReader(resourceStream);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("target/test-outputs/test-trimmed-stamper.pdf"));
// Go through all pages
int n = reader.getNumberOfPages();
for (int i = 1; i <= n; i++)
{
Rectangle pageSize = reader.getPageSize(i);
Rectangle rect = getOutputPageSize(pageSize, reader, i);
PdfDictionary page = reader.getPageN(i);
page.put(PdfName.CROPBOX, new PdfArray(new float[]{rect.getLeft(), rect.getBottom(), rect.getRight(), rect.getTop()}));
stamper.markUsed(page);
}
stamper.close();
As you see I also added another argument to your getOutputPageSize method to-be. It is the page number. The amount of white space to trim might differ on different pages after all.
If the source document did not contain vector graphics, you could simply use the iText parser package classes. There even already is a TextMarginFinder based on them. In this case the getOutputPageSize method (with the additional page parameter) could look like this:
private Rectangle getOutputPageSize(Rectangle pageSize, PdfReader reader, int page) throws IOException
{
PdfReaderContentParser parser = new PdfReaderContentParser(reader);
TextMarginFinder finder = parser.processContent(page, new TextMarginFinder());
Rectangle result = new Rectangle(finder.getLlx(), finder.getLly(), finder.getUrx(), finder.getUry());
System.out.printf("Text/bitmap boundary: %f,%f to %f, %f\n", finder.getLlx(), finder.getLly(), finder.getUrx(), finder.getUry());
return result;
}
Using this method with your file test.pdf results in:
As you see the code trims according to text (and bitmap image) content on the page.
To find the bounding box respecting vector graphics, too, you essentially have to do the same but you have to extend the parser framework used here to inform its listeners (the TextMarginFinder essentially is a listener to drawing events sent from the parser framework) about vector graphics operations, too. This is non-trivial, especially if you don't know PDF syntax by heart yet.
If your PDFs to trim are not too generic but can be forced to include some text or bitmap graphics in relevant positions, though, you could use the sample code above (probably with minor changes) anyways.
E.g. if your PDFs always start with text on top and end with text at the bottom, you could change getOutputPageSize to create the result rectangle like this:
Rectangle result = new Rectangle(pageSize.getLeft(), finder.getLly(), pageSize.getRight(), finder.getUry());
This only trims top and bottom empty space:
Depending on your input data pool and requirements this might suffice.
Or you can use some other heuristics depending on your knowledge on the input data. If you know something about the positioning of text (e.g. the heading to always be centered and some other text to always start at the left), you can easily extend the TextMarginFinder to take advantage of this knowledge.
Recent (April 2015, iText 5.5.6-SNAPSHOT) improvements
The current development version, 5.5.6-SNAPSHOT, extends the parser package to also include vector graphics parsing. This allows for an extension of iText's original TextMarginFinder class implementing the new ExtRenderListener methods like this:
#Override
public void modifyPath(PathConstructionRenderInfo renderInfo)
{
List<Vector> points = new ArrayList<Vector>();
if (renderInfo.getOperation() == PathConstructionRenderInfo.RECT)
{
float x = renderInfo.getSegmentData().get(0);
float y = renderInfo.getSegmentData().get(1);
float w = renderInfo.getSegmentData().get(2);
float h = renderInfo.getSegmentData().get(3);
points.add(new Vector(x, y, 1));
points.add(new Vector(x+w, y, 1));
points.add(new Vector(x, y+h, 1));
points.add(new Vector(x+w, y+h, 1));
}
else if (renderInfo.getSegmentData() != null)
{
for (int i = 0; i < renderInfo.getSegmentData().size()-1; i+=2)
{
points.add(new Vector(renderInfo.getSegmentData().get(i), renderInfo.getSegmentData().get(i+1), 1));
}
}
for (Vector point: points)
{
point = point.cross(renderInfo.getCtm());
Rectangle2D.Float pointRectangle = new Rectangle2D.Float(point.get(Vector.I1), point.get(Vector.I2), 0, 0);
if (currentPathRectangle == null)
currentPathRectangle = pointRectangle;
else
currentPathRectangle.add(pointRectangle);
}
}
#Override
public Path renderPath(PathPaintingRenderInfo renderInfo)
{
if (renderInfo.getOperation() != PathPaintingRenderInfo.NO_OP)
{
if (textRectangle == null)
textRectangle = currentPathRectangle;
else
textRectangle.add(currentPathRectangle);
}
currentPathRectangle = null;
return null;
}
#Override
public void clipPath(int rule)
{
}
(Full source: MarginFinder.java)
Using this class to trim the white space results in
which is pretty much what one would hope for.
Beware: The implementation above is far from optimal. It is not even correct as it includes all curve control points which is too much. Furthermore it ignores stuff like line width or wedge types. It actually merely is a proof-of-concept.
All test code is in TestTrimPdfPage.java.
I am currently in the process of writing an app that "formats" PDF's to our requirements, using iText 2.1.7.
We basically take a portrait PDF, and scale down the pages, so we can fit 2 pages of the original PDF, on one landscape page of the new PDF. We also leave some space at the bottom of the page which is used for post processing.
This process works 90% of the time as it should.
However, we received a PDF that has been cropped/trimmed by the content department, and when we view this PDF in Acrobat, it looks as excpected. However, when we process it, the new PDF includes the entire original MediaBox, and the crop lines.
Here is the code we use, and how a problem output looks.
File tempFile = new File(tempFilename);
PdfReader reader = new PdfReader(originalPdfFile);
Document doc = new Document(new RectangleReadOnly(842f, 595f), 0, 0, 0, 0);
PdfWriter writer = PdfWriter.getInstance(doc, new FileOutputStream(tempFile));
doc.open();
for (int i = 1; i < reader.getNumberOfPages(); i = i + 2) {
doc.newPage();
PdfContentByte cb = writer.getDirectContent();
PdfImportedPage page = writer.getImportedPage(reader, i); // page #1
float documentWidth = doc.getPageSize().getWidth() / 2;
float documentHeight = doc.getPageSize().getHeight() - 65f;
float pageWidth = page.getWidth();
float pageHeight = page.getHeight();
float widthScale = documentWidth / pageWidth;
float heightScale = documentHeight / pageHeight;
float scale = Math.min(widthScale, heightScale);
float offsetX = (documentWidth - (pageWidth * scale)) / 2;
float offsetY = 65f; //100f
cb.addTemplate(page, scale, 0, 0, scale, offsetX, offsetY);
PdfImportedPage page2 = writer.getImportedPage(reader, i+1); // page #2
pageWidth = page.getWidth();
pageHeight = page.getHeight();
widthScale = documentWidth / pageWidth;
heightScale = documentHeight / pageHeight;
scale = Math.min(widthScale, heightScale);
offsetX = ((documentWidth - (pageWidth * scale)) / 2) + documentWidth;
offsetY = 65f; //100f
cb.addTemplate(page2, scale, 0, 0, scale, offsetX, offsetY);//430f
}
doc.close();
original in acrobat:
modified in acrobat, showing unwanted pretrim content:
Although it's hard to be sure without seeing the PDF itself, I suspect your problem is that this PDF specifies a CropBox on at least some of its pages. If that's the case, then I think that you would want to do something like page.setBoundingBox(reader.getCropBox(i)); right after you get the page reference.
Note that the default value of a page's CropBox is it's MediaBox, so the addition of the above line shouldn't negatively impact the layout of PDF pages that do not specify a CropBox.
(I am not an iText user, so this is a bit of speculation on my part...)
Good luck!
after a lot of frustration, I finally got this to work, by "hard cropping" the PDF before doing my scaling and layout processing.
The hard cropping takes an Acrobat cropped PDF (cropped = hidden), and uses PdfStamper to create a new PDF only containing the contents from inside the crop box.
public String cropPdf(String pdfFilePath) throws DocumentException, IOException {
String filename = FilenameUtils.getBaseName(pdfFilePath) + "_cropped." + FilenameUtils.getExtension(pdfFilePath);
filename = FilenameUtils.concat(System.getProperty("java.io.tmpdir"), filename);
PdfReader reader = new PdfReader(pdfFilePath);
try {
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(filename));
try {
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
PdfDictionary pdfDictionary = reader.getPageN(i);
PdfArray cropArray = new PdfArray();
Rectangle cropbox = reader.getCropBox(i);
cropArray.add(new PdfNumber(cropbox.getLeft()));
cropArray.add(new PdfNumber(cropbox.getBottom()));
cropArray.add(new PdfNumber(cropbox.getLeft() + cropbox.getWidth()));
cropArray.add(new PdfNumber(cropbox.getBottom() + cropbox.getHeight()));
pdfDictionary.put(PdfName.CROPBOX, cropArray);
pdfDictionary.put(PdfName.MEDIABOX, cropArray);
pdfDictionary.put(PdfName.TRIMBOX, cropArray);
pdfDictionary.put(PdfName.BLEEDBOX, cropArray);
}
return filename;
} finally {
stamper.close();
}
} finally {
reader.close();
}
}
One minor but important fix to Kabals answer: the boxes expect width/height instead of coordinates:
...
cropArray.add(new PdfNumber(cropbox.getLeft()));
cropArray.add(new PdfNumber(cropbox.getBottom()));
cropArray.add(new PdfNumber(cropbox.getWidth()));
cropArray.add(new PdfNumber(cropbox.getHeight()));
...