Absolute position with itext7 - java

I have a problem with adding a image with absolute position relative to page size in itext7.
In itext5 I used the code below to determine the image position relative to the page that I'm adding it to
for (int i = 0; i < numberOfPages;) {
page = copy.getImportedPage(reader, ++i);
if(page.getBoundingBox().getWidth() != 595.00f) {
img.setAbsolutePosition(page.getBoundingBox().getWidth() - (595-img.getAbsoluteX()),img.getAbsoluteY());
}
if(page.getBoundingBox().getHeight() != 842.00f) {
img.setAbsolutePosition(img.getAbsoluteX(), page.getBoundingBox().getHeight() - (842-img.getAbsoluteY()));
}
stamp = copy.createPageStamp(page);
stamp.getOverContent().addImage(img);
stamp.alterContents();
copy.addPage(page);
}
Now for itext7 I'm using
public static void addImageToPDF(String inputFilePath, Image img) throws IOException, DocumentException {
File inFile = new File(inputFilePath);
File outFile = new File(inputFilePath + "_image.pdf");
PdfDocument pdfDoc = new PdfDocument(new PdfReader(inFile), new PdfWriter(outFile));
Document document = new Document(pdfDoc);
int numberOfPages = pdfDoc.getNumberOfPages();
Rectangle pageSize;
// Loop over the pages of document
for (int i = 1; i <= numberOfPages; i++) {
pageSize = pdfDoc.getPage(i).getPageSize();
if(pageSize.getWidth() != 595.00f) {
img.setFixedPosition(pageSize.getWidth() - (595-img.getImageWidth()),img.getImageHeight());
}
if(pageSize.getHeight() != 842.00f) {
img.setFixedPosition(img.getImageWidth(), pageSize.getHeight() - (842-img.getImageHeight()));
}
document.add(img);
}
}
I need the image to be added in the right top corner relative to the page, but now it adds it in the middle of the screen on the right.
Is there a way to set absolute position in itext7 when adding an image? The image is not always on the same position to the exact width and height so I's a problem for me using fixed position.

I don't get why you need two cases in your for loop. If your goal is to place the image to the top right position of the page and you know the image width and height as well as page width and height, all you need to do is calculate the coordinates to pass to setFixedPosition method.
setFixedPosition accepts x and y coordinates which are image's left bottom coordinates in PDF coordinate system, i.e. left to right, top to bottom.
So you need to subtract image width from page width and do the same for height, which results in the following one-liner:
img.setFixedPosition(pageSize.getWidth() - img.getImageWidth(), pageSize.getHeight() - img.getImageHeight());

Related

Why does PDFBox read the image width/height wrong? (always assumes "width" is the bigger one)

I'm using the PDFBox library (see here) to convert an image to PDF. The goal is to have a image scaled to a full A4 page in the PDF file. And it works well, except one thing:
The image height and width seem to be mixed up. The width is always assumed to be the bigger value of them both. I have 2 images: One has the dimensions (according to the Windows file details) 4032x2268 (landscape) and the other one 2268x4032 (portrait).
When i load the images in PDFBox, the width is always 4032 and the height 2268. The goal is to create a landscape PDF for one and a portrait PDF for the other one. This weird "bug" (?) causes the portrait image to convert to a landscape PDF which of course causes the image to be rotated (which is inconventient).
Here's the relevant part of my code:
public byte[] imageToPDF(MultipartFile file) throws IOException {
PDDocument pdf = new PDDocument();
PDImageXObject pdImage = PDImageXObject.createFromByteArray(pdf, file.getBytes(), file.getOriginalFilename());
// scale image to fit the full page
PDPage page;
int imageWidth;
int imageHeight;
if (pdImage.getWidth() > pdImage.getHeight()) {
// landscape pdf
float pageHeight = PDRectangle.A4.getWidth();
float pageWidth = PDRectangle.A4.getHeight();
page = new PDPage(new PDRectangle(pageWidth, pageHeight));
imageWidth = (int)pageWidth;
imageHeight = (int)(((double)imageWidth / (double)pdImage.getWidth()) * (double)pdImage.getHeight());
} else {
// portrait pdf
float pageHeight = PDRectangle.A4.getHeight();
float pageWidth = PDRectangle.A4.getWidth();
page = new PDPage(new PDRectangle(pageWidth, pageHeight));
imageHeight = (int)pageHeight;
imageWidth = (int)(((double)imageHeight / (double)pdImage.getHeight()) * (double)pdImage.getWidth());
}
...
}
pdImage.getWidth() is always greater than pdImage.getHeight(), no matter which of the two images I use. Does anyone have an idea?

Unable to extract values from PDF for specific coordinates using java apache pdfbox

My task is to extract text from PDF for a specific coordinates.
I have used Apache Pdfbox client for data extraction .
To get the x, y , height and width coordinates from the PDF i am using PDF X change tool which is in Millimeter. When i pass the value in the rectangle the values are not getting empty value.
public String getTextUsingPositionsUsingPdf(String pdfLocation, int pageNumber, double x, double y, double width,
double height) throws IOException {
String extractedText = "";
// PDDocument Creates an empty PDF document. You need to add at least
// one page for the document to be valid.
// Using load method we can load a PDF document
PDDocument document = null;
PDPage page = null;
try {
if (pdfLocation.endsWith(".pdf")) {
document = PDDocument.load(new File(pdfLocation));
int getDocumentPageCount = document.getNumberOfPages();
System.out.println(getDocumentPageCount);
// Get specific page. THe parameter is pageindex which starts with // 0. If we need to
// access the first page then // the pageIdex is 0 PDPage
if (getDocumentPageCount > 0) {
page = document.getPage(pageNumber + 1);
} else if (getDocumentPageCount == 0) {
page = document.getPage(0);
}
// To create a rectangle by passing the x axis, y axis, width and height
Rectangle2D rect = new Rectangle2D.Double(x, y, width, height);
String regionName = "region1";
// Strip the text from PDF using PDFTextStripper Area with the
// help of Rectangle and named need to given for the rectangle
PDFTextStripperByArea stripper = new PDFTextStripperByArea();
stripper.setSortByPosition(true);
stripper.addRegion(regionName, rect);
stripper.extractRegions(page);
System.out.println("Region is " + stripper.getTextForRegion("region1"));
extractedText = stripper.getTextForRegion("region1");
} else {
System.out.println("No data return");
}
} catch (IOException e) {
System.out.println("The file not found" + "");
} finally {
document.close();
}
// Return the extracted text and this can be used for assertion
return extractedText;
}
Please suggest whether my way is correct or not..
I have used this PDF tutorialspoint.com/uipath/uipath_tutorial.pdf.. Where i am trying to find the text "a part of contests" which is have x = 55.6 mm y = 168.8 width = 210.0 mm and height = 297.0. But i am getting empty value
I tested your method with those inputs:
System.out.println("Extracting like Venkatachalam Neelakantan from uipath_tutorial.pdf\n");
float MM_TO_UNITS = 1/(10*2.54f)*72;
String text = getTextUsingPositionsUsingPdf("src/test/resources/mkl/testarea/pdfbox2/extract/uipath_tutorial.pdf",
0, 55.6 * MM_TO_UNITS, 168.8 * MM_TO_UNITS, 210.0 * MM_TO_UNITS, 297.0 * MM_TO_UNITS);
System.out.printf("\n---\nResult:\n%s\n", text);
(ExtractText test testUiPathTutorial)
and got the result
part of contents of this e-book in any manner without written consent
te the contents of our website and tutorials as timely and as precisely as
, the contents may contain inaccuracies or errors. Tutorials Point (I) Pvt.
guarantee regarding the accuracy, timeliness or completeness of our
tents including this tutorial. If you discover any errors on our website or
ease notify us at contact#tutorialspoint.com
i
Assuming you actually were looking for "a part of contents", not "a part of contests", merely the 'a' is missing; probably when measuring you looked for the beginning of the visible letter drawing but the actual glyph origin is a bit before that. If you choose a slightly smaller x, e.g. 54.6 mm, you'll also get the 'a'.
It obviously is no surprise that you get more than "a part of contents", considering the width and height of your rectangle.
Should you wonder about the MM_TO_UNITS constant, have a look at this answer.

create a one page PDF from two PDFs using PDFBOX

I have a small (quarter inch) one page PDF I created with PDFBOX with text (A). I want to put that small one page PDF (A) on the top of an existing PDF page (B), preserving the existing content of the PDF page (B). In the end, I will have a one page PDF, representing the small PDF on top(A), and the existing PDF intact making up the rest (B). How can I accomplish this with PDFBOX?
To join two pages one atop the other onto one target page, you can make use of the PDFBox LayerUtility for importing pages as form XObjects in a fashion similar to PDFBox SuperimposePage example, e.g. with this helper method:
void join(PDDocument target, PDDocument topSource, PDDocument bottomSource) throws IOException {
LayerUtility layerUtility = new LayerUtility(target);
PDFormXObject topForm = layerUtility.importPageAsForm(topSource, 0);
PDFormXObject bottomForm = layerUtility.importPageAsForm(bottomSource, 0);
float height = topForm.getBBox().getHeight() + bottomForm.getBBox().getHeight();
float width, topMargin, bottomMargin;
if (topForm.getBBox().getWidth() > bottomForm.getBBox().getWidth()) {
width = topForm.getBBox().getWidth();
topMargin = 0;
bottomMargin = (topForm.getBBox().getWidth() - bottomForm.getBBox().getWidth()) / 2;
} else {
width = bottomForm.getBBox().getWidth();
topMargin = (bottomForm.getBBox().getWidth() - topForm.getBBox().getWidth()) / 2;
bottomMargin = 0;
}
PDPage targetPage = new PDPage(new PDRectangle(width, height));
target.addPage(targetPage);
PDPageContentStream contentStream = new PDPageContentStream(target, targetPage);
if (bottomMargin != 0)
contentStream.transform(Matrix.getTranslateInstance(bottomMargin, 0));
contentStream.drawForm(bottomForm);
contentStream.transform(Matrix.getTranslateInstance(topMargin - bottomMargin, bottomForm.getBBox().getHeight()));
contentStream.drawForm(topForm);
contentStream.close();
}
(JoinPages method join)
You use it like this:
try ( PDDocument document = new PDDocument();
PDDocument top = ...;
PDDocument bottom = ...) {
join(document, top, bottom);
document.save("joinedPage.pdf");
}
(JoinPages test testJoinSmallAndBig)
The result looks like this:
Just as an additional point to #mkl's answer.
If anybody is looking to scale the PDFs before placing them on the page use,
contentStream.transform(Matrix.getScaleInstance(<scaling factor in x axis>, <scaling factor in y axis>)); //where 1 is the scaling factor if you want the page as the original size
This way you can rescale your PDFs.

Add page numbers to Merged PDF with different Pages sizes using IText API

I am trying to add Page numbers to merged PDF files using Itext on top right corner of the pages, but my pdf content size is different, after merging the PDF's while trying to print the page sizes i am getting approximately same sizes(height and width) on each page, but i am not able see page numbers, because of content size difference. please see below code and pdf attachements which am using for merging PDFs and adding page numbers.
public class PageNumber {
public static void main(String[] args) {
PageNumber number = new PageNumber();
try {
String DOC_ONE_PATH = "C:/Users/Admin/Downloads/codedetailsforartwork/elebill.pdf";
String DOC_TWO_PATH = "C:/Users/Admin/Downloads/codedetailsforartwork/PP-P0109916.pdf";
String DOC_THREE_PATH = "C:/Users/Admin/Downloads/codedetailsforartwork/result.pdf";
String[] files = { DOC_ONE_PATH, DOC_TWO_PATH };
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileOutputStream(DOC_THREE_PATH));
document.open();
PdfReader reader;
int n;
for (int i = 0; i < files.length; i++) {
reader = new PdfReader(files[i]);
n = reader.getNumberOfPages();
for (int page = 0; page < n; ) {
copy.addPage(copy.getImportedPage(reader, ++page));
}
copy.freeReader(reader);
reader.close();
}
// step 5
document.close();
number.manipulatePdf(
"C:/Users/Admin/Downloads/codedetailsforartwork/result.pdf",
"C:/Users/Admin/Downloads/codedetailsforartwork/PP-P0109916_1.pdf");
} catch (IOException | DocumentException | APIException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public static void manipulatePdf(String src, String dest)
throws IOException, DocumentException, APIException {
PdfReader reader = new PdfReader(src);
int n = reader.getNumberOfPages();
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
PdfContentByte pagecontent;
for (int i = 0; i < n;) {
pagecontent = stamper.getOverContent(++i);
System.out.println(i);
com.itextpdf.text.Rectangle pageSize = reader.getPageSize(i);
pageSize.normalize();
float height = pageSize.getHeight();
float width = pageSize.getWidth();
System.out.println(width + " " + height);
ColumnText.showTextAligned(pagecontent, Element.ALIGN_CENTER,
new Phrase(String.format("page %d of %d", i, n)),
width - 200, height-85, 0);
}
stamper.close();
reader.close();
}
}
PDF files Zip
#Bruno's answer explains and/or references answer with explanations for all relevant facts on the issue at hand.
In a nutshell, the two issues of the OP's code are:
he uses reader.getPageSize(i); while this indeed returns the page size, PDF viewers do not display the whole page size but merely the crop box on it. Thus, the OP should use reader.getCropBox(i) instead. According to the PDF specification, "the crop box defines the region to which the contents of the page shall be clipped (cropped) when displayed or printed. ... The default value is the page’s media box."
he uses pageSize.getWidth() and pageSize.getHeight() to determine the upper right corner but should use pageSize.getRight() and pageSize.getTop() instead. The boxes defining the PDF coordinate system may not have the origin in their lower left corner.
I don't understand why you are defining the position of the page number like this:
com.itextpdf.text.Rectangle pageSize = reader.getPageSize(i);
pageSize.normalize();
float height = pageSize.getHeight();
float width = pageSize.getWidth();
where you use
x = width - 200;
y = height - 85;
How does that make sense?
If you have an A4 page in portrait with (0,0) as the coordinate of the lower-left corner, the page number will be added at position x = 395; y = 757. However, (0,0) isn't always the coordinate of the lower-left corner, so the first A4 page with the origin at another position will already put the page number at another position. If the page size is different, the page number will move to other places.
It's as if you're totally unaware of previously answered questions such as How should I interpret the coordinates of a rectangle in PDF? and Where is the Origin (x,y) of a PDF page?
I know, I know, finding these specific answers on StackOverflow is hard, but I've spent many weeks organizing the best iText questions on StackOverflow on the official web site. See for instance: How should I interpret the coordinates of a rectangle in PDF? and Where is the origin (x,y) of a PDF page?
These Q&As are even available in a free ebook! If you take a moment to educate yourself by reading the documentation, you'll find the answer to the question How to position text relative to page? that was already answered on StackOverflow in 2013: How to position text relative to page using iText?
For instance, if you want to position your page number at the bottom and in the middle, you need to define your coordinates like this:
float x = pageSize.getBottom() + 10;
float y = pageSize.getLeft() + pageSize.getWidth() / 2;
ColumnText.showTextAligned(pagecontent, Element.ALIGN_CENTER,
new Phrase(String.format("page %d of %d", i, n)), x, y, 0);
I hope this answer will inspire you to read the documentation. I've spent weeks of work on organizing that documentation and it's frustrating when I discover that people don't read it.

How to watermark PDFs using text or images?

I have a bunch of PDF documents in a folder and I want to augment them with a watermark. What are my options from a Java serverside context?
Preferably the watermark will support transparency. Both vector and raster is desirable.
Please take a look at the TransparentWatermark2 example. It adds transparent text on each odd page and a transparent image on each even page of an existing PDF document.
This is how it's done:
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
int n = reader.getNumberOfPages();
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
// text watermark
Font f = new Font(FontFamily.HELVETICA, 30);
Phrase p = new Phrase("My watermark (text)", f);
// image watermark
Image img = Image.getInstance(IMG);
float w = img.getScaledWidth();
float h = img.getScaledHeight();
// transparency
PdfGState gs1 = new PdfGState();
gs1.setFillOpacity(0.5f);
// properties
PdfContentByte over;
Rectangle pagesize;
float x, y;
// loop over every page
for (int i = 1; i <= n; i++) {
pagesize = reader.getPageSizeWithRotation(i);
x = (pagesize.getLeft() + pagesize.getRight()) / 2;
y = (pagesize.getTop() + pagesize.getBottom()) / 2;
over = stamper.getOverContent(i);
over.saveState();
over.setGState(gs1);
if (i % 2 == 1)
ColumnText.showTextAligned(over, Element.ALIGN_CENTER, p, x, y, 0);
else
over.addImage(img, w, 0, 0, h, x - (w / 2), y - (h / 2));
over.restoreState();
}
stamper.close();
reader.close();
}
As you can see, we create a Phrase object for the text and an Image object for the image. We also create a PdfGState object for the transparency. In our case, we go for 50% opacity (change the 0.5f into something else to experiment).
Once we have these objects, we loop over every page. We use the PdfReader object to get information about the existing document, for instance the dimensions of every page. We use the PdfStamper object when we want to stamp extra content on the existing document, for instance adding a watermark on top of each single page.
When changing the graphics state, it is always safe to perform a saveState() before you start and to restoreState() once you're finished. You code will probably also work if you don't do this, but believe me: it can save you plenty of debugging time if you adopt the discipline to do this as you can get really strange effects if the graphics state is out of balance.
We apply the transparency using the setGState() method and depending on whether the page is an odd page or an even page, we add the text (using ColumnText and an (x, y) coordinate calculated so that the text is added in the middle of each page) or the image (using the addImage() method and the appropriate parameters for the transformation matrix).
Once you've done this for every page in the document, you have to close() the stamper and the reader.
Caveat:
You'll notice that pages 3 and 4 are in landscape, yet there is a difference between those two pages that isn't visible to the naked eye. Page 3 is actually a page of which the size is defined as if it were a page in portrait, but it is rotated by 90 degrees. Page 4 is a page of which the size is defined in such a way that the width > the height.
This can have an impact on the way you add a watermark, but if you use getPageSizeWithRotation(), iText will adapt. This may not be what you want: maybe you want the watermark to be added differently.
Take a look at TransparentWatermark3:
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
int n = reader.getNumberOfPages();
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.setRotateContents(false);
// text watermark
Font f = new Font(FontFamily.HELVETICA, 30);
Phrase p = new Phrase("My watermark (text)", f);
// image watermark
Image img = Image.getInstance(IMG);
float w = img.getScaledWidth();
float h = img.getScaledHeight();
// transparency
PdfGState gs1 = new PdfGState();
gs1.setFillOpacity(0.5f);
// properties
PdfContentByte over;
Rectangle pagesize;
float x, y;
// loop over every page
for (int i = 1; i <= n; i++) {
pagesize = reader.getPageSize(i);
x = (pagesize.getLeft() + pagesize.getRight()) / 2;
y = (pagesize.getTop() + pagesize.getBottom()) / 2;
over = stamper.getOverContent(i);
over.saveState();
over.setGState(gs1);
if (i % 2 == 1)
ColumnText.showTextAligned(over, Element.ALIGN_CENTER, p, x, y, 0);
else
over.addImage(img, w, 0, 0, h, x - (w / 2), y - (h / 2));
over.restoreState();
}
stamper.close();
reader.close();
}
In this case, we don't use getPageSizeWithRotation() but simply getPageSize(). We also tell the stamper not to compensate for the existing page rotation: stamper.setRotateContents(false);
Take a look at the difference in the resulting PDFs:
In the first screen shot (showing page 3 and 4 of the resulting PDF of TransparentWatermark2), the page to the left is actually a page in portrait rotated by 90 degrees. iText however, treats it as if it were a page in landscape just like the page to the right.
In the second screen shot (showing page 3 and 4 of the resulting PDF of TransparentWatermark3), the page to the left is a page in portrait rotated by 90 degrees and we add the watermark as if the page is in portrait. As a result, the watermark is also rotated by 90 degrees. This doesn't happen with the page to the right, because that page has a rotation of 0 degrees.
This is a subtle difference, but I thought you'd want to know.
If you want to read this answer in French, please read Comment créer un filigrane transparent en PDF?
Best option is iText. Check a watermark demo here
Important part of the code (where the watermar is inserted) is this:
public class Watermark extends PdfPageEventHelper {
#Override
public void onEndPage(PdfWriter writer, Document document) {
// insert here your watermark
}
Read carefully the example.
onEndPage() method will be something like (in my logo-watermarks I use com.itextpdf.text.Image;):
Image image = Image.getInstance(this.getClass().getResource("/path/to/image.png"));
// set transparency
image.setTransparency(transparency);
// set position
image.setAbsolutePosition(absoluteX, absoluteY);
// put into document
document.add(image);

Categories

Resources