I have an application which opens a pdf file with dimensions 1700pixels*2200pixels. I will get dimensions of a rectangle drawn over a pdf from it.
When I am trying to create the same rectangle on a pdf,
I am using PdfBox which creates a pdf page with dimensions.
System.out.println(page.getMediaBox().getHeight());
System.out.println(page.getMediaBox().getWidth());
results in :
612
792
How to convert the pdf coordinates from 1700*2200 to 612*792?
Your output
612 792
of
System.out.println(page.getMediaBox().getHeight()); System.out.println(page.getMediaBox().getWidth());
seems to indicate that you create that PDPage using the default constructor, i.e. using new PDPage() as that constructor sets the page size to the US Letter page format.
If you want pages in a different format, you should use the constructor PDPage(PDRectangle), e.g.:
PDRectangle rec = new PDRectangle(1700, 2200);
PDDocument document = new PDDocument();
PDPage page = new PDPage(rec);
document.addPage(page);
This creates a PDF with a page whose size is 1700x2200 user space units, i.e. about 23.6"x30.6".
BTW, you talk about a pdf file in the dimensions 1700pixels*2200pixels - PDFs don't know the unit 'pixel'. They know the default user space unit which defaults to 1/72" and, therefore, more or less corresponds to the unit point. This especially does not imply a resolution.
Related
In Itext 5, PdfWriter used to have a method called getCurrentPageNumber().
Itext 7 does not have it anymore.
How can I get the current page number since the entire pdf is going to use the same Document instance ( PdfDocument pdfDocument = new PdfDocument(writer))from start to end?
Use case: An apllication written using itext5 needs to be rewritten using itext7.
Our app generates PDF files for our customer's monthly bills.
Structure of the pdf:
Create document:
PdfWriter writer = new PdfWriter(new FileOutputStream(filename));
PdfDocument pdfDocument = new PdfDocument(writer);
Document document = new Document(pdfDocument, PageSize.A4);
document.setMargins(235, 35, 0, 38);
buildBody(pdfData, document, writer);
For every page I use an IEventHandler for the Start and another one for the End of the pages to add some data at the Header and Footer.
pdfDocument.addEventHandler(PdfDocumentEvent.START_PAGE, new
StartPageHandler<>(pdfData, document, this));
pdfDocument.addEventHandler(PdfDocumentEvent.END_PAGE, new
EndPageHandler<>(pdfData, document, this));
3.Each IEventHandler instance has this format(except the rectangle values):
Rectangle rectangle = new Rectangle(51, 50, 500, 60); (This is for the END_PAGE)
Rectangle[] columns = new Rectangle[] {rectangle};
document.setRenderer(new ColumnDocumentRenderer(document, columns));
document.add(lineSeparatorHeader);
document.add(paragraphInfo); // information used in the Header/Footer
The buildBody(pdfData, document, writer) method (from step 1) was using the writer.getCurrentPageNumber() method in order to check if the content we wanted to add has exceeded the space on the page and has passed on another page. If it passed on another page, I would have to add some extra information (somehow as a title in the body of the page) before the rest of the content is added. (It used to work with itext5)
Considering that the code you want to port apparently used the iText 5 PdfDocument/PdfWriter pattern, the current page number in that code always was the currently last page number, i.e. the current number of pages in the document being created.
In iText 7 these numbers are not automatically the same; if you port the code keeping the general architecture, though, those numbers most likely will remain the same in your iText 7 code, too.
In iText 7 you can get the current number of pages from the PdfDocument in question:
PdfDocument pdfDocument = ...;
...
int currentNamberOfPages = pdfDocument.getNumberOfPages();
Currently your buildBody method signature does not include the PdfDocument (which made sense in iText 5). For iText 7 you should consider changing the signature to include the PdfDocument. Most likely, though, you won't need the PdfWriter anymore which you then can exclude.
If you need to keep that method signature for some obscure reason, you can retrieve the PdfDocument from the Document using the getPdfDocument method.
I am writing a Java program that reads in a CSV file and then writes out the information in a certain format into a PDF. I am using Java 12.0.2+10 and PDFBOX-app-2.0.22 in Apache NetBeans 12.0. After much study I've learned the basics and have my program working well for 1 paged documents. The problem comes if my content exceeds the space on page 1. Then I need to create a new page, add it to the document, and then create a new content stream for page2. The name of the content stream must be different than it was in page 1 as that variable name has already been used. This requires me to duplicate, but with the new content stream name, the code used for page 1 that formats & displays the text, lines, etc.. If my content exceeds 2 pages, I have to create a third page and again duplicate all the code that formats and shows the text. This makes for very large and difficult to maintain code. Is not there is a way to make the content stream name a variable so that I can have one code block to write to the different pages in my document?
PDDocument doc = new PDDocument(); // creating instance of PDFdoc
PDPage page1 = new PDPage(PDRectangle.LETTER);
doc.addPage(page1);
PDPageContentStream content = new PDPageContentStream(doc,page1);
content.beginText();
for(int I = 0; I < lineCount; i++){
// code to format text
content.setFont(fontBold,fontSize10);
content.newLineAtOffset(leftMargin, pageHeight - marginTop - page1HeaderHeight - (pageLineCount * lineSpacing10));
content.showText(“text to display”); // display the text
etc...
content.endText();
content.close(); // closes content stream for page 1
}
// page 2 code block
PDPage page2 = new PDPage(PDRectangle.LETTER); // set page size to 8.5 x 11.0" = US letter
doc.addPage(page2); // adding a page in PDF file
PDPageContentStream content2 = new PDPageContentStream(doc,page2,AppendMode.APPEND, true, true);
content2.beginText();
content2.setFont(fontBold,fontSize10);
content2.newLineAtOffset(leftMargin, pageHeight - marginTop - page1HeaderHeight - (pageLineCount * lineSpacing10));
content.showText(“text to display”); // display the text
etc...
content2.endText();
content.2close(); // closes content stream for page 2
im using Apache PDFBox,
I want to convert a RGB PDF file to another GRAYSCALE file WITHOUT using images method because its making huge file size -_- !!
so this my steps:
Export a (A4) First.pdf from Adobe InDesign, contain images, texts, vector-objects.
I read the First.pdf file. Done!
using LayerUtility, copy pages from First.pdf rotate them and put them to NEW PDF file (A4) Second.pdf. Done!
this method preferred because i need vector-objects to reduce the size.
then, i want to save this as GRAY-SCALE PDF file (Second-grayscale.pdf)
and this my code (not all):
PDDocument documentFirst = PDDocument.load("First.pdf"));
// Second.pdf its empty always
PDDocument documentSecond = PDDocument.load("Second.pdf"));
for (int page = 0; page < documentSecond.getNumberOfPages(); page++) {
// get current page from documentSecond
PDPage tempPage = documentSecond.getPage(page);
// create content contentStream
PDPageContentStream contentStream = new PDPageContentStream(documentSecond, tempPage);
// create layerUtility
LayerUtility layerUtility = new LayerUtility(documentSecond);
// importPageAsForm from documentFirst
PDFormXObject form = layerUtility.importPageAsForm(documentFirst, page);
// saveGraphicsState
contentStream.saveGraphicsState();
// rotate the page
Matrix matrix;
matrix.rotate(Math.toRadians(90));
contentStream.transform(matrix);
// draw the rotated page from documentFirst to documentSecond
contentStream.drawForm(form);
contentStream.close();
}
// save the new document
documentSecond.save("Second.pdf");
documentSecond.close();
documentFirst.close();
// now convert it to GRAYSCALE or do it in the Loop above!
well, i just start using Apache Box this week, i have followed some
example, but most are old and not working, until now i did what i
need, just need the Grayscale :)!!
if there are other solutions in java using open-source library
or a free tools. (i found with Ghost Script and Python)
i read this example but i didn't understand it and there are a functions deprecated!:
https://github.com/lencinhaus/pervads/blob/master/libs/pdfbox/src/java/org/apache/pdfbox/ConvertColorspace.java
its about PDF Specs, and changing Color Space...
You mentioned you would be interested in a Ghostscript based solution as far as I understood.
If you are able to call GS from your command line you can do color to grayscale conversion with this command line
gs -sDEVICE=pdfwrite -sProcessColorModel=DeviceGray -sColorConversionStrategy=Gray -dOverrideICC -o out.pdf -f input.pdf
my answer is taken from How to convert a PDF to grayscale from command line avoiding to be rasterized?
My web application signs PDF documents. I would like to let users download the original PDF document (not signed) but adding an image and the signers in the left margin of the pdf document.
I've seen this idea in another web application, and I would like to do the same. Of course I would like to do it using itext library.
I have attached two images, the original PDF document (not signed) and the modified PDF document.
First this: it is important to change the document before you digitally sign it. Once digitally signed, these changes will break the signature.
I will break up the question in two parts and I'll skip the part about the actual watermarking as this is already explained here: How to watermark PDFs using text or images?
This question is not a duplicate of that question, because of the extra requirement to add an extra margin to the right.
Take a look at the primes.pdf document. This is the source file we are going to use in the AddExtraMargin example with the following result: primes_extra_margin.pdf. As you can see, a half an inch margin was added to the left of each page.
This is how it's done:
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
int n = reader.getNumberOfPages();
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
// properties
PdfContentByte over;
PdfDictionary pageDict;
PdfArray mediabox;
float llx, lly, ury;
// loop over every page
for (int i = 1; i <= n; i++) {
pageDict = reader.getPageN(i);
mediabox = pageDict.getAsArray(PdfName.MEDIABOX);
llx = mediabox.getAsNumber(0).floatValue();
lly = mediabox.getAsNumber(1).floatValue();
ury = mediabox.getAsNumber(3).floatValue();
mediabox.set(0, new PdfNumber(llx - 36));
over = stamper.getOverContent(i);
over.saveState();
over.setColorFill(new GrayColor(0.5f));
over.rectangle(llx - 36, lly, 36, ury - llx);
over.fill();
over.restoreState();
}
stamper.close();
reader.close();
}
The PdfDictionary we get with the getPageN() method is called the page dictionary. It has plenty of information about a specific page in the PDF. We are only looking at one entry: the /MediaBox. This is only a proof of concept. If you want to write a more robust application, you should also look at the /CropBox and the /Rotate entry. Incidentally, I know that these entries don't exist in primes.pdf, so I am omitting them here.
The media box of a page is an array with four values that represent a rectangle defined by the coordinates of its lower-left and upper-right corner (usually, I refer to them as llx, lly, urx and ury).
In my code sample, I change the value of llx by subtracting 36 user units. If you compare the page size of both PDFs, you'll see that we've added half an inch.
We also use these coordinates to draw a rectangle that covers the extra half inch. Now switch to the other watermark examples to find out how to add text or other content to each page.
Update:
if you need to scale down the existing pages, please read Fix the orientation of a PDF in order to scale it
I am using pdfbox to manipulate PDF content. I have a big PDF file (say 500 pages). I also have a few other single page PDF files containing only a single image which are around 8-15kb per file at the max. What I need to do is to import these single page pdf's like an overlay onto certain pages of the big PDF file.
I have tried the LayerUtility of pdfbox where I've succeeded but it creates a very large sized file as the output. The source pdf is about 1MB before processing and when added with the smaller pdf files, the size goes upto 64MB. And sometimes I need to include two smaller PDF's onto the bigger one.
Is there a better way to do this or am I just doing this wrong? Posting code below trying to add two layers onto a single page:
...
...
..
overlayDoc[pCounter] = PDDocument.load("data\\" + overlay + ".pdf");
outputPage[pCounter] = (PDPage) overlayDoc[pCounter].getDocumentCatalog().getAllPages().get(0);
LayerUtility lu = new LayerUtility( overlayDoc[pCounter] );
form[pCounter] = lu.importPageAsForm( bigPDFDoc, Integer.parseInt(pageNo)-1);
lu.appendFormAsLayer( outputPage[pCounter], form[pCounter], aTrans, "OVERLAY_"+pCounter );
outputDoc.addPage(outputPage[pCounter]);
mOverlayDoc[pCounter] = PDDocument.load("data\\" + overlay2 + ".pdf");
mOutputPage[pCounter] = (PDPage) mOverlayDoc[pCounter].getDocumentCatalog().getAllPages().get(0);
LayerUtility lu2 = new LayerUtility( mOverlayDoc[pCounter] );
mForm[pCounter] = lu2.importPageAsForm(outputDoc, outputDoc.getNumberOfPages()-1);
lu.appendFormAsLayer( mOutputPage[pCounter], mForm[pCounter], aTrans, "OVERLAY_2"+pCounter );
outputDoc.removePage(outputPage[pCounter]);
outputDoc.addPage(mOutputPage[pCounter]);
...
...
With code like the following I don't see any unepected growth of size:
PDDocument bigDocument = PDDocument.load(BIG_SOURCE_FILE);
LayerUtility layerUtility = new LayerUtility(bigDocument);
List bigPages = bigDocument.getDocumentCatalog().getAllPages();
// import each page to superimpose only once
PDDocument firstSuperDocument = PDDocument.load(FIRST_SUPER_FILE);
PDXObjectForm firstForm = layerUtility.importPageAsForm(firstSuperDocument, 0);
PDDocument secondSuperDocument = PDDocument.load(SECOND_SUPER_FILE);
PDXObjectForm secondForm = layerUtility.importPageAsForm(secondSuperDocument, 0);
// These things can easily be done in a loop, too
AffineTransform affineTransform = new AffineTransform(); // Identity... your requirements may differ
layerUtility.appendFormAsLayer((PDPage) bigPages.get(0), firstForm, affineTransform, "Superimposed0");
layerUtility.appendFormAsLayer((PDPage) bigPages.get(1), secondForm, affineTransform, "Superimposed1");
layerUtility.appendFormAsLayer((PDPage) bigPages.get(2), firstForm, affineTransform, "Superimposed2");
bigDocument.save(BIG_TARGET_FILE);
As you see I superimposed the first page of FIRST_SUPER_FILE on two pages of the target file but I only imported the page once. Thus, also the resources of that imported page are imported only once.
This approach is open for loops, too, but don't import the same page multiple times! Instead import all required template pages once up front as forms and in the later loop reference those forms again and again.
(I hope this solves your issue. If not, supply more code and the sample PDFs to reproduce your issue.)