I try to fill a PDF form with PDFBox and I managed to do it well with a portrait oriented document. But I have a problem when filling a document in landscape mode. The fields are filled up, but the text orientation is not good. It appear vertically like if it was still in portrait but in a rotation of 90 degrees.
Here is my simplified code:
PDDocument pdfDoc = PDDocument.load(MY_FILE);
PDDocumentCatalog docCatalog = pdfDoc.getDocumentCatalog();
PDAcroForm acroForm = docCatalog.getAcroForm();
acroForm.getField("aAddressLine1").setValue("ADDRESS1_HERE");
acroForm.getField("aAddressLine2").setValue("ADDRESS1_HERE");
acroForm.getField("country").setValue("COUNTRY_HERE");
pdfDoc.save(PATH_HERE);
pdfDoc.close();
Did you manage to fill a PDF document in landscape mode?
Thanks for your help.
The short answer
I'm afraid PDFBox does not yet (as of version 1.8.2) allow you to fill in landscape PDFs like the one you provided because it does not seem to query and factor in informations about the page the form field is located on.
The long answer
There are different ways you can define a page to be A4 landscape:
You can define it to have the A4 landscape dimensions directly by means of a media box definition:
/MediaBox [0, 0, 842, 595]
In this case the coordinates of your aAddressLine1 would be
/Rect[23.1711 86.8914 292.121 100.132]
or you can define it to have the A4 portrait dimensions and being rotated by 90° (or 270° obviously):
/MediaBox [0, 0, 595, 842]
/Rotate 90
In this case the coordinates of your aAddressLine1 are
/Rect[86.8914 23.1711 100.132 292.121]
Your example document uses the latter method.
Now PDFBox, when creating an appearance stream for that field, only looks at the rectangle defining the field but ignores the properties of the page. Thus, PDFBox sees a very narrow and very high textfield and fills it in just like that. It is completely unaware that the result will be rotated in a PDF viewer.
What it should have done is to also look at the page the field is located on. If that page has a /Rotate entry, it should create an appearance stream for the field which displays the text rotated in the opposite direction.
Alternatives
In a comment you also asked
Do you know another library I could use if PDFBox can't do what I want?
I have tested the feat with iText 5.4.2:
PdfReader reader = new PdfReader(MY_FILE);
OutputStream os = new FileOutputStream(PATH_HERE);
PdfStamper stamper = new PdfStamper(reader, os);
AcroFields acroFields = stamper.getAcroFields();
acroFields.setField("aAddressLine1", "ADDRESS1_HERE");
acroFields.setField("aAddressLine2", "ADDRESS1_HERE");
stamper.close();
(The free iText version is licensed under the AGPL; you have to decide whether that's ok for your project. There is a commercial license, too, if it's not ok.)
I'm sure other PDF libraries also can do that, it's not too exotic a feature after all...
But I also tested PDF Clown 0.1.3 (trunk version), which did not work either:
File file = new File(MY_FILE);
Document document = file.getDocument();
Form form = document.getForm();
form.getFields().get("aAddressLine1").setValue("ADDRESS1_HERE");
form.getFields().get("aAddressLine2").setValue("ADDRESS1_HERE");
file.save(new java.io.File(PATH_HERE), SerializationModeEnum.Incremental);
file.close();
Related
I am using PdfFormXObject pageCopy = sourcePage.CopyAsFormXObject(pdf); to then insert pageCopy into a new PDF page using pdfCanvas.AddXObjectFittedIntoRectangle. The copied page is visible in the new PDF as expected, but it how has it's 'hidden' OCGs visible.
The reason I am doing this is to be able to take a PDF page, scale and crop it and add it to a new PDF where it may be collated with other contents.
Is there a way to remove OCG PDF content prior to create the XObject, or is there a different way of achieving my goal without using the XObject route that allows me to maintain the 'off' status of hidden OCGs
OCG removal functionality is not yet available in iText 7.
There is, however, a workaround that you can try to apply: we can copy all the information about OCGs from your source document to the target document which should create the same OCGs in the target document and preserve default on/off states.
To copy the OCGs, you can copy a page from one document to another one (which is going to copy all the OCGs) and then remove that page.
When the OCG removal functionality becomes available in iText the approach would become cleaner but for now you can use the code similar to the following:
PdfDocument sourceDocument = new PdfDocument(new PdfReader(sourcePdfPath));
PdfDocument targetDocument = new PdfDocument(new PdfWriter(targetPdfPath));
PdfFormXObject pageCopy = sourceDocument.getFirstPage().copyAsFormXObject(targetDocument);
PdfPage page = targetDocument.addNewPage();
PdfCanvas canvas = new PdfCanvas(page);
canvas.addXObject(pageCopy);
// Workaround: copying the page from source document to destination document also copies OCGs
sourceDocument.copyPagesTo(1, 1, targetDocument);
// Workaround: remove the page that we only copied to make sure OCGs are copied
targetDocument.removePage(targetDocument.getNumberOfPages());
sourceDocument.close();
targetDocument.close();
I have to fill pdf fields.
What I have done so far is to open my pdf like a form with Adobe Acrobat and then save it. It turns into a pdf forms and with Apache PdfBox I can achieve what I want to do.
Unfortunately, I must not open it with an external program. And if I don't do this little trick, I have an empty array with :
DDocument pdfDoc = PDDocument.load(new File(path));
PDDocumentCatalog docCatalog = pdfDoc.getDocumentCatalog();
PDAcroForm acroForm = docCatalog.getAcroForm();
acroForm.getFields() //empty
Is it possible to create dynamicly in java my future fields from a simple pdf?
Thanks in advance
I'm having an issue with PDF box flattening a PDF generated by Adobe Acrobat DC.
The Adobe Acrobat text field I created is absolutely the default text field.
In my example below, I have a PatientName field with the text value "Douglas McDouggelman".
When I flatten the PDF, here's what it looks like:
Anyone know what's up with this bizarre spacing?
It appears that the space + next character are combined. This is what it looks like when you try to select that character.
Code:
try (PDDocument document = PDDocument.load(pdfFormInputStream)) {
PDDocumentCatalog catalog = document.getDocumentCatalog();
PDAcroForm acroForm = catalog.getAcroForm();
acroForm.getField("PatientName").setValue("Douglas McDouggelman");
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
if (flattenPdfs) {
acroForm.flatten();
}
document.save(byteArrayOutputStream);
}
I realized this PDF was from some other group who made it and who knows what they did. So I found the source word document, repeated the creation of the form from Adobe DC, added the fields back to the document, then it was totally fine.
PDF box was not the problem... it was some unknown incorrect step that the person who originally prepared the pdf did.
I have a sample of scanned PDFs that I need to edit and re-export. I use PDFBox to render the PDF into a series of images (one image per page), I perform some OpenCV calculations on the rasterized jpegs and then I intend to insert them back into a new pdf file.
Example: PDF is 423kb, Page 1 is 313kb, Page 2 is 287kb, Page 3 is 319kb, Page 4 is 485kb, and Page 5 is 470kb.
Problem is that the output images are greater in size than the PDF itself. This results in my OCR efforts taking much longer than is acceptable (5 minutes vs 30 seconds per document). The only way to keep the jpegs from inflating in size is to leave them with a default DPI of 72. This produces poor quality images that cannot be used.
Why is this happening? I should be able to get back images that have a size less than or equal to the PDF in question (without sacrificing quality). I'm not doing anything weird to the images, just removing watermarks.
Here's some code illustrating how I'm extracting the jpegs from the PDF.
File file = new File(fileName);
PDDocument document = PDDocument.load(file);
PDFRenderer renderer = new PDFRenderer(document);
BufferedImage[] pageArray = new BufferedImage[document.getNumberOfPages()];
int pageCounter = 0;
for(PDPage page : document.getPages()) {
pageArray[pageCounter] = renderer.renderImageWithDPI(pageCounter, 160);
pageCounter++;
}
I have a use case that I cannot figure out how to implement.
I'm using headless chrome to export a rich text editor as a pdf and then I need to cut out a part of the created PDF and embed it as a pdf annotation in another parent pdf such that the annotation looks exactly the same as the section I cut out from the created PDF.
I'm able to correctly calculate and cut the precise area I need from the created PDF using instructions provided by:
https://developers.itextpdf.com/examples/stamping-content-existing-pdfs-itext5/cut-and-paste-content-page
PdfTemplate template2 = cb.createTemplate(pageSize.getWidth(), pageSize.getHeight());
template2.rectangle(toMove.getLeft(), toMove.getBottom(), toMove.getWidth(), toMove.getHeight());
template2.clip();
template2.newPath();
template2.addTemplate(page, 0, 0);
cb.addTemplate(template1, 0, 0);
cb.addTemplate(template2, -20, -2);
I would like to add the PDFTemplate via a PdfStamper.
Is this possible? If not now can I achieve this with another method?
In the example you refer to, you obtain cb like this:
PdfContentByte cb = writer.getDirectContent();
When using PdfStamper, you can obtain cb like this:
PdfContentByte cb = stamper.getUnderContent(p);
Or like this:
PdfContentByte cb = stamper.getOverContent(p);
The former method will add the new content under the existing content; the latter method will add the new content on top of the existing content. In these lines p is a page number (from 1 to the total number of pages of the existing document). See How to superimpose pages from existing documents into another document? for an example.
If you want to add new pages to an existing document, use the insertPage() method as explained in How to add blank pages to an existing PDF in java? Once you have added a blank page, you can add a PdfTemplate to it.