Changing page zoom of an existing pdf with PDFBox

Changing page zoom of an existing pdf with PDFBox - java

I have a pdf which I'm iterating through using PDFBox as below:
PDDocument doc = PDDocument.load(new ByteArrayInputStream(bytearray));
PDDocumentCatalog catalog = doc.getDocumentCatalog();
for(PDPage page : catalog.getPages()){
...
}
I want to set the default magnification for the pages so that when it is opened through a pdf reader, it opens at 75% zoom by default. Is this possible? I've seen few posts where the zoom is set using PDPageXYZDestination, but I'm not sure whether that is applicable in my case.
Thanks.

Do this, it applies to the first seen page only, i.e. when opening:
PDDocumentCatalog catalog = doc.getDocumentCatalog();
PDPage page = doc.getPage(0); // zero-based; you can also put another number to jump to a specific existing page
PDPageXYZDestination dest = new PDPageXYZDestination();
dest.setPage(page);
dest.setZoom(0.75f);
dest.setLeft((int) page.getCropBox().getLowerLeftX());
dest.setTop((int) page.getCropBox().getUpperRightY());
PDActionGoTo action = new PDActionGoTo();
action.setDestination(dest);
catalog.setActions(null);
catalog.setOpenAction(action);
doc.save(...);

Related

itext7 - How to copy page as form XObject while keeping hidden OCGs hidden

I am using PdfFormXObject pageCopy = sourcePage.CopyAsFormXObject(pdf); to then insert pageCopy into a new PDF page using pdfCanvas.AddXObjectFittedIntoRectangle. The copied page is visible in the new PDF as expected, but it how has it's 'hidden' OCGs visible.
The reason I am doing this is to be able to take a PDF page, scale and crop it and add it to a new PDF where it may be collated with other contents.
Is there a way to remove OCG PDF content prior to create the XObject, or is there a different way of achieving my goal without using the XObject route that allows me to maintain the 'off' status of hidden OCGs

OCG removal functionality is not yet available in iText 7.
There is, however, a workaround that you can try to apply: we can copy all the information about OCGs from your source document to the target document which should create the same OCGs in the target document and preserve default on/off states.
To copy the OCGs, you can copy a page from one document to another one (which is going to copy all the OCGs) and then remove that page.
When the OCG removal functionality becomes available in iText the approach would become cleaner but for now you can use the code similar to the following:
PdfDocument sourceDocument = new PdfDocument(new PdfReader(sourcePdfPath));
PdfDocument targetDocument = new PdfDocument(new PdfWriter(targetPdfPath));
PdfFormXObject pageCopy = sourceDocument.getFirstPage().copyAsFormXObject(targetDocument);
PdfPage page = targetDocument.addNewPage();
PdfCanvas canvas = new PdfCanvas(page);
canvas.addXObject(pageCopy);
// Workaround: copying the page from source document to destination document also copies OCGs
sourceDocument.copyPagesTo(1, 1, targetDocument);
// Workaround: remove the page that we only copied to make sure OCGs are copied
targetDocument.removePage(targetDocument.getNumberOfPages());
sourceDocument.close();
targetDocument.close();

Convert html to pdf in landscape mode using iText

I'm trying to convert html to pdf using iText.
Here is the simple code that is working fine :
ByteArrayOutputStream pdfStream = new ByteArrayOutputStream();
HtmlConverter.convertToPdf(htmlAsStringToConvert, pdfStream)
Now, I want to convert the pdf to LANDSCAPE mode, so I've tried :
ConverterProperties converterProperties = new ConverterProperties();
MediaDeviceDescription mediaDeviceDescription = new MediaDeviceDescription(MediaType.SCREEN);
mediaDeviceDescription.setOrientation(LANDSCAPE);
converterProperties.setMediaDeviceDescription(mediaDeviceDescription);
HtmlConverter.convertToPdf(htmlAsStringToConvert, pdfStream, converterProperties);
and also :
PdfDocument pdfDoc = new PdfDocument(writer);
pdfDoc.setDefaultPageSize(PageSize.A4.rotate());
HtmlConverter.convertToPdf(htmlAsStringToConvert, pdfDoc, new ConverterProperties()).
I've also mixed both, but the result remains the same, the final PDF is still in default mode.

The best way to achieve landscape page size when converting HTML to PDF is to provide the corresponding CSS instruction for the page to become landscape.
This is done with the following CSS:
#page {
size: landscape;
}
Now, if you have your input HTML document in htmlAsStringToConvert variable then you can process it as an HTML using Jsoup library which iText embeds. Basically we are just adding the necessary CSS instruction into our <head>:
Document htmlDoc = Jsoup.parse(htmlAsStringToConvert);
htmlDoc.head().append("<style>" +
"#page { size: landscape; } "
+ "</style>");
HtmlConverter.convertToPdf(htmlDoc.outerHtml(), new FileOutputStream(outPdf));
Beware that if you already have #page declarations in your HTML then the one you append might be in conflict with the ones you already have - in this case, you need to make sure you insert your declaration as the latest one (this should be the case with the code above as long as all of your declaration are in <head> element).

PDF Box flatten PDF causes weird spacing

I'm having an issue with PDF box flattening a PDF generated by Adobe Acrobat DC.
The Adobe Acrobat text field I created is absolutely the default text field.
In my example below, I have a PatientName field with the text value "Douglas McDouggelman".
When I flatten the PDF, here's what it looks like:
Anyone know what's up with this bizarre spacing?
It appears that the space + next character are combined. This is what it looks like when you try to select that character.
Code:
try (PDDocument document = PDDocument.load(pdfFormInputStream)) {
PDDocumentCatalog catalog = document.getDocumentCatalog();
PDAcroForm acroForm = catalog.getAcroForm();
acroForm.getField("PatientName").setValue("Douglas McDouggelman");
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
if (flattenPdfs) {
acroForm.flatten();
}
document.save(byteArrayOutputStream);
}

I realized this PDF was from some other group who made it and who knows what they did. So I found the source word document, repeated the creation of the form from Adobe DC, added the fields back to the document, then it was totally fine.
PDF box was not the problem... it was some unknown incorrect step that the person who originally prepared the pdf did.

Add hyperlink to pdf files using pdfbox

I need to create a small tool that adds a hyperlink on the first page of a PDF file. I'm using Apache PDFBox to read the pdf files.
Any ideas how to add a hyperlink on a page using this library?
I found this question: how to set hyperlink in content using pdfbox, but this doesn't work.
I just want to add a hyperlink on the first page of a pdf file.
File file = new File(filename);
PDDocument doc = PDDocument.load(file);
PDPage page = doc.getPage(0);
...
I have at least 2 problems with the solution that I found on this question:
The method drawString(String) in the type PDPageContentStream is not applicable for the arguments (PDAnnotationLink)
colourBlue is not initialized
I would prefer to add the hyperlink at the bottom of the page with the URL centered. But for the moment any suggestion will help

First of all, you need to create a PDAnnotationLink like this:
PDAnnotationLink link = new PDAnnotationLink();
The link should have an action:
PDActionURI actionURI = new PDActionURI();
actionUri.setURI("http://www.Google.com");
link.setAction(action);
Finally, you need to define a rectangle at the desired position and finally add the link to the page's annotations.
PDRectangle pdRectangle = new PDRectangle();
pdRectangle.setLowerLeftX(...);
pdRectangle.setLowerLeftY(...);
pdRectangle.setUpperRightX(...);
pdRectangle.setUpperRightY(...);
link.setRectangle(pdRectangle);
page.getAnnotations().add(link);
If you want you can also add an underline for the link by calling the setBorderStyle(...) methid.
Hope this works for you !
If you want to add some text, then you need to create a PDPageContentStream like this:
PDPageContentStream contentStream = new PDPageContentStream(doc, page);
contentStream.beginText();
contentStream.newLineAtOffset(..., ...);
contentStream.showText(...);
contentStream.endText();
contentStream.close();
The newLineAtOffset(..., ...) method is used for positioning the text at the desired location.
P.S. Sorry for the bad indentation but it is pretty hard to write on the mobile. If you need any further help you can even write me a private message in Romanian.

How to use source document with form as a template for multiple page target document in PDFBox

I am using PDFBox 1.2.1 in Java and I am trying to use single page pdf document which has an acro form in it as a template for making multiple page target pdf.
PDDocument sourceDocument = PDDocument.load(fileStream);
PDDocument targetDocument = new PDDocument();
PDDocumentCatalog sourceDocCatalog = sourceDocument.getDocumentCatalog();
PDAcroForm acroFormFromSource = sourceDocCatalog.getAcroForm();
targetDocument.getDocumentCatalog().setAcroForm(acroFormFromSource);
PDPage templatePdfPage = (PDPage) sourceDocument.getDocumentCatalog().getAllPages().get(0);
for (int i = 0; i < 5; i++) {
targetDocument.addPage(templatePdfPage);
PDDocumentCatalog targetDocumentsDocumentCatalog = targetDocument.getDocumentCatalog();
PDAcroForm acroForm = targetDocumentsDocumentCatalog.getAcroForm();
acroForm.getField("Text1").setValue("Car " + i);
}
Unfortunately the generated target pdf contains 5 pages but every page has Text1 field with same value "Car 4". So every page is the same acro form. Is it somehow possible to generate new unique acro form for every page or is there other possible solution for my use case?

I think the problem is that you're using the same Java object acroFormFromSource for all the pages, so when you set the "Text1" field on page 4 (the last page of pages 0 through 4) it's setting it for all 5 pages.
I think you need to make a new copy the original PDAcroForm for each page. I suspect the easiest way to make a copy is to use the copy constructor of CosDictionary (COSDictionary( COSDictionary dict )). Be aware that this makes a shallow copy, though!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Changing page zoom of an existing pdf with PDFBox - java

Related

itext7 - How to copy page as form XObject while keeping hidden OCGs hidden

Convert html to pdf in landscape mode using iText

PDF Box flatten PDF causes weird spacing

Add hyperlink to pdf files using pdfbox

How to use source document with form as a template for multiple page target document in PDFBox

Categories

Resources