How to add text watermark to pdf in Java using Apache PDFBox? - java

I am not getting any tutorial for adding a text watermark in a PDF file? Can you all please guide me, I am very new to PDFBOX.
Its not duplicate, the link in the comment didn't help me. I want to add text, not an image to the pdf.

Here is an example using PDFBox 2.0.2. This will load a PDF and write some text in the bottom right corner in a red transparent font. If it is a multiple page PDF the watermark will appear on every page. It might not be production ready, as I am not sure if there are some additional null conditions that need to be checked, but it should get you running in the right direction.
Keep in mind that this particular block of code will not modify the original PDF, but will create a new PDF using the Tmp_(filename) as the output.
private static void watermarkPDF (File fileStored) {
File tmpPDF;
PDDocument doc;
tmpPDF = new File(fileStored.getParent() + System.getProperty("file.separator") +"Tmp_"+fileStored.getName());
doc = PDDocument.load(fileStored);
for(PDPage page:doc.getPages()){
PDPageContentStream cs = new PDPageContentStream(doc, page, AppendMode.APPEND, true, true);
String ts = "Some sample text";
PDFont font = PDType1Font.HELVETICA_BOLD;
float fontSize = 14.0f;
PDResources resources = page.getResources();
PDExtendedGraphicsState r0 = new PDExtendedGraphicsState();
r0.setNonStrokingAlphaConstant(0.5f);
cs.setGraphicsStateParameters(r0);
cs.setNonStrokingColor(255,0,0);//Red
cs.beginText();
cs.setFont(font, fontSize);
cs.setTextMatrix(Matrix.getTranslateInstance(0f,0f));
cs.showText(ts);
cs.endText();
}
cs.close();
}
doc.save(tmpPDF);
}

Related

pdfbox: how to load a font once and use it many times?

I am trying to create a lot of pdf files in a loop.
for(int i=0; i<10000; ++i){
PDDocument doc = PDDocument.load(inputstream);
PDPage page = doc.getPage(0);
PDPageContentStream content = new PDPageContentStream(doc, page, PDPageContentStream.AppendMode.APPEND, true, true);
content.beginText();
//what happens here?
PDFont font = PDType0Font.load(doc, Thread.currentThread().getContextClassLoader().getResourceAsStream("font/simsun.ttf") );
content.setFont(font, 10);
//...
doc.save(outstream);
doc.close();
}
what does it happen by calling PDType0Font.load... ? Because the ttf file is large (10M), will it create ephemeral big objects of font 10000 times? If so, is there a way to make the font as embedded as PDType1Font, so I can just load it once and use it many times in the loop?
I encountered a full GC problem here, and I'm trying to figure it out.
Create the font at the fontbox level:
TrueTypeFont ttf = new TTFParser().parse(...);
You can now reuse ttf in different PDDocument objects like this:
PDFont font = PDType0Font.load(doc, ttf, true);
When done with all documents, don't forget to close ttf.
See also PDFontTest.testPDFBox3826() in the source code.

iText/Java When converting HTML to PDF, the text positioning does not match

I use iText to convert HTML to PDF. In general, everything works as it should, but there is a problem with some font families. After the conversion, the height positioning of fonts in a PDF document does not match what it was in HTML. An example in the picture below (on the left is the generated PDF, on the right is a preview of the PDF document):
In the image:
(1) - the font Arial - with it all right.
(2) - the font Times New Roman - in PDF it rose to 2px.
(3) - the Dancing Script font - in PDF it dropped to 5px.
(4) - the Tangerine font - it’s all right.
(5) - the Charmonman font - in PDF it dropped to 5px.
For HTML template, which is generated in PDF specified CSS styles exactly the same as used to preview PDF document.
PDF is generated as follows:
private byte[] getPdf(StringBuilder fileText) throws IOException {
FontProvider fontProvider = new FontProvider(addCertificateFonts(), "");
fontProvider.addStandardPdfFonts();
ConverterProperties properties = new ConverterProperties().setFontProvider(fontProvider);
properties.setBaseUri(getBaseProjectUri());
try (ByteArrayOutputStream out = new ByteArrayOutputStream()) {
HtmlConverter.convertToPdf(fileText.toString(), out, properties);
return out.toByteArray();
}
}
private FontSet addCertificateFonts() {
FontSet set = new FontSet();
List<String> fontPaths = //здесь список путей к файлам шрифтов
ExternalContext context = FacesContext.getCurrentInstance().getExternalContext();
for (String path : fontPaths) {
set.addFont(context.getRealPath(path));
}
return set;
}
Is there any way to fix the font positioning?
Is it possible that the fonts themselves have some configurations that may affect the position of the text with this font?

write on existing form-pdf with pdfbox

I am relativly new to Java and I want to replace an existing iText based Javascript with pdfbox. (Java 2.0)
I have a pdf-Formsheet (but this sheet has no Acroform entries) and I want to fill it with information (Name, Birthdate and so on). The pdf is in a rectangular special size (like a contact card).
My code so far:
File file = new File("ToBeFilled.pdf");
PDDocument document = PDDocument.load(file);
System.out.println("PDF loaded");
//Retrieving the page
PDPage page = (PDPage)document.getPages().get( 0 );
PDFont font = PDType1Font.HELVETICA_BOLD;
PDPageContentStream content = new PDPageContentStream(document, page, PDPageContentStream.AppendMode.APPEND, true);
content.beginText();
//Setting the font to the Content stream
content.setFont(font, 30);
//Setting the position for the line (float x, float y), (0,0) = lower left corner
content.newLineAtOffset(100, 400);
String text = "This is the sample document and we are adding content to it.";
String text1 = "This is an example of adding text to a page in the pdf document. we can add as many lines";
String text2 = "as we want like this using the ShowText() method of the ContentStream class";
//Adding text in the form of string
content.showText(text);
//Adding text in the form of string
content.newLine();
content.showText(text1);
content.newLine();
content.showText(text2);
//Ending the content stream
content.endText();
System.out.println("Text added");
content.close();
//Saving the document
document.save("newPrint.pdf");
//Closing the document
document.close();
The text does not show. What am I missing here? I thought with the correct text-positions I could simply write on the pdf?
The source is working.
Maybe your content.newLineAtOffset(100, 400); is too huge - out of bounds - for your little card.
By the way, you have to setLeading(float) to use newLine() meaningfuly.

PDFBox incorrect text appearance after copy/paste

I’m using PDFBox 2.0.4 to create PDF documents with acroForms. Here is my test code example:
PDDocument document = new PDDocument();
PDPage page = new PDPage(PDRectangle.A4);
document.addPage(page);
PDAcroForm acroForm = new PDAcroForm(document);
document.getDocumentCatalog().setAcroForm(acroForm);
String dir = "../testPdfBox/src/main/resources/fonts/";
PDType0Font font = PDType0Font.load(document, new File(dir + "Roboto-Regular.ttf"));
PDResources resources = new PDResources();
String fontName = resources.add(font).getName();
acroForm.setDefaultResources(resources);
String defaultAppearanceString = format("/%s 12 Tf 0 g", fontName);
acroForm.setDefaultAppearance(defaultAppearanceString);
PDTextField field = new PDTextField(acroForm);
field.setPartialName("SampleField");
field.setDefaultAppearance(defaultAppearanceString);
acroForm.getFields().add(field);
PDAnnotationWidget widget = field.getWidgets().get(0);
PDRectangle rect = new PDRectangle(50, 750, 200, 50);
widget.setRectangle(rect);
widget.setPage(page);
widget.setPrinted(true);
page.getAnnotations().add(widget);
field.setValue("Sample field 123456");
acroForm.flatten();
document.save("target/SimpleForm.pdf");
document.close();
Everything works fine. But when I try to copy text from the created document and paste it to the NotePad or Word it becomes squares.
􀀷􀁅􀁑􀁔􀁐􀁉􀀄􀁊􀁍􀁉􀁐􀁈􀀄􀀕􀀖􀀗􀀘􀀙􀀚
I search a lot about this problem. The most popular answer is that there is no toUnicode cmap in created PDF. So I explore my document with CanOpener for Acrobat:
Yes, there is no toUnicode cmap, but everything works properly, if not to use acroForm.flatten(). When form fields are not flattened, I can copy/paste text from the document and it looks correct. Nevertheless I need all fields to be flattened.
So, I have two questions:
Why there is a problem with copy/pasting text in flattened form, and everything is ok in non-flattened?
What can I do to avoid problem with text copy/pasting?
Is there only one solution - to create toUnicode CMap by my own, like in this example?
My test pdf files are available here.
Please replace
PDType0Font font = PDType0Font.load(document, new File(dir + "Roboto-Regular.ttf"));
with
PDType0Font font = PDType0Font.load(document, new FileInputStream(dir + "Roboto-Regular.ttf"), false);
This makes sure that the font is embedded in full and not just as a subset.

Adding Header to existing PDF File using PDFBox

I am trying to add a Header to an existing PDF file. It works but the table header in the existing PDF are messed up by the change in the font. If I remove setting the font then the header doesn't show up. Here is my code:
// the document
PDDocument doc = null;
try
{
doc = PDDocument.load( file );
List allPages = doc.getDocumentCatalog().getAllPages();
//PDFont font = PDType1Font.HELVETICA_BOLD;
for( int i=0; i<allPages.size(); i++ )
{
PDPage page = (PDPage)allPages.get( i );
PDRectangle pageSize = page.findMediaBox();
PDPageContentStream contentStream = new PDPageContentStream(doc, page, true, true,true);
PDFont font = PDType1Font.TIMES_ROMAN;
float fontSize = 15.0f;
contentStream.beginText();
// set font and font size
contentStream.setFont( font, fontSize);
contentStream.moveTextPositionByAmount(700, 1150);
contentStream.drawString( message);
contentStream.endText();
//contentStream.
contentStream.close();}
doc.save( outfile );
}
finally
{
if( doc != null )
{
doc.close();
}
}
}`
Essentially you are running into a PDFBox bug in the current version 1.8.2.
A workaround:
Add a getFonts call of the page resources after creating the new content stream before using a font:
PDPage page = (PDPage)allPages.get( i );
PDRectangle pageSize = page.findMediaBox();
PDPageContentStream contentStream = new PDPageContentStream(doc, page, true, true,true);
page.getResources().getFonts(); // <<<<<<<<
PDFont font = PDType1Font.TIMES_ROMAN;
float fontSize = 15.0f;
contentStream.beginText();
The bug itself:
The bug is in the method PDResources.addFont which is called from PDPageContentStream.setFont:
public String addFont(PDFont font)
{
return addFont(font, MapUtil.getNextUniqueKey( fonts, "F" ));
}
It uses the current content of the fonts member variable to determine a unique name for the new font resource on the page at hand. Unfortunately this member variable still can be (and in your case is) uninitialized at this time. This results in the MapUtil.getNextUniqueKey( fonts, "F" ) call to always return F0.
The font variable then is initialized implicitly during the addFont(PDFont, String) call later.
Thus, if unfortunately there already existed a font named F0 on that page, it is replaced by the new font.
Having tested with your PDF this is exactly what happens in your case. As the existing font F0 uses some custom encoding while your replacement font uses a standard one, the text originally written using F0 now looks like gibberish.
The work-around mentioned above implicitly initializes that member variable and, thus, prevents the font replacement.
If you plan to use PDFBox in production for this task, you might want to report the bug.
PS: As mentioned in the comments above there is another bug to observe in context with inherited resources. It should be brought to the PDFBox development's attention, too.
PPS: The issue at hand meanwhile has been fixed in PDFBox for versions 1.8.3 and 2.0.0, cf. PDFBOX-1753.

Categories

Resources