Setting multi-line text to form fields in PDFBox - java

I'm using PDFBox to fill form fields in a pdf using below code:
PDField nameField = form.getField("name");
if(null != nameField){
nameField.setValue(data.get("name")); // data is a hashmap
nameField.setReadonly(true);
}
The problem is, if the text is long it doesn't split to multiple lines, even though I have enabled the "multi-line" option for the field in the pdf. Do I have to do anything from the code as well to enable this?
Thanks.

Remember
Setting the ressources for the fonts to be used into the TextField.
Associating the ressources with the PDAccroform of the PDDocument.
Getting a widget for the PDTextField.
Getting a rectangle for the Widget.
Setting the width and the height of the rectangle of the widget.
It would solve it. In my case, I have a height of 20 for a non multiline text and another of 80 for a multiline textfield.You can see them being the last argument of the PDRectangle constructor. The PDRectangle class is used to specify the position and the dimension of the widget that sets it's rectangle to it. The texfield widget will appear as specified by the PDRectangle.
public static PDTextField addTextField(PDDocument pdDoc,PDAcroForm pda,String value,
String default_value,Boolean multiline,float txtfieldsyposition,float pagesheight)
{
int page = (int) (txtfieldsyposition/pagesheight);
if(page+1> pdDoc.getNumberOfPages())
{
ensurePageCapacity(pdDoc,page+1);//add 1 page to doc if needed
}
PDTextField pdtff = new PDTextField(pda);
PDFont font = new PDType1Font(FontName.TIMES_ROMAN);
String appearance = "/TIMES 10 Tf 0 0 0 rg";
try
{
PDFont font_ = new PDType1Font(FontName.HELVETICA);
PDResources resources = new PDResources();
resources.put(COSName.getPDFName("Helv"), font_);
resources.put(COSName.getPDFName("TIMES"), font);
pda.setDefaultResources(resources);
org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationWidget widget = pdtff.getWidgets().get(0);
PDRectangle rect = null;
if(!multiline)
rect = new PDRectangle(80, (pagesheight - (txtfieldsyposition % pagesheight)), 450, 20);
else
rect = new PDRectangle(80,(pagesheight-(txtfieldsyposition%pagesheight)),450,80);
PDPage pd_page = pdDoc.getPage(page);
System.out.println(pd_page.getBBox().getHeight());
widget.setRectangle(rect);
widget.setPage(pd_page);
PDAppearanceCharacteristicsDictionary fieldAppearance = new PDAppearanceCharacteristicsDictionary(new COSDictionary());
fieldAppearance.setBorderColour(new PDColor(new float[]{0,0,0}, PDDeviceRGB.INSTANCE));
fieldAppearance.setBackground(new PDColor(new float[]{255,255,255}, PDDeviceRGB.INSTANCE));
widget.setAppearanceCharacteristics(fieldAppearance);
widget.setPrinted(true);
pd_page.getAnnotations().add(widget);
System.out.println("before appearance " +pdtff.getDefaultAppearance());
pdtff.setDefaultAppearance(appearance);
System.out.println("after appearance "+pdtff.getDefaultAppearance());
if(multiline)
{
pdtff.setMultiline(true);
}
pdtff.setDefaultValue("");
pdtff.setValue(value.replaceAll("\u202F"," "));
pdtff.setPartialName( page +""+(int)txtfieldsyposition);
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
catch(IllegalArgumentException e)
{
e.printStackTrace();
}
return pdtff;
}

Related

Apache PDFBox: Get alignment and font from a PDAnnotationWidget or PDTextField

I have an existing pdf file with form fields, which can be filled by a user. This form fields have a font and text alignment which were defined when creating the pdf file.
I use Apache PDFBox to find the form field in the pdf:
PDDocument document = PDDocument.load(pdfFile);
PDAcroForm form = document.getDocumentCatalog().getAcroForm();
PDTextField textField = (PDTextField)form.getField("anyFieldName");
if (textField == null) {
textField = (PDTextField)form.getField("fieldsContainer.anyFieldName");
}
List<PDAnnotationWidget> widgets = textField.getWidgets();
PDAnnotationWidget annotation = null;
if (widgets != null && !widgets.isEmpty()) {
annotation = widgets.get(0);
/* font and alignment needed here */
}
If I set the content of the form field with
textField.setValue("This is the text");
then the text in the form field has the same font and alignment as predefined for this field.
But I need the alignment and the font for a second field (which is not a form field btw.).
How to find out which alignment (left, center, right) and which font (I need a PDType1Font and its size in point) is defined for this form field? Sth. like font = annotation.getFont() and alignment = annotation.getAlignment() which both do not exist.
How to get font and alignment?
17: Edit
Where I need the font is this:
PDPageContentStream content = new PDPageContentStream(document, page, AppendMode.APPEND, false);
content.setFont(font, size); /* Here I need font and size from the text field above */
content.beginText();
content.showText("My very nice text");
content.endText();
I need the font for the setFont() call.
To get the PDFont, do this:
String defaultAppearance = textField.getDefaultAppearance(); // usually like "/Helv 12 Tf 0 0 1 rg"
Pattern p = Pattern.compile("\\/(\\w+)\\s(\\d+)\\s.*");
Matcher m = p.matcher(defaultAppearance);
if (!m.find() || m.groupCount() < 2)
{
// oh-oh
}
String fontName = m.group(1);
int fontSize = Integer.parseInt(m.group(2));
PDAnnotationWidget widget = textField.getWidgets().get(0);
PDResources res = widget.getAppearance().getNormalAppearance().getAppearanceStream().getResources();
PDFont fieldFont = res.getFont(COSName.getPDFName(fontName));
if (fieldFont == null)
{
fieldFont = acroForm.getDefaultResources().getFont(COSName.getPDFName(fontName));
}
System.out.println(fieldFont + "; " + fontSize);
This retrieves the font object from the resource dictionary of the resource dictionary of the first widget of your field. If the font isn't there, the default resource dictionary is checked. Note that there are no null checks, you need to add them. At the botton of the code you'll get a PDFont object and a number.
Re alignment, call getQ(), see also here.

Setting a text style to underlined in PDFBox

I'm trying to add underlined text to a blank pdf page using PDFBox, but I haven't been able to find any examples online. All questions on stackoverflow point to extracting underlined text, but not creating it. Has this function not been implemented for PDFBox? Looking at the PDFBox documentation, it seems that fonts are pre-rendered as bold, italic, and regular.
For example, Times New Roman Regular is denoted as:
PDFont font = PDType1Font.TIMES_ROMAN.
Times New Roman Bold is denoted as:
PDFont font = PDType1Font.TIMES_BOLD
Italicized is denoted as:
PDFont font = PDType1Font.TIMES_ITALIC
There seems to be no underlined option. Is there anyway to underline text, or is this not a feature?
I'm not sure if this is a better alternative or not, but I followed Tilman Hausherr and drew a line in comparison to my text. For instance, I have the following:
public processPDF(int xOne, int yOne, int xTwo, int yTwo)
{
//create pdf and its contents for one page
PDDocument document = new PDDocument();
File file = new File("hello.pdf");
PDPage page = new PDPage();
PDFont font = PDType1Font.HELVETICA_BOLD;
PDPageContentStream contentStream;
try {
//create content stream
contentStream = new PDPageContentStream(document, page);
//being to create our text for our page
contentStream.beginText();
contentStream.setFont( font, largeTitle);
//position of text
contentStream.moveTextPositionByAmount(xOne, yOne, xTwo, yTwo);
contentStream.drawString("Hello");
contentStream.endText();
//begin to draw our line
contentStream.drawLine(xOne, yOne - .5, xTwo, yYwo - .5);
//close and save document
document.save(file);
document.close();
} catch (Exception e) {
e.printStackTrace();
}
}
where our parameters xOne, yOne, xTwo, and yTwo are our locations of the text. The line has us subtract .5 from yOne and yTwo to move it a pinch below our text location, ultimately setting it to look like underlined text.
There may be better ways, but this was the route I went.
I use below function for underlined the string.
public class UnderlineText {
PDFont font = PDType1Font.HELVETICA_BOLD;
float fontSize = 10f;
String str = "Hello";
public static void main(String[] args) {
new UnderlineText().generatePDF(20, 200);
}
public void generatePDF(int sX, int sY)
{
//create pdf and its contents for one page
PDDocument document = new PDDocument();
File file = new File("underlinePdfbox.pdf");
PDPage page = new PDPage();
PDPageContentStream contentStream;
try {
document.addPage(page);
//create content stream
contentStream = new PDPageContentStream(document, page);
//being text for our page
contentStream.beginText();
contentStream.setFont( font, fontSize);
contentStream.newLineAtOffset(sX, sY);
contentStream.showText(str);
contentStream.endText();
//Draw Underline
drawLine(contentStream, str, 1, sX, sY, -2);
//close and save document
contentStream.close();
document.save(file);
document.close();
} catch (Exception e) {
e.printStackTrace();
}
}
public void drawLine(PDPageContentStream contentStream, String text, float lineWidth, float sx, float sy, float linePosition) throws IOException {
//Calculate String width
float stringWidth = fontSize * font.getStringWidth(str) / 1000;
float lineEndPoint = sx + stringWidth;
//begin to draw our line
contentStream.setLineWidth(lineWidth);
contentStream.moveTo(sx, sy + linePosition);
contentStream.lineTo(lineEndPoint, sy + linePosition);
contentStream.stroke();
}
}
drawLine is a function which i created for drawing a line for specific string. You can adjust line as per specification using position attribute.
Minus (-) value in position field create under line. you can use positive value for over-line and stroke-line.(For example -2 for underline, 10 for over-line, 2 for stroke-line for above code)
Also you can manage the width for line.
Try this answer:
highlight text using pdfbox when it's location in the pdf is known
This method using PDAnnotationTextMarkup, it has four values
/**
* The types of annotation.
*/
public static final String SUB_TYPE_HIGHLIGHT = "Highlight";
/**
* The types of annotation.
*/
public static final String SUB_TYPE_UNDERLINE = "Underline";
/**
* The types of annotation.
*/
public static final String SUB_TYPE_SQUIGGLY = "Squiggly";
/**
* The types of annotation.
*/
public static final String SUB_TYPE_STRIKEOUT = "StrikeOut";
Hope it helps

itext 5.5 Portrait orientation not working from new page

I run the following code for an HTML file with Tables.
I am able to convert HTML to PDF for first page with margins all sides.
But as I do document.newPage(); and apply document.setPageSize(); its not working. Margins are not present.
PDF is borderless, without any margins.
Pls guide.
Code:
public class Potrait_ParseHtmlObjects {
public static final String HTML = "C:/h.html";
public static final String DEST = "C:/test33.pdf";
public void createPdf(String file) {
// Parse HTML into Element list
try{
XMLWorkerHelper helper = XMLWorkerHelper.getInstance();
// CSS
CSSResolver cssResolver = helper.getDefaultCssResolver(true);
CssFile cssFile = helper.getCSS(new FileInputStream("D:\\Itext_Test\\Test\\src\\test.css"));
cssResolver.addCss(cssFile);
// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.autoBookmark(false);
//mycode starts
FontFactory.registerDirectories();
//mycode ends
// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, end);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
//mycode starts
p.parse(new FileInputStream(HTML),Charset.forName("UTF-8"));//changed for Charset Encoding
// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
writer.setInitialLeading(12.5f);
// step 3
document.open();
// step 4
Rectangle left = new Rectangle(33,33,550,770);
document.setPageSize(left);
System.out.println("1"+document.getPageSize());
ColumnText column = new ColumnText(writer.getDirectContent());
column.setSimpleColumn(left);
int runDirection = PdfWriter.RUN_DIRECTION_LTR;
column.setRunDirection(runDirection);
int status = ColumnText.START_COLUMN;
for (Element e : elements) {
if (e instanceof PdfPTable) {
PdfPTable table = (PdfPTable) e;
for (PdfPRow row : table.getRows()) {
for (PdfPCell cell : row.getCells()) {
if(cell!=null)
cell.setRunDirection(runDirection);
}
}
}
if (ColumnText.isAllowedElement(e)) {
column.addElement(e);
status = column.go();
while (ColumnText.hasMoreText(status)) {
Rectangle left1 = new Rectangle(50,50,500,700);
document.newPage();
document.setPageSize(left1);
column.setSimpleColumn(left1);
status = column.go();
}
}
}
// step 5
document.close();
}catch(Exception ex)
{ex.printStackTrace();}
}
/**
* Main method
*/
public static void main(String[] args) throws IOException, DocumentException {
File file = new File(DEST);
file.getParentFile().mkdirs();
new Potrait_ParseHtmlObjects().createPdf(DEST);
}
}
You initialize all page parameters when you do document.newPage(), hence changing the page size or margins doesn't make sense after triggering document.newPage(). If you want a different page size (or orientation, or margins), you need to set the values for the page size, orientation and margins before invoking document.newPage() (and before document.open() if you want to change the first page).
For instance: in your case, you should create your document like this:
Document document = new Document(new Rectangle(33,33,550,770));
And you should change the page size like this:
document.setPageSize(left1);
document.newPage();
column.setSimpleColumn(left1);
You don't have any margins because you use the same Rectangle for the page size as for the column. You are creating a PDF of which the coordinate of the lower-left corner is not equal to (0, 0). This isn't illegale, but it's unusual. My guess is that you want to do something like this:
document.setPageSize(new Rectangle(0, 0, 550, 750););
document.newPage();
column.setSimpleColumn(new Rectangle(50,50,500,700));
This will result in a page size of 7.64 by 10.42 inch (550 by 750 pt) and you'll have a margin of 0
69 inches on every side (50 pt).

How to make text invisible in an existing PDF

I want to make all the text in an existing PDF transparent.
Option 1: select all the text, find a color property and change it to "colorless"
Or, if there is no such property
Option 2: Parse the page content Stream and all Form XObjects for that page, detect text blocks (BT/ET), and set the render mode to invisble.
This seems to be a complex operation.
Here is my example file
The following code is generating PDF(example pdf file):
Document document = new Document(new Rectangle(width, height));
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(filename));
document.open();
PdfContentByte picCanvas = null;
PdfContentByte txtCanvas = null;
if (isUnderPic) {
txtCanvas = writer.getDirectContentUnder();
picCanvas = writer.getDirectContent();
} else {
txtCanvas = writer.getDirectContent();
picCanvas = writer.getDirectContentUnder();
}
BaseFont bf = null;
if (null != pageList) {
int[] dpi = { 0, 0 };
if (dpiType == 1) {
dpi[0] = 300;
dpi[1] = 300;
} else if (dpiType == 2) {
dpi[0] = 600;
dpi[1] = 600;
}
for (int i = 0; i < pageList.size(); i++) {
PDFPage page = pageList.get(i);
Image pageImage = null;
if (pdfType == 3) {
pageImage = Image.getInstance(page.getBinImage());
} else {
pageImage = Image.getInstance(page.getOriImage());
}
if (pageImage.getWidth() > 0) {
pageImage.scaleAbsolute(page.getWidth(), page.getHeight());
}
pageImage.setAbsolutePosition(0, 0);
picCanvas.addImage(pageImage);
if (pdfType == 2 || pdfType == 3) {
for (PageElement ele : page.getElementList()) {
if (ele.getType().equals(PDFConstant.ElementType.PDF_ELEMENT_CHAR)) {
txtCanvas.beginText();
if (isColor) {
txtCanvas.setTextRenderingMode(PdfContentByte.TEXT_RENDER_MODE_FILL);
txtCanvas.setColorFill(BaseColor.RED);
} else {
txtCanvas.setTextRenderingMode(PdfContentByte.TEXT_RENDER_MODE_INVISIBLE);
}
String font = ele.getFont();
try {
bf = fonts.get(font);
if (null == bf) {
bf = BaseFont.createFont(font, "UniGB-UCS2-H", BaseFont.NOT_EMBEDDED);
fonts.put(font, bf);
}
} catch (Exception e) {
bf = BaseFont.createFont("STSong-Light", "UniGB-UCS2-H", BaseFont.NOT_EMBEDDED);
fonts.put(font, bf);
}
txtCanvas.setFontAndSize(bf, ele.getFontSize());
txtCanvas.setTextMatrix(ele.getPageX(), ele.getPageY(page.getRcInPage()));
txtCanvas.showText(ele.getCode());
txtCanvas.endText();
}
}
}
if (StringUtils.isNotBlank(cutPath)) {
for (PageElement ele : page.getElementList()) {
if (ele.getType().equals(PDFConstant.ElementType.PDF_ELEMENT_PIC) && StringUtils.isNotBlank(ele.getCutPicSrc())) {
ImageTools.cutPic(ele.getRcInImage(), page.getOriImage(), ele.getCutPicSrc(), dpi);
}
}
}
if (pdfType == 3) {
logger.debug("pdfType == 3");
for (PageElement ele : page.getElementList()) {
if (ele.getType().equals(PDFConstant.ElementType.PDF_ELEMENT_PIC) && StringUtils.isNotBlank(ele.getCutPicSrc())) {
if (new File(ele.getCutPicSrc()).exists()) {
Image cutCover = Image.getInstance(ImageTools.drawImage((int) ele.getWidth(), (int) ele.getHeight()));
if (cutCover.getWidth() > 0) {
cutCover.scaleAbsolute(ele.getWidth(), ele.getHeight());
}
cutCover.setAbsolutePosition(ele.getPageX(), ele.getPageY(page.getRcInPage()));
picCanvas.addImage(cutCover);
Image pic = Image.getInstance(ele.getCutPicSrc());
if (pic.getWidth() > 0) {
pic.scaleAbsolute(ele.getWidth(), ele.getHeight());
}
pic.setAbsolutePosition(ele.getPageX(), ele.getPageY(page.getRcInPage()));
picCanvas.addImage(pic);
}
}
}
}
if (i + 1 < pageList.size()) {
document.setPageSize(new Rectangle(pageList.get(i + 1).getWidth(), pageList.get(i + 1).getHeight()));
} else {
document.setPageSize(new Rectangle(pageList.get(i).getWidth(), pageList.get(i).getHeight()));
}
document.newPage();
}
}
document.close();
I've taken a look at your PDF and I see that the PDF is a scanned image. The text isn't really text: it consists of an image. Your question is invalid because it assumes that the text consists of vector data (defined using PDF syntax, such as BT and ET). In reality, the text is a bunch of pixels and any pixel doesn't know whether it belongs to a text glyph or an image. In short: you're using the wrong approach. You are trying to solve a problem using PDF software whereas you should be using a tool that manipulates raster images.
This is the image I extracted from the PDF:
The OP claims that there are two layers: one with an image, one with text. That may very well be true, but the image also contains rasterized text and it is impossible to remove that text from the image by changing the PDF syntax.
You may be able to cover the text if you know the coordinates, but that will largely depend on the accuracy of the OCR operation.
If your requirement is not to cover the text in the image, but the text of the vector layer, it's sufficient to add the syntax that adds the image after the syntax that adds the vector text. If the image is opaque, it will cover all the text. This is done in the RepeatImage example:
PdfReader reader = new PdfReader(src);
// We assume that there's a single large picture on the first page
PdfDictionary page = reader.getPageN(1);
PdfDictionary resources = page.getAsDict(PdfName.RESOURCES);
PdfDictionary xobjects = resources.getAsDict(PdfName.XOBJECT);
PdfName imgName = xobjects.getKeys().iterator().next();
Image img = Image.getInstance((PRIndirectReference)xobjects.getAsIndirectObject(imgName));
img.setAbsolutePosition(0, 0);
img.scaleAbsolute(reader.getPageSize(1));
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.getOverContent(1).addImage(img);
stamper.close();
reader.close();
Take a look at the resulting PDF; now you can still select the vector text, but it's no longer visible.

Don't know how to create a centered "image + text" watermark in pdf files with iText (Java)

I'm using the iText library, and I'm trying to add a watermark at the bottom of the page. The watermark is simple, it has to be centered an has an image on the left and a text on the right.
At this point, I have the image AND the text in a png format. I can calculate the position where I want to put the image (centered) calculating the page size and image size, but now I want to include the text AS text (better legibility, etc.).
Can I embed the image and the text in some component and then calculate the position like I'm doing now? Another solutions or ideas?
Here is my actual code:
try {
PdfReader reader = new PdfReader("example.pdf");
int numPages = reader.getNumberOfPages();
PdfStamper stamp = new PdfStamper(reader, new FileOutputStream("pdfWithWatermark.pdf"));
int i = 0;
Image watermark = Image.getInstance("watermark.png");
PdfContentByte addMark;
while (i < numPages) {
i++;
float x = reader.getPageSizeWithRotation(i).getWidth() - watermark.getWidth();
watermark.setAbsolutePosition(x/2, 15);
addMark = stamp.getUnderContent(i);
addMark.addImage(watermark);
}
stamp.close();
}
catch (Exception i1) {
logger.info("Exception adding watermark.");
i1.printStackTrace();
}
Thank you in advance!
you better check this:
import com.lowagie.text.*;
import java.io.*;
import com.lowagie.text.pdf.*;
import java.util.*;
class pdfWatermark
{
public static void main(String args[])
{
try
{
PdfReader reader = new PdfReader("text.pdf");
int n = reader.getNumberOfPages();
// Create a stamper that will copy the document to a new file
PdfStamper stamp = new PdfStamper(reader,
new FileOutputStream("text1.pdf"));
int i = 1;
PdfContentByte under;
PdfContentByte over;
Image img = Image.getInstance("watermark.jpg");
BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA,
BaseFont.WINANSI, BaseFont.EMBEDDED);
img.setAbsolutePosition(200, 400);
while (i < n)
{
// Watermark under the existing page
under = stamp.getUnderContent(i);
under.addImage(img);
// Text over the existing page
over = stamp.getOverContent(i);
over.beginText();
over.setFontAndSize(bf, 18);
over.showText("page " + i);
over.endText();
i++;
}
stamp.close();
}
catch (Exception de)
{}
}
}
(source)
is a bit ugly but, can't you add the image and the text to a table and then center it?

Categories

Resources