We have a program that modifies a pdf for mailing. We noticed an issue with docusign envelope fields and signatures. They do not stay on the pdf when we modify. I tried everything i can think of in pdfbox.
The only workaround i found is foxit flatten signature when printing and adobe save as pdf do create a flattened pdf that works. I cannot render the whole pdf to an image because i need the pdf text fields to be evaluated programatically.
Anyway to create the same flatten that foxit and adobe to with pdfbox. I am so stumped:(
Not sure if this helps but I am able to access the item in the document this way.
PDDocument doc = PDDocument.load( myFile );
PDPageTree allPages = doc.getDocumentCatalog().getPages();
PDPage page1 = allPages.get(1);
COSDictionary pageDict = page1.getCOSObject();
COSDictionary newPageDict = new COSDictionary(pageDict);
COSDictionary test = newPageDict.getCOSDictionary(COSName.RESOURCES);
test = test.getCOSDictionary(COSName.XOBJECT);
test = test.getCOSDictionary(COSName.F);
test = test.getCOSDictionary(COSName.RESOURCES);
test = test.getCOSDictionary(COSName.XOBJECT);
The item exists in a COSName{X0}, but it does not appear pdfbox can access this so I cannot flatten it. I'd like to itterate through entire document for any non identifiable COSNames and render it to an image because that does work then use that image? Anyway to do this?
Related
The problem:
I have a pdf that I've created using Adobe acrobat pro. Inside the pdf I have several form buttons ("fields") which I want to dynamically insert them a web url link to enter when one's click a button.
what I've tried so far:
PDDocument doc = new PDDocument.load(pdf);
PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
PDPushButton b1 = acroForm.getField("Button1");
PDActionURI uri = new PDActionURI();
uri.setURI("https://stackoverflow.com");
PDFormFieldAdditionalActions actions = new PDFormFieldAdditionalActions();
actions.setF(uri);
b1.setActions(actions);
and -
PDDocument doc = new PDDocument.load(pdf);
PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
PDPushButton b1 = acroForm.getField("Button1");
PDActionURI uri = new PDActionURI();
uri.setURI("https://stackoverflow.com");
b1.getWidget().setAction(uri);
I haven't find any documentation on how to do it - so I've tried the above code based on reading their open source code and what I've understood from it.
I've also tried to use an action called PDActionGoTo but it has a function 'setDestination' which gets 'PDDestination' object, but I didn't understand how to initial it with a url.
I've found also someone with a similar problem - only that he created the button in the code instead of getting it from the pdf file. I didn't manage to make the adjustment but I will supply the code (he wrote that it worked for him eventually):
PDPushButton pb = new PDPushButton(acroForm);
pb.setPartialName("sbtn");
COSDictionary cosPush = pb.getCOSObject();
COSDictionary cosA = new COSDictionary();
cosPush.setInt(COSName.F, 4);
cosPush.setItem(COSName.A, cosA);
cosPush.setItem(COSName.P, page);
cosA.setInt(COSName.FLAGS, 256);
cosA.setName(COSName.S, "SubmitForm");
COSDictionary cosF = new COSDictionary();
cosA.setItem(COSName.F, cosF);
cosF.setString(COSName.F, "https://stackoverflow.com");
cosF.setName("FS", "URL");
// add the field to the acroform
acroForm.getFields().add(pb);
To see the original question go here
Also, does anyone knows what are those constants COSName.F/COSName.A/ect. meaning? is it part of the pdf acro-form specification? I tried to look it up but didn't find any helpful information about it, beside that it looks like everyone that asked about pdfbox before somehow already knows it.
about the environment:
java 8
pdfbox version 2.0.22
adobe acrobat pro version 2022.001.20112
thanks for the help : )
What I am trying to achieve is to replace a text in pdf file. I have the following code:
PdfReader reader = new PdfReader("test.pdf");
PdfDictionary dict = reader.getPageN(1);
PdfObject object = dict.getDirectObject(PdfName.CONTENTS);
if (object instanceof PRStream)
{
PRStream stream = (PRStream) object;
byte[] data = PdfReader.getStreamBytes(stream);
System.out.println(new String(data));
stream.setData(new String(data).replace("application", "HELLO WORLD").getBytes());
}
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("test-output.pdf"));
stamper.close();
reader.close();
When I trying to print out to see the data (System.out.println(new String(data))), "application" is showing as "ap)-4(plica)-3(tion", that's the reason why I failed to replace the text, any idea or other method that can achieve what I trying to achieve?
You will not be able to do this with iText.
Believe me, this is one of the most frustrating discoveries about PDFs: you can build them with iText, but you cannot go back later and replace text with something else, as you have in your example.
There really is not much you can do about it. Once text is there, you can't modify it.
All that notwithstanding, you can usually ADD new content (text, images, etc.) to an existing PDF. So... if you can alter the universe slightly and create a PDF with empty space in the correct size, you can go back later and use the PdfStamper class to "stamp" on another layer of graphical content.
More on this can be found in the iText documentation, and in this fine question:
How to add Content to a PDF using iText PdfStamper
I have a PDF previously created with FOP, and I need to add some named destinations to it so later another program can open and navigate the document with the Adobe PDF open parameters, namely the #namedest=destination_name parameter.
I don't need to add bookmarks or other dynamic content but just some destinations with a name and thus injecting a /Dests collection with names defined in the resulting PDF.
I use iText 5.3.0 and I read the chapter 7 of iText in Action (2nd edition), but still I cannot figure it out how to add the destinations and so use them with #nameddest in a browser.
I'm reading and manipulating the document with PdfReader and PdfStamper. I already know in advance where to put every destination after having parsed the document with a customized Listener and a PdfContentStreamProcessor, searching for a specific text marker on each page.
This is a shortened version of my code:
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new BufferedOutputStream(dest));
// search text markers for destinations, page by page
for (int i=1; i<reader.getNumberOfPages(); i++) {
// get a list of markers for this page, as obtained with a custom Listener and a PdfContentStreamProcessor
List<MyDestMarker> markers = ((MyListener)listener).getMarkersForThisPage();
// add a destination for every text marker in the current page
Iterator<MyDestMarker> it = markers.iterator();
while(it.hasNext()) {
MyDestMarker marker = it.next();
String name = marker.getName();
String x = marker.getX();
String y = marker.getY();
// create a new destination
PdfDestination dest = new PdfDestination(PdfDestination.FITH, y); // or XYZ
// add as a named destination -> does not work, only for new documents?
stamper.getWriter().addNamedDestination(name, i /* current page */, dest);
// alternatives
PdfContentByte content = stamper.getOverContent(i);
content.localDestination(name, dest); // doesn't work either -> no named dest found
// add dest name to a list for later use with Pdf Open Parameters
destinations.add(name);
}
}
stamper.close();
reader.close();
I also tried creating a PdfAnnotation with PdfFormField.createLink() but still, I just manage to get the annotation but with no named destination defined it does not work.
Any solution for this? Do I need to add some "ghost" content over the existing one with Chunks or something else?
Thanks in advance.
edit 01-27-2016:
I recently found an answer to my question in the examples section of iText website, here.
Unfortunately the example provided does not work for me if I test it with a pdf without destinations previously defined in it, as it is the case with the source primes.pdf which already contains a /Dests array. This behaviour appears to be consistent with the iText code, since the writer loads the destinations in a map attribute of PdfDocument which is not "inherited" by the stamper on closing.
That said, I got it working using the method addNamedDestination() of PdfStamper added with version 5.5.7; this method loads a named destination in a local map attribute of the class which is later processed and consolidated in the document when closing the stamper.
This approach reaised a new issue though: the navigation with Pdf Open Parameters (#, #nameddest=) works fine with IE but not with Chrome v47 (and probably Firefox, too). I tracked the problem down to the order in which the dests names are defined and referenced inside the document; the stamper uses a HashMap as the container for the destinations, which of course does not guarantee the order of its objects and for whatever reason Chrome refuse to recognise destinations not listed in "natural" order. So, the only way I got it to work is replacing the namedDestinations HashMap with a natural-ordered TreeMap.
Hope this help others with the same issue.
I 've been in the same need for my project previously. Had to display and navigate pdf document with acrobat.jar viewer. To navigate i needed the named destinations in the pdf. I have looked around the web for a possible solution, but no fortunate for me. Then I this idea strikes my mind.
I tried to recreate the existing pdf with itext, navigating through each page and adding localdestinations to each page and i got what I wanted. below is the snip of my code
OutputStream outputStream = new FileOutputStream(new File(filename));
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
PdfContentByte cb = writer.getDirectContent();
PdfOutline pol = cb.getRootOutline();
PdfOutline oline1 = null;
InputStream in1 = new FileInputStream(new File(inf1));
PdfReader reader = new PdfReader(in1);
for (int i = 1; i <= reader.getNumberOfPages(); i++)
{
document.newPage();
document.setMargins(0.0F, 18.0F, 18.0F, 18.0F);
PdfImportedPage page = writer.getImportedPage(reader, i);
document.add(new Chunk(new Integer(i).toString()).setLocalDestination(new Integer(i).toString()));
System.out.println(i);
cb.addTemplate(page, 0.0F, 0.0F);
}
outputStream.flush();
document.close();
outputStream.close();
Thought it would help you.
I am trying to generate a PDF document from a *.doc document.
Till now and thanks to stackoverflow I have success generating it but with some problems.
My sample code below generates the pdf without formatations and images, just the text.
The document includes blank spaces and images which are not included in the PDF.
Here is the code:
in = new FileInputStream(sourceFile.getAbsolutePath());
out = new FileOutputStream(outputFile);
WordExtractor wd = new WordExtractor(in);
String text = wd.getText();
Document pdf= new Document(PageSize.A4);
PdfWriter.getInstance(pdf, out);
pdf.open();
pdf.add(new Paragraph(text));
docx4j includes code for creating a PDF from a docx using iText. It can also use POI to convert a doc to a docx.
There was a time when we supported both methods equally (as well as PDF via XHTML), but we decided to focus on XSL-FO.
If its an option, you'd be much better off using docx4j to convert a docx to PDF via XSL-FO and FOP.
Use it like so:
wordMLPackage = WordprocessingMLPackage.load(new java.io.File(inputfilepath));
// Set up font mapper
Mapper fontMapper = new IdentityPlusMapper();
wordMLPackage.setFontMapper(fontMapper);
// Example of mapping missing font Algerian to installed font Comic Sans MS
PhysicalFont font
= PhysicalFonts.getPhysicalFonts().get("Comic Sans MS");
fontMapper.getFontMappings().put("Algerian", font);
org.docx4j.convert.out.pdf.PdfConversion c
= new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);
// = new org.docx4j.convert.out.pdf.viaIText.Conversion(wordMLPackage);
OutputStream os = new java.io.FileOutputStream(inputfilepath + ".pdf");
c.output(os);
Update July 2016
As of docx4j 3.3.0, Plutext's commercial PDF renderer is docx4j's default option for docx to PDF conversion. You can try an online demo at converter-eval.plutext.com
If you want to use the existing docx to XSL-FO to PDF (or other target supported by Apache FOP) approach, then just add the docx4j-export-FO jar to your classpath.
Either way, to convert docx to PDF, you can use the Docx4J facade's toPDF method.
The old docx to PDF via iText code can be found at https://github.com/plutext/docx4j-export-FO/.../docx4j-extras/PdfViaIText/
WordExtractor just grabs the plain text, nothing else. That's why all you're seeing is the plain text.
What you'll need to do is get each paragraph individually, then grab each run, fetch the formatting, and generate the equivalent in PDF.
One option may be to find some code that turns XHTML into a PDF. Then, use Apache Tika to turn your word document into XHTML (it uses POI under the hood, and handles all the formatting stuff for you), and from the XHTML on to PDF.
Otherwise, if you're going to do it yourself, take a look at the code in Apache Tika for parsing word files. It's a really great example of how to get at the images, the formatting, the styles etc.
I have succesfully used Apache FOP to convert a 'WordML' document to PDF. WordML is the Office 2003 way of saving a Word document as xml. XSLT stylesheets can be found on the web to transform this xml to xml-fo which in turn can be rendered by FOP into PDF (among other outputs).
It's not so different from the solution plutext offered, except that it doesn't read a .doc document, whereas docx4j apparently does. If your requirements are flexible enough to have WordML style documents as input, this might be worth looking into.
Good luck with your project!
Wim
Use OpenOffice/LbreOffice and JODConnector
This also mostly works for .doc to .docx. Problems with graphics that I have not yet worked out though.
private static void transformDocXToPDFUsingJOD(File in, File out)
{
OfficeDocumentConverter converter = new OfficeDocumentConverter(officeManager);
DocumentFormat pdf = converter.getFormatRegistry().getFormatByExtension("pdf");
converter.convert(in, out, pdf);
}
private static OfficeManager officeManager;
#BeforeClass
public static void setupStatic() throws IOException {
/*officeManager = new DefaultOfficeManagerConfiguration()
.setOfficeHome("C:/Program Files/LibreOffice 3.6")
.buildOfficeManager();
*/
officeManager = new ExternalOfficeManagerConfiguration().setConnectOnStart(true).setPortNumber(8100).buildOfficeManager();
officeManager.start();
}
#AfterClass
public static void shutdownStatic() throws IOException {
officeManager.stop();
}
You need to be running LibreOffice as a serverto make this work.
From the command line you can do this using;
"C:\Program Files\LibreOffice 3.6\program\soffice.exe" -accept="socket,host=0.0.0.0,port=8100;urp;LibreOffice.ServiceManager" -headless -nodefault -nofirststartwizard -nolockcheck -nologo -norestore
Another option I came across recently is using the OpenOffice (or LibreOffice) API (see here). I have not been able to get into this but it should be able to open documents in various formats and output them in a pdf format. If you look into this, let me know how it worked!
I'm using iText to dynamically generate PDF docs. Now I'm trying to dynamically create a barcode in this PDF. Adobe Live Cycle has a barcode function built-in. You can just drag the barcode text box on the page and it's created.
Problem:
I placed the barcode field in the PDF. Then pass a number to the barcode field from the JSP page. But only the number appears. The barcode lines never display
The number, 20099002, is visible on the PDF doc, but the barcode lines fail to appear. I tried several other barcode options in LiveCycle but the all give the same result.
OurJavaPage.java
public class ExampleForm extends BaseOutput {
private static final Log LOG = LogFactory.getLog(ExampleForm.class);
public OutputStream generate() throws IOException, DocumentException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfReader reader;
reader = new PdfReader(BASEDIR + "MailingExample.pdf");
PdfStamper stamper = new PdfStamper(reader, baos);
Participant participantHome = home.getParticipant();
Set<Location> homeLocs = participantHome.getLocations();
final AcroFields form = stamper.getAcroFields();
setFormField(form, "addrNumber[0]", addrMaster.getStreetNum());
setFormField(form, "dateMiddle[0]", formatDate("MM-dd-yyyy", new Date()));
// *********** Here's the problem *****************************
setFormField(form, "Code128ABarcode1[0]", "20099002");
// ************************************************************
debugAcrobatForm("ExampleForm", form);
stamper.setFormFlattening(true);
stamper.close();
return baos;
}
}
Operating System: Linux
Programming: Java, .jsp, iText
Software: Adobe Live Cycle Designer ES 8.1
Problem solved!!!
I contacted iText and they suggested that I change this line.
From:
stamper.setFormFlattening(true);
To:
stamper.setFormFlattening(false);
It worked.
Does your software embed the barcode as a graphic or as a font representation of characters?
If the latter, is it embedding the font into the PDF?
Did you ask on the very active IText mailing list?
If this was a dynamic XFA form created with LiveCycle, then using form flattening will cause you to lose your form fields. Static XFA forms should work though.
Reference: http://itext.ugent.be/library/question.php?id=30
XFA support in iText is improving but spotty at best.
u have to add barcode font in ur system (font library), then it will be visible in ur font drop-down. Use acro-field (text) and set that font in this acro-field. Ur problem will be solved.
also, use setformflattening=true as it will make pdf uneditable
it works just fine for me.. maybe its got to do with the way you make the template in livecycle designer (static or dynamic)...
see sample here
http://1t3xt.info/examples/browse/?page=example&id=433
Regards
Raghavendra Samant