Replace a placeholder with image in word?

Replace a placeholder with image in word? - java

I need to replace the placeholder with image in the word document using Apache POI. I am able to insert the picture in the word document using Apache poi. But i don't know how to replace the placeholder with image. Can anyone please help in this?
I know it will be easy if we do it through docx4j or some other API, i am allowed to use only the Apache poi.

It can be done but I believe you must insert raw XML to accomplish it currently. This linked question "Insert picture in word document" has the basic idea. You can do it using only the libraries that POI requires, not dom4j. If you look at the source for the method on the XWPFRun that adds a picture it too is trying to add raw XML. But if you use that method it renders your doc unreadable when written back to disk. So you have to add the picture to the document using the XWPFDocument level method, which returns a generated ID for the picture. And then add raw XML to the run with that ID in it, as the example link does.
The way we solved the problem was to instead have our users insert a placeholder image into their Word doc file instead of text. We then: add the replacement image to be inserted at the document level, find the run that contains the placeholder image using the size of the image as criteria, then get and replace the XML for that run with the new image's ID swapped in. As long as the placeholder and the replacement image are the same size this works. If you need to adjust the size of the image after replacing, you could manipulate the XML size values in the same manner. I like our solution better because it is less susceptible to changes in the Word doc XML format then inserting your own full XML for the picture. Cheers
InputStream newImageIS = getImageForCorporation(corporationID);
String relationID = run.getParagraph().getDocument().addPictureData(newImageIS, Document.PICTURE_TYPE_GIF);
replaceRunImageData(run, relationID);
private void replaceRunImageData(XWPFRun run, String relationID) {
CTGraphicalObjectData graph = run.getCTR().getDrawingArray(0).getInlineArray(0).getGraphic()
.getGraphicData();
String currentGraphicXML = graph.toString();
String originalID = RegularExpressionUtil.capture("<a:blip r:embed=\"(\\w+)\"", currentGraphicXML);
String newXML = StringUtils.replace(currentGraphicXML, originalID, relationID);
try {
graph.set(XmlToken.Factory.parse(newXML));
} catch (XmlException e) {
throw new RuntimeException(e);
}
replaced = true;
}
We identified the run of the image to replace by search each run's list of embedded pictures which met the below criteria. We tried using the name of the image as the criteria, but we found that if the placeholder image was copied from one Word doc to another Word doc on a different PC, the name was lost.
private boolean isRunForExistingImage(XWPFPicture pic) {
if (pic == null || pic.getCTPicture() == null || pic.getCTPicture().getSpPr() == null
|| pic.getCTPicture().getSpPr().getXfrm() == null
|| pic.getCTPicture().getSpPr().getXfrm().getExt() == null) {
return false;
}
long x = pic.getCTPicture().getSpPr().getXfrm().getExt().getCx();
long y = pic.getCTPicture().getSpPr().getXfrm().getExt().getCy();
return x == 2066925 && y == 590550;
}

Related

Fill out field for pdf revision when signing with PDFbox

I am trying to add input to my existing field in a pdf that should be signed at the end. I used the library from swisscom (https://github.com/SwisscomTrustServices/pdfbox-ais-client/blob/8d52c759ade267b0c443fcd6f15bc9635c745d72/src/main/java/com/swisscom/ais/client/impl/PdfDocument.java#L97) with PDFbox (v2.0.24) and added these lines
try {
PDAcroForm acroForm = pdDocument.getDocumentCatalog().getAcroForm();
acroForm.setSignaturesExist(true);
acroForm.setAppendOnly(true);
acroForm.getCOSObject().setDirect(true);
acroForm.getCOSObject().setNeedToBeUpdated(true);
FieldInput[] fields = new FieldInput[1];
COSObject pdfFields = acroForm.getCOSObject().getCOSObject(COSName.FIELDS);
if (pdfFields != null) {
pdfFields.setNeedToBeUpdated(true);
}
fields[0] = new FieldInput("1", "foobar");
for (int i = 0; i < fields.length; i++) {
PDField field = acroForm.getField(fields[0].id);
if (field != null) {
field.setValue(fields[0].value);
Log.info("set field: " + field.getFullyQualifiedName());
}
}
pdDocument.getDocumentCatalog().getCOSObject().setNeedToBeUpdated(true);
} catch (Exception e) {
Log.warn(e);
}
I get the log output that the field was set, but in the final document the field is still empty. Using this answer from PDFBox 2.0 create signature field and save incremental with already signed document was no luck for me, I think I messed up the form handling since pdfFields is null.
Update:
I added the suggestion from Tilman, now the entry gets set with
field.getCOSObject().setNeedToBeUpdated(true);
But when looking at the signature I have no information which fields are filled out:
Is it possible with pdfbox to achieve the same output as AdobeSign like where you can store more detailed information in the revision metadata? I am not able to open the file with itext rups because the pdf gets locked with a password at the end....
And what would be the best way to look the fields when everyone is done (so the fields are not shown as editable fields anymore):
setting a lock like in Adobe
setting some protection after the field was filled

The update flag must be set on the field itself
field.getCOSObject().setNeedToBeUpdated(true);
and on the appearance of the widget of the field
field.getWidgets().get(0).getAppearance().getCOSObject().setNeedToBeUpdated(true);
(this assumes that the field is its own widget and that there is only one)
It might still work if the second code part is missing because Adobe Reader updates the appearance when displaying.

How to add a hyperlink to image in a Word document using Apache POI?

In Word, you can insert a hyperlink to an image by right-clicking the image, and selecting "Link..." as follows:
How can I do this programmatically using Apache POI?

As of this writing, there is no API available via the latest available version (4.1.2) of the Apache POI library to add a hyperlink to an image.
Therefore, the only approach is to use the underlying objects to manipulate the XML structure of the document directly.
Hyperlinks exist as a relationship on the document object, so the first thing to do is to create a new relationship on the document object:
String relationshipId = paragraph.getDocument().getPackagePart()
.addExternalRelationship(url, XWPFRelation.HYPERLINK.getRelation()).getId();
After that, retrieve the CTDrawing object from the XWPFRun, and insert a new CTHyperlink to set the hyperlink on the image:
if (run.getCTR().getDrawingList() != null && !run.getCTR().getDrawingList().isEmpty()) {
CTDrawing ctDrawing = run.getCTR().getDrawingList().get(0);
if (ctDrawing.getInlineList() != null && !ctDrawing.getInlineList().isEmpty()) {
CTInline ctInline = ctDrawing.getInlineList().get(0);
CTNonVisualDrawingProps docPr = ctInline.getDocPr();
if (docPr != null) {
org.openxmlformats.schemas.drawingml.x2006.main.CTHyperlink hlinkClick = docPr.addNewHlinkClick();
hlinkClick.setId(relationshipId);
}
}
}
If the CTHyperlink object already exists, you can set the id on the object to point it to a new hyperlink.

Keep dimension of new image when replacing old image using docx4j

I need to add an image to my docx file. The image is a png image of a signature that is to placed behind text in the signature line of a certificate to be downloaded by the user as a docx, a pdf or jpg. The first problem I encountered is that you can only add inline image using the latest version of docx4j (v6.1.2) and creating an image Anchor is currently disabled (see BinaryPartAbstractImage.java: line 1029). That's a problem since the signature image is not inline, it supposed to appear behind the name on the signature line. Instead of inserting one myself, my workaround is to place a placeholder image:
These images are mapped as image1.png and image2.png, respectively, on /word/media directory of the docx uncompressed version. The program then replaces these with the name, position, and actual png of the signature every time a certificate is generated.
The problem is that the images are scaled the same dimension as the placeholder image, where in fact it should look like this:
How can I get to keep the image dimension of the image after replacing, or at least the aspect ratio? Here is how I replace the placeholder image with the new image:
File approveBySignatureImage = new File(...);
final String approvedByImageNodeId = "rId5";
replaceImageById(approvedByImageNodeId,
"image1.png", approveBySignatureImage);
This is the actual method where the replacing happens:
public void replaceImageById(String id, String placeholderImageName, File newImage) throws Exception {
Relationship rel = document.getMainDocumentPart().getRelationshipsPart().getRelationshipByID(id);
BinaryPartAbstractImage imagePart;
if(FilenameUtils.getExtension(placeholderImageName).toLowerCase() == ContentTypes.EXTENSION_BMP) {
imagePart = new ImageBmpPart(new PartName("/word/media/" + placeholderImageName));
}
else if([ContentTypes.EXTENSION_JPG_1, ContentTypes.EXTENSION_JPG_2].contains(FilenameUtils.getExtension(placeholderImageName).toLowerCase())) {
imagePart = new ImageJpegPart(new PartName("/word/media/" + placeholderImageName));
}
else if(FilenameUtils.getExtension(placeholderImageName).toLowerCase() == ContentTypes.EXTENSION_PNG) {
imagePart = new ImagePngPart(new PartName("/word/media/" + placeholderImageName));
}
InputStream stream = new FileInputStream(newImage);
imagePart.setBinaryData(stream);
if(FilenameUtils.getExtension(newImage.getName()).toLowerCase() == ContentTypes.EXTENSION_BMP) {
imagePart.setContentType(new ContentType(ContentTypes.IMAGE_BMP));
}
else if([ContentTypes.EXTENSION_JPG_1, ContentTypes.EXTENSION_JPG_2].contains(FilenameUtils.getExtension(newImage.getName()).toLowerCase())) {
imagePart.setContentType(new ContentType(ContentTypes.IMAGE_JPEG));
}
else if(FilenameUtils.getExtension(newImage.getName()).toLowerCase() == ContentTypes.EXTENSION_PNG) {
imagePart.setContentType(new ContentType(ContentTypes.IMAGE_PNG));
}
imagePart.setRelationshipType(Namespaces.IMAGE);
final String embedId = rel.getId();
rel = document.getMainDocumentPart().addTargetPart(imagePart);
rel.setId(embedId);
}

You'll need to set the dimensions (or possibly just remove what you have?) on your placeholder image.
For help in doing that:-
docx4j inspects the image to work that out at https://github.com/plutext/docx4j/blob/master/src/main/java/org/docx4j/openpackaging/parts/WordprocessingML/BinaryPartAbstractImage.java#L512 using org.apache.xmlgraphics ImageInfo.
See also CxCy:https://github.com/plutext/docx4j/blob/master/src/main/java/org/docx4j/openpackaging/parts/WordprocessingML/BinaryPartAbstractImage.java#L1164
https://github.com/plutext/docx4j/blob/master/src/main/java/org/docx4j/openpackaging/parts/WordprocessingML/BinaryPartAbstractImage.java#L815 shows scaling to maintain aspect ratio.

Convert embedded pictures in database

I have a 'small' problem. In a database documents contain a richtextfield. The richtextfield contains a profile picture of a certain contact. The problem is that this content is not saved as mime and therefore I can not calculate the url of the image.
I'm using a pojo to retrieve data from the person profile and use this in my xpage control to display its contents. I need to build a convert agent which takes the content of the richtextitem and converts it to mime to be able to calculate the url something like
http://host/database.nsf/($users)/D40FE4181F2B86CCC12579AB0047BD22/Photo/M2?OpenElement
Could someone help me with converting the contents of the richtextitem to mime? When I check for embedded objects in the rt field there are none. When I get the content of the field as stream and save it to a new richtext field using the following code. But the new field is not created somehow.
System.out.println("check if document contains a field with name "+fieldName);
if(!doc.hasItem(fieldName)){
throw new PictureConvertException("Could not locate richtextitem with name"+fieldName);
}
RichTextItem pictureField = (RichTextItem) doc.getFirstItem(fieldName);
System.out.println("Its a richtextfield..");
System.out.println("Copy field to backup field");
if(doc.hasItem("old_"+fieldName)){
doc.removeItem("old_"+fieldName);
}
pictureField.copyItemToDocument(doc, "old_"+fieldName);
// Vector embeddedPictures = pictureField.getEmbeddedObjects();
// System.out.println(doc.hasEmbedded());
// System.out.println("Retrieved embedded objects");
// if(embeddedPictures.isEmpty()){
// throw new PictureConvertException("No embedded objects could be found.");
// }
//
// EmbeddedObject photo = (EmbeddedObject) embeddedPictures.get(0);
System.out.println("Create inputstream");
//s.setConvertMime(false);
InputStream iStream = pictureField.getInputStream();
System.out.println("Create notesstream");
Stream nStream = s.createStream();
nStream.setContents(iStream);
System.out.println("Create mime entity");
MIMEEntity mEntity = doc.createMIMEEntity("PictureTest");
MIMEHeader cdheader = mEntity.createHeader("Content-Disposition");
System.out.println("Set header withfilename picture.gif");
cdheader.setHeaderVal("attachment;filename=picture.gif");
System.out.println("Setcontent type header");
MIMEHeader cidheader = mEntity.createHeader("Content-ID");
cidheader.setHeaderVal("picture.gif");
System.out.println("Set content from stream");
mEntity.setContentFromBytes(nStream, "application/gif", mEntity.ENC_IDENTITY_BINARY);
System.out.println("Save document..");
doc.save();
//s.setConvertMime(true);
System.out.println("Done");
// Clean up if we are done..
//doc.removeItem(fieldName);

Its been a little while now and I didn't go down the route of converting existing data to mime. I could not get it to work and after some more research it seemed to be unnecessary. Because the issue is about displaying images bound to a richtextbox I did some research on how to compute the url for an image and I came up with the following lines of code:
function getImageURL(doc:NotesDocument, strRTItem,strFileType){
if(doc!=null && !"".equals(strRTItem)){
var rtItem = doc.getFirstItem(strRTItem);
if(rtItem!=null){
var personelDB = doc.getParentDatabase();
var dbURL = getDBUrl(personelDB);
var imageURL:java.lang.StringBuffer = new java.lang.StringBuffer(dbURL);
if("file".equals(strFileType)){
var embeddedObjects:java.util.Vector = rtItem.getEmbeddedObjects();
if(!embeddedObjects.isEmpty()){
var file:NotesEmbeddedObject = embeddedObjects.get(0);
imageURL.append("(lookupView)\\");
imageURL.append(doc.getUniversalID());
imageURL.append("\\$File\\");
imageURL.append(file.getName());
imageURL.append("?Open");
}
}else{
imageURL.append(doc.getUniversalID());
imageURL.append("/"+strRTItem+"/");
if(rtItem instanceof lotus.domino.local.RichTextItem){
imageURL.append("0.C4?OpenElement");
}else{
imageURL.append("M2?OpenElement");
}
}
return imageURL.toString()
}
}
}
It will check if a given RT field is present. If this is the case it assumes a few things:
If there are files in the rtfield the first file is the picture to display
else it will create a specified url if the item is of type Rt otherwhise it will assume it is a mime entity and will generate another url.

Not sure if this is an answer but I can't seem to add comments yet. Have you verified that there is something in your stream?
if (stream.getBytes() != 0) {

The issue cannot be resolved "ideally" in Java.
1) if you convert to MIME, you screw up the original Notes rich text. MIME allows only for sad approximation of original content; this might or might not matter.
If it matters, it's possible to convert a copy of the original field to MIME used only for display purposes, or scrape it out using DXL and storing separately - however this approach again means an issue of synchronization every time somebody changes the image in the original RT item.
2) computing URL as per OP code in the accepted self-answer is not possible in general as the constant 0.C4 in this example relates to the offset of the image in binary data of the RT item. Meaning any other design of rich text field, manually entered images, created by different version of Notes - all influence the offset.
3) the url can be computed correctly only by using C API that allows to investigate binary data in rich text item. This cannot be done from Java. IMO (without building JNI bridges etc)

How to use iText to add a watermark using an embedded font

I've several pdf/a documents with some embedded fonts, now I've to post-process these documents using iText to add a watermark.
I know that it's possible to embed a font with iText:
BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.WINANSI, BaseFont.EMBEDDED);
but I would like to use a font that is already embedded in the document to add the textual watermark (something like "SAMPLE COPY").
How can I do it?

What you want is a DocumentFont. You cannot create them directly, the constructor is package private, but BaseFont.createFont(PRIndirectReference) will do the trick.
So you just need to get a PRIndirectReference to the font you want. "PR"s come from a PdfReader. There are two ways to find the font you're looking for and get a reference to it:
1) Enumerate every object in the PdfReader, filtering out everything that's not a PRStream, from there dropping everything that's not a /Type /Font, and looking for a font with the correct name.
public PRIndirectReference findNamedFont( PdfReader myReader, String desiredFontName) {
int objNum = 0;
PdfObject curObj;
do {
//The "Release" version doesn't keep a reference
//to the object so it can be GC'd later. Quite Handy
//when dealing with Really Big PDFs.
curObj = myReader.getPdfObjectRelease( objNum++ );
if (curObj instanceof PRStream) {
PRStream stream = (PRStream)curObj;
PdfName type = stream.getAsName(PdfName.TYPE);
if (PdfName.FONT.equals(type)) {
PdfString fontName = stream.getAsString(PdfName.BASEFONT);
if (desiredFontName.equals(fontName.toString())) {
return curObj.getIndRef();
}
}
}
} while (curObj != null);
return null;
}
2) Examine your pages' resource dictionaries /Font <<>> dicts, looking for a font with the correct name. Keep in mind that XObject Form resources have resources of their own you'll have to check to:
public PRIndirectReference findFontInPage(PdfReader reader, String desiredName, int i) {
PdfDictionary page = reader.getPageN(i);
return findFontInResources(page.getAsDict(PdfName.RESOURCES), desiredName);
}
public PRIndirectReference findFontInResources(PdfDictionary resources, String desiredName) {
if (resources != null) {
PdfDictionary fonts = resources.getAsDict(PdfName.FONTS);
if (fonts != null) {
for (PdfName curFontName : fonts.keySet()) {
PRStream curFont (PRStream)= fonts.getAsStream(curFontName);
if (desiredName.equals(curFont.getAsString(PdfName.BASEFONT).toString()) {
return (PRIndirectReference) curFont.getIndirectReference();
}
}
}
PdfDictionary xobjs = resources.getAsDict(PdfName.XOBJECTS);
if (xobjs != null) {
for (PdfName curXObjName : xobjs.keySet()) {
PRStream curXObj = (PRStream)xobjs.getAsStream(curXObjName);
if (curXObj != null && PdfName.FORM.equals(curXObj.getAsName(PdfName.SUBTYPE)) {
PdfDictionary resources = curXObj.getAsDict(PdfName.RESOURCES);
PRIndirectReference ref = findFontInResources(resources, desiredName);
if (ref != null) {
return ref;
}
}
}
}
}
return null;
}
Either one of those will get you the PRIndirectReference you're after. Then you call BaseFont.createFont(myPRRef) and you'll have the DocumentFont you need. The first method will find any font in the PDF, while the second will only find fonts That Are Actually Used.
Also, subsetted fonts are supposed to have a "6-random-letters-plus-sign" tag prepended to the font name. DO NOT use a font subset. The characters you're using may not be in the subset, leading to what I call the " arry ole" problem. It sounds nice and dirty, but it was really just our sales guy's name: "Harry Vole" missing the upper case letters because I'd subsetted some font I shouldn't have Many Moons Ago.
PS: never embed subsets of fonts you intend to be used in a form field. No Bueno.
The usual "I wrote all that code in the answer box here" disclaimer applies, but I've written a LOT of this sort of code, so it just might work out of the box. Cross your fingers. ;)

An entirely different approach: Use Line Art instead of Text.
If you create a "line art only" PdfGraphics2D object from the page's overContent, you can use an AWT font and need not worry about embedding at all. With a relatively short string you don't have to worry about the PDF's size exploding either.
PdfContentByte overcontent = stamper.getOverContent(1);
Graphics2D g2d = overcontent.createGraphicsShapes(pageWid, pageHei);
drawStuffToTheGraphic(g2d);
g2d.dispose();
This will result in "text" that is actually line art. It cannot be selected, searched, etc. That could be good or bad depending on what you're after.

Using plain jPod (BSD, SourceForge) you could base on the "Watermark" example. WIth
PDFontTools.getFonts(doc)
you can enumerate the fonts and then use one of them in the "createForm" method...

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Replace a placeholder with image in word? - java

Related

Fill out field for pdf revision when signing with PDFbox

How to add a hyperlink to image in a Word document using Apache POI?

Keep dimension of new image when replacing old image using docx4j

Convert embedded pictures in database

How to use iText to add a watermark using an embedded font

Categories

Resources