Docx4j - Images in the document

Docx4j - Images in the document - java

How can we remove an image from the docx4j.
Say I have 10 images, and i want to replace 8 images with my own byte array/binary data, and I want to delete remaining 2.
I am also having trouble in locating images.
Is it somehow possible to replace text placeholders in the document with images?

Refer to this post : http://vixmemon.blogspot.com/2013/04/docx4j-replace-text-placeholders-with.html
for(Object obj : elemetns){
if(obj instanceof Tbl){
Tbl table = (Tbl) obj;
List rows = getAllElementFromObject(table, Tr.class);
for(Object trObj : rows){
Tr tr = (Tr) trObj;
List cols = getAllElementFromObject(tr, Tc.class);
for(Object tcObj : cols){
Tc tc = (Tc) tcObj;
List texts = getAllElementFromObject(tc, Text.class);
for(Object textObj : texts){
Text text = (Text) textObj;
if(text.getValue().equalsIgnoreCase("${MY_PLACE_HOLDER}")){
File file = new File("C:\\image.jpeg");
P paragraphWithImage = addInlineImageToParagraph(createInlineImage(file));
tc.getContent().remove(0);
tc.getContent().add(paragraphWithImage);
}
}
System.out.println("here");
}
}
System.out.println("here");
}
}
wordMLPackage.save(new java.io.File("C:\\result.docx"));

See docx4j checking checkboxes for the 2 approaches to finding stuff (XPath, or non XPath traversal).
VariableReplace allows you to replace text placeholders, but not with images. I think there may be code floating around (in the docx4j forums?) which extends it to do that.
But I'd suggest you use content control databinding instead. See how to create a new word from template with docx4j
You can use base64 encoded images in your XML data, and docx4j and/or Word will do the rest.

Related

How to replace date field with some text in the ViewMaster (Vertical) for word/pdf using Aspose?

Aspose code is inserting Viewmaster(vertical) with default date to
select as a text inside. I want to replace with some text as shown in
the image.
Followed the code mentioned in ViewMaster(vertical) using Aspose
to generate the ViewMaster(Vertical) in the word/pdf. can someone help
in getting the right code to replace the date with text

Date is set in structured document tag. You can use code like this to get and modify value of this SDT:
// Get structured document tags from footer.
NodeCollection tags = doc.FirstSection.HeadersFooters[HeaderFooterType.FooterPrimary].GetChildNodes(NodeType.StructuredDocumentTag, true);
foreach (StructuredDocumentTag tag in tags)
{
if (tag.Title.Equals("Date") && tag.SdtType == SdtType.Date)
{
tag.IsShowingPlaceholderText = false;
tag.FullDate = DateTime.Now;
// By default SDT is minded to XML. We can simply remove mapping to use value set in FullDate property.
tag.XmlMapping.Delete();
}
}
If you do not need date, but need to insert some custom text, you can remove the tag and insert a simple paragraph with text instead. For example:
// Get structured document tags from footer.
NodeCollection tags = doc.FirstSection.HeadersFooters[HeaderFooterType.FooterPrimary].GetChildNodes(NodeType.StructuredDocumentTag, true);
foreach (StructuredDocumentTag tag in tags)
{
if (tag.Title.Equals("Date") && tag.SdtType == SdtType.Date)
{
// Put an empty paragraph ater the structured document tag
Paragraph p = new Paragraph(doc);
tag.ParentNode.InsertAfter(p, tag);
// Remove tag
tag.Remove();
// move DocumentBuilder to the newly inserted paragraph and insert some text.
builder.MoveTo(p);
builder.Write("This is my custom vertical text");
}
}

Remove paragraph style with docx4j

In my word template file I have some tables and sometimes the second column of them is formatted as an enumeration.
Using docx4j I'm filling it with dynamic content and if there's only one entry I need to get rid of the enumeration style.
I found a place deep down in the structure that has a value for enumeration but when setting it to null, I don't see any changes in my template.
//This value is "Listenabsatz" (German) and I want to get rid of it
//Setting this value to "" or setting pStyle to null didn't help
((PStyle)((PPr)((P)((java.util.ArrayList)((Tc)((JAXBElement)templateRow.content.get(1)).value).content).get(0)).pPr).pStyle).val
In my actual code this is the place where I'm trying to change it:
Tr templateRow = (Tr) rows.get(0);
Tc cell = (Tc) ((javax.xml.bind.JAXBElement) templateRow.getContent().get(1)).getValue();
P par = (P) (cell.getContent().get(0));
PPr parStyle = par.getPPr();
if (parStyle.getPStyle() != null && parStyle.getPStyle().getVal() != null) {
parStyle.setPStyle(null);
//parStyle.getPStyle().setVal("");
}
How can I remove that enumeration style succesfully?

Write text and tables in to word, with whitespaces/enters

I'm writing text and text from tables into a word document.
With the following code the tables are placed under the right paragraphs.
Iterator<IBodyElement> iter = xdoc.getBodyElementsIterator();
while (iter.hasNext())
{
IBodyElement elem = iter.next();
if (elem instanceof XWPFParagraph)
{
relevantText.setText(((XWPFParagraph) elem).getText());
} else if (elem instanceof XWPFTable)
{
tabellen.setText(((XWPFTable) elem).getText());
}
}
Now when I try to make a whitespace/enter with addBreak() or addCarriageReturn() the order of my document is wrong. The table text is placed after all the text.
Has anyone a solution for this?

I had the same problem a couple of days ago. did you create 2 diffrent runs for the paragraphs and the tables?
Because I did, and when I changed it to 1 run it did work for me.
Like this:
XWPFRun text = paragraph.createRun();

In Apache POI, Is there a way to access XWPF elements by id their id?

I have word document (it is docx and xml based), I want to find a table and populate it programmatically. I am using Apache POI, XWPF API.
Is there a way to access XWPF elements by their id?
How can I create uniqueness between XWPF elements then alter using java?
Thanks

What I have implemented is a find replace feature(from here);
In my template docx file I am using "id like texts", __heading1__, __subjectname__, Then replacing with them using code below. For tables #axel-richters solution may be suitable.
private void findReplace(String a, String b, CustomXWPFDocument document){
for (XWPFParagraph p : document.getParagraphs()) {
List<XWPFRun> runs = p.getRuns();
if (runs != null) {
for (XWPFRun r : runs) {
String text = r.getText(0);
if (text != null && text.contains(a)) {
text = text.replace(a, b);
r.setText(text, 0);
}
}
}
}
}

Extracting heading and paragraphs from doc and docx files using apache-poi

I am trying to read Microsoft word documents via apache-poi and found that there are couple of convenient methods provided to scan through document like getText(), getParagraphList() etc.. But my use case is slightly different and the way we want to scan through any document is, it should give us events/information like heading, paragraph, table in the same sequence as they appear in document. It will help me in preparing a document structure like,
<content>
<section>
<heading> ABC </heading>
<paragraph>xyz </paragraph>
<paragraph>scanning through APIs</paragraph>
<section>
.
.
.
</content>
The main intent is to maintain the relationship between heading and paragraphs as in original document. Not sure but can something like this work for me,
Iterator<IBodyElement> itr = doc.getBodyElementsIterator();
while(itr.hasNext()) {
IBodyElement ele = itr.next();
System.out.println(ele.getElementType());
}
I was able to get the paragraph list but not heading information using this code. Just to mention, I would be interested in all headings, they might be explicitly marked as heading by using style or by using large font size.

Headers aren't stored inline in the main document, they live elsewhere, which is why you're not getting them as body elements. Body elements are things like sections, paragraphs and tables, not headers, so you have to fetch them yourself.
If you look at this code in Apache Tika, you'll see an example of how to do so. Assuming you're iterating over the body elements, and want headers / footers of paragraphs, you'll want code something like this (based on the Tika code):
for(IBodyElement element : bodyElement.getBodyElements()) {
if(element instanceof XWPFParagraph) {
XWPFParagraph paragraph = (XWPFParagraph)element;
XWPFHeaderFooterPolicy headerFooterPolicy = null;
if (paragraph.getCTP().getPPr() != null) {
CTSectPr ctSectPr = paragraph.getCTP().getPPr().getSectPr();
if(ctSectPr != null) {
headerFooterPolicy = new XWPFHeaderFooterPolicy(document, ctSectPr);
// Handle Header
}
}
// Handle paragraph
if (headerFooterPolicy != null) {
// Handle footer
}
}
if(element instanceof XWPFTable) {
XWPFTable table = (XWPFTable)element;
// Handle table
}
if (element instanceof XWPFSDT){
XWPFSDT sdt = (XWPFSDT) element;
// Handle SDT
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Docx4j - Images in the document - java

How can we remove an image from the docx4j. Say I have 10 images, and i want to replace 8 images with my own byte array/binary data, and I want to delete remaining 2. I am also having trouble in locating images. Is it somehow possible to replace text placeholders in the document with images?

Related

How to replace date field with some text in the ViewMaster (Vertical) for word/pdf using Aspose?

Remove paragraph style with docx4j

Write text and tables in to word, with whitespaces/enters

In Apache POI, Is there a way to access XWPF elements by id their id?

Extracting heading and paragraphs from doc and docx files using apache-poi

Categories

Resources