I have a table that contains items. I want to set the names of items in the word document but each one in a new line.
So I created the void below:
When my text contain "P01" I replace the text by the name, add a new line and set another text "P01".
public void findAndRemplaceString(XWPFDocument doc, String champs) throws IOException {
for (XWPFParagraph p : doc.getParagraphs()) {
java.util.List<XWPFRun> runs = p.getRuns();
if (runs != null) {
for (XWPFRun r : runs) {
String text = r.getText(0);
if (text != null && text.contains("P01")) {
text = text.replace("P01", champs);
System.out.println("text replaced");
r.setText(text, 0);
//add new line
r.addBreak();
//new "P01" added
r.setText("P01");
}
}
}
}
}
So that the next name of item will be replaced in the paragraph below.
#FXML
void endButton(ActionEvent event) {
String file = "model";
for (Person item : table.getItems()) {
//get the name of item
String a = item.getName();
// get the index of item
int ind0 = table.getItems().indexOf(item);
int ind1 = table.getItems().indexOf(item) + 1;
try {
XWPFDocument doc = new XWPFDocument(new FileInputStream(new File(file + ind0 + ".docx")));
findAndRemplaceString(doc, a);
FileOutputStream fileOutputStream = new FileOutputStream(new File(file + ind1 + ".docx"));
doc.write(fileOutputStream);
fileOutputStream.close();
doc.close();
} catch (Exception e) {
System.out.println("erreur " + e);
}
}
}
The problem is:
It replace only the first name of item and not the others. It doesn't read the new "P01" that I set.
I found the answer, it's not the best but it works.
I changed the type of String[] instead of String, so that i can do it this way :
public void findAndRemplaceString(XWPFDocument doc,String champs[]){
for (XWPFParagraph p : doc.getParagraphs()) {
java.util.List<XWPFRun> runs = p.getRuns();
if (runs != null) {
for (XWPFRun r : runs) {
String text = r.getText(0);
if (text != null && text.contains("P01") ) {
for (int i=0;i<champs.length;i++){
text = text.replace("P01","");
r.setText(text,0); //Replace the old text
r.setText(champs[i]);//add the new text
r.addBreak(); //new line
}
}
}
}
}
}
And when I click the button, the void findAndReplaceString is called only once instead of looping, so I put all the item names in a list like that:
#FXML void endButton(ActionEvent event) {
List<String> list = new ArrayList<String>();
for (Person item : table.getItems()) {
String a = item.getName();
list.add(a);
}
String[] simpleArray = new String[list.size()];
list.toArray(simpleArray);
try{
XWPFDocument doc = new XWPFDocument(new FileInputStream(new File("input.docx")));
findAndRemplaceString(doc,simpleArray);
FileOutputStream fileOutputStream = new FileOutputStream(new File("output.docx"));
doc.write(fileOutputStream);
fileOutputStream.close();
doc.close();
}catch (Exception e) {
System.out.println("erreur " + e);
}
}
Related
My purpose is to read a file docx and take this text "#name#" and "#surname#" and change the value with another casual text:
This is my docx file:
I do this:
XWPFDocument docx = new XWPFDocument(OPCPackage.open("..."));
for (XWPFParagraph p : docx.getParagraphs()) {
List<XWPFRun> runs = p.getRuns();
if (runs != null) {
for (XWPFRun r : runs) {
String text = r.getText(0);
if (text != null && text.startsWith("#") && text.endsWith("#")) {
text = text.replace("#", "new ");
r.setText(text, 0);
}
}
}
}
for (XWPFTable tbl : docx.getTables()) {
for (XWPFTableRow row : tbl.getRows()) {
for (XWPFTableCell cell : row.getTableCells()) {
for (XWPFParagraph p : cell.getParagraphs()) {
for (XWPFRun r : p.getRuns()) {
String text = r.getText(0);
if (text != null && text.startsWith("#") && text.endsWith("#")) {
text = text.replace("#", "new ");
r.setText(text,0);
}
}
}
}
}
the problem is that my code reads all label in docx file but it doesn't read the label "#surname#" and "#name". Anyone can help me?
From your screenshot it looks like the "#name#" and "#suremane#" are not in the document body directly but in a drawing (a text-box for example or a shape). Such elements are not covered by XWPFDocument.getParagraphs or .getTables or any other high level method in apache poi. So your main problem will be that the paragraphs which contain your text simply are not traversed by your code.
The only way to get really all paragraphs out of the documents body is using a XmlCursor which selects all w:p elements from the XML directly.
The code below shows that. It traverses really all XWPFParagraphs in documents body using a XmlCursor and replaces text if found.
For the replacement process I prefer the TextSegment replacement approach shown in Apache POI: ${my_placeholder} is treated as three different runs already. This is necessary because, even if the containing paragraph gets traversed, the text could be separated in different text runs because of formatting, spell checking or any other strange reasons. Microsoft Word knows nearly infinity reasons to strangely split text into different text runs.
import java.io.*;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;
import org.apache.xmlbeans.XmlObject;
import org.apache.xmlbeans.XmlCursor;
import java.util.Map;
import java.util.HashMap;
import java.util.List;
import java.util.ArrayList;
public class WordReplaceTextSegment {
/**
* this methods parse the paragraph and search for the string searched.
* If it finds the string, it will return true and the position of the String
* will be saved in the parameter startPos.
*
* #param searched
* #param startPos
*/
static TextSegment searchText(XWPFParagraph paragraph, String searched, PositionInParagraph startPos) {
int startRun = startPos.getRun(),
startText = startPos.getText(),
startChar = startPos.getChar();
int beginRunPos = 0, candCharPos = 0;
boolean newList = false;
//CTR[] rArray = paragraph.getRArray(); //This does not contain all runs. It lacks hyperlink runs for ex.
java.util.List<XWPFRun> runs = paragraph.getRuns();
int beginTextPos = 0, beginCharPos = 0; //must be outside the for loop
//for (int runPos = startRun; runPos < rArray.length; runPos++) {
for (int runPos = startRun; runPos < runs.size(); runPos++) {
//int beginTextPos = 0, beginCharPos = 0, textPos = 0, charPos; //int beginTextPos = 0, beginCharPos = 0 must be outside the for loop
int textPos = 0, charPos;
//CTR ctRun = rArray[runPos];
CTR ctRun = runs.get(runPos).getCTR();
XmlCursor c = ctRun.newCursor();
c.selectPath("./*");
try {
while (c.toNextSelection()) {
XmlObject o = c.getObject();
if (o instanceof CTText) {
if (textPos >= startText) {
String candidate = ((CTText) o).getStringValue();
if (runPos == startRun) {
charPos = startChar;
} else {
charPos = 0;
}
for (; charPos < candidate.length(); charPos++) {
if ((candidate.charAt(charPos) == searched.charAt(0)) && (candCharPos == 0)) {
beginTextPos = textPos;
beginCharPos = charPos;
beginRunPos = runPos;
newList = true;
}
if (candidate.charAt(charPos) == searched.charAt(candCharPos)) {
if (candCharPos + 1 < searched.length()) {
candCharPos++;
} else if (newList) {
TextSegment segment = new TextSegment();
segment.setBeginRun(beginRunPos);
segment.setBeginText(beginTextPos);
segment.setBeginChar(beginCharPos);
segment.setEndRun(runPos);
segment.setEndText(textPos);
segment.setEndChar(charPos);
return segment;
}
} else {
candCharPos = 0;
}
}
}
textPos++;
} else if (o instanceof CTProofErr) {
c.removeXml();
} else if (o instanceof CTRPr) {
//do nothing
} else {
candCharPos = 0;
}
}
} finally {
c.dispose();
}
}
return null;
}
static void replaceTextSegment(XWPFParagraph paragraph, String textToFind, String replacement) {
TextSegment foundTextSegment = null;
PositionInParagraph startPos = new PositionInParagraph(0, 0, 0);
//while((foundTextSegment = paragraph.searchText(textToFind, startPos)) != null) { // search all text segments having text to find
while((foundTextSegment = searchText(paragraph, textToFind, startPos)) != null) { // search all text segments having text to find
System.out.println(foundTextSegment.getBeginRun()+":"+foundTextSegment.getBeginText()+":"+foundTextSegment.getBeginChar());
System.out.println(foundTextSegment.getEndRun()+":"+foundTextSegment.getEndText()+":"+foundTextSegment.getEndChar());
// maybe there is text before textToFind in begin run
XWPFRun beginRun = paragraph.getRuns().get(foundTextSegment.getBeginRun());
String textInBeginRun = beginRun.getText(foundTextSegment.getBeginText());
String textBefore = textInBeginRun.substring(0, foundTextSegment.getBeginChar()); // we only need the text before
// maybe there is text after textToFind in end run
XWPFRun endRun = paragraph.getRuns().get(foundTextSegment.getEndRun());
String textInEndRun = endRun.getText(foundTextSegment.getEndText());
String textAfter = textInEndRun.substring(foundTextSegment.getEndChar() + 1); // we only need the text after
if (foundTextSegment.getEndRun() == foundTextSegment.getBeginRun()) {
textInBeginRun = textBefore + replacement + textAfter; // if we have only one run, we need the text before, then the replacement, then the text after in that run
} else {
textInBeginRun = textBefore + replacement; // else we need the text before followed by the replacement in begin run
endRun.setText(textAfter, foundTextSegment.getEndText()); // and the text after in end run
}
beginRun.setText(textInBeginRun, foundTextSegment.getBeginText());
// runs between begin run and end run needs to be removed
for (int runBetween = foundTextSegment.getEndRun() - 1; runBetween > foundTextSegment.getBeginRun(); runBetween--) {
paragraph.removeRun(runBetween); // remove not needed runs
}
}
}
static List<XmlObject> getCTPObjects(XWPFDocument doc) {
List<XmlObject> result = new ArrayList<XmlObject>();
//create cursor selecting all paragraph elements
XmlCursor cursor = doc.getDocument().newCursor();
cursor.selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' .//*/w:p");
while(cursor.hasNextSelection()) {
cursor.toNextSelection();
XmlObject obj = cursor.getObject();
// add only if the paragraph contains at least a run containing text
if (obj.selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' ./w:r/w:t").length > 0) {
result.add(obj);
}
}
return result;
}
static void traverseAllParagraphsAndReplace(XWPFDocument doc, Map<String, String> replacements) throws Exception {
//This gets all XWPFParagraph out od the stored XML and replaces
//first get all CTP objects
List<XmlObject> allCTPObjects = getCTPObjects(doc);
//then traverse them and create XWPFParagraphs from them and do the replacing
for (XmlObject obj : allCTPObjects) {
XWPFParagraph paragraph = null;
if (obj instanceof CTP) {
CTP p = (CTP)obj;
paragraph = new XWPFParagraph(p, doc);
} else {
CTP p = CTP.Factory.parse(obj.xmlText());
paragraph = new XWPFParagraph(p, doc);
}
if (paragraph != null) {
for (String textToFind : replacements.keySet()) {
String replacement = replacements.get(textToFind);
if (paragraph.getText().contains(textToFind)) replaceTextSegment(paragraph, textToFind, replacement);
}
}
obj.set(paragraph.getCTP());
}
}
public static void main(String[] args) throws Exception {
XWPFDocument doc = new XWPFDocument(new FileInputStream("source.docx"));
Map<String, String> replacements;
replacements = new HashMap<String, String>();
replacements.put("#name#", "Axel");
replacements.put("#surename#", "Richter");
traverseAllParagraphsAndReplace(doc, replacements);
FileOutputStream out = new FileOutputStream("result.docx");
doc.write(out);
out.close();
doc.close();
}
}
I am writing existing excel by merging many excel files, after generating of final excel file blank row is adding up after headers.
Below is my code which reads data from multiple files and write to particular blank file which have pivot formulas set.
I tried even by
1. Setting createRow(0) , then started filling data from next row.
2. Tried of maintaining int counter, but still didn't work
3. Tried incrementing getLastRowNum() count, but no use
public class DCSReadImpl implements ReadBehavior {
Logger log = Logger.getLogger(DCSReadImpl.class.getName());
#SuppressWarnings("resource")
#Override
public Sheet readReport(Workbook workbook,Map<String,String> masterMap, Properties properties) {
//int firstRow = 0;
int outRowCounter = 0;
String fileToMove= "";
boolean headers = true;
Row outputRow = null;
Sheet outputSheet = null;
Workbook wb = new XSSFWorkbook();
try {
outputSheet = wb.createSheet("Data");
log.info("**** Set headers start"); // this used to be different method
int cellNo = 0;
outputRow = outputSheet.createRow(0);
for(String headerName : ReportConstants.DCS_OUTPUT_HEADER){
outputRow.createCell(cellNo).setCellValue(headerName);
cellNo++;
}
//outRowCounter++;
log.info("**** Set headers completed");
log.info("Read input file(s) for DCS report");
log.info("Input File Path : " + properties.getProperty(ReportConstants.DCS_INPUT_PATH));
File inputDir = new File(properties.getProperty(ReportConstants.DCS_INPUT_PATH));
File[] dirListing = inputDir.listFiles();
if (0 == dirListing.length) {
throw new Exception(properties.getProperty(ReportConstants.DCS_INPUT_PATH) + " is empty");
}
for (File file : dirListing) {
log.info("Processing : " + file.getName());
fileToMove = file.getName();
XSSFWorkbook inputWorkbook = null;
try {
inputWorkbook = new XSSFWorkbook(new FileInputStream(file));
} catch (Exception e) {
throw new Exception("File is already open, please close the file");
}
XSSFSheet inputsheet = inputWorkbook.getSheet("Sheet1");
Iterator<Row> rowItr = inputsheet.iterator();
int headItr = 0;
//log.info("Validating headers : " + file.getName());
while (rowItr.hasNext()) {
Row irow = rowItr.next();
Iterator<Cell> cellItr = irow.cellIterator();
int cellIntItr = 0;
String key = "";
int rowN = outputSheet.getLastRowNum() + 1;
outputRow = outputSheet.createRow(rowN);
Cell outCell = null;
while (cellItr.hasNext()) {
Cell inputCell = cellItr.next();
if (0 == inputCell.getRowIndex()) {
if (!FileUtility.checkHeaders(headItr, inputCell.getStringCellValue().trim(),
ReportConstants.DCS_INPUT_HEADER)) {
throw new Exception("Incorrect header(s) present in Input File, Expected : "
+ ReportConstants.DCS_INPUT_HEADER[headItr]);
}
headItr++;
} else {
//outCell = outputRow.createCell(cellIntItr);
if (0 == inputCell.getColumnIndex()) {
key = inputCell.getStringCellValue().trim();
} else if (2 == inputCell.getColumnIndex()) {
key = key + ReportConstants.DEL + inputCell.getStringCellValue().trim();
}
if (7 == cellIntItr){
outCell = outputRow.createCell(cellIntItr);
outCell.setCellValue(getValue(masterMap, key, 0));
cellIntItr++;
outCell = outputRow.createCell(cellIntItr);
outCell.setCellValue(getValue(masterMap, key, 1));
cellIntItr++;
outCell = outputRow.createCell(cellIntItr);
outCell.setCellValue(getValue(masterMap, key, 2));
cellIntItr++;
}
// Check the cell type and format accordingly
switch (inputCell.getCellType()) {
case Cell.CELL_TYPE_NUMERIC:
outCell = outputRow.createCell(cellIntItr);
outCell.setCellValue(inputCell.getNumericCellValue());
break;
case Cell.CELL_TYPE_STRING:
outCell = outputRow.createCell(cellIntItr);
outCell.setCellValue(inputCell.getStringCellValue().trim());
break;
}
cellIntItr++;
}
}
//outRowCounter ++ ;
}
if(!fileToMove.isEmpty()){
FileUtility.checkDestinationDir(""+properties.get(ReportConstants.DCS_ARCHIVE_PATH));
FileUtility.moveFile(properties.get(ReportConstants.DCS_INPUT_PATH) + fileToMove,
properties.get(ReportConstants.DCS_ARCHIVE_PATH)+fileToMove+FileUtility.getPattern());
}
}
} catch (Exception e) {
log.error("Exception occured : ", e);
}
FileOutputStream outputStream;
try {
outputStream = new FileOutputStream("D:\\DCS\\Output\\Krsna_"+FileUtility.getPattern()+".xlsx");
wb.write(outputStream);
} catch (Exception e) {
e.printStackTrace();
}
return outputSheet;
}
private String getValue(Map<String, String> masterMap, String cellKey, int index) {
String value = masterMap.get(cellKey);
if (null != value) {
String cellValue[] = value.split("\\" + ReportConstants.DEL);
return cellValue[index];
} else {
return "";
}
}
}
There should not be blank row after header row. That is in between of 0th row and 1st row (hope my understanding is correct on row indexing). I know this is very basic question :-(
//*I am trying to copy a table from one document of docx file to other document of docx file. But last cell of my table and last row are getting duplicated along with content in the last cell. Rest all is working fine. I am able to copy complete table but only when it come to last cell and last row , i am getting duplicate. It Will be very helpful, if someone could guide me , exactly where i am doing mistakes , i have tried it for complete 2 days but it didn't work.
This is my complete code and i am trying with only one table in file nothing else.
*//
public class ExtractTables {
public static void main(String args[]) throws JAXBException, IOException, Docx4JException {
try {
WordprocessingMLPackage wordMLPackage;
wordMLPackage = WordprocessingMLPackage.load(new java.io.File("D://Table.docx"));
MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
ObjectFactory factory = Context.getWmlObjectFactory();
ExtractTables wordExt = new ExtractTables();
wordExt.extractTable(factory, documentPart);
} catch (Docx4JException DOE) {
System.out.println("The exception is : ");
DOE.printStackTrace();
}
}
private void extractTable(ObjectFactory objFact, MainDocumentPart documentPart) throws Docx4JException {
try {
WordprocessingMLPackage wml = WordprocessingMLPackage.createPackage();
int noTbls = 0;
int noRows = 0;
int noCells = 0;
int noParas = 0;
int noTexts = 0;
List<Object> allTables = getAllElementFromObject(documentPart, Tbl.class);
for (Object table : allTables) {
noTbls++;
Tbl tbl = (Tbl) table;
Tr tr = null;
Tc tc = null;
P p = null;
R r = null;
Text txt = null;
// Get all the Rows in the table
List<Object> allRows = getAllElementFromObject(tbl, Tr.class);
for (Object row : allRows) {
tr = (Tr) row;
noRows++;
// Get all the Cells in the Row
List<Object> allCells = getAllElementFromObject(tr, Tc.class);
for (Object cell : allCells) {
tc = (Tc) cell;
noCells++;
// Get all the Paragraph's in the Cell
List<Object> allParas = getAllElementFromObject(tc, P.class);
for (Object par : allParas) {
p = (P) par;
noParas++;
// Get all the Run's in the Paragraph
List<Object> allRuns = getAllElementFromObject(p, R.class);
for (Object runs : allRuns) {
r = (R) runs;
// Get the Text in the Run
List<Object> allText = getAllElementFromObject(r, Text.class);
for (Object text : allText) {
noTexts++;
txt = (Text) text;
}
System.out.println("No of Text in Para No: " + noParas + "are: " + noTexts);
}
}
System.out.println("No of Paras in Cell No: " + noCells + "are: " + noParas);
}
System.out.println("No of Cells in Row No: " + noRows + "are: " + noCells);
}
System.out.println("No of Rows in Table No: " + noTbls + "are: " + noRows);
r.getContent().add(txt);
p.getContent().add(r);
tc.getContent().add(p);
tr.getContent().add(tc);
tbl.getContent().add(tr);
wml.getMainDocumentPart().addObject(tbl);
wml.save(new File("D://TestTable.docx"));
}
System.out.println("Total no of Tables: " + noTbls);
}
catch (Docx4JException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private static List getAllElementFromObject(Object obj, Class toSearch) {
List<Object> result = new ArrayList<Object>();
if (obj instanceof JAXBElement)
obj = ((JAXBElement<?>) obj).getValue();
if (obj.getClass().equals(toSearch))
result.add(obj);
else if (obj instanceof ContentAccessor) {
List<?> children = ((ContentAccessor) obj).getContent();
for (Object child : children) {
result.addAll(getAllElementFromObject(child, toSearch));
}
}
return result;
}
}
I'm having the docx file with the following string
"My name is santhanam"
"I'm from India"
"I love docx4j"
And I bookmarked the above three paragraph with the bookmark name para0,para1,para2. I need to get the output as text file with following string
{para0}My name is santhanam{para0}
{para1}I'm from India{para1}
{para2}I love docx4j{para2}
Which I already succeeded with following code.
public class GetBookMark {
public static void main(String[] args) throws Exception {
String docString = "";
String outputfilepath = "5.txt";
String inputfilepath = "bookmark.docx";
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new java.io.File(inputfilepath));
MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
// String bookmark[] = new String[100000];
GetBookMark gb = new GetBookMark();
ClassFinder finder = new ClassFinder(CTBookmark.class); // <----- change
// this to suit
new TraversalUtil(documentPart.getContent(), finder);
for (Object o : finder.results)
{
CTBookmark BookMkStart = (CTBookmark) o;
String BookMarkName = BookMkStart.getName();
if (BookMarkName.startsWith("para")) {
P p = gb.findBookmarkedParagraphInMainDocumentPart(BookMarkName, documentPart);
List<Object> texts = getAllElementFromObject(p, Text.class);
if (texts.size() == 0) {
} else {
Text t1st = (Text) texts.get(0);
t1st.setValue("<" + BookMarkName + ">" + t1st.getValue());
Text tLast = (Text) texts.get(texts.size() - 1);
tLast.setValue(tLast.getValue() + "</" + BookMarkName + ">");
}
for (Object o1 : texts) {
Text t = (Text) o1;
docString += t.getValue();
}
docString += "\r\n";
}
}
// System.out.println("Document\n---------------\n" + docString);
try {
// BufferedWriter bw = new BufferedWriter(new
// FileWriter(outputfilepath));
Writer writer = new OutputStreamWriter(new FileOutputStream(outputfilepath), "UTF-8");
BufferedWriter bw = new BufferedWriter(writer);
bw.write(docString);
bw.close();
} catch (Exception e) {
System.out.println("Exception while writing to file : " + e);
}
}
public static List<Object> getAllElementFromObject(Object obj, Class<?> toSearch) {
List<Object> result = new ArrayList<Object>();
if (obj instanceof JAXBElement)
obj = ((JAXBElement<?>) obj).getValue();
if (obj.getClass().equals(toSearch))
result.add(obj);
else if (obj instanceof ContentAccessor) {
List<?> children = ((ContentAccessor) obj).getContent();
for (Object child : children) {
result.addAll(getAllElementFromObject(child, toSearch));
}
}
return result;
}
private P findBookmarkedParagraphInMainDocumentPart(String name, MainDocumentPart documentPart)
throws JAXBException, Docx4JException {
final String xpath = "//w:bookmarkStart[#w:name='" + name + "']/..";
List<Object> objects = documentPart.getJAXBNodesViaXPath(xpath, false);
return (org.docx4j.wml.P) XmlUtils.unwrap(objects.get(0));
}
// No xpath implementation for other parts than main document; traverse
// manually
private P findBookmarkedParagraphInPart(Object parent, String bookmark) {
P p = traversePartForBookmark(parent, bookmark);
return p;
}
// Used internally by findBookmarkedParagrapghInPart().
private P traversePartForBookmark(Object parent, String bookmark) {
P p = null;
List children = TraversalUtil.getChildrenImpl(parent);
if (children != null) {
for (Object o : children) {
o = XmlUtils.unwrap(o);
if (o instanceof CTBookmark) {
if (((CTBookmark) o).getName().toLowerCase().equals(bookmark)) {
return (P) parent; // If bookmark found, the surrounding
// P is what is interesting.
}
}
p = traversePartForBookmark(o, bookmark);
if (p != null) {
break;
}
}
}
return p;
}
}
Now I bookmarked the docx file which contains tables with table0 to table(etc) with bookmarked (Tr)rows and (Tc)cells.Is it possible to get the output as
{table0}{row0}{cello}{para0}text string{para0}{para1}text string{para1}{cell0}{row0}
{row1}{cell1}{para2}text string{para2}{para3}text string{para3}{cell1}{row1}{table0}
Thanks in advance.
UPDATE
Now,I'm halfway there with the following code
public class GetBookMark {
public static void main(String[] args) throws Exception {
String docString = "";
String outputfilepath = "BMChapter 14.txt";
String inputfilepath = "BMTable.docx";
String rowbm = null;
String tblbm = null;
String parabm = null;
String cellbm = null;
String tblparabm = null;
List<Object> tblTexts = null;
String partDocString = null;
String prtblbm = null;
String prrowbm = null;
String prcellbm = null;
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new java.io.File(inputfilepath));
MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
GetBookMark gb = new GetBookMark();
ClassFinder finder = new ClassFinder(CTBookmark.class); // <----- change
new TraversalUtil(documentPart.getContent(), finder);
for (Object o : finder.results) {
CTBookmark BookMkStart = (CTBookmark) o;
String BookMarkName = BookMkStart.getName();
if (BookMarkName.startsWith("para")) {
P p = gb.findBookmarkedParagraphInMainDocumentPart(BookMarkName, documentPart);
List<Object> texts = getAllElementFromObject(p, Text.class);
if (texts.size() == 0) {
} else {
Text t1st = (Text) texts.get(0);
t1st.setValue("<" + BookMarkName + ">" + t1st.getValue());
Text tLast = (Text) texts.get(texts.size() - 1);
tLast.setValue(tLast.getValue() + "</" + BookMarkName + ">");
}
for (Object o1 : texts) {
Text t = (Text) o1;
docString += t.getValue();
}
docString += "\r\n";
} else {
if (BookMarkName.startsWith("table")) {
// rowbm = "</"+BookMarkName+">";
// tblbm = "<"+BookMarkName+">";
tblbm = BookMarkName;
}
if (BookMarkName.startsWith("row")) {
// rowbm = "</"+BookMarkName+">" +rowbm;
// tblbm +="<"+BookMarkName+">";
rowbm = BookMarkName;
}
if (BookMarkName.startsWith("cell")) {
// rowbm = "</"+BookMarkName+">" +rowbm;
// tblbm+="<"+BookMarkName+">";
cellbm = BookMarkName;
}
if (BookMarkName.startsWith("tble")) {
// rowbm = "</"+BookMarkName+">" +rowbm;
// tblbm+="<"+BookMarkName+">";
tblparabm = BookMarkName;
P p = gb.findBookmarkedParagraphInMainDocumentPart(BookMarkName, documentPart);
List<Object> texts = getAllElementFromObject(p, Text.class);
if (texts.size() == 0) {
} else {
if (prtblbm != tblbm) {
docString += "<" + tblbm + ">";
}
if (prrowbm != rowbm) {
docString += "<" + rowbm + ">";
}
if (prcellbm != cellbm) {
docString += "<" + cellbm + ">";
}
Text t1st = (Text) texts.get(0);
t1st.setValue("<" + tblparabm + ">" + t1st.getValue());
Text tLast = (Text) texts.get(texts.size() - 1);
tLast.setValue(tLast.getValue() + "</" + tblparabm + ">");
}
prtblbm = tblbm;
prrowbm = rowbm;
prcellbm = cellbm;
}
for (Object o1 : texts) {
Text t = (Text) o1;
docString += t.getValue();
}
docString += "\r\n";
}
}
try {
Writer writer = new OutputStreamWriter(new FileOutputStream(outputfilepath), "UTF-8");
BufferedWriter bw = new BufferedWriter(writer);
bw.write(docString);
bw.close();
} catch (Exception e) {
System.out.println("Exception while writing to file : " + e);
}
}
System.out.println(docString);
}
I am using apache POI 3.7. I am trying to replace the value of a table column in a word document (docx). However, what I have done is it keeps appending the value of the current value in the document. But if a table column value is null, it places the value. Can you give me some thoughts how to resolve this. Below is the code I have done so far.
Thanks in advance.
package test.doc;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.List;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFTable;
import org.apache.poi.xwpf.usermodel.XWPFTableCell;
import org.apache.poi.xwpf.usermodel.XWPFTableRow;
public class POIDocXTableTest {
public static void main(String[] args)throws IOException {
String fileName = "C:\\Test.docx";
InputStream fis = new FileInputStream(fileName);
XWPFDocument document = new XWPFDocument(fis);
List<XWPFParagraph> paragraphs = document.getParagraphs();
for (int x=0; x<paragraphs.size();x++)
{
XWPFParagraph paragraph = paragraphs.get(x);
System.out.println(paragraph.getParagraphText());
}
List<XWPFTable> tables = document.getTables();
for (int x=0; x<tables.size();x++)
{
XWPFTable table = tables.get(x);
List<XWPFTableRow> tableRows = table.getRows();
tableRows.remove(x);
for (int r=0; r<tableRows.size();r++)
{
System.out.println("Row "+ (r+1)+ ":");
XWPFTableRow tableRow = tableRows.get(r);
List<XWPFTableCell> tableCells = tableRow.getTableCells();
for (int c=0; c<tableCells.size();c++)
{
System.out.print("Column "+ (c+1)+ ": ");
XWPFTableCell tableCell = tableCells.get(c);
//tableCell.setText("TAE");
String tableCellVal = tableCell.getText();
if ((c+1)==2){
if (tableCellVal!=null){
if (tableCellVal.length()>0){
char c1 = tableCellVal.charAt(0);
String s2 = "-TEST";
char c2 = s2.charAt(0);
String test = tableCell.getText().replace(tableCellVal,s2);
tableCell.setText(test);
}else{
//tableCell.setText("NULL");
}
}
}
System.out.println("tableCell.getText(" + (c) + "):" + tableCellVal);
}
}
System.out.println("\n");
}
OutputStream out = new FileOutputStream(fileName);
document.write(out);
out.close();
}
}
The best solution to prevent styles in paragraphs and find search strings with different styles is this method:
private long replaceInParagraphs(Map<String, String> replacements, List<XWPFParagraph> xwpfParagraphs) {
long count = 0;
for (XWPFParagraph paragraph : xwpfParagraphs) {
List<XWPFRun> runs = paragraph.getRuns();
for (Map.Entry<String, String> replPair : replacements.entrySet()) {
String find = replPair.getKey();
String repl = replPair.getValue();
TextSegement found = paragraph.searchText(find, new PositionInParagraph());
if ( found != null ) {
count++;
if ( found.getBeginRun() == found.getEndRun() ) {
// whole search string is in one Run
XWPFRun run = runs.get(found.getBeginRun());
String runText = run.getText(run.getTextPosition());
String replaced = runText.replace(find, repl);
run.setText(replaced, 0);
} else {
// The search string spans over more than one Run
// Put the Strings together
StringBuilder b = new StringBuilder();
for (int runPos = found.getBeginRun(); runPos <= found.getEndRun(); runPos++) {
XWPFRun run = runs.get(runPos);
b.append(run.getText(run.getTextPosition()));
}
String connectedRuns = b.toString();
String replaced = connectedRuns.replace(find, repl);
// The first Run receives the replaced String of all connected Runs
XWPFRun partOne = runs.get(found.getBeginRun());
partOne.setText(replaced, 0);
// Removing the text in the other Runs.
for (int runPos = found.getBeginRun()+1; runPos <= found.getEndRun(); runPos++) {
XWPFRun partNext = runs.get(runPos);
partNext.setText("", 0);
}
}
}
}
}
return count;
}
This method works with search strings spanning over more than one Run. The replaced part gets the style from the first found Run.
well, I have done something like that, to replace marks in a word template by specified words...:
public DotxTemplateFiller() {
String filename = "/poi/ls_Template_modern_de.dotx";
String outputPath = "/poi/output/output" + new Date().getTime()
+ ".dotx";
OutputStream out = null;
try {
File file = new File(filename);
XWPFDocument template = new XWPFDocument(new FileInputStream(file));
List<XWPFParagraph> xwpfParagraphs = template.getParagraphs();
replaceInParagraphs(xwpfParagraphs);
List<XWPFTable> tables = template.getTables();
for (XWPFTable xwpfTable : tables) {
List<XWPFTableRow> tableRows = xwpfTable.getRows();
for (XWPFTableRow xwpfTableRow : tableRows) {
List<XWPFTableCell> tableCells = xwpfTableRow
.getTableCells();
for (XWPFTableCell xwpfTableCell : tableCells) {
xwpfParagraphs = xwpfTableCell.getParagraphs();
replaceInParagraphs(xwpfParagraphs);
}
}
}
out = new FileOutputStream(new File(outputPath));
template.write(out);
out.flush();
out.close();
//System.exit(0);
} catch (IOException e) {
e.printStackTrace();
} finally {
if (out != null) {
try {
out.close();
} catch (IOException e) {
// nothing to do ....
}
}
}
}
/**
* #param xwpfParagraphs
*/
private void replaceInParagraphs(List<XWPFParagraph> xwpfParagraphs) {
for (XWPFParagraph xwpfParagraph : xwpfParagraphs) {
List<XWPFRun> xwpfRuns = xwpfParagraph.getRuns();
for (XWPFRun xwpfRun : xwpfRuns) {
String xwpfRunText = xwpfRun.getText(xwpfRun
.getTextPosition());
for (Map.Entry<String, String> entry : replacements
.entrySet()) {
if (xwpfRunText != null
&& xwpfRunText.contains(entry.getKey())) {
xwpfRunText = xwpfRunText.replaceAll(
entry.getKey(), entry.getValue());
}
}
xwpfRun.setText(xwpfRunText, 0);
}
}
}
public static void main(String[] args) {
new DotxTemplateFiller();
}
First I did it for regular paragraphs in the MS Word template and than for paragraphs inside table cells.
Hope it is helpful for you and I hope I understood your problem right... :-)
Best wishes.
Adding on to Josh's solution, the map I am building has ended up with over a thousand tags and continues to grow. To cut down on processing, I decided to build a small subset of the tags that I know appear in the paragraph, typically ending up with a map of only one or two tags that I then pass as the Map to the replaceInParagraphs method provided above. Also, using the Substitution object to store the substitution text, allows me to add methods into that object (such as formatting) that I can call once the substitution has been completed. Using the subset Map also allows me to know what replacements have been made in any paragraph.
private Map<String, Substitution> buildTagList(Map<String, Substitution> replacements, List<XWPFParagraph> xwpfParagraphs, String start, String end) {
Map<String, Substitution> returnMap = new HashMap<String, Substitution> ();
for (XWPFParagraph paragraph : xwpfParagraphs) {
List<XWPFRun> runs = paragraph.getRuns();
// Check is there is a tag in the paragraph
TextSegment found = paragraph.searchText(start, new PositionInParagraph());
String runText = "";
XWPFRun run = null;
if ( found != null ) {
StringBuilder b = new StringBuilder();
for (int runPos = found.getBeginRun(); runPos < runs.size(); runPos++) {
run = runs.get(runPos);
b.append(run.getText(run.getTextPosition()));
runText = b.toString();
}
// Now we need to find all tags in the run
boolean finished = false;
int tagStart = 0;
int tagEnd = 0;
while ( ! finished ) {
// get the first tag
tagStart = runText.indexOf(start,tagStart);
tagEnd = runText.indexOf(end, tagEnd);
if ( tagStart >= 0 ) {
String tag = runText.substring(tagStart, tagEnd + end.length());
Substitution s = replacements.get(tag);
if (s != null) {
returnMap.putIfAbsent(tag,s);
}
}
else
finished = true;
tagStart = tagEnd + end.length();
tagEnd = tagStart;
}
}
}
return returnMap;
}