How do you properly use PDPageContentStream::setTextRise? - java

Using PDFBox, given data notated like this: [G]Glory be to [D]God [Em]the [C]Father,\n[G]And to [A]Christ the [D]Son,, I am creating a guitar chord sheet like this:
My approach was to iterate through each character in the song and check the current index against the map.. whenever the map has an entry to that character index, we "jump" to the line above, write the chord, then jump back down.
The method setTextRise looked promising, but still processes the horizontal spacing incorrectly:
Here's an SSCCE (needs PDFBox libraries) that produces the PDF above:
public static void main(String[] args) {
try {
String extracted_text = "Capo 1\n\n1\n[G]Glory be to [D]God [Em]the [C]Father,\n[G]And to [A]Christ the [D]Son,\n[B7]Glory to the [Em]Holy [C]Spirit—\n[D-D7]Ever [ G]One.\n\n2\nAs we view the vast creation,\nPlanned with wondrous skill,\nSo our hearts would move to worship,\nAnd be still.\n\n3\nBut, our God, how great Thy yearning\nTo have sons who love\nIn the Son e’en now to praise Thee,\nLove to prove!\n\n4\n’Twas Thy thought in revelation,\nTo present to men\nSecrets of Thine own affections,\nTheirs to win.\n\n5\nSo in Christ, through His redemption\n(Vanquished evil powers!)\nThou hast brought, in new creation,\nWorshippers!\n\n6\nGlory be to God the Father,\nAnd to Christ the Son,\nGlory to the Holy Spirit—\nEver One.\n".replaceAll("\n", "\r");
String[] lines = extracted_text.split("\\r");
ArrayList<SongLine> songlines = new ArrayList<>();
for(String s : lines) {
LinkedHashMap<Integer, String> chords = new LinkedHashMap();
StringBuilder line = new StringBuilder();
StringBuilder currentchord = null;
int index = 0;
for(char c : s.toCharArray()) {
if(currentchord != null) {
if(c == ']') {
chords.put(index, currentchord.toString());
currentchord = null;
} else {
currentchord.append(c);
}
} else {
if(c == '[') {
currentchord = new StringBuilder();
} else {
line.append(c);
index++;
}
}
}
SongLine sl = new SongLine();
if(chords.size() > 0)
sl.char_index_to_chords = chords;
sl.line = line.toString();
songlines.add(sl);
}
try (PDDocument doc = new PDDocument()) {
PDPage page = new PDPage();
PDPageContentStream pcs = new PDPageContentStream(doc, page);
int firstLineX = 25;
int firstLineY = 700;
boolean first = true;
float leading = 14.5f;
pcs.beginText();
pcs.newLineAtOffset(firstLineX, firstLineY);
pcs.setFont(PDType1Font.TIMES_ROMAN, 12);
pcs.setLeading(leading);
for(SongLine line : songlines) {
if(line.char_index_to_chords != null)
System.out.println(line.char_index_to_chords.toString());
System.out.println(line.line);
if(!first) {
pcs.newLine();
}
first = false;
if(line.char_index_to_chords != null) {
pcs.newLine();
}
for(int i = 0; i < line.line.length(); i++) {
pcs.showText(String.valueOf(line.line.charAt(i)));
if(line.char_index_to_chords != null && line.char_index_to_chords.containsKey(i)) {
pcs.setTextRise(12);
pcs.showText(line.char_index_to_chords.get(i));
pcs.setTextRise(0);
}
}
}
pcs.endText();
pcs.close();
doc.addPage(page);
String path = "0001.pdf";
doc.save(path);
Desktop.getDesktop().open(new File(path));
}
} catch (Exception e) {
e.printStackTrace();
}
}
static class SongLine {
Map<Integer, String> char_index_to_chords;
String line;
}
What would you do in PDFBox to create the text aligned with chords (like in the first image)?

I got it. The answer was not setTextRise, rather newLineAtOffset while using getStringWidth to calculate font size:
for(SongLine line : songlines) {
if(!first) {
pcs.newLine();
}
first = false;
if(line.char_index_to_chords != null) {
float offset = 0;
for(Entry<Integer, String> entry : line.char_index_to_chords.entrySet()) {
float offsetX = font.getStringWidth(line.char_index_to_leading_lyrics.get(entry.getKey())) / (float)1000 * fontSize;
pcs.newLineAtOffset(offsetX, 0);
offset += offsetX;
pcs.showText(entry.getValue());
}
pcs.newLineAtOffset(-offset, -leading);
}
pcs.showText(line.line);
}

Related

The value "name" and "surname" aren't read apache poi

My purpose is to read a file docx and take this text "#name#" and "#surname#" and change the value with another casual text:
This is my docx file:
I do this:
XWPFDocument docx = new XWPFDocument(OPCPackage.open("..."));
for (XWPFParagraph p : docx.getParagraphs()) {
List<XWPFRun> runs = p.getRuns();
if (runs != null) {
for (XWPFRun r : runs) {
String text = r.getText(0);
if (text != null && text.startsWith("#") && text.endsWith("#")) {
text = text.replace("#", "new ");
r.setText(text, 0);
}
}
}
}
for (XWPFTable tbl : docx.getTables()) {
for (XWPFTableRow row : tbl.getRows()) {
for (XWPFTableCell cell : row.getTableCells()) {
for (XWPFParagraph p : cell.getParagraphs()) {
for (XWPFRun r : p.getRuns()) {
String text = r.getText(0);
if (text != null && text.startsWith("#") && text.endsWith("#")) {
text = text.replace("#", "new ");
r.setText(text,0);
}
}
}
}
}
the problem is that my code reads all label in docx file but it doesn't read the label "#surname#" and "#name". Anyone can help me?
From your screenshot it looks like the "#name#" and "#suremane#" are not in the document body directly but in a drawing (a text-box for example or a shape). Such elements are not covered by XWPFDocument.getParagraphs or .getTables or any other high level method in apache poi. So your main problem will be that the paragraphs which contain your text simply are not traversed by your code.
The only way to get really all paragraphs out of the documents body is using a XmlCursor which selects all w:p elements from the XML directly.
The code below shows that. It traverses really all XWPFParagraphs in documents body using a XmlCursor and replaces text if found.
For the replacement process I prefer the TextSegment replacement approach shown in Apache POI: ${my_placeholder} is treated as three different runs already. This is necessary because, even if the containing paragraph gets traversed, the text could be separated in different text runs because of formatting, spell checking or any other strange reasons. Microsoft Word knows nearly infinity reasons to strangely split text into different text runs.
import java.io.*;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;
import org.apache.xmlbeans.XmlObject;
import org.apache.xmlbeans.XmlCursor;
import java.util.Map;
import java.util.HashMap;
import java.util.List;
import java.util.ArrayList;
public class WordReplaceTextSegment {
/**
* this methods parse the paragraph and search for the string searched.
* If it finds the string, it will return true and the position of the String
* will be saved in the parameter startPos.
*
* #param searched
* #param startPos
*/
static TextSegment searchText(XWPFParagraph paragraph, String searched, PositionInParagraph startPos) {
int startRun = startPos.getRun(),
startText = startPos.getText(),
startChar = startPos.getChar();
int beginRunPos = 0, candCharPos = 0;
boolean newList = false;
//CTR[] rArray = paragraph.getRArray(); //This does not contain all runs. It lacks hyperlink runs for ex.
java.util.List<XWPFRun> runs = paragraph.getRuns();
int beginTextPos = 0, beginCharPos = 0; //must be outside the for loop
//for (int runPos = startRun; runPos < rArray.length; runPos++) {
for (int runPos = startRun; runPos < runs.size(); runPos++) {
//int beginTextPos = 0, beginCharPos = 0, textPos = 0, charPos; //int beginTextPos = 0, beginCharPos = 0 must be outside the for loop
int textPos = 0, charPos;
//CTR ctRun = rArray[runPos];
CTR ctRun = runs.get(runPos).getCTR();
XmlCursor c = ctRun.newCursor();
c.selectPath("./*");
try {
while (c.toNextSelection()) {
XmlObject o = c.getObject();
if (o instanceof CTText) {
if (textPos >= startText) {
String candidate = ((CTText) o).getStringValue();
if (runPos == startRun) {
charPos = startChar;
} else {
charPos = 0;
}
for (; charPos < candidate.length(); charPos++) {
if ((candidate.charAt(charPos) == searched.charAt(0)) && (candCharPos == 0)) {
beginTextPos = textPos;
beginCharPos = charPos;
beginRunPos = runPos;
newList = true;
}
if (candidate.charAt(charPos) == searched.charAt(candCharPos)) {
if (candCharPos + 1 < searched.length()) {
candCharPos++;
} else if (newList) {
TextSegment segment = new TextSegment();
segment.setBeginRun(beginRunPos);
segment.setBeginText(beginTextPos);
segment.setBeginChar(beginCharPos);
segment.setEndRun(runPos);
segment.setEndText(textPos);
segment.setEndChar(charPos);
return segment;
}
} else {
candCharPos = 0;
}
}
}
textPos++;
} else if (o instanceof CTProofErr) {
c.removeXml();
} else if (o instanceof CTRPr) {
//do nothing
} else {
candCharPos = 0;
}
}
} finally {
c.dispose();
}
}
return null;
}
static void replaceTextSegment(XWPFParagraph paragraph, String textToFind, String replacement) {
TextSegment foundTextSegment = null;
PositionInParagraph startPos = new PositionInParagraph(0, 0, 0);
//while((foundTextSegment = paragraph.searchText(textToFind, startPos)) != null) { // search all text segments having text to find
while((foundTextSegment = searchText(paragraph, textToFind, startPos)) != null) { // search all text segments having text to find
System.out.println(foundTextSegment.getBeginRun()+":"+foundTextSegment.getBeginText()+":"+foundTextSegment.getBeginChar());
System.out.println(foundTextSegment.getEndRun()+":"+foundTextSegment.getEndText()+":"+foundTextSegment.getEndChar());
// maybe there is text before textToFind in begin run
XWPFRun beginRun = paragraph.getRuns().get(foundTextSegment.getBeginRun());
String textInBeginRun = beginRun.getText(foundTextSegment.getBeginText());
String textBefore = textInBeginRun.substring(0, foundTextSegment.getBeginChar()); // we only need the text before
// maybe there is text after textToFind in end run
XWPFRun endRun = paragraph.getRuns().get(foundTextSegment.getEndRun());
String textInEndRun = endRun.getText(foundTextSegment.getEndText());
String textAfter = textInEndRun.substring(foundTextSegment.getEndChar() + 1); // we only need the text after
if (foundTextSegment.getEndRun() == foundTextSegment.getBeginRun()) {
textInBeginRun = textBefore + replacement + textAfter; // if we have only one run, we need the text before, then the replacement, then the text after in that run
} else {
textInBeginRun = textBefore + replacement; // else we need the text before followed by the replacement in begin run
endRun.setText(textAfter, foundTextSegment.getEndText()); // and the text after in end run
}
beginRun.setText(textInBeginRun, foundTextSegment.getBeginText());
// runs between begin run and end run needs to be removed
for (int runBetween = foundTextSegment.getEndRun() - 1; runBetween > foundTextSegment.getBeginRun(); runBetween--) {
paragraph.removeRun(runBetween); // remove not needed runs
}
}
}
static List<XmlObject> getCTPObjects(XWPFDocument doc) {
List<XmlObject> result = new ArrayList<XmlObject>();
//create cursor selecting all paragraph elements
XmlCursor cursor = doc.getDocument().newCursor();
cursor.selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' .//*/w:p");
while(cursor.hasNextSelection()) {
cursor.toNextSelection();
XmlObject obj = cursor.getObject();
// add only if the paragraph contains at least a run containing text
if (obj.selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' ./w:r/w:t").length > 0) {
result.add(obj);
}
}
return result;
}
static void traverseAllParagraphsAndReplace(XWPFDocument doc, Map<String, String> replacements) throws Exception {
//This gets all XWPFParagraph out od the stored XML and replaces
//first get all CTP objects
List<XmlObject> allCTPObjects = getCTPObjects(doc);
//then traverse them and create XWPFParagraphs from them and do the replacing
for (XmlObject obj : allCTPObjects) {
XWPFParagraph paragraph = null;
if (obj instanceof CTP) {
CTP p = (CTP)obj;
paragraph = new XWPFParagraph(p, doc);
} else {
CTP p = CTP.Factory.parse(obj.xmlText());
paragraph = new XWPFParagraph(p, doc);
}
if (paragraph != null) {
for (String textToFind : replacements.keySet()) {
String replacement = replacements.get(textToFind);
if (paragraph.getText().contains(textToFind)) replaceTextSegment(paragraph, textToFind, replacement);
}
}
obj.set(paragraph.getCTP());
}
}
public static void main(String[] args) throws Exception {
XWPFDocument doc = new XWPFDocument(new FileInputStream("source.docx"));
Map<String, String> replacements;
replacements = new HashMap<String, String>();
replacements.put("#name#", "Axel");
replacements.put("#surename#", "Richter");
traverseAllParagraphsAndReplace(doc, replacements);
FileOutputStream out = new FileOutputStream("result.docx");
doc.write(out);
out.close();
doc.close();
}
}

How to change size and color of appended text in java

I am working with a block of code that contains multiple lines of appended text. Is there a simple method to change the font color and size? I am specifically looking to change the font size of the appended text like sb.append(record.getSegText())?
Below is the code.
#Override
public String inspect(ReferencedCoordinate coord) throws VizException {
if (resourceData.hideSampling) {
return "";
}
// check if we are in the last frame
boolean lastFrame = false;
FramesInfo framesInfo = this.descriptor.getFramesInfo();
if (time != null) {
try {
Point point = gf.createPoint(coord.asLatLon());
for (String key : entryMap.keySet()) {
WarningEntry entry = entryMap.get(key);
AbstractWarningRecord record = entry.record;
if (matchesFrame(entry, time, framePeriod, lastFrame)
&& record.getGeometry() != null) {
Geometry recordGeom = record.getGeometry();
for (int i = 0; i < recordGeom.getNumGeometries(); i++) {
PreparedGeometry prepGeom = pgf.create(recordGeom
.getGeometryN(i));
if (prepGeom.contains(point)) {
StringBuffer sb = new StringBuffer();
String[] textToPrint = getText(record, 0);
for (String text : textToPrint) {
if (sb.length() > 0) {
sb.append(" ");
}
sb.append("\n\n\n\n\n\n");
sb.append(text);
sb.append(record.getSegText());
}
return sb.toString();
}
}
}
}
}
return "NO DATA";
}

apache poi word to html conversion - words boundry

I am using below code to convert word to html file
public Map convert(String wordDocPath, String htmlPath,
Map conversionParams)
{
log.info("Converting word file "+wordDocPath)
try
{
String workingFolder = "C:\temp"
File workingFolderFile = new File(workingFolder)
FileInputStream fis = new FileInputStream(wordDocPath);
XWPFDocument document = new XWPFDocument(fis);
XHTMLOptions options = XHTMLOptions.create().URIResolver(new FileURIResolver(workingFolderFile));
options.setExtractor(new FileImageExtractor(workingFolderFile))
File htmlFile = new File(htmlPath);
OutputStream out = new FileOutputStream(htmlFile)
XHTMLConverter.getInstance().convert(document, out, options);
log.info("Converted to HTML file "+htmlPath)
}
catch(Exception e)
{
log.error("Exception :"+e.getMessage(),e)
}
}
The code is properly generating html output.
I need to put some parameters in the doc like [[AGENT_NAME]] that I will replace with regex later in code. But apache poi is not treating this pattern as single word and sometime splitting "[[", "AGENT_NAME" & "]]" and inserting some tags with styles in between. I cannot write regex and replace the parameters because of it.
How does apache poi decides word boundry? is there a way to control it?
After all the efforts, I finally decided to write code to parse word doc and merge splitted runs. Here is the code, hope it will help someone else
Note: I have used pattern as ${pattern}
void mergeSplittedPatterns(XWPFDocument document)
{
List<XWPFParagraph> paragraphs = document.paragraphs
for(XWPFParagraph paragraph : paragraphs)
{
List<XWPFRun> runs = paragraph.getRuns()
int firstCharRun,closingCharRun
boolean firstCharFound = false;
boolean secondCharFoundImmediately = false;
boolean closingCharFound = false;
boolean gotoNextRun = true
boolean scan = (runs!=null && runs.size()>0)
int index = 0
while(scan)
{
gotoNextRun = true;
XWPFRun run = runs.get(index)
String runText = run.getText(0)
if(runText!=null)
for (int i = 0; i < runText.length(); i++)
{
char character = runText.charAt(i);
if(secondCharFoundImmediately)
{
closingCharFound = (character=="}")
if(closingCharFound)
{
closingCharRun = index
if(firstCharRun==closingCharRun)
{
firstCharFound = secondCharFoundImmediately = closingCharFound = false
continue;
}
else
{
String mergedText= ""
for(int j=firstCharRun;j<=closingCharRun;j++)
{
mergedText += runs.get(j).getText(0)
}
runs.get(firstCharRun).setText(mergedText,0)
for(int j=closingCharRun;j>firstCharRun;j--)
{
paragraph.removeRun(j)
}
firstCharFound = secondCharFoundImmediately = closingCharFound = gotoNextRun = false
index = firstCharRun
break;
}
}
}
else if(firstCharFound)
{
secondCharFoundImmediately = (character=="{")
if(!secondCharFoundImmediately)
{
firstCharFound = secondCharFoundImmediately = closingCharFound = false
}
}
else if(character=="\$")
{
firstCharFound = true;
firstCharRun = index
}
}
if(gotoNextRun)
{
index++;
}
if(index>=runs.size())
{
scan = false;
}
}
}
}

How to find size of ArrayList<String> in my map?

I want to find the size of each value from the key-value pair in Map<Integer, ArrayList<String>>. Simply writing list.size() does not work.
Here's my code:
public void getF() throws Exception {
BufferedReader br2 =
new BufferedReader(
new FileReader("/home/abc/NetBeansProjects/network1.txt"));
System.out.println("hello" +r.usr);
while ((s= br2.readLine()) != null) {
String F[]= s.split(":");
for (String uid : F) {
if (uid == F[0]) {
user.add(uid);
} else {
li = followee.get(Integer.valueOf(F[0]));
if (li == null) {
followee.put(Integer.valueOf(F[0]), li= new ArrayList<String>());
}
li.add(uid);
}
System.out.println(followee);
int g = li.size();
System.out.println("g:" +g);
[...]
}
}
}
Why am I not getting correct size on last line?
Try to follow the data structures, by keeping the variable as close to their usage.
(I know in other languages the convention is to declare them at the top.)
Here li should be kept at the begin of a while-step. And its more natural to handle f[0] outside the loop, instead of for+if. I think the latter put you on the wrong foot.
Set<String> user = new HashSet<>();
Map<Integer, List<String>> followee = new HashMap<>();
String s;
while ((s = br2.readLine()) != null) {
// s has the format "key:value value value"
String keyAndValues[] = s.split(":", 2);
if (keyAndValues.length != 2) {
continue;
}
Integer key = Integer.valueOf(keyAndValues[0]);
String values = keyAndValues[1];
user.add(keyAndValues[0]);
List<String> li = followee.get(key);
if (li == null) {
li = new ArrayList<>();
followee.put(key, li);
}
Collections.addAll(values.split(" +");
System.out.println(followee);
int g = li.size();
System.out.println("g:" + g);
//[...]
}

Replace table column value in Apache POI

I am using apache POI 3.7. I am trying to replace the value of a table column in a word document (docx). However, what I have done is it keeps appending the value of the current value in the document. But if a table column value is null, it places the value. Can you give me some thoughts how to resolve this. Below is the code I have done so far.
Thanks in advance.
package test.doc;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.List;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFTable;
import org.apache.poi.xwpf.usermodel.XWPFTableCell;
import org.apache.poi.xwpf.usermodel.XWPFTableRow;
public class POIDocXTableTest {
public static void main(String[] args)throws IOException {
String fileName = "C:\\Test.docx";
InputStream fis = new FileInputStream(fileName);
XWPFDocument document = new XWPFDocument(fis);
List<XWPFParagraph> paragraphs = document.getParagraphs();
for (int x=0; x<paragraphs.size();x++)
{
XWPFParagraph paragraph = paragraphs.get(x);
System.out.println(paragraph.getParagraphText());
}
List<XWPFTable> tables = document.getTables();
for (int x=0; x<tables.size();x++)
{
XWPFTable table = tables.get(x);
List<XWPFTableRow> tableRows = table.getRows();
tableRows.remove(x);
for (int r=0; r<tableRows.size();r++)
{
System.out.println("Row "+ (r+1)+ ":");
XWPFTableRow tableRow = tableRows.get(r);
List<XWPFTableCell> tableCells = tableRow.getTableCells();
for (int c=0; c<tableCells.size();c++)
{
System.out.print("Column "+ (c+1)+ ": ");
XWPFTableCell tableCell = tableCells.get(c);
//tableCell.setText("TAE");
String tableCellVal = tableCell.getText();
if ((c+1)==2){
if (tableCellVal!=null){
if (tableCellVal.length()>0){
char c1 = tableCellVal.charAt(0);
String s2 = "-TEST";
char c2 = s2.charAt(0);
String test = tableCell.getText().replace(tableCellVal,s2);
tableCell.setText(test);
}else{
//tableCell.setText("NULL");
}
}
}
System.out.println("tableCell.getText(" + (c) + "):" + tableCellVal);
}
}
System.out.println("\n");
}
OutputStream out = new FileOutputStream(fileName);
document.write(out);
out.close();
}
}
The best solution to prevent styles in paragraphs and find search strings with different styles is this method:
private long replaceInParagraphs(Map<String, String> replacements, List<XWPFParagraph> xwpfParagraphs) {
long count = 0;
for (XWPFParagraph paragraph : xwpfParagraphs) {
List<XWPFRun> runs = paragraph.getRuns();
for (Map.Entry<String, String> replPair : replacements.entrySet()) {
String find = replPair.getKey();
String repl = replPair.getValue();
TextSegement found = paragraph.searchText(find, new PositionInParagraph());
if ( found != null ) {
count++;
if ( found.getBeginRun() == found.getEndRun() ) {
// whole search string is in one Run
XWPFRun run = runs.get(found.getBeginRun());
String runText = run.getText(run.getTextPosition());
String replaced = runText.replace(find, repl);
run.setText(replaced, 0);
} else {
// The search string spans over more than one Run
// Put the Strings together
StringBuilder b = new StringBuilder();
for (int runPos = found.getBeginRun(); runPos <= found.getEndRun(); runPos++) {
XWPFRun run = runs.get(runPos);
b.append(run.getText(run.getTextPosition()));
}
String connectedRuns = b.toString();
String replaced = connectedRuns.replace(find, repl);
// The first Run receives the replaced String of all connected Runs
XWPFRun partOne = runs.get(found.getBeginRun());
partOne.setText(replaced, 0);
// Removing the text in the other Runs.
for (int runPos = found.getBeginRun()+1; runPos <= found.getEndRun(); runPos++) {
XWPFRun partNext = runs.get(runPos);
partNext.setText("", 0);
}
}
}
}
}
return count;
}
This method works with search strings spanning over more than one Run. The replaced part gets the style from the first found Run.
well, I have done something like that, to replace marks in a word template by specified words...:
public DotxTemplateFiller() {
String filename = "/poi/ls_Template_modern_de.dotx";
String outputPath = "/poi/output/output" + new Date().getTime()
+ ".dotx";
OutputStream out = null;
try {
File file = new File(filename);
XWPFDocument template = new XWPFDocument(new FileInputStream(file));
List<XWPFParagraph> xwpfParagraphs = template.getParagraphs();
replaceInParagraphs(xwpfParagraphs);
List<XWPFTable> tables = template.getTables();
for (XWPFTable xwpfTable : tables) {
List<XWPFTableRow> tableRows = xwpfTable.getRows();
for (XWPFTableRow xwpfTableRow : tableRows) {
List<XWPFTableCell> tableCells = xwpfTableRow
.getTableCells();
for (XWPFTableCell xwpfTableCell : tableCells) {
xwpfParagraphs = xwpfTableCell.getParagraphs();
replaceInParagraphs(xwpfParagraphs);
}
}
}
out = new FileOutputStream(new File(outputPath));
template.write(out);
out.flush();
out.close();
//System.exit(0);
} catch (IOException e) {
e.printStackTrace();
} finally {
if (out != null) {
try {
out.close();
} catch (IOException e) {
// nothing to do ....
}
}
}
}
/**
* #param xwpfParagraphs
*/
private void replaceInParagraphs(List<XWPFParagraph> xwpfParagraphs) {
for (XWPFParagraph xwpfParagraph : xwpfParagraphs) {
List<XWPFRun> xwpfRuns = xwpfParagraph.getRuns();
for (XWPFRun xwpfRun : xwpfRuns) {
String xwpfRunText = xwpfRun.getText(xwpfRun
.getTextPosition());
for (Map.Entry<String, String> entry : replacements
.entrySet()) {
if (xwpfRunText != null
&& xwpfRunText.contains(entry.getKey())) {
xwpfRunText = xwpfRunText.replaceAll(
entry.getKey(), entry.getValue());
}
}
xwpfRun.setText(xwpfRunText, 0);
}
}
}
public static void main(String[] args) {
new DotxTemplateFiller();
}
First I did it for regular paragraphs in the MS Word template and than for paragraphs inside table cells.
Hope it is helpful for you and I hope I understood your problem right... :-)
Best wishes.
Adding on to Josh's solution, the map I am building has ended up with over a thousand tags and continues to grow. To cut down on processing, I decided to build a small subset of the tags that I know appear in the paragraph, typically ending up with a map of only one or two tags that I then pass as the Map to the replaceInParagraphs method provided above. Also, using the Substitution object to store the substitution text, allows me to add methods into that object (such as formatting) that I can call once the substitution has been completed. Using the subset Map also allows me to know what replacements have been made in any paragraph.
private Map<String, Substitution> buildTagList(Map<String, Substitution> replacements, List<XWPFParagraph> xwpfParagraphs, String start, String end) {
Map<String, Substitution> returnMap = new HashMap<String, Substitution> ();
for (XWPFParagraph paragraph : xwpfParagraphs) {
List<XWPFRun> runs = paragraph.getRuns();
// Check is there is a tag in the paragraph
TextSegment found = paragraph.searchText(start, new PositionInParagraph());
String runText = "";
XWPFRun run = null;
if ( found != null ) {
StringBuilder b = new StringBuilder();
for (int runPos = found.getBeginRun(); runPos < runs.size(); runPos++) {
run = runs.get(runPos);
b.append(run.getText(run.getTextPosition()));
runText = b.toString();
}
// Now we need to find all tags in the run
boolean finished = false;
int tagStart = 0;
int tagEnd = 0;
while ( ! finished ) {
// get the first tag
tagStart = runText.indexOf(start,tagStart);
tagEnd = runText.indexOf(end, tagEnd);
if ( tagStart >= 0 ) {
String tag = runText.substring(tagStart, tagEnd + end.length());
Substitution s = replacements.get(tag);
if (s != null) {
returnMap.putIfAbsent(tag,s);
}
}
else
finished = true;
tagStart = tagEnd + end.length();
tagEnd = tagStart;
}
}
}
return returnMap;
}

Categories

Resources