Remove FixedLeading at the first line on each page - java

I want to remove setFixedLeading at the first line on each page (100+)
I read a bit text(more 100 page with help while). And I set padding and margin to 0 but I still have top indent. Why? Help me pls? How delete it?
public static final String DEST = "PDF.pdf";
public static void main(String[] args) throws FileNotFoundException {
PdfDocument pdfDoc = new PdfDocument(new PdfWriter(DEST));
Document doc = new Document(pdfDoc);
doc.setMargins(0,0,0,0);
for (int i = 0; i <20 ; i++) {
Paragraph element = new Paragraph("p " + i);
element.setPadding(0);
element.setMargin(0);
element.setFixedLeading(55);
doc.add(element);
}
doc.close();
}
PDF file:
https://pdfhost.io/v/Byt9LHJcy_PDFpdf.pdf

At the time of element creation you don't know the page it will end up on nor its resultant position. I don't think there is a property that allows you to configure the behavior depending on whether it's the top element on a page (such property would be too custom and tied to a specific workflow).
Fortunately, the layout mechanism is quite flexible and you can implement the desired behavior in a couple of lines of code.
First off, let's not use setFixedLeading and set the top margin for all paragraphs instead:
Document doc = new Document(pdfDocument);
doc.setMargins(0, 0, 0, 0);
for (int i = 0; i < 20; i++) {
Paragraph element = new Paragraph("p " + i);
element.setPadding(0);
element.setMargin(0);
element.setMarginTop(50);
doc.add(element);
}
doc.close();
This does not pretty much change anything in the visual result - it's just another way of doing things.
Now, we need a custom renderer to tweak the behavior of a paragraph if it is rendered at the top of the page. We are going to override layout method and check if the area we are given is located at the top of the page - and if so, we will not apply the top margin:
private static class CustomParagraphRenderer extends ParagraphRenderer {
Document document;
public CustomParagraphRenderer(Paragraph modelElement, Document document) {
super(modelElement);
this.document = document;
}
#Override
public IRenderer getNextRenderer() {
return new ParagraphRenderer((Paragraph) modelElement);
}
#Override
public LayoutResult layout(LayoutContext layoutContext) {
if (layoutContext.getArea().getBBox().getTop() == document.getPdfDocument().getDefaultPageSize().getHeight()) {
((Paragraph)getModelElement()).setMarginTop(0);
}
return super.layout(layoutContext);
}
}
Now the only thing we need to do is to set the custom renderer instance to each paragraph in the loop:
element.setNextRenderer(new CustomParagraphRenderer(element, doc));
Visual result:

Related

IText 7 How To Add Div or Paragraph in Header Without Overlapping Page Content?

I am facing the following problem for which i haven't found any solution yet. I am implementing a platform for a medical laboratory. They want for every incident to write the report to the system and then generate and print it from the system. I am using itext 7 to accomplish this. However i am facing the following problem.
They have a very strange template. On the first page in the beginning they want to print a specific table, while in the beginning of every other page they want to print something else. So i need to know when pages change in order to print in the beginning of the page the corresponding table.
After reading various sources i ended up creating the first page normally and then adding a header event handler that checks the page number and gets executed always except page 1.
public class VariableHeaderEventHandler implements IEventHandler {
#Override
public void handleEvent(Event event) {
System.out.println("THIS IS ME: HEADER EVENT HANDLER STARTED.....");
PdfDocumentEvent documentEvent = (PdfDocumentEvent) event;
PdfDocument pdfDoc = documentEvent.getDocument();
PdfPage page = documentEvent.getPage();
Rectangle pageSize = page.getPageSize();
int pageNumber = pdfDoc.getPageNumber(page);
if (pageNumber == 1) return; //Do nothing in the first page...
System.out.println("Page size: " + pageSize.getHeight());
Rectangle rectangle = new Rectangle(pageSize.getLeft() + 30, pageSize.getHeight()-234, pageSize.getWidth() - 60, 200);
PdfCanvas pdfCanvas = new PdfCanvas(page.newContentStreamBefore(), page.getResources(), pdfDoc);
pdfCanvas.rectangle(rectangle);
pdfCanvas.setFontAndSize(FontsAndStyles.getRegularFont(), 10);
Canvas canvas = new Canvas(pdfCanvas, pdfDoc, rectangle);
Div header = new Div();
Paragraph paragraph = new Paragraph();
Text text = new Text("Διαγνωστικό Εργαστήριο Ιστοπαθολογίας και Μοριακής Παθολογοανατομικής").addStyle(FontsAndStyles.getBoldStyle());
paragraph.add(text);
paragraph.add(new Text("\n"));
text = new Text("Μοριακή Διάγνωση σε Συνεργασία με").addStyle(FontsAndStyles.getBoldStyle());
paragraph.add(text);
paragraph.add(new Text("\n"));
text = new Text("Γκούρβας Βίκτωρας, M.D., Ph.D.").addStyle(FontsAndStyles.getBoldStyle());
paragraph.add(text);
paragraph.add(new Text("\n"));
text = new Text("Τσιμισκή 33, Τ.Κ. 54624, ΘΕΣΣΑΛΟΝΙΚΗ").addStyle(FontsAndStyles.getNormalStyle());
paragraph.add(text);
paragraph.add(new Text("\n"));
text = new Text("Τήλ/Φάξ: 2311292924 Κιν.: 6932104909 e-mail: vgourvas#gmail.com").addStyle(FontsAndStyles.getNormalStyle());
paragraph.add(text);
header.add(paragraph);
// =============Horizontal Line BOLD============
SolidLine solidLine = new SolidLine((float) 1.5);
header.add(new LineSeparator(solidLine));
// ========Horizontal Line BOLD End==========
text = new Text("ΠΑΘΟΛΟΓΟΑΝΑΤΟΜΙΚΗ ΕΞΕΤΑΣΗ").addStyle(FontsAndStyles.getBoldStyle());
paragraph = new Paragraph().add(text);
header.add(paragraph);
header.setTextAlignment(TextAlignment.CENTER);
canvas.add(header);
canvas.close();
}
However the problem i am facing now is that header overlaps content and i can't figure out how to set different margins per page. For example form page 2 and beyond i would like different topMargin.
Has anyone faced these problems before and have found a working solution? Am I implementing correct? Is there a better way of accomplishing the same result?
Thanks in advance,
Toutoudakis Michail
You should create your own custom document renderer and decrease the area which would be used to place content for each page except for the first one.
Please look at the snippet below and updateCurrentArea method in particular.
class CustomDocumentRenderer extends DocumentRenderer {
public CustomDocumentRenderer(Document document) {
super(document);
}
#Override
public IRenderer getNextRenderer() {
return new CustomDocumentRenderer(this.document);
}
#Override
protected LayoutArea updateCurrentArea(LayoutResult overflowResult) {
LayoutArea area = super.updateCurrentArea(overflowResult);
if (currentPageNumber > 1) {
area.setBBox(area.getBBox().decreaseHeight(200));
}
return area;
}
}
Then just set the renderer on your document:
Document doc = new Document(pdfDoc);
doc.setRenderer(new CustomDocumentRenderer(doc));
The resultant pdf which I get for your document looks as follows:
There is another solution however. Once you've added at least one element to your document, you can change the default document's margins. The change will be applied on all pages created afterwards (and in your case these are pages 2, 3, ...)
doc.add(new Paragraph("At least one element should be added. Otherwise the first page wouldn't be created and changing of the default margins would affect it."));
doc.setMargins(200, 36, 36, 36);
// now you can be sure that all the next pages would have new margins

PDF Generation using iText7 with Java

I am trying to add content to an existing PDF using iText7. I have been able to create new PDFs and add content to them using Paragraphs and Tables. However, once I go to reopen a PDF that I have created and attempt to write more content to it, the new content starts overwriting the old content. I want the new content to be appended to the Document after the old content. How can I achieve this?
Edit
This is the Class which sets up some common methods that will be executed with each change done to a PDF document.
public class PDFParent {
private static Document document;
private static PdfWriter writer;
private static PdfReader reader;
private static PageSize ps;
private static PdfDocument pdfDoc;
public static Document getDocument() {
return document;
}
public static void setDocument(Document document) {
PDFParent.document = document;
}
public static void setupPdf(byte[] inParamInPDFBinary){
writer = new PdfWriter(new ByteArrayOutputStream());
try {
reader = new PdfReader(new ByteArrayInputStream(inParamInPDFBinary));
} catch (IOException e) {
e.printStackTrace();
}
pdfDoc = new PdfDocument(reader, writer);
ps = PageSize.A4;
document = new Document(pdfDoc, ps);
}
public static byte[] writePdf(){
ByteArrayOutputStream stream = (ByteArrayOutputStream) writer.getOutputStream();
return stream.toByteArray();
}
public static void closePdf(){
pdfDoc.close();
}
And this is how I am adding the content to the pdf
public class ActAddParagraphToPDF extends PDFParent{
// output parameters
public static byte[] outParamOutPDFBinary;
public static ActAddParagraphToPDF mosAddParagraphToPDF(byte[] inParamInPDFBinary, String inParamParagraph) throws IOException {
ActAddParagraphToPDF result = new ActAddParagraphToPDF();
setupPdf(inParamInPDFBinary);
//---------------------begin content-------------------//
getDocument().add((Paragraph) new Paragraph(inParamParagraph));
//---------------------end content-------------------//
closePdf();
outParamOutPDFBinary = writePdf();
return result;
}
When I go to execute this second class, it appears to be treating the original document as if it is blank. Then writes the new Paragraph on top of the original content. I know that I am missing something, just not sure what that is.
Is reopening the document every time a requirement? If you keep the document open, you can append as much content as you want and you won't have to deal with content overlapping problems.
If it is a requirement, then you will have to track the last free content position yourself and reset it to new DocumentRenderer.
A Rectangle would be enough to store the free area that is left on the last page. Right before closing the document, save the free area in some Rectangle in the following way:
Rectangle savedBbox = document.getRenderer().getCurrentArea().getBBox();
After that, when you have to reopen the document, first jump to the last page:
document.add(new AreaBreak(AreaBreakType.LAST_PAGE));
And then reset the free occupied area left from the previous time you dealt with the document:
document.getRenderer().getCurrentArea().setBBox(savedBbox);
After that you are free to add new content to the document and it will appear at the saved position:
document.add(new Paragraph("Hello again"));
Please note that this approach works if you know which documents you are dealing with (i.e. you can associate last "free" position with the document's ID) and this document is not changed outside of your environment. If this is not the case, I recommend that you look into content extraction and in particular PdfDocumentContentParser. It can help you to extract the content you have on the page and determine which positions it occupies. Then you can calculate the free area on a page and use document.getRenderer().getCurrentArea().setBBox approach I described above to point DocumentRenderer to the correct place to write content to.

iText add Watermark to selected pages

I need to add a watermark to every page that has certain text, such as "PROCEDURE DELETED".
Based on Bruno Lowagie's suggestion in Adding watermark directly to the stream
So far have the PdfWatermark Class with:
protected Phrase watermark = new Phrase("DELETED", new Font(FontFamily.HELVETICA, 60, Font.NORMAL, BaseColor.PINK));
ArrayList<Integer> arrPages = new ArrayList<Integer>();
boolean pdfChecked = false;
#Override
public void onEndPage(PdfWriter writer, Document document) {
if(pdfChecked == false) {
detectPages(writer, document);
pdfChecked = true;
}
int pageNum = writer.getPageNumber();
if(arrPages.contains(pageNum)) {
PdfContentByte canvas = writer.getDirectContentUnder();
ColumnText.showTextAligned(canvas, Element.ALIGN_CENTER, watermark, 298, 421, 45);
}
}
And this works fine if I add, say, the number 3 to the arrPages ArrayList in my custom detectPages method - it shows the desired watermark on page 3.
What I am having trouble with is how to search through the document for the text string, which I have access to here, only from the PdfWriter writer or the com.itextpdf.text.Document document sent to onEndPage method.
Here is what I have tried, unsuccessfully:
private void detectPages(PdfWriter writer, Document document) {
try {
//arrPages.add(3);
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
PdfWriter.getInstance(document, byteArrayOutputStream);
//following code no work
PdfReader reader = new PdfReader(writer.getDirectContent().getPdfDocument());
PdfContentByte canvas = writer.getDirectContent();
PdfImportedPage page;
for (int i = 0; i < reader.getNumberOfPages(); ) {
page = writer.getImportedPage(reader, ++i);
canvas.addTemplate(page, 1f, 0, 0.4f, 0.4f, 72, 50 * i);
canvas.beginText();
canvas.showTextAligned(Element.ALIGN_CENTER,
//search String
String.valueOf((char)(181 + i)), 496, 150 + 50 * i, 0);
//if(detected) arrPages.add(i);
canvas.endText();
}
Am I on the right track with this as a solution or do I need to back out?
Can anyone supply the missing link needed to scan the doc and pick out "PROCEDURE DELETED" pages?
EDIT: I am using iText 5.0.4 - cannot upgrade to 5.5.X at this time, but could probably upgrade to latest version below that.
EDIT2: More information: This is approach to adding text to the document (doc):
String processed = processText(template);
List<Element> objects = HTMLWorker.parseToListLog(new StringReader(processed),
styles, interfaceProps, errors);
for (Element elem : objects) {
doc.add(elem);
}
That is called in an addText method I control. The template is simply html from a database LOB. The processText checks the html for custom markers contained by curlies as in ${replaceMe}.
This seems to be the place to identify the "PROCEDURE DELETED" string during generation of the document, but I don't see the path to Chunk.setGenericTags().
EDIT3: Table difficulties
List<Element> objects = HTMLWorker.parseToListLog(new StringReader(processed),
styles, interfaceProps, errors);
for (Element elem : objects) {
//Code below no work
if (elem instanceof PdfPTable) {
PdfPTable table = (PdfPTable) elem;
ArrayList<Chunk> chks = table.getChunks();
for(Chunk chk : chks){
if(chk.toString().contains("TEST DELETED")) {
chk.setGenericTag("delete_tag");
}
}
}
doc.add(elem);
}
Commenters mlk and Bruno suggested to detect the "PROCEDURE DELETED" keywords at the time they are added to the doc. However, since the keywords are necessarily inside a table, they have to be detected through PdfPTable rather than the simpler Element.
I could not do it with the code above. Any suggestions exactly how to find text inside a table cell and do a string comparison on it?
EDIT4: Based on some experimentation, I would like to make some assertions and please show me the way through them:
Using Chunk.setGenericTag() is required to trigger the handler onGenericTag
For some reason (PdfPTable) table.getChunks() does not return chunks, at least that my system picks up. This is counterintuitive and possibly there is a setup, version, or code bug causing this behavior.
Therefore, a selection text string inside a table cannot be used to trigger a watermark.
Thanks to all the help from #mkl and #Bruno, I finally got a workable solution. Anybody interested who can give a more elegant approach is welcome - there is certainly room.
Two classes were in play in all the snippets given in the original question. Below is partial code that embodies the working solution.
public class PdfExporter
//a custom class to build content
//create some content to add to pdf
String text = "<div>Lots of important content here</div>";
//assume the section is known to be deleted
if(section_deleted) {
//add a special marker to the text, in white-on-white
text = "<span style=\"color:white;\">.!^DEL^?.</span>" + text + "</span>";
}
//based on some indicator, we know we want to force a new page for a new section
boolean ensureNewPage = true;
//use an instance of PdfDocHandler to add the content
pdfDocHandler.addText(text, ensureNewPage);
public class PdfDocHandler extends PdfPageEventHelper
private boolean isDEL;
public boolean addText(String text, boolean ensureNewPage)
if (ensureNewPage) {
//turn isDEL off: a forced pagebreak indicates a new section
isDEL = false;
}
//attempt to find the special DELETE marker in first chunk
//NOTE: this can be done several ways
ArrayList<Chunk> chks = elem.getChunks();
if(chks.size()>0) {
if(chks.get(0).getContent().contains(".!^DEL^?.")) {
//special delete marker found in text
isDEL = true;
}
}
//doc is an iText Document
doc.add(elem);
public void onEndPage(PdfWriter writer, Document document) {
if(isDEL) {
//set the watermark
Phrase watermark = new Phrase("DELETED", new Font(Font.FontFamily.HELVETICA, 60, Font.NORMAL, BaseColor.PINK));
PdfContentByte canvas = writer.getDirectContentUnder();
ColumnText.showTextAligned(canvas, Element.ALIGN_CENTER, watermark, 298, 421, 45);
}
}

Using iTextPDF to trim a page's whitespace

I have a pdf which comprises of some data, followed by some whitespace. I don't know how large the data is, but I'd like to trim off the whitespace following the data
PdfReader reader = new PdfReader(PDFLOCATION);
Rectangle rect = new Rectangle(700, 2000);
Document document = new Document(rect);
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(SAVELCATION));
document.open();
int n = reader.getNumberOfPages();
PdfImportedPage page;
for (int i = 1; i <= n; i++) {
document.newPage();
page = writer.getImportedPage(reader, i);
Image instance = Image.getInstance(page);
document.add(instance);
}
document.close();
Is there a way to clip/trim the whitespace for each page in the new document?
This PDF contains vector graphics.
I'm usung iTextPDF, but can switch to any Java library (mavenized, Apache license preferred)
As no actual solution has been posted, here some pointers from the accompanying itext-questions mailing list thread:
As you want to merely trim pages, this is not a case of PdfWriter + getImportedPage usage but instead of PdfStamper usage. Your main code using a PdfStamper might look like this:
PdfReader reader = new PdfReader(resourceStream);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("target/test-outputs/test-trimmed-stamper.pdf"));
// Go through all pages
int n = reader.getNumberOfPages();
for (int i = 1; i <= n; i++)
{
Rectangle pageSize = reader.getPageSize(i);
Rectangle rect = getOutputPageSize(pageSize, reader, i);
PdfDictionary page = reader.getPageN(i);
page.put(PdfName.CROPBOX, new PdfArray(new float[]{rect.getLeft(), rect.getBottom(), rect.getRight(), rect.getTop()}));
stamper.markUsed(page);
}
stamper.close();
As you see I also added another argument to your getOutputPageSize method to-be. It is the page number. The amount of white space to trim might differ on different pages after all.
If the source document did not contain vector graphics, you could simply use the iText parser package classes. There even already is a TextMarginFinder based on them. In this case the getOutputPageSize method (with the additional page parameter) could look like this:
private Rectangle getOutputPageSize(Rectangle pageSize, PdfReader reader, int page) throws IOException
{
PdfReaderContentParser parser = new PdfReaderContentParser(reader);
TextMarginFinder finder = parser.processContent(page, new TextMarginFinder());
Rectangle result = new Rectangle(finder.getLlx(), finder.getLly(), finder.getUrx(), finder.getUry());
System.out.printf("Text/bitmap boundary: %f,%f to %f, %f\n", finder.getLlx(), finder.getLly(), finder.getUrx(), finder.getUry());
return result;
}
Using this method with your file test.pdf results in:
As you see the code trims according to text (and bitmap image) content on the page.
To find the bounding box respecting vector graphics, too, you essentially have to do the same but you have to extend the parser framework used here to inform its listeners (the TextMarginFinder essentially is a listener to drawing events sent from the parser framework) about vector graphics operations, too. This is non-trivial, especially if you don't know PDF syntax by heart yet.
If your PDFs to trim are not too generic but can be forced to include some text or bitmap graphics in relevant positions, though, you could use the sample code above (probably with minor changes) anyways.
E.g. if your PDFs always start with text on top and end with text at the bottom, you could change getOutputPageSize to create the result rectangle like this:
Rectangle result = new Rectangle(pageSize.getLeft(), finder.getLly(), pageSize.getRight(), finder.getUry());
This only trims top and bottom empty space:
Depending on your input data pool and requirements this might suffice.
Or you can use some other heuristics depending on your knowledge on the input data. If you know something about the positioning of text (e.g. the heading to always be centered and some other text to always start at the left), you can easily extend the TextMarginFinder to take advantage of this knowledge.
Recent (April 2015, iText 5.5.6-SNAPSHOT) improvements
The current development version, 5.5.6-SNAPSHOT, extends the parser package to also include vector graphics parsing. This allows for an extension of iText's original TextMarginFinder class implementing the new ExtRenderListener methods like this:
#Override
public void modifyPath(PathConstructionRenderInfo renderInfo)
{
List<Vector> points = new ArrayList<Vector>();
if (renderInfo.getOperation() == PathConstructionRenderInfo.RECT)
{
float x = renderInfo.getSegmentData().get(0);
float y = renderInfo.getSegmentData().get(1);
float w = renderInfo.getSegmentData().get(2);
float h = renderInfo.getSegmentData().get(3);
points.add(new Vector(x, y, 1));
points.add(new Vector(x+w, y, 1));
points.add(new Vector(x, y+h, 1));
points.add(new Vector(x+w, y+h, 1));
}
else if (renderInfo.getSegmentData() != null)
{
for (int i = 0; i < renderInfo.getSegmentData().size()-1; i+=2)
{
points.add(new Vector(renderInfo.getSegmentData().get(i), renderInfo.getSegmentData().get(i+1), 1));
}
}
for (Vector point: points)
{
point = point.cross(renderInfo.getCtm());
Rectangle2D.Float pointRectangle = new Rectangle2D.Float(point.get(Vector.I1), point.get(Vector.I2), 0, 0);
if (currentPathRectangle == null)
currentPathRectangle = pointRectangle;
else
currentPathRectangle.add(pointRectangle);
}
}
#Override
public Path renderPath(PathPaintingRenderInfo renderInfo)
{
if (renderInfo.getOperation() != PathPaintingRenderInfo.NO_OP)
{
if (textRectangle == null)
textRectangle = currentPathRectangle;
else
textRectangle.add(currentPathRectangle);
}
currentPathRectangle = null;
return null;
}
#Override
public void clipPath(int rule)
{
}
(Full source: MarginFinder.java)
Using this class to trim the white space results in
which is pretty much what one would hope for.
Beware: The implementation above is far from optimal. It is not even correct as it includes all curve control points which is too much. Furthermore it ignores stuff like line width or wedge types. It actually merely is a proof-of-concept.
All test code is in TestTrimPdfPage.java.

iText: Setting image interpolation for images on a page

I want to iterate through the pages of a PDF and write a new PDF where all images have interpolation set to false. I was expecting to be able to do something like the following, but I cannot find a method of accessing the Images or Rectangles on the PDF page.
PdfCopy copy = new PdfCopy(document, new FileOutputStream(outFileName));
copy.newPage();
PdfReader reader = new PdfReader(inFileName);
for(int i = 1; i <= reader.getNumberOfPages(); i++) {
PdfImportedPage importedPage = copy.getImportedPage(reader, i);
for(Image image : importedPage.images())
image.isInterpolated(false);
copy.addPage(importedPage);
}
reader.close();
There is, however, no PdfImportedPage.images(). Any suggestions on how I might otherwise do the same?
Cheers
Nik
It won't be that easy. There's no high-level way of doing what you want. You'll have to enumerate the resources looking for XObject Images, and clear their /Interpolate flag.
And you'll have to do it before creating the PdfImportedPage because there's no public way to access their resources. Grr.
void removeInterpolation( int pageNum ) {
PdfDictionary page = someReader.getPageN(pageNum);
PdfDictionary resources = page.getAsDict(PdfName.RESOURCES);
enumResources(resources);
}
void enumResource( PdfDictionary resources) {
PdfDictionary xobjs = resources.getAsDict(PdfName.XOBJECTS);
Set<PdfName> xobjNames = xobjs.getKeys();
for (PdfName name : xobjNames) {
PdfStream xobjStream = xobjs.getAsStream(name);
if (PdfName.FORM.equals( xobjStream.getAsName(PdfName.SUBTYPE))) {
// xobject forms have their own nested resources.
PdfDictionary nestedResources = xobjStream.getAsDict(PdfName.RESOURCES);
enumResources(nestedResources);
} else {
xobjStream.remove(PdfName.INTERPOLATE);
}
}
}
There's quite a bit of null-checking that's skipped in the above code. A page doesn't have to have a resource dictionary, though they almost always do. Ditto for XObject Forms. All the getAs* functions will return null if the given key is missing or of a different type... you get the idea.

Categories

Resources