Transposing a matrix from a PDF stripper - java

I am reading the text from PDF file using PDFBOX. I am able to read properly using "Rectangle2D". But issue where results shows one by one. Instead i want show in Transpose view.
Current view
PO-00145678
Vendor : AQ-00067
Date...................................: 5/10/2021
Expected DeliveryDate...: 6/10/2021
Expected results
In one line
PO-00145678 Vendor : AQ-00067 Date...................................: 5/10/2021 Expected DeliveryDate...: 6/10/2021
Code using
public class PDFBoxReadFromFile {
public static void main(String[] args) throws Exception {
try (PDDocument document = PDDocument.load(new File("C:\\Users\\ed\\Documents\\test2.pdf"))) {
if (!document.isEncrypted()) {
PDFTextStripperByArea stripper = new PDFTextStripperByArea();
stripper.setSortByPosition(true);
Rectangle2D rect4 = new Rectangle2D.Double(210, 160, 230, 25);
Rectangle rect1 = new Rectangle(55, 290, 225, 17);
Rectangle2D rect2 = new Rectangle2D.Double(281, 255, 255, 20);
Rectangle2D rect3 = new Rectangle2D.Double(2, 365, 660, 1900);
stripper.addRegion("class2", rect1);
stripper.addRegion("class3", rect2);
stripper.addRegion("class4", rect3);
stripper.addRegion("class5", rect4);
PDPage firstPage = document.getPages().get(0);
stripper.extractRegions(firstPage);
System.out.println(stripper.getTextForRegion("class5"));
System.out.println(stripper.getTextForRegion("class2"));
System.out.println(stripper.getTextForRegion("class3"));
System.out.println(stripper.getTextForRegion("class4"));
File file = new File("C:/Users/ed/eclipse-workspace/pdfboxreadfromfile/file.txt");
FileWriter fw = new FileWriter(file);
PrintWriter pw = new PrintWriter(fw);
pw.println(stripper.getTextForRegion("class5"));
pw.println(stripper.getTextForRegion("class2"));
pw.println(stripper.getTextForRegion("class3"));
pw.println(stripper.getTextForRegion("class4"));
pw.close();
}
} catch (IOException e) {
System.err.println("Exception while trying to read pdf document - " + e);
}
}
Second type but no luck
System.out.print(stripper.getTextForRegion ( "class5" ) + (stripper.getTextForRegion( "class2")));
results
It should be like PO-003334823 Vendor : WL-00051

Related

Merging PDF Files in-between another PDF File in java using IText or PDFBox

I have two PDF files A and B, I have a requirement where i need to Merge both these PDF files based on a condition.Like,If i find a string like "attach PDF" in my A, i have to merge the B file into A from that particular page in A. For Example,If i spot the word in Page No 3 in my A file I need to merge the B file from Page No:3. I'm using I-Text 5.5.10. Is it possible to achieve this in I-Text or PDFBox. Here is what i have tried as of now.
public static void mergePdf() throws IOException, DocumentException
{
PdfReader reader1 = new PdfReader("C:\\Users\\user1\\Downloads\\generatedSample.pdf");
PdfReader reader2 = new PdfReader("C:\\Users\\user1\\Desktop\\sample1.pdf");
Document document = new Document();
document.addHeader("Header Text", "");
FileOutputStream fos = new FileOutputStream("C:\\Users\\user1\\Downloads\\MergeFile.pdf");
PdfCopy copy = new PdfCopy(document, fos);
document.open();
PdfImportedPage page;
PdfCopy.PageStamp stamp;
Phrase phrase;
BaseFont bf = BaseFont.createFont();
Font font = new Font(bf, 9);
int n = reader1.getNumberOfPages();
for (int i = 1; i <= reader1.getNumberOfPages(); i++)
{
page = copy.getImportedPage(reader1, i);
stamp = copy.createPageStamp(page);
ColumnText.showTextAligned(stamp.getOverContent(), Element.ALIGN_CENTER, null, 520, 5, 0);
stamp.alterContents();
copy.addPage(page);
}
for (int i = 1; i <= reader2.getNumberOfPages(); i++)
{
page = copy.getImportedPage(reader2, i);
stamp = copy.createPageStamp(page);
ColumnText.showTextAligned(stamp.getOverContent(), Element.ALIGN_CENTER, null, 520, 5, 0);
stamp.alterContents();
copy.addPage(page);
}
document.close();
reader1.close();
reader2.close();
}
Thanks for the solution in advance !!

Use different Table Header on first page

I'm using iText and create a dynamic table which has a a reoccurring header in the method createTabularHeader:
PdfPTable table = new PdfPTable(6);
// fill it with some basic information
table.setHeaderRows(1);
Yet on the first page I would like to display different information. (but the table structure/size remains the same)
Due to the dynamic content which is obtained in a different method I can't say when a new page starts.
I tried with the most primitive variant - just adding a white rectangle over the text and insert the different text. As it's just on the first page all I have to do is creating that rectangle between both methods.
But the white rectangle doesn't have any opacity and can' cover anything.
Yet by trying around I found the method writer.getDirectContent().setColorStroke(BaseColor.WHITE); which set the text to white. Later I just set the BaseColor of my cells manually to black. But the even though the new text is applied after the calling of my createTabularHeader-method its layer is under the layer of the original text and the letters are covering the new text partly.
Using the answer to How to insert invisible text into a PDF? brought me to the idea of using myPdfContentByte.setTextRenderMode(PdfContentByte.TEXT_RENDER_MODE_INVISIBLE); was not so helpful as it resets only on the 2nd page regardless what I do and the regular text on the first page stays invisible.
I'm unable to find a proper solution... How can the table-header be modified only on the first page?
The solution is not really nice, but works... and as some sort of bonus I want to add how you can modify the indentions on the first page.
public void createPdf() {
document = new Document();
try {
PdfWriter writer = PDFHead.getWriter(document);
//If it's a letter we have a different indention on the top
if (letterPDF) {
document.setMargins(36, 36, 100, 36);
} else {
document.setMargins(36, 36, 36, 36);
}
document.open();
document.add(createTabularContent());
document.close();
} catch (DocumentException | FileNotFoundException ex) {
try {
document = new Document();
PdfWriter.getInstance(document, new FileOutputStream(FILENAME));
document.open();
document.add(new Phrase(ex.getLocalizedMessage()));
document.close();
Logger.getLogger(Etikette.class.getName()).log(Level.SEVERE, null, ex);
} catch (FileNotFoundException | DocumentException ex1) {
Logger.getLogger(Etikette.class.getName()).log(Level.SEVERE, null, ex1);
}
}
}
The PDFHead is used to create a regular header (the one which appears on every page, not only on pages with the table):
public static PdfWriter getWriter(Document document) throws FileNotFoundException, DocumentException {
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(filename));
HeaderFooter event = new HeaderFooter("Ing. Mario J. Schwaiger", type + " " + DDMMYYYY.format(new java.util.Date()), 835, isLetterPDF(), customerNumber);
writer.setBoxSize("art", new Rectangle(36, 54, 559, 788));
writer.setPageEvent(event);
return writer;
}
And in that HeaderFooter-Event I use the fact the function is called after the PDF is basically created (for the page number for instance):
#Override
public void onEndPage(PdfWriter writer, Document document) {
if (isLetter) {
//That's only for the first page, apparently 1 is too late
//I'm open for improvements but that works fine for me
if (writer.getPageNumber() == 0) {
//If it's a letter we use the different margins
document.setMargins(36, 36, 100, 36);
}
if (writer.getPageNumber() == 1) {
PdfContentByte canvas = writer.getDirectContent();
float llx = 460;
float lly = 742;
float urx = 36;
float ury = 607;
//As I add the rectangle in the event here it's
//drawn over the table-header. Seems the tableheader
//is rendered afterwards
Rectangle rect1 = new Rectangle(llx, lly, urx, ury);
rect1.setBackgroundColor(BaseColor.WHITE);
rect1.setBorder(Rectangle.NO_BORDER);
rect1.setBorderWidth(1);
canvas.rectangle(rect1);
ColumnText ct = new ColumnText(canvas);
ct.setSimpleColumn(rect1);
PdfPTable minitable = new PdfPTable(1);
PdfPCell cell = PDFKopf.getKundenCol(PDFHeader.getCustomer(customerNumber));
cell.setBorder(Rectangle.NO_BORDER);
minitable.addCell(cell);
//A single cell is not accepted as an "Element"
//But a table including only a single cell is
ct.addElement(minitable);
try {
ct.go();
} catch (DocumentException ex) {
Logger.getLogger(HeaderFooter.class.getName()).log(Level.SEVERE, null, ex);
}
//In any other case we reset the margins back to normal
//This could be solved in a more intelligent way, feel free
} else {
document.setMargins(36, 36, 36, 36);
}
}
//The regular header of any page...
PdfPTable table = new PdfPTable(4);
try {
table.setWidths(new int[]{16, 16, 16, 2});
table.setWidthPercentage(100);
table.setTotalWidth(527);
table.setLockedWidth(true);
table.getDefaultCell().setFixedHeight(20);
table.getDefaultCell().setBorder(Rectangle.BOTTOM);
table.addCell(header);
PdfPCell cell;
cell = new PdfPCell(new Phrase(mittelteil));
cell.setHorizontalAlignment(Element.ALIGN_CENTER);
cell.setBorder(Rectangle.BOTTOM);
table.addCell(cell);
table.getDefaultCell().setHorizontalAlignment(Element.ALIGN_RIGHT);
table.addCell(String.format("Page %d of ", writer.getPageNumber()));
cell = new PdfPCell(Image.getInstance(total));
cell.setBorder(Rectangle.BOTTOM);
table.addCell(cell);
table.writeSelectedRows(0, -1, 34, y, writer.getDirectContent());
} catch (DocumentException de) {
throw new ExceptionConverter(de);
}
}

The type java.awt.geom.AffineTransform cannot be resolved. It is indirectly referenced from required .class files

I am generating PDF file using ItextPdf but I am getting an exception on this line canvas.addImage(background, width, 0,0, height, 20, 430); i.e. The type java.awt.geom.AffineTransform cannot be resolved. It is indirectly referenced from required .class files . On this line I am trying to set background image. Please help me out from this exception.
public void createPDF() throws NumberFormatException, ParseException
{
list1.add("I-Tax Number : ");
list1.add("Category : ");
list1.add("Service : ");
list1.add("Number : ");
list1.add("Amount : ");
list1.add("Status : ");
list2.add(iTaxNumber);
list2.add("Bill Payment");
list2.add("Idea Postapid");
list2.add("9644212111");
list2.add("100");
list2.add("SUCCESS");
Font trfont = new Font(FontFamily.TIMES_ROMAN, 12, Font.BOLDITALIC,
new BaseColor(130, 130, 140));
Font otherfont = new Font(FontFamily.TIMES_ROMAN, 12, Font.NORMAL,
new BaseColor(160, 160, 160));
Font datefont = new Font(FontFamily.TIMES_ROMAN, 12, Font.BOLD,
new BaseColor(130, 130, 140));
Font thanksFont = new Font(FontFamily.TIMES_ROMAN, 14, Font.BOLDITALIC,
new BaseColor(130, 130, 140));
Document doc = new Document(new Rectangle(792, 612));
try {
String path = Environment.getExternalStorageDirectory().getAbsolutePath() + "/PDF";
File dir = new File(path);
if(!dir.exists())
dir.mkdirs();
Log.d("PDFCreator", "PDF Path: " + path);
File file = new File(dir, "demo98989.pdf");
FileOutputStream fOut = new FileOutputStream(file);
PdfWriter docPdfWriter = PdfWriter.getInstance(doc, fOut);
Paragraph fromTotoDate = new Paragraph("Date : 25-oct-2015", datefont);
fromTotoDate.setAlignment(Element.ALIGN_RIGHT);
fromTotoDate.setIndentationRight(5);
doc.addAuthor("betterThanZero");
doc.addCreationDate();
doc.addProducer();
doc.addCreator("www.xyz.com");
doc.setPageSize(PageSize.A4);
doc.open();
PdfPTable table = setTable(list1, list2);
Paragraph trId = new Paragraph("Transaction Id : 889879899", trfont);
trId.setAlignment(Element.ALIGN_RIGHT);
trId.setIndentationRight(65);
Paragraph p = new Paragraph("\n\n\n\n");
Paragraph nextline = new Paragraph("\n");// for blank line
doc.add(fromTotoDate);
doc.add(p);
doc.add(trId);
int list1size = list1.size();
String size = String.valueOf(list1size);
Image trDetails_Icon;
Bitmap bmp = BitmapFactory.decodeResource(getBaseContext().getResources(),R.drawable.trreceipt);
ByteArrayOutputStream streamTrReceipt = new ByteArrayOutputStream();
bmp.compress(Bitmap.CompressFormat.PNG, 100, streamTrReceipt);
trDetails_Icon = Image.getInstance(streamTrReceipt.toByteArray());
trDetails_Icon.scaleAbsolute(445f, 238f);
trDetails_Icon.setAbsolutePosition(76, 516);
doc.add(trDetails_Icon);
doc.add(nextline);
doc.add(table);
Paragraph thanktouMessage = new Paragraph("Thanks for Being with Us ! ", thanksFont);
thanktouMessage.setAlignment(Element.ALIGN_CENTER);
doc.add(nextline);
doc.add(thanktouMessage);
Font contFont = new Font(FontFamily.TIMES_ROMAN, 10, Font.NORMAL,
new BaseColor(130, 130, 140));
doc.add(nextline);
Paragraph cont = new Paragraph("For more info contact us", contFont);
cont.setAlignment(Element.ALIGN_RIGHT);
cont.setIndentationRight(20);
doc.add(cont);
System.out.println("list2.get(1) = "+list2.get(1));
float width;
float height;
Image background;
Bitmap bmp1 = BitmapFactory.decodeResource(getBaseContext().getResources(),R.drawable.trans);
ByteArrayOutputStream streamTrReceipt1 = new ByteArrayOutputStream();
bmp1.compress(Bitmap.CompressFormat.PNG, 100, streamTrReceipt1);
System.out.println("list2.get(1) = "+list2.get(1)+"ELSE");
width = PageSize.A4.getWidth()-40;
height = (PageSize.A4.getHeight()/2)-25;
background = Image.getInstance(streamTrReceipt1.toByteArray());
PdfContentByte canvas = docPdfWriter.getDirectContentUnder();
canvas.addImage(background, width, 0,0, height, 20, 430);
Toast.makeText(getApplicationContext(), "Created...", Toast.LENGTH_LONG).show();
} catch (DocumentException de) {
Log.e("PDFCreator", "DocumentException:" + de);
} catch (IOException e) {
Log.e("PDFCreator", "ioException:" + e);
}
finally
{
doc.close();
}
}
You are using the wrong iText version. You should use iTextG instead of the "plain Java" iText version. As an Android developer, you know that it's forbidden to use java.awt (and javax.nio,...) classes on Android.
The "plain Java" iText uses classes that aren't whitelisted on Android (e.g. in the PdfGraphics2D class). That's why we've created iTextG. iTextG is essentially identical to iText, except that we've removed all dependencies on the "forbidden classes" (and java.awt.geom.AffineTransform is one of those classes).
There is slightly less functionality in iTextG (we had to drop PdfGraphics2D), but at first sight, I don't see anything that isn't supported in iTextG in your code.
Long story short: replace iText with its Android port iTextG and your problem will be solved.

PdfBox: issues when creating pdf from bmp

When generating a PDF form BMP the result is allways curios.
Input "hellowworld.bmp"
Output (only the relevant part)
why is there a loss of quality
why is it repeated three times
why is there a black square ( green Frame)
Heres how i test it:
#Test
public final void testWriteSingleBMPtoPDF() throws IOException {
Assert.assertTrue("File existst", TestFileHelper.getBMP(BMPS.HELLOWORLD).exists());
Assert.assertTrue("File readable", TestFileHelper.getBMP(BMPS.HELLOWORLD).canRead());
ArrayList<File> doc = new ArrayList<EncodedPage>();
doc.add(createPage(BMPS.HELLOWORLD));
File result = null;
try {
result = ConvertPDF.bmpToPDF(doc);
} catch (COSVisitorException e) {
e.printStackTrace();
}
Assert.assertTrue("File existst", result.exists());
Assert.assertTrue("File readable", result.canRead());
System.out.println("Please Check >"+result+"<");
}
Heres the part of my java implementation
public static File bmpToPDF(ArrayList<File> inputDoc)
PDDocument document = new PDDocument();
String saveTo = "C:\\temp\\" + System.currentTimeMillis() + ".pdf";
for (File bmpPage : inputDoc) {
PDPage page = null;
PDXObjectImage ximage = null;
page = new PDPage();
document.addPage(page);
BufferedImage awtImage = ImageIO.read(bmpPage);
ximage = new PDPixelMap(document, awtImage);
PDPageContentStream content = new PDPageContentStream(document, page);
content.drawImage(ximage, 0, 0);
content.close();
}
document.save(saveTo);
document.close();
return new File(saveTo) ;
Version of Apache PDFBox is 1.7.1

how to clear this error iTextPdf Document error?

I am getting a The document has no pages. runtime error in this program...
public class Windows {
public static void main(String[] args) throws FileNotFoundException, DocumentException {
java.io.File f = new java.io.File("c:/temp/text.pdf");
java.io.FileOutputStream fo = new java.io.FileOutputStream(f);
com.itextpdf.text.Document d = new com.itextpdf.text.Document(PageSize.A5, 50, 50, 50, 50);
PdfWriter pw = PdfWriter.getInstance(d, fo);
d.open();
Boolean b0 = d.newPage();
Boolean b1 = d.addAuthor("Tamil Selvan");
d.addCreator("Tamil Selvan");
d.addHeader("Tamil Selvan Header name", "Header Content");
d.addKeywords("These are the keywords for the document");
d.addSubject("These are the subjects for the Document");
d.addTitle("The Title Of the Document");
d.close();
System.out.println("Is the Documnet is Opened "+b0);
System.out.println("Is the Documnet is Working "+b1);
};
}
How can I run this?
I believe the problem here is that you have provided metadata for the pdf, but no actual body or content for the pdf.
For example, you can try
d.add(new Paragraph("Some random text"));
and seeing if this addresses the error you are facing.

Categories

Resources