making more than one pdf document in java

making more than one pdf document in java - java

This is my code:
try {
dozen = magazijn.getFfd().vraagDozenOp();
for (int i = 0; i < dozen.size(); i++) {
PdfWriter.getInstance(doc, new FileOutputStream("Order" + x + ".pdf"));
System.out.println("Writer instance created");
doc.open();
System.out.println("doc open");
Paragraph ordernummer = new Paragraph(order.getOrdernummer());
doc.add(ordernummer);
doc.add( Chunk.NEWLINE );
for (String t : text) {
Paragraph klant = new Paragraph(t);
doc.add(klant);
}
doc.add( Chunk.NEWLINE );
Paragraph datum = new Paragraph (order.getDatum());
doc.add(datum);
doc.add( Chunk.NEWLINE );
artikelen = magazijn.getFfd().vraagArtikelenOp(i);
for (Artikel a : artikelen){
artikelnr.add(a.getArtikelNaam());
}
for (String nr: artikelnr){
Paragraph Artikelnr = new Paragraph(nr);
doc.add(Artikelnr);
}
doc.close();
artikelnr.clear();
x++;
System.out.println("doc closed");
}
} catch (Exception e) {
System.out.println(e);
}
I get this exception: com.itextpdf.text.DocumentException: The document has been closed. You can't add any Elements.
can someone help me fix this so that the other pdf can be created and paragrphs added?

Alright, your intent is not very clear from your code and question so I'm going to operate under the following assumptions:
You are creating a report for each box you're processing
Each report needs to be a separate PDF file
You're getting a DocumentException on the second iteration of the loop, you're trying to add content to a Document that has been closed in the previous iteration via doc.close();. 'doc.close' will finalize the Document and write everything still pending to any linked PdfWriter.
If you wish to create separate pdfs for each box, you need to create a seperate Document in your loop statement as well, since creating a new PdfWriter via PdfWriter.getInstance(doc, new FileOutputStream("Order" + x + ".pdf")); will not create a new Document on its own.
If I'm wrong with assumption 2 and you wish to add everything to a single PDF, move doc.close(); outside of the loop and create only a single PdfWriter

You can try something like this using Apache PDFBox
File outputFile = new File(path);
outputFile.createNewFile();
PDDocument newDoc = new PDDocument();
then create a PDPage and write what you wanna write in that page. After your page is ready, add it to the newDoc and in the end save it and close it
newDoc.save(outputFile);
newDoc.close()
repeat this dozen.size() times and keep changing the file's name in path for every new document.

Related

iText Fill Form / Copy Page to new Document

I'm useing iText to fill a template PDF which contains a AcroForm.
Now I want to use this template to create a new PDF with dynamically pages.
My idea is it to fill the template PDF, copy the page with the written fields and add it to a new file. They main Problem is that our customer want to designe the template by them self. So I'm not sure if I try the right way to solve this Problem.
So I've created this code which don't work right now I get the error com.itextpdf.io.IOException: PDF header not found.
My Code
x = 1;
try (PdfDocument finalDoc = new PdfDocument(new PdfWriter("C:\\Users\\...Final.pdf"))) {
for (HashMap<String, String> map : testValues) {
String path1 = "C:\\Users\\.....Temp.pdf"
InputStream template = templateValues.get("Template");
PdfWriter writer = new PdfWriter(path1);
try (PdfDocument pdfDoc = new PdfDocument(new PdfReader(template), writer)) {
PdfAcroForm form = PdfAcroForm.getAcroForm(pdfDoc, true);
for (HashMap.Entry<String, String> map2 : map.entrySet()) {
if (form.getField(map2.getKey()) != null) {
Map<String, PdfFormField> fields = form.getFormFields();
fields.get(map2.getKey()).setValue(map2.getValue());
}
}
} catch (IOException | PdfException ex) {
System.err.println("Ex2: " + ex.getMessage());
}
if (x != 0 && (x % 5) == 0) {
try (PdfDocument tempDoc = new PdfDocument(new PdfReader(path1))) {
PdfPage page = tempDoc.getFirstPage();
finalDoc.addPage(page.copyTo(finalDoc));
} catch (IOException | PdfException ex) {
System.err.println("Ex3: " + ex.getMessage());
}
}
x++;
}
} catch (IOException | PdfException ex) {
System.err.println("Ex: " + ex.getMessage());
}

Part 1 - PDF Header is Missing
this appears to be caused by you attempting to re-read an InputStream w/in a loop that has already been read (and, depending on the configuration of the PdfReader, closed). Solving for this depends on the specific type of InputStream being used - if you want to leave it as a simple InputStream (vs. a more specific yet more capable InputStream type) then you'll need to first slurp up the bytes from the stream into memory (e.g. a ByteArrayOutputStream) then create your PDFReaders based on those bytes.
i.e.
ByteArrayOutputStream templateBuffer = new ByteArrayOutputStream();
while ((int c = template.read()) > 0) templateBuffer.write(c);
for (/* your loop */) {
...
PdfDocument filledInAcroFormTemplate = new PdfDocument(new PdfReader(new ByteArrayInputStream(templateBuffer.toByteArray())), new PdfWriter(tmp))
...
Part 2 - other problems
Couple of things
make sure to grab the recently released 7.0.1 version of iText since it included a couple of fixes wrt/ AcroForm handling
you can probably get away with using ByteArrayOutputStreams for your temporary PDFs (vs. writing them out to files) - i'll use this approach in the example below
PdfDocument/PdfPage is in the "kernel" module, yet AcroForms are in the "form" module (meaning PdfPage is intentionally unaware of AcroForms) - IPdfPageExtraCopier is sortof the bridge between the modules. In order to properly copy AcroForms, you need to use the two-arg copyTo() version, passing an instance of PdfPageFormCopier
field names must be unique in the document (the "absolute" field name that is - i'll skip field hierarcies for now). Since we're looping through and adding the fields from the template multiple times, we need to come up with a strategy to rename the fields to ensure uniqueness (the current API is actually a little bit clunky in this area)
File acroFormTemplate = new File("someTemplate.pdf");
Map<String, String> someMapOfFieldToValues = new HashMap<>();
try (
PdfDocument finalOutput = new PdfDocument(new PdfWriter(new FileOutputStream(new File("finalOutput.pdf")));
) {
for (/* some looping condition */int x = 0; x < 5; x++) {
// for each iteration of the loop, create a temporary in-memory
// PDF to handle form field edits.
ByteArrayOutputStream tmp = new ByteArrayOutputStream();
try (
PdfDocument filledInAcroFormTemplate = new PdfDocument(new PdfReader(new FileInputStream(acroFormTemplate)), new PdfWriter(tmp));
) {
PdfAcroForm acroForm = PdfAcroForm.getAcroForm(filledInAcroFormTemplate, true);
for (PdfFormField field : acroForm.getFormFields().values()) {
if (someMapOfFieldToValues.containsKey(field.getFieldName())) {
field.setValue(someMapOfFieldToValues.get(field.getFieldName()));
}
}
// NOTE that because we're adding the template multiple times
// we need to adopt a field renaming strategy to ensure field
// uniqueness in the final document. For demonstration's sake
// we'll just rename them prefixed w/ our loop counter
List<String> fieldNames = new ArrayList<>();
fieldNames.addAll(acroForm.getFormFields().keySet()); // avoid ConfurrentModification
for (String fieldName : fieldNames) {
acroForm.renameField(fieldName, x+"_"+fieldName);
}
}
// the temp PDF needs to be "closed" for all the PDF finalization
// magic to happen...so open up new read-only version to act as
// the source for the merging from our in-memory bucket-o-bytes
try (
PdfDocument readOnlyFilledInAcroFormTemplate = new PdfDocument(new PdfReader(new ByteArrayInputStream(tmp.toByteArray())));
) {
// although PdfPage.copyTo will probably work for simple pages, PdfDocument.copyPagesTo
// is a more comprehensive copy (wider support for copying Outlines and Tagged content)
// so it's more suitable for general page-copy use. Also, since we're copying AcroForm
// content, we need to use the PdfPageFormCopier
readOnlyFilledInAcroFormTemplate.copyPagesTo(1, 1, finalOutput, new PdfPageFormCopier());
}
}
}

Close your PdfDocuments when you are done with adding content to them.

addNamedDestination not inserting destinations into new PDF

I'm manipulating a PDF available from https://www.census.gov/content/dam/Census/library/publications/2015/econ/g13-aspef.pdf. Part of the manipulation is to copy the pages from the original PDF to a new PDF, and also to copy the named destinations. The iText Java API method addNamedDestinations isn't inserting the destinations into the new PDF.
Below is my code segment which is based on the example in the book iText in Action, 2nd edition.
try {
PdfReader reader1 = new PdfReader("C:\\Temp\\g13-aspef.pdf");
Document doc = new Document();
PdfCopy copy2 = new PdfCopy(doc, fileout);
doc.open();
reader1.consolidateNamedDestinations();
int n = reader1.getNumberOfPages();
for (int i = 0; i < n;) {
copy2.addPage(copy2.getImportedPage(reader1, ++i));
}
/* myDests indeed includes all 23 destinations appearing in the original PDF. */
HashMap<String,String> myDests = SimpleNamedDestination.getNamedDestination(reader1, false);
/* Use addNamedDestinations to insert the original destinations into the new PDF. */
copy2.addNamedDestinations(myDests, 0);
doc.close();
} catch (IOException e) {
System.out.println("Could not copy");
}
However, when I open the created PDF, the pages are there, but not the destinations. Why don't I see the destinations in the new PDF?
Thank you in advance!

continuing text to next page

I want to generate a PDF of questions and their options using iText. I am able to generate the PDF but the problem is sometimes questions get printed at the end of a page and options go to the next page.
How can I determine that a question and its option will not fit in the same page?
This means that if question and options will not fit in the same page then that they must be placed on the next page.
UPDATED
com.itextpdf.text.Document document = new com.itextpdf.text.Document(PageSize.A4,50,50,15,15);
ByteArrayOutputStream OutputStream = new ByteArrayOutputStream();
PdfWriter writer = PdfWriter.getInstance(document, OutputStream);
document.open();
Paragraph paragraph = new Paragraph("Paper Name Here",new Font(FontFamily.TIMES_ROMAN,15,Font.BOLD));
paragraph.setAlignment(Element.ALIGN_CENTER);
document.add(paragraph);
document.addTitle("Paper Name Here");
document.addAuthor("corp");
com.itextpdf.text.List list = new com.itextpdf.text.List(true);
for (long i = 1; i <= 20 ; i++)
{
List<MultipleChoiceSingleCorrect> multipleChoiceSingleCorrects = new MultipleChoiceSingleCorrectServicesImpl().getItemDetailsByItemID(i);
for (MultipleChoiceSingleCorrect multipleChoiceSingleCorrect : multipleChoiceSingleCorrects) {
list.add(multipleChoiceSingleCorrect.getItemText());
RomanList oplist = new RomanList();
oplist.setIndentationLeft(20);
for (OptionSingleCorrect optionSingleCorrect : multipleChoiceSingleCorrect.getOptionList()) {
oplist.add(optionSingleCorrect.getOptionText());
}
list.add(oplist);
}
}
document.add(list);
document.close();
after this I m getting abnormal page brakes means some times question is at end of page and option jumps to next page.(AS shown in image below)

What you are interested in are the setKeepTogether(boolean) methods :
for Paragraph
or for PdfPTable
This will keep the object in one page, forcing the creation of a new page if the content doesn't fit in the remaining page.

with the help of Alexis Pigeon I done with this code. So special thanks to him.
I have added question text to Paragraph after that all options kept in an list.
Option list opList added in paragraph, this paragraph add to an ListItem and this ListItem
added to an master list.
This way question splitting on two pages is resolved but I m not getting question numbers.. I already set master list as numbered=true com.itextpdf.text.List list = new com.itextpdf.text.List(true);
Code:-
try {
String Filename="PaperName.pdf";
com.itextpdf.text.Document document = new com.itextpdf.text.Document(PageSize.A4,50,50,15,15);
ByteArrayOutputStream OutputStream = new ByteArrayOutputStream();
PdfWriter writer = PdfWriter.getInstance(document, OutputStream);
document.open();
Paragraph paragraph = new Paragraph("Paper Name Here",new Font(FontFamily.TIMES_ROMAN,15,Font.BOLD));
paragraph.setAlignment(Element.ALIGN_CENTER);
paragraph.setSpacingAfter(20);
document.add(paragraph);
document.addTitle("Paper Name Here");
document.addAuthor("crop");
document.addCreator("crop");
com.itextpdf.text.List list = new com.itextpdf.text.List(true);
for (long i = 1; i <= 20 ; i++)
{
List<MultipleChoiceSingleCorrect> multipleChoiceSingleCorrects = new MultipleChoiceSingleCorrectServicesImpl().getItemDetailsByItemID(i);
for (MultipleChoiceSingleCorrect multipleChoiceSingleCorrect : multipleChoiceSingleCorrects) {
Paragraph paragraph2 =new Paragraph();
paragraph2.setKeepTogether(true);
paragraph2.add(multipleChoiceSingleCorrect.getItemText());
paragraph2.add(Chunk.NEWLINE);
RomanList oplist = new RomanList();
oplist.setIndentationLeft(20);
for (OptionSingleCorrect optionSingleCorrect : multipleChoiceSingleCorrect.getOptionList()) {
oplist.add(optionSingleCorrect.getOptionText());
}
paragraph2.add(oplist);
paragraph2.setSpacingBefore(20);
ListItem listItem =new ListItem();
listItem.setKeepTogether(true);
listItem.add(paragraph2);
list.add(listItem);
}
}
document.add(list);
document.close();
response.setContentLength(OutputStream.size());
response.setContentType("application/pdf");
response.setHeader("Content-disposition", "attachment; filename=" + Filename);
ServletOutputStream out = response.getOutputStream();
OutputStream.writeTo(out);
out.flush();
}
catch (Exception e)
{
e.printStackTrace();
}

Remove page from PDF

I'm currently using iText and I'm wondering if there is a way to delete a page from a PDF file?
I have opened it up with a reader etc., and I want to remove a page before it is then saved back to a new file; how can I do that?

The 'better' way to 'delete' pages is doing
reader.selectPages("1-5,10-12");
Which means we only select pages 1-5, 10-12 effectively 'deleting' pages 6-9.

Get the reader of existing pdf file by
PdfReader pdfReader = new PdfReader("source pdf file path");
Now update the reader by
pdfReader.selectPages("1-5,15-20");
then get the pdf stamper object to write the changes into a file by
PdfStamper pdfStamper = new PdfStamper(pdfReader,
new FileOutputStream("destination pdf file path"));
close the PdfStamper by
pdfStamper.close();
It will close the PdfReader too.
Cheers.....

For iText 7 I found this example:
PdfReader pdfReader = new PdfReader(PATH + name + ".pdf");
PdfDocument srcDoc = new PdfDocument(pdfReader);
PdfDocument resultDoc = new PdfDocument(new PdfWriter(PATH + name + "_cut.pdf"));
resultDoc.initializeOutlines();
srcDoc.copyPagesTo(1, 2, resultDoc);
resultDoc.close();
srcDoc.close();
See also here: clone-reordering-pages
and here: clone-splitting-pdf-file

You can use a PdfStamper in combination with PdfCopy.
In this answer it is explained how to copy a whole document. If you change the criteria for the loop in the sample code you can remove the pages you don't need.

Here is a removing function ready for real life usage. Proven to work ok with itext 2.1.7. It does not use "strigly typing" also.
/**
* Removes given pages from a document.
* #param reader document
* #param pagesToRemove pages to remove; 1-based
*/
public static void removePages(PdfReader reader, int... pagesToRemove) {
int pagesTotal = reader.getNumberOfPages();
List<Integer> allPages = new ArrayList<>(pagesTotal);
for (int i = 1; i <= pagesTotal; i++) {
allPages.add(i);
}
for (int page : pagesToRemove) {
allPages.remove(new Integer(page));
}
reader.selectPages(allPages);
}

Apache POI HWPF - problem in convert doc file to pdf

I am currently working Java project with use of apache poi.
Now in my project I want to convert doc file to pdf file. The conversion done successfully but I only get text in pdf not any text style or text colour.
My pdf file looks like a black & white. While my doc file is coloured and have different style of text.
This is my code,
POIFSFileSystem fs = null;
Document document = new Document();
try {
System.out.println("Starting the test");
fs = new POIFSFileSystem(new FileInputStream("/document/test2.doc"));
HWPFDocument doc = new HWPFDocument(fs);
WordExtractor we = new WordExtractor(doc);
OutputStream file = new FileOutputStream(new File("/document/test.pdf"));
PdfWriter writer = PdfWriter.getInstance(document, file);
Range range = doc.getRange();
document.open();
writer.setPageEmpty(true);
document.newPage();
writer.setPageEmpty(true);
String[] paragraphs = we.getParagraphText();
for (int i = 0; i < paragraphs.length; i++) {
org.apache.poi.hwpf.usermodel.Paragraph pr = range.getParagraph(i);
// CharacterRun run = pr.getCharacterRun(i);
// run.setBold(true);
// run.setCapitalized(true);
// run.setItalic(true);
paragraphs[i] = paragraphs[i].replaceAll("\\cM?\r?\n", "");
System.out.println("Length:" + paragraphs[i].length());
System.out.println("Paragraph" + i + ": " + paragraphs[i].toString());
// add the paragraph to the document
document.add(new Paragraph(paragraphs[i]));
}
System.out.println("Document testing completed");
} catch (Exception e) {
System.out.println("Exception during test");
e.printStackTrace();
} finally {
// close the document
document.close();
}
}
please help me.
Thnx in advance.

If you look at Apache Tika, there's a good example of reading some style information from a HWPF document. The code in Tika generates HTML based on the HWPF contents, but you should find that something very similar works for your case.
The Tika class is
https://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/WordExtractor.java
One thing to note about word documents is that everything in any one Character Run has the same formatting applied to it. A Paragraph is therefore made up of one or more Character Runs. Some styling is applied to a Paragraph, and other parts are done on the runs. Depending on what formatting interests you, it may therefore be on the paragraph or the run.

If you use WordExtractor, you will get text only. Try using CharacterRun class. You will get style along with text. Please refer following Sample code.
Range range = doc.getRange();
for (int i = 0; i < range.numParagraphs(); i++) {
org.apache.poi.hwpf.usermodel.Paragraph poiPara = range.getParagraph(i);
int j = 0;
while (true) {
CharacterRun run = poiPara.getCharacterRun(j++);
System.out.println("Color "+run.getColor());
System.out.println("Font size "+run.getFontSize());
System.out.println("Font Name "+run.getFontName());
System.out.println(run.isBold()+" "+run.isItalic()+" "+run.getUnderlineCode());
System.out.println("Text is "+run.text());
if (run.getEndOffset() == poiPara.getEndOffset()) {
break;
}
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

making more than one pdf document in java - java

Related

iText Fill Form / Copy Page to new Document

addNamedDestination not inserting destinations into new PDF

continuing text to next page

Remove page from PDF

Apache POI HWPF - problem in convert doc file to pdf

Categories

Resources