how to get a whole content of file using apache-poi?

how to get a whole content of file using apache-poi? - java

I try to read file .docx with help java api Apache POI. I use:
public static String view(String nameDoc){
String text = null;
try{
XWPFDocument docx = new XWPFDocument(
new FileInputStream(nameDoc));
XWPFWordExtractor we = new XWPFWordExtractor(docx);
text = we.getText();
we.close();
docx.close();
}catch (Exception e){
e.printStackTrace();
}
return text;
}
In this case i get only a text of file, but my file includes a text, table, pictures... How can i get full content of file?

String contents = "";
try {
System.out.println("Starting the test");
POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream("D:/Resume.doc"));
HWPFDocument doc = new HWPFDocument(fs);
WordExtractor we = new WordExtractor(doc);
OutputStream file = new FileOutputStream(new File("D:/test.pdf"));
PdfWriter parser = PdfWriter.getInstance(doc, file);
parser.parse();
PDDocument pdfDocument = parser.getPDDocument();
PDFTextStripper stripper = new PDFTextStripper();
contents = stripper.getText(pdfDocument);
pdfDocument.close();
} catch (Exception e) {
logger.error(e.getMessage());
}
In contents you get full content of file.

Related

How to merge or concat to byte [ ] arrays of PDF's type in java but am getting pdf1 as output [duplicate]

I want to merge many PDF files into one using PDFBox and this is what I've done:
PDDocument document = new PDDocument();
for (String pdfFile: pdfFiles) {
PDDocument part = PDDocument.load(pdfFile);
List<PDPage> list = part.getDocumentCatalog().getAllPages();
for (PDPage page: list) {
document.addPage(page);
}
part.close();
}
document.save("merged.pdf");
document.close();
Where pdfFiles is an ArrayList<String> containing all the PDF files.
When I'm running the above, I'm always getting:
org.apache.pdfbox.exceptions.COSVisitorException: Bad file descriptor
Am I doing something wrong? Is there any other way of doing it?

Why not use the PDFMergerUtility of pdfbox?
PDFMergerUtility ut = new PDFMergerUtility();
ut.addSource(...);
ut.addSource(...);
ut.addSource(...);
ut.setDestinationFileName(...);
ut.mergeDocuments();

A quick Google search returned this bug: "Bad file descriptor while saving a document w. imported PDFs".
It looks like you need to keep the PDFs to be merged open, until after you have saved and closed the combined PDF.

This is a ready to use code, merging four pdf files with itext.jar from http://central.maven.org/maven2/com/itextpdf/itextpdf/5.5.0/itextpdf-5.5.0.jar, more on http://tutorialspointexamples.com/
import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfImportedPage;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfWriter;
/**
* This class is used to merge two or more
* existing pdf file using iText jar.
*/
public class PDFMerger {
static void mergePdfFiles(List<InputStream> inputPdfList,
OutputStream outputStream) throws Exception{
//Create document and pdfReader objects.
Document document = new Document();
List<PdfReader> readers =
new ArrayList<PdfReader>();
int totalPages = 0;
//Create pdf Iterator object using inputPdfList.
Iterator<InputStream> pdfIterator =
inputPdfList.iterator();
// Create reader list for the input pdf files.
while (pdfIterator.hasNext()) {
InputStream pdf = pdfIterator.next();
PdfReader pdfReader = new PdfReader(pdf);
readers.add(pdfReader);
totalPages = totalPages + pdfReader.getNumberOfPages();
}
// Create writer for the outputStream
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
//Open document.
document.open();
//Contain the pdf data.
PdfContentByte pageContentByte = writer.getDirectContent();
PdfImportedPage pdfImportedPage;
int currentPdfReaderPage = 1;
Iterator<PdfReader> iteratorPDFReader = readers.iterator();
// Iterate and process the reader list.
while (iteratorPDFReader.hasNext()) {
PdfReader pdfReader = iteratorPDFReader.next();
//Create page and add content.
while (currentPdfReaderPage <= pdfReader.getNumberOfPages()) {
document.newPage();
pdfImportedPage = writer.getImportedPage(
pdfReader,currentPdfReaderPage);
pageContentByte.addTemplate(pdfImportedPage, 0, 0);
currentPdfReaderPage++;
}
currentPdfReaderPage = 1;
}
//Close document and outputStream.
outputStream.flush();
document.close();
outputStream.close();
System.out.println("Pdf files merged successfully.");
}
public static void main(String args[]){
try {
//Prepare input pdf file list as list of input stream.
List<InputStream> inputPdfList = new ArrayList<InputStream>();
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_1.pdf"));
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_2.pdf"));
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_3.pdf"));
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_4.pdf"));
//Prepare output stream for merged pdf file.
OutputStream outputStream =
new FileOutputStream("..\\pdf\\MergeFile_1234.pdf");
//call method to merge pdf files.
mergePdfFiles(inputPdfList, outputStream);
} catch (Exception e) {
e.printStackTrace();
}
}
}

Multiple pdf merged method using org.apache.pdfbox:
public void mergePDFFiles(List<File> files,
String mergedFileName) {
try {
PDFMergerUtility pdfmerger = new PDFMergerUtility();
for (File file : files) {
PDDocument document = PDDocument.load(file);
pdfmerger.setDestinationFileName(mergedFileName);
pdfmerger.addSource(file);
pdfmerger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
document.close();
}
} catch (IOException e) {
logger.error("Error to merge files. Error: " + e.getMessage());
}
}
From main program, call mergePDFFiles method using list of files and target file name.
String mergedFileName = "Merged.pdf";
mergePDFFiles(files, mergedFileName);
After calling mergePDFFiles, load merged file
File mergedFile = new File(mergedFileName);

package article14;
import java.io.File;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.util.PDFMergerUtility;
public class Pdf
{
public static void main(String args[])
{
new Pdf().createNew();
new Pdf().combine();
}
public void combine()
{
try
{
PDFMergerUtility mergePdf = new PDFMergerUtility();
String folder ="pdf";
File _folder = new File(folder);
File[] filesInFolder;
filesInFolder = _folder.listFiles();
for (File string : filesInFolder)
{
mergePdf.addSource(string);
}
mergePdf.setDestinationFileName("Combined.pdf");
mergePdf.mergeDocuments();
}
catch(Exception e)
{
}
}
public void createNew()
{
PDDocument document = null;
try
{
String filename="test.pdf";
document=new PDDocument();
PDPage blankPage = new PDPage();
document.addPage( blankPage );
document.save( filename );
}
catch(Exception e)
{
}
}
}

If you want to combine two files where one overlays the other (example: document A is a template and document B has the text you want to put on the template), this works:
after creating "doc", you want to write your template (templateFile) on top of that -
PDDocument watermarkDoc = PDDocument.load(getServletContext()
.getRealPath(templateFile));
Overlay overlay = new Overlay();
overlay.overlay(watermarkDoc, doc);

Using iText (existing PDF in bytes)
public static byte[] mergePDF(List<byte[]> pdfFilesAsByteArray) throws DocumentException, IOException {
ByteArrayOutputStream outStream = new ByteArrayOutputStream();
Document document = null;
PdfCopy writer = null;
for (byte[] pdfByteArray : pdfFilesAsByteArray) {
try {
PdfReader reader = new PdfReader(pdfByteArray);
int numberOfPages = reader.getNumberOfPages();
if (document == null) {
document = new Document(reader.getPageSizeWithRotation(1));
writer = new PdfCopy(document, outStream); // new
document.open();
}
PdfImportedPage page;
for (int i = 0; i < numberOfPages;) {
++i;
page = writer.getImportedPage(reader, i);
writer.addPage(page);
}
}
catch (Exception e) {
e.printStackTrace();
}
}
document.close();
outStream.close();
return outStream.toByteArray();
}

How can I append an existing PDF to a generated PDF with flying saucer for Java [duplicate]

I want to merge many PDF files into one using PDFBox and this is what I've done:
PDDocument document = new PDDocument();
for (String pdfFile: pdfFiles) {
PDDocument part = PDDocument.load(pdfFile);
List<PDPage> list = part.getDocumentCatalog().getAllPages();
for (PDPage page: list) {
document.addPage(page);
}
part.close();
}
document.save("merged.pdf");
document.close();
Where pdfFiles is an ArrayList<String> containing all the PDF files.
When I'm running the above, I'm always getting:
org.apache.pdfbox.exceptions.COSVisitorException: Bad file descriptor
Am I doing something wrong? Is there any other way of doing it?

Why not use the PDFMergerUtility of pdfbox?
PDFMergerUtility ut = new PDFMergerUtility();
ut.addSource(...);
ut.addSource(...);
ut.addSource(...);
ut.setDestinationFileName(...);
ut.mergeDocuments();

A quick Google search returned this bug: "Bad file descriptor while saving a document w. imported PDFs".
It looks like you need to keep the PDFs to be merged open, until after you have saved and closed the combined PDF.

This is a ready to use code, merging four pdf files with itext.jar from http://central.maven.org/maven2/com/itextpdf/itextpdf/5.5.0/itextpdf-5.5.0.jar, more on http://tutorialspointexamples.com/
import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfImportedPage;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfWriter;
/**
* This class is used to merge two or more
* existing pdf file using iText jar.
*/
public class PDFMerger {
static void mergePdfFiles(List<InputStream> inputPdfList,
OutputStream outputStream) throws Exception{
//Create document and pdfReader objects.
Document document = new Document();
List<PdfReader> readers =
new ArrayList<PdfReader>();
int totalPages = 0;
//Create pdf Iterator object using inputPdfList.
Iterator<InputStream> pdfIterator =
inputPdfList.iterator();
// Create reader list for the input pdf files.
while (pdfIterator.hasNext()) {
InputStream pdf = pdfIterator.next();
PdfReader pdfReader = new PdfReader(pdf);
readers.add(pdfReader);
totalPages = totalPages + pdfReader.getNumberOfPages();
}
// Create writer for the outputStream
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
//Open document.
document.open();
//Contain the pdf data.
PdfContentByte pageContentByte = writer.getDirectContent();
PdfImportedPage pdfImportedPage;
int currentPdfReaderPage = 1;
Iterator<PdfReader> iteratorPDFReader = readers.iterator();
// Iterate and process the reader list.
while (iteratorPDFReader.hasNext()) {
PdfReader pdfReader = iteratorPDFReader.next();
//Create page and add content.
while (currentPdfReaderPage <= pdfReader.getNumberOfPages()) {
document.newPage();
pdfImportedPage = writer.getImportedPage(
pdfReader,currentPdfReaderPage);
pageContentByte.addTemplate(pdfImportedPage, 0, 0);
currentPdfReaderPage++;
}
currentPdfReaderPage = 1;
}
//Close document and outputStream.
outputStream.flush();
document.close();
outputStream.close();
System.out.println("Pdf files merged successfully.");
}
public static void main(String args[]){
try {
//Prepare input pdf file list as list of input stream.
List<InputStream> inputPdfList = new ArrayList<InputStream>();
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_1.pdf"));
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_2.pdf"));
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_3.pdf"));
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_4.pdf"));
//Prepare output stream for merged pdf file.
OutputStream outputStream =
new FileOutputStream("..\\pdf\\MergeFile_1234.pdf");
//call method to merge pdf files.
mergePdfFiles(inputPdfList, outputStream);
} catch (Exception e) {
e.printStackTrace();
}
}
}

Multiple pdf merged method using org.apache.pdfbox:
public void mergePDFFiles(List<File> files,
String mergedFileName) {
try {
PDFMergerUtility pdfmerger = new PDFMergerUtility();
for (File file : files) {
PDDocument document = PDDocument.load(file);
pdfmerger.setDestinationFileName(mergedFileName);
pdfmerger.addSource(file);
pdfmerger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
document.close();
}
} catch (IOException e) {
logger.error("Error to merge files. Error: " + e.getMessage());
}
}
From main program, call mergePDFFiles method using list of files and target file name.
String mergedFileName = "Merged.pdf";
mergePDFFiles(files, mergedFileName);
After calling mergePDFFiles, load merged file
File mergedFile = new File(mergedFileName);

package article14;
import java.io.File;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.util.PDFMergerUtility;
public class Pdf
{
public static void main(String args[])
{
new Pdf().createNew();
new Pdf().combine();
}
public void combine()
{
try
{
PDFMergerUtility mergePdf = new PDFMergerUtility();
String folder ="pdf";
File _folder = new File(folder);
File[] filesInFolder;
filesInFolder = _folder.listFiles();
for (File string : filesInFolder)
{
mergePdf.addSource(string);
}
mergePdf.setDestinationFileName("Combined.pdf");
mergePdf.mergeDocuments();
}
catch(Exception e)
{
}
}
public void createNew()
{
PDDocument document = null;
try
{
String filename="test.pdf";
document=new PDDocument();
PDPage blankPage = new PDPage();
document.addPage( blankPage );
document.save( filename );
}
catch(Exception e)
{
}
}
}

If you want to combine two files where one overlays the other (example: document A is a template and document B has the text you want to put on the template), this works:
after creating "doc", you want to write your template (templateFile) on top of that -
PDDocument watermarkDoc = PDDocument.load(getServletContext()
.getRealPath(templateFile));
Overlay overlay = new Overlay();
overlay.overlay(watermarkDoc, doc);

Using iText (existing PDF in bytes)
public static byte[] mergePDF(List<byte[]> pdfFilesAsByteArray) throws DocumentException, IOException {
ByteArrayOutputStream outStream = new ByteArrayOutputStream();
Document document = null;
PdfCopy writer = null;
for (byte[] pdfByteArray : pdfFilesAsByteArray) {
try {
PdfReader reader = new PdfReader(pdfByteArray);
int numberOfPages = reader.getNumberOfPages();
if (document == null) {
document = new Document(reader.getPageSizeWithRotation(1));
writer = new PdfCopy(document, outStream); // new
document.open();
}
PdfImportedPage page;
for (int i = 0; i < numberOfPages;) {
++i;
page = writer.getImportedPage(reader, i);
writer.addPage(page);
}
}
catch (Exception e) {
e.printStackTrace();
}
}
document.close();
outStream.close();
return outStream.toByteArray();
}

convert Word document to .pdf

I am not able to read Microsoft Word page by page and write into PDF file.
I am having Apache poi jar 3.15 and itext jar 5.0.6. In Apache poi,i am having 5 jars namely poi-3.14.jar,poi-ooxml-3.14.jar,poi-ooxml-schemas-3.14.jar ,XmlBeans 2.6.0.jar.Poi Scratchpad 3.14.jar My code is shown below.
public class Script
{
public void WordToPdf()
{
POIFSFileSystem fs = null;
Document document = new Document();
try
{
System.out.println("Starting the test");
fs = new POIFSFileSystem(new FileInputStream("C:\\Users\\admin\\Desktop\\Srieedher.doc"));
HWPFDocument doc = new HWPFDocument(fs);
WordExtractor we = new WordExtractor(doc);
OutputStream file = new FileOutputStream(new File("C:\\Users\\admin\\Desktop\\test.pdf"));
PdfWriter writer = PdfWriter.getInstance(document, file);
}
catch (Exception e) {
System.out.println("Exception during test");
e.printStackTrace();
}
finally
{
document.close();
}
}
public static void main(String args[])
{
Script1 s=new Script();
s1.WordToPdf();
}
}

JSF Primefaces p:fileDownload file name contains UTF-8 characters

I am working on Java 8, JSF 2, Primefaces 5.1.
Conversation to PDF or Docx works, but when I am displaying file name, it just skips UTF-8 encoded letters, in my case, Lithuanian letters like ą,č,ę,ė,į,š,ų,ū
What I have tried so farm is :
<h:form enctype="multipart/form-data;charset=UTF-8">
Charset.forName("UTF-8").encode(myString)
or
byte[] bytes = templateTitle.getBytes(Charset.forName("UTF-8"));
String title = new String(bytes, Charset.forName("UTF-8"));
or
UTF-8 text is garbled when form is posted as multipart/form-data
checked some tuttorials about encoding, still, no use,
also checked this, but I just do not understand this example...
Primefaces fileDownload non-english file names corrupt
my code:
Download file as docx
public void downloadTemplateAsDocx() throws Exception {
try {
InputStream content = null;
String objID = this.actData.getMainActs().get(0).getId();
ContentStream cmisStream = folderCatalogue.getDocumentContentStream(objID);
content = cmisStream.getStream();
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
AlternativeFormatInputPart afiPart = new AlternativeFormatInputPart(new PartName("/hw.html"));
afiPart.setBinaryData(content);
afiPart.setContentType(new ContentType("text/html"));
Relationship altChunkRel = wordMLPackage.getMainDocumentPart().addTargetPart(afiPart);
CTAltChunk ac = Context.getWmlObjectFactory().createCTAltChunk();
ac.setId(altChunkRel.getId());
wordMLPackage.getMainDocumentPart().addObject(ac);
wordMLPackage.getContentTypeManager().addDefaultContentType("html", "text/html");
File fileTmp = File.createTempFile("tempDocFile", "docx");
wordMLPackage.save(fileTmp);
streamedContent = new DefaultStreamedContent(new FileInputStream(fileTmp), cmisStream.getMimeType(),
templateTitle + ".docx", "UTF-8");
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (InvalidFormatException eInv) {
eInv.printStackTrace();
} catch (IOException ioEx) {
ioEx.printStackTrace();
} catch (Docx4JException docxEx) {
docxEx.printStackTrace();
}
}
code for .Pdf file download.
public void downloadTemplateAsPdf() {
try {
InputStream content = null;
String objID = this.actData.getMainActs().get(0).getId();
ContentStream cmisStream = folderCatalogue.getDocumentContentStream(objID);
content = cmisStream.getStream();
File fileTmp = File.createTempFile("tempFile", "pdf");
OutputStream fileStream = new FileOutputStream(fileTmp);
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, fileStream);
document.open();
XMLWorkerHelper worker = XMLWorkerHelper.getInstance();
worker.parseXHtml(writer, document, content, Charset.forName("UTF-8"));
document.close();
fileStream.close();
streamedContent = new DefaultStreamedContent(new FileInputStream(fileTmp), cmisStream.getMimeType(),
templateTitle + ".pdf");
} catch (FileNotFoundException e) {
e.printStackTrace();
System.out.println("File was not found");
} catch (IOException ex) {
ex.printStackTrace();
} catch (Exception exeption) {
exeption.printStackTrace();
}
}
EDIT:
<p:fileDownload value="#{controller.streamedContent}" />
private StreamedContent streamedContent;

Solution,
String title = URLEncoder.encode(templateTitle, "UTF-8");
StringBuilder fileName = new StringBuilder(title);
if (title.contains("+")) {
for (int i = 0; i < title.length(); i++) {
if (title.charAt(i) == '+') {
fileName.setCharAt(i, ' ');
}
}
}
This Encoding works fine, just it replaces all spaces to + that's why I loop over it.

How to merge two PDF files into one in Java?

I want to merge many PDF files into one using PDFBox and this is what I've done:
PDDocument document = new PDDocument();
for (String pdfFile: pdfFiles) {
PDDocument part = PDDocument.load(pdfFile);
List<PDPage> list = part.getDocumentCatalog().getAllPages();
for (PDPage page: list) {
document.addPage(page);
}
part.close();
}
document.save("merged.pdf");
document.close();
Where pdfFiles is an ArrayList<String> containing all the PDF files.
When I'm running the above, I'm always getting:
org.apache.pdfbox.exceptions.COSVisitorException: Bad file descriptor
Am I doing something wrong? Is there any other way of doing it?

Why not use the PDFMergerUtility of pdfbox?
PDFMergerUtility ut = new PDFMergerUtility();
ut.addSource(...);
ut.addSource(...);
ut.addSource(...);
ut.setDestinationFileName(...);
ut.mergeDocuments();

A quick Google search returned this bug: "Bad file descriptor while saving a document w. imported PDFs".
It looks like you need to keep the PDFs to be merged open, until after you have saved and closed the combined PDF.

This is a ready to use code, merging four pdf files with itext.jar from http://central.maven.org/maven2/com/itextpdf/itextpdf/5.5.0/itextpdf-5.5.0.jar, more on http://tutorialspointexamples.com/
import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfImportedPage;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfWriter;
/**
* This class is used to merge two or more
* existing pdf file using iText jar.
*/
public class PDFMerger {
static void mergePdfFiles(List<InputStream> inputPdfList,
OutputStream outputStream) throws Exception{
//Create document and pdfReader objects.
Document document = new Document();
List<PdfReader> readers =
new ArrayList<PdfReader>();
int totalPages = 0;
//Create pdf Iterator object using inputPdfList.
Iterator<InputStream> pdfIterator =
inputPdfList.iterator();
// Create reader list for the input pdf files.
while (pdfIterator.hasNext()) {
InputStream pdf = pdfIterator.next();
PdfReader pdfReader = new PdfReader(pdf);
readers.add(pdfReader);
totalPages = totalPages + pdfReader.getNumberOfPages();
}
// Create writer for the outputStream
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
//Open document.
document.open();
//Contain the pdf data.
PdfContentByte pageContentByte = writer.getDirectContent();
PdfImportedPage pdfImportedPage;
int currentPdfReaderPage = 1;
Iterator<PdfReader> iteratorPDFReader = readers.iterator();
// Iterate and process the reader list.
while (iteratorPDFReader.hasNext()) {
PdfReader pdfReader = iteratorPDFReader.next();
//Create page and add content.
while (currentPdfReaderPage <= pdfReader.getNumberOfPages()) {
document.newPage();
pdfImportedPage = writer.getImportedPage(
pdfReader,currentPdfReaderPage);
pageContentByte.addTemplate(pdfImportedPage, 0, 0);
currentPdfReaderPage++;
}
currentPdfReaderPage = 1;
}
//Close document and outputStream.
outputStream.flush();
document.close();
outputStream.close();
System.out.println("Pdf files merged successfully.");
}
public static void main(String args[]){
try {
//Prepare input pdf file list as list of input stream.
List<InputStream> inputPdfList = new ArrayList<InputStream>();
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_1.pdf"));
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_2.pdf"));
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_3.pdf"));
inputPdfList.add(new FileInputStream("..\\pdf\\pdf_4.pdf"));
//Prepare output stream for merged pdf file.
OutputStream outputStream =
new FileOutputStream("..\\pdf\\MergeFile_1234.pdf");
//call method to merge pdf files.
mergePdfFiles(inputPdfList, outputStream);
} catch (Exception e) {
e.printStackTrace();
}
}
}

Multiple pdf merged method using org.apache.pdfbox:
public void mergePDFFiles(List<File> files,
String mergedFileName) {
try {
PDFMergerUtility pdfmerger = new PDFMergerUtility();
for (File file : files) {
PDDocument document = PDDocument.load(file);
pdfmerger.setDestinationFileName(mergedFileName);
pdfmerger.addSource(file);
pdfmerger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
document.close();
}
} catch (IOException e) {
logger.error("Error to merge files. Error: " + e.getMessage());
}
}
From main program, call mergePDFFiles method using list of files and target file name.
String mergedFileName = "Merged.pdf";
mergePDFFiles(files, mergedFileName);
After calling mergePDFFiles, load merged file
File mergedFile = new File(mergedFileName);

package article14;
import java.io.File;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.util.PDFMergerUtility;
public class Pdf
{
public static void main(String args[])
{
new Pdf().createNew();
new Pdf().combine();
}
public void combine()
{
try
{
PDFMergerUtility mergePdf = new PDFMergerUtility();
String folder ="pdf";
File _folder = new File(folder);
File[] filesInFolder;
filesInFolder = _folder.listFiles();
for (File string : filesInFolder)
{
mergePdf.addSource(string);
}
mergePdf.setDestinationFileName("Combined.pdf");
mergePdf.mergeDocuments();
}
catch(Exception e)
{
}
}
public void createNew()
{
PDDocument document = null;
try
{
String filename="test.pdf";
document=new PDDocument();
PDPage blankPage = new PDPage();
document.addPage( blankPage );
document.save( filename );
}
catch(Exception e)
{
}
}
}

If you want to combine two files where one overlays the other (example: document A is a template and document B has the text you want to put on the template), this works:
after creating "doc", you want to write your template (templateFile) on top of that -
PDDocument watermarkDoc = PDDocument.load(getServletContext()
.getRealPath(templateFile));
Overlay overlay = new Overlay();
overlay.overlay(watermarkDoc, doc);

Using iText (existing PDF in bytes)
public static byte[] mergePDF(List<byte[]> pdfFilesAsByteArray) throws DocumentException, IOException {
ByteArrayOutputStream outStream = new ByteArrayOutputStream();
Document document = null;
PdfCopy writer = null;
for (byte[] pdfByteArray : pdfFilesAsByteArray) {
try {
PdfReader reader = new PdfReader(pdfByteArray);
int numberOfPages = reader.getNumberOfPages();
if (document == null) {
document = new Document(reader.getPageSizeWithRotation(1));
writer = new PdfCopy(document, outStream); // new
document.open();
}
PdfImportedPage page;
for (int i = 0; i < numberOfPages;) {
++i;
page = writer.getImportedPage(reader, i);
writer.addPage(page);
}
}
catch (Exception e) {
e.printStackTrace();
}
}
document.close();
outStream.close();
return outStream.toByteArray();
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

how to get a whole content of file using apache-poi? - java

Related

How to merge or concat to byte [ ] arrays of PDF's type in java but am getting pdf1 as output [duplicate]

How can I append an existing PDF to a generated PDF with flying saucer for Java [duplicate]

convert Word document to .pdf

JSF Primefaces p:fileDownload file name contains UTF-8 characters

How to merge two PDF files into one in Java?

Categories

Resources