I've PDF Statements getting generated using Apache FOP 2.6. For those statements when I go to the properties, it tells the producer is Apache FOP like the one in the below screen
Now I want to merge those statements to create a single PDF File. For that I used PdfMerger from Itext7. But when I see the properties, the producer property changes to Itext7 like the one in the below screen
Here's the code I've used.
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfReader;
import com.itextpdf.kernel.pdf.PdfWriter;
import com.itextpdf.kernel.utils.PdfMerger;
public class ApacheFOPExample {
public static void main(String[] args) {
mergePdfs();
}
public static void mergePdfs() {
try {
PdfDocument pdf = new PdfDocument(new PdfWriter(new File("D:\\Workspace\\Itext\\xsl_employee_merged.pdf")));
PdfMerger merger = new PdfMerger(pdf);
//Add pages from the first document
PdfDocument firstSourcePdf = new PdfDocument(new PdfReader(new File("D:\\Workspace\\Itext\\xsl_employee.pdf")));
merger.merge(firstSourcePdf, 1, firstSourcePdf.getNumberOfPages());
//Add pages from the second pdf document
PdfDocument secondSourcePdf = new PdfDocument(new PdfReader(new File("D:\\Workspace\\Itext\\xsl_employee_another.pdf")));
merger.merge(secondSourcePdf, 1, secondSourcePdf.getNumberOfPages());
firstSourcePdf.close();
secondSourcePdf.close();
pdf.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
I want to merge or copy multiple multiple statements as a single PDF but with the properties intact. How can I achieve it? To merge, is there any alternative way that I can follow to achieve this?
Note: I can't use Itext5 classes to do this.
Thank You
I have a java program named wordToPdf that converts the *.docx file to the *.pdf file. The program runs pretty well with Apache POI 4.1.2 along with POI OOXML 4.1.2 and the fr.opensagres.xdocreport 2.0.2. The *.pdf result is created successfully.
Below is the java program
import fr.opensagres.poi.xwpf.converter.pdf.PdfConverter;
import fr.opensagres.poi.xwpf.converter.pdf.PdfOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import java.io.*;
public void wordToPdf(String inDocFile, String outPdfFile) {
try {
File src = new File(inDocFile);
InputStream doc = new FileInputStream(src);
XWPFDocument document = new XWPFDocument(doc);
PdfOptions options = null;
OutputStream out = new FileOutputStream(new File(outPdfFile));
PdfConverter.getInstance().convert(document, out, options);
} catch (Exception e) {
e.printStackTrace();
}
}
However, when the program is called from an HttpServletRequest, it doesn't work the same as the above scenario. In stead, the NoClassDefFoundError exception of org.apache.poi.POIXMLDocumentPart is returned.
Has anyone experienced this issue previously?
Please help me. Thanks so much, guys.
I have the following code that works fine as an application. I am converting the code to an applet for web use. I am running this class statically from the main applet.
The code compiles without any errors. The code creates the output file. But when it attempts to create the Document it throws a runtime error java.lang.ClassNotFoundException com.itextpdf.text.DocListener
I have included the DocListener in the imports even though it is unnecessary. I have put all of the class files into a self signed jar file. I have turned down the java security to allow self signed applets to run after prompting (test environment). The applet runs fine until I try to create the PDF document then it generates the runtime error.
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.StringReader;
import java.io.File;
import com.itextpdf.text.Document;
import com.itextpdf.text.DocListener;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Paragraph;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.text.html.simpleparser.HTMLWorker;
public class WFRPPDFWrite {
public static void WritePDF() throws DocumentException, IOException {
String htmltext = "<!DOCTYPE html><HTML><HEAD><TITLE>Test HTML</TITLE></HEAD><BODY></BODY></HTML>";
System.out.println("HTML: " + htmltext);
FileOutputStream PDFOut = null;
try {
System.out.println("Attempting to Write PDF");
PDFOut = new FileOutputStream(new File("HTMLOut.pdf"));
} catch (Exception fio){
fio.printStackTrace();
}
try {
System.out.println("PDF Write: 2");
Document doc = new Document();
System.out.println("PDF Write: 3");
PdfWriter.getInstance(doc,PDFOut);
System.out.println("PDF Write: 4");
doc.open();
System.out.println("PDF Write: 6");
HTMLWorker work = new HTMLWorker(doc);
System.out.println("PDF Write: 7");
work.parse(new StringReader(htmltext));
System.out.println("PDF Write: 8");
doc.close();
} catch(Exception doce) {
doce.printStackTrace();
}
System.out.println("PDF Write: 9");
PDFOut.close();
}
}
You need to append your additional jar (itextpdf.jar) to your archive attribute in the applet tag like in an example above:
<applet ... archive="yourApplet.jar,itextpdf.jar,otherjar.jar" />
I have done reading doc file now i'm trying to read docx file content. when i searched for sample code i found many, nothing worked. check the code for reference...
import java.io.*;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.text.Document;
import com.itextpdf.text.Paragraph;
public class createPdfForDocx {
public static void main(String[] args) {
InputStream fs = null;
Document document = new Document();
XWPFWordExtractor extractor = null ;
try {
fs = new FileInputStream("C:\\DATASTORE\\test.docx");
//XWPFDocument hdoc=new XWPFDocument(fs);
XWPFDocument hdoc=new XWPFDocument(OPCPackage.open(fs));
//XWPFDocument hdoc=new XWPFDocument(fs);
extractor = new XWPFWordExtractor(hdoc);
OutputStream fileOutput = new FileOutputStream(new File("C:/DATASTORE/test.pdf"));
PdfWriter.getInstance(document, fileOutput);
document.open();
String fileData=extractor.getText();
System.out.println(fileData);
document.add(new Paragraph(fileData));
System.out.println(" pdf document created");
} catch(IOException e) {
System.out.println("IO Exception");
e.printStackTrace();
} catch(Exception ex) {
ex.printStackTrace();
}finally {
document.close();
}
}//end of main()
}//end of class
For the above code i'm getting following Exception:
org.apache.poi.POIXMLException: java.lang.reflect.InvocationTargetException
at org.apache.poi.xwpf.usermodel.XWPFFactory.createDocumentPart(XWPFFactory.java:60)
at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:277)
at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:186)
at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:107)
at pagecode.createPdfForDocx.main(createPdfForDocx.java:20)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:67)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:521)
at org.apache.poi.xwpf.usermodel.XWPFFactory.createDocumentPart(XWPFFactory.java:58)
... 4 more
Caused by: java.lang.NoSuchMethodError: org/openxmlformats/schemas/wordprocessingml/x2006/main/CTStyles.getStyleList()Ljava/util/List;
at org.apache.poi.xwpf.usermodel.XWPFStyles.onDocumentRead(XWPFStyles.java:78)
at org.apache.poi.xwpf.usermodel.XWPFStyles.<init>(XWPFStyles.java:59)
... 9 more
Please help
Thank you
This is covered in the Apache POI FAQ! The entry you want is I'm using the poi-ooxml-schemas jar, but my code is failing with "java.lang.NoClassDefFoundError: org/openxmlformats/schemas/something"
The short answer is to switch the poi-ooxml-schemas jar for the full ooxml-schemas-1.1 jar. The full answer is given in the FAQ
For reading excels or docx file if you want to solve errors you need to add all jars then you wont get any error.
I need to use DOCX files (actually the XML contained in them) in a Java software I'm currently developing, but some people in my company still use the DOC format.
Do you know if there is a way to convert a DOC file to the DOCX format using Java ? I know it's possible using C#, but that's not an option
I googled it, but nothing came up...
Thanks
You may try Aspose.Words for Java. It allows you to load a DOC file and save it as DOCX format. The code is very simple as shown below:
// Open a document.
Document doc = new Document("input.doc");
// Save document.
doc.save("output.docx");
Please see if this helps in your scenario.
Disclosure: I work as developer evangelist at Aspose.
Check out JODConverter to see if it fits the bill. I haven't personally used it.
Use newer versions of jars jodconverter-core-4.2.2.jar and jodconverter-local-4.2.2.jar
String inputFile = "*.doc";
String outputFile = "*.docx";
LocalOfficeManager localOfficeManager = LocalOfficeManager.builder()
.install()
.officeHome(getDefaultOfficeHome()) //your path to openoffice
.build();
try {
localOfficeManager.start();
final DocumentFormat format
= DocumentFormat.builder()
.from(DefaultDocumentFormatRegistry.DOCX)
.build();
LocalConverter
.make()
.convert(new FileInputStream(new File(inputFile)))
.as(DefaultDocumentFormatRegistry.getFormatByMediaType("application/msword"))
.to(new File(outputFile))
.as(format)
.execute();
} catch (OfficeException ex) {
Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex);
} catch (FileNotFoundException ex) {
Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex);
} finally {
OfficeUtils.stopQuietly(localOfficeManager);
}
JODConvertor calls OpenOffice/LibreOffice via a network protocol. It can therefore 'do anything you can do in OpenOffice'. This includes converting formats. But it only does as good a job as whatever version of OpenOffice you are running. I have some art in one of my docs, and it doesn't convert them as I hoped.
JODConvertor is no longer supported, according to the google code web site for v3.
To get JOD to do the job you need to do something like
private static void transformBinaryWordDocToDocX(File in, File out)
{
OfficeDocumentConverter converter = new OfficeDocumentConverter(officeManager);
DocumentFormat docx = converter.getFormatRegistry().getFormatByExtension("docx");
docx.setStoreProperties(DocumentFamily.TEXT,
Collections.singletonMap("FilterName", "MS Word 2007 XML"));
converter.convert(in, out, docx);
}
private static void transformBinaryWordDocToW2003Xml(File in, File out)
{
OfficeDocumentConverter converter = new OfficeDocumentConverter(officeManager);;
DocumentFormat w2003xml = new DocumentFormat("Microsoft Word 2003 XML", "xml", "text/xml");
w2003xml.setInputFamily(DocumentFamily.TEXT);
w2003xml.setStoreProperties(DocumentFamily.TEXT, Collections.singletonMap("FilterName", "MS Word 2003 XML"));
converter.convert(in, out, w2003xml);
}
private static OfficeManager officeManager;
#BeforeClass
public static void setupStatic() throws IOException {
/*officeManager = new DefaultOfficeManagerConfiguration()
.setOfficeHome("C:/Program Files/LibreOffice 3.6")
.buildOfficeManager();
*/
officeManager = new ExternalOfficeManagerConfiguration().setConnectOnStart(true).setPortNumber(8100).buildOfficeManager();
officeManager.start();
}
#AfterClass
public static void shutdownStatic() throws IOException {
officeManager.stop();
}
For this to work you need to be running LibreOffice as a networked server ( I could not get the 'run on demand' part of JODConvertor to work under windows with LO 3.6 very well )
To convert DOC file to HTML look at this
(Convert Word doc to HTML programmatically in Java)
Use this: http://poi.apache.org/
Or use this :
XWPFDocument docx = new XWPFDocument(OPCPackage.openOrCreate(new File("hello.docx")));
XWPFWordExtractor wx = new XWPFWordExtractor(docx);
String text = wx.getText();
System.out.println("text = "+text);
I needed the same conversion ,after researching a lot found Jodconvertor can be useful in it , you can download the jar from
https://code.google.com/p/jodconverter/downloads/list
Add jodconverter-core-3.0-beta-4-sources.jar file to your project lib
//1) Create OfficeManger Object
OfficeManager officeManager = new DefaultOfficeManagerConfiguration()
.setOfficeHome(new File("/opt/libreoffice4.4"))
.buildOfficeManager();
officeManager.start();
// 2) Create JODConverter converter
OfficeDocumentConverter converter = new OfficeDocumentConverter(
officeManager);
// 3)Create DocumentFormat for docx
DocumentFormat docx = converter.getFormatRegistry().getFormatByExtension("docx");
docx.setStoreProperties(DocumentFamily.TEXT,
Collections.singletonMap("FilterName", "MS Word 2007 XML"));
//4)Call convert funtion in converter object
converter.convert(new File("doc/AdvancedTable.doc"), new File(
"docx/AdvancedTable.docx"), docx);
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.OutputStream;
import com.lowagie.text.Document;
import com.lowagie.text.DocumentException;
import com.lowagie.text.Paragraph;
import com.lowagie.text.pdf.PdfWriter;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
public class TestCon {
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
POIFSFileSystem fs = null;
Document document = new Document();
try {
System.out.println("Starting the test");
fs = new POIFSFileSystem(new FileInputStream("C:/Users/312845/Desktop/a.doc"));
HWPFDocument doc = new HWPFDocument(fs);
WordExtractor we = new WordExtractor(doc);
OutputStream file = new FileOutputStream(new File("C:/Users/312845/Desktop/test.docx"));
System.out.println("Document testing completed");
} catch (Exception e) {
System.out.println("Exception during test");
e.printStackTrace();
} finally {
// close the document
document.close();
}
}
}