Converting a docx containing a chart to PDF

Converting a docx containing a chart to PDF - java

I've got a docx4j generated file which contains several tables, titles and, finally, an excel-generated curve chart.
I have tried many approaches in order to convert this file to PDF, but did not get to any successful result.
Docx4j with xsl-fo did not work, most of the things included in the docx file are not yet implemented and show up in red text as "not implemented".
JODConverter did not work either, I got a resulting PDF in which everything was pretty good (just little formatting/styling issues) BUT the graph did not show up.
Finally, the closest approach was using Apache POI: The resulting PDF was identical to my docx file, but still no chart showing up.
I already know Aspose would solve this pretty easily, but I am looking for an open source, free solution.
The code I am using with Apache POI is as follows:
public static void convert(String inputPath, String outputPath)
throws XWPFConverterException, IOException {
PdfConverter converter = new PdfConverter();
converter.convert(new XWPFDocument(new FileInputStream(new File(
inputPath))), new FileOutputStream(new File(outputPath)),
PdfOptions.create());
}
I do not know what to do to get the chart inside the PDF, could anybody tell me how to proceed?
Thanks in advance.

I don't know if this helps you but you could use "jacob" (I don't know if its possible with apache poi or docx4j)
With this solution you open "Word" yourself and export it as pdf.
!Word needs to be installed on the computer!
Heres the download-page: http://sourceforge.net/projects/jacob-project/
try {
if (System.getProperty("os.arch").contains("64")) {
System.load(DLL_64BIT_PATH);
} else {
System.load(DLL_32BIT_PATH);
}
} catch (UnsatisfiedLinkError e) {
//TODO
} catch (IOException e) {
//TODO
}
ActiveXComponent oleComponent = new ActiveXComponent("Word.Application");
oleComponent.setProperty("Visible", false);
Variant var = Dispatch.get(oleComponent, "Documents");
Dispatch document = var.getDispatch();
Dispatch activeDoc = Dispatch.call(document, "Open", fileName).toDispatch();
// https://msdn.microsoft.com/EN-US/library/office/ff845579.aspx
Dispatch.call(activeDoc, "ExportAsFixedFormat", new Object[] { "path to pdfFile.pdf", new Integer(17), false, 0 });
Object args[] = { new Integer(0) };//private static final int DO_NOT_SAVE_CHANGES = 0;
Dispatch.call(activeDoc, "Close", args);
Dispatch.call(oleComponent, "Quit");

Related

How to append a PDF file to an existing one with iText?

In an application I am trying to append multiple PDF files to a single already existing file. Using iText I found this
Using iText I found this tutorial, which, in my case doesn't seem to work.
Here are some ways I've tried to make it work.
String path = "path/to/destination.pdf";
PdfCopy mergedFile = new PdfCopy(pdf, new FileOutputStream(path));
PdfReader reader;
for(String toMergePath : toMergePaths){
reader = new PdfReader(toMergePath);
mergedFile.addDocument(reader);
mergedFile.freeReader(reader);
reader.close();
}
mergedFile.close();
When I try to add the document logcat tells me that the document is not open.
But, pdf (the original document) is already open by other methods, and closed only after this one. And, mergedFile is exactly like in the tutorial, which, I believe, must be right.
Did anyone experience the same problem? Otherwise, do anyone know a better method to do what I want to do?
I've seen other solutions copying the bite from one page and append them to a new file but I'm affraid this will "compile" the annotations which I need.
Thank you for your help,
Cordially,
Matthieu Meunier

I hope this code will help you.
public static void mergePdfs(){
try {
String[] files = { "D:\\1.pdf" ,"D:\\2.pdf" ,"D:\\3.pdf" ,"D:\\4.pdf"};
Document pDFCombineUsingJava = new Document();
PdfCopy copy = new PdfCopy(pDFCombineUsingJava , new FileOutputStream("D:\\CombinedFile.pdf"));
pDFCombineUsingJava.open();
PdfReader ReadInputPDF;
int number_of_pages;
for (int i = 0; i < files.length; i++) {
ReadInputPDF = new PdfReader(files[i]);
copy.addDocument(ReadInputPDF);
copy.freeReader(ReadInputPDF);
}
pDFCombineUsingJava.close();
}
catch (Exception i)
{
System.out.println(i);
}
}

Java: parse dxf file using Ycad / Kabeja or any other simmilar library

I'm pretty new to programming so any help would be much appreciated.
I'm trying to parse a .dxf file in order to get the coordinates of the entities and plot them to a JPanel. Basically I would need a graphical presentation of the dxf file.
So far I've only found some examples on how to use Ycad or Kabeja library but it's still not clear to me how to get the entities or even how the libraries work. It also seems like that the libraries aren't complete because some classes are missing and practically every example code I used had some problems with missing classes.
Also old questions on SO don't give me many answers. If anybody has any experience with the libraries mentioned above or any other method that would help me to resolve my problem, it would be greatly appreciated.

Use kabeja library, it converts DXF to PDF/SVG/JPEG
Working example :
private static void parseFile(String sourceFile, String index)
throws FileNotFoundException, ParseException, SAXException {
InputStream in = new FileInputStream("C:\\Users\\z003kebe\\Downloads\\DWGAndDxf\\dwg\\"+sourceFile);
// Parser dxfParser = DXFParserBuilder.createDefaultParser();
Parser dxfParser = ParserBuilder.createDefaultParser();
dxfParser.parse(new FileInputStream("C:\\Users\\z003kebe\\Downloads\\DWGAndDxf\\dwg\\"+sourceFile), "UTF-8");
DXFDocument doc = dxfParser.getDocument();
SAXGenerator generator = new SVGGenerator();
// generate into outputstream
// output the SVG
SAXSerializer out = new SAXPDFSerializer();
// or you can use also pdf
// org.kabeja.xml.SAXSerialzer out =
// org.kabeja.batik.tools.SAXPDFSerializer();
// tiff
// org.kabeja.xml.SAXSerialzer out =
// org.kabeja.batik.tools.SAXTIFFSerializer();
// png
// org.kabeja.xml.SAXSerialzer out =
// org.kabeja.batik.tools.SAXPNGSerializer();
// jpg
// org.kabeja.xml.SAXSerialzer out =
// org.kabeja.batik.tools.SAXJEPGSerializer();
OutputStream fileo = new FileOutputStream(outputFile+index+".PDF");
// out.setOutputStream(response.getOutputStream()) //write direct to
// ServletResponse
out.setOutput(fileo);
// generate
generator.generate(doc, out, new HashMap());
}

Specific characters not rendering properly in Java

I have an issue when displaying strings received from a server in a JTable. Some specific characters appear as little white squares instead of "é" or "à" etc. I tried a lot of things but none of them fixed my problem. I'm working with Eclipse under Windows. The server was developped using Visual Studio 2010.
The server sends an XML file using tinyXML2, the client uses JDom to read it. The font used is "Dialog". The server takes the strings from an Oracle database.
I assume this is an encoding problem, but I haven't been able to fix it yet.
Does anyone have an idea ?
Thx
Arnaud
EDIT : As requested, this is how I use JDom
public static Player fromXML(Element e)
{
Player result = new Player();
String e_text = null;
try
{
e_text = e.getChildText(XMLTags.XML_Player_playerId);
if (e_text != null) result.setID(Integer.parseInt(e_text));
e_text = e.getChildText(XMLTags.XML_Player_lastName);
if (e_text != null) result.setName(e_text);
e_text = e.getChildText(XMLTags.XML_Player_point_scored);
if (e_text != null) result.addSpecial(STAT_SCORED, Double.parseDouble(e_text));
e_text = e.getChildText(XMLTags.XML_Player_point_scored_last);
if (e_text != null) result.addSpecial(STAT_SCORED_LAST, Double.parseDouble(e_text));
}
catch (Exception ex) {
ex.printStackTrace();
}
return result;
}
public static Document load(String filename) {
File XMLFile = new File(CLIENT_TO_SERVER, filename);
SAXBuilder sxb = new SAXBuilder();
Document document = new Document();
try
{
document = sxb.build(new File(XMLFile.getPath()));
} catch(Exception e){e.printStackTrace();}
return document;
}

read the file using correct encoding, something like:
document = sxb.build(new BufferedReader(new InputStreamReader(new FileInputStream(XMLFile.getPath()), "UTF8")));
Note: 1. 1st determine which char encoding used in that file. specify that charset instead of UTF8 above.
Incase encoding is not known or it's being generated from various systems with different encoding, you may use 'encoding detector library of Mozilla'. #see https://code.google.com/p/juniversalchardet/
need to handle UnsupportedEncodingException

How to create image from PDF using PDFBox in JAVA

I want to create an image from first page of PDF . I am using PDFBox . After researching in web , I have found the following snippet of code :
public class ExtractImages
{
public static void main(String[] args)
{
ExtractImages obj = new ExtractImages();
try
{
obj.read_pdf();
}
catch (IOException ex)
{
System.out.println("" + ex);
}
}
void read_pdf() throws IOException
{
PDDocument document = null;
try
{
document = PDDocument.load("H:\\ct1_answer.pdf");
}
catch (IOException ex)
{
System.out.println("" + ex);
}
List<PDPage>pages = document.getDocumentCatalog().getAllPages();
Iterator iter = pages.iterator();
int i =1;
String name = null;
while (iter.hasNext())
{
PDPage page = (PDPage) iter.next();
PDResources resources = page.getResources();
Map pageImages = resources.getImages();
if (pageImages != null)
{
Iterator imageIter = pageImages.keySet().iterator();
while (imageIter.hasNext()) {
String key = (String) imageIter.next();
PDXObjectImage image = (PDXObjectImage) pageImages.get(key);
image.write2file("H:\\image" + i);
i ++;
}
}
}
}
}
In the above code there is no error . But the output of this code is nothing . I have expected that the above code will produce a series of image which will be saved in H drive . But there is no image in that code produced from this code . Why ?

Without trying to be rude, here is what the code you posted does inside its main working loop:
PDPage page = (PDPage) iter.next();
PDResources resources = page.getResources();
Map pageImages = resources.getImages();
It's getting each page from the PDF file, getting the resources from the page, and extracting the embedded images. It then writes those to disk.
If you are to be a competent software developer you need to be able to research and read documentation. With Java, that means Javadocs. Googling PDPage (or explicitly going to the apache site) turns up the Javadoc for PDPage.
On that page you find two versions of the method convertToImage() for converting the PDPage to an image. Problem solved.
Except ...
Unfortunately, they return a java.awt.image.BufferedImage which based on other questions you have asked is a problem because it is not supported on the Android platform which is what you're working on.
In short, you can't use Apache's PDFBox on Android to do what you're trying to do.
Searching on StackOverflow you find this same question posed several times in different forms, which will lead you to this: https://stackoverflow.com/questions/4665957/pdf-parsing-library-for-android/4766335#4766335 with the following answer that would be of interest to you: https://stackoverflow.com/a/4779852/302916
Unfortunately even the one that the aforementioned answer says will work ... is not very user friendly; there's no "How to" or docs that I can find. It's also labeled as "alpha". This is probably not something for the feint hearted as it's going to require reading and understanding their code to even start using it.

I copied your above code and added following libs to my buildpath in eclipse. It is working.
Apache PDFBox 1.7.1 libs
Commons Logging 1.1.1 libs

How do I generate RTF from Java?

I work on a web-based tool where we offer customized prints.
Currently we build an XML structure with Java, feed it to the XMLmind XSL-FO Converter along with customized XSL-FO, which then produces an RTF document.
This works fine on simple layouts, but there's some problem areas where I'd like greater control, or where I can't do what I want at all. F.ex: tables in header, footers (e.g., page numbers), columns, having a separate column setup or different page number info on the first page, etc.
Do any of you know of better alternatives, either to XMLmind or to the way we get from data to RTF, i.e., Java-> XML, XML+XSL-> RTF? (The only practical limitation for us is the JVM.)

You can take a look at a new library called jRTF. It allows you to create new RTF documents and to fill RTF templates.

Have you had a look at the iText library? It's touted primarily as a PDF generator, though it can also generate RTF. I haven't had cause to use it personally, but the general feeling I get is that it's good, and the interface looks comprehensive and easy to work to in the abstract. Whether it would fit in well with your existing data model is another question.

If you could afford spending some money, you could use Aspose.Words, a professional library for creating Word and RTF documents for Java and .NET.

iText supports RTF.

import com.lowagie.text.*;
import com.lowagie.text.html.simpleparser.HTMLWorker;
import com.lowagie.text.html.simpleparser.StyleSheet;
import com.lowagie.text.rtf.*;
import java.io.*;
import java.util.ArrayList;
public class HTMLtoRTF {
public static void main(String[] args) throws DocumentException {
Document document = new Document();
try {
Reader htmlreader = new BufferedReader((new InputStreamReader((new FileInputStream("C:\\Users\\asrikantan\\Desktop\\sample.htm")))));
RtfWriter2 rtfWriter = RtfWriter2.getInstance(document, new FileOutputStream(("C:\\Users\\asrikantan\\Desktop\\sample12.rtf")));
document.open();
document.add(new Paragraph("Testing simple paragraph addition."));
//ByteArrayOutputStream out = new ByteArrayOutputStream();
StyleSheet styles = new StyleSheet();
styles.loadTagStyle("body", "font", "Bitstream Vera Sans");
ArrayList htmlParser = HTMLWorker.parseToList(htmlreader, styles);
//fetch HTML line by line
for (int htmlDatacntr = 0; htmlDatacntr < htmlParser.size(); htmlDatacntr++) {
Element htmlDataElement = (Element) htmlParser.get(htmlDatacntr);
document.add((htmlDataElement));
}
htmlreader.close();
document.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (Exception e) {
System.out.println(e);
}
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.