Converting HTML to PDF using iText

Converting HTML to PDF using iText - java

I am posting this question because many developers ask more or less the same question in different forms. I will answer this question myself (I am the Founder/CTO of iText Group), so that it can be a "Wiki-answer." If the Stack Overflow "documentation" feature still existed, this would have been a good candidate for a documentation topic.
The source file:
I am trying to convert the following HTML file to PDF:
<html>
<head>
<title>Colossal (movie)</title>
<style>
.poster { width: 120px;float: right; }
.director { font-style: italic; }
.description { font-family: serif; }
.imdb { font-size: 0.8em; }
a { color: red; }
</style>
</head>
<body>
<img src="img/colossal.jpg" class="poster" />
<h1>Colossal (2016)</h1>
<div class="director">Directed by Nacho Vigalondo</div>
<div class="description">Gloria is an out-of-work party girl
forced to leave her life in New York City, and move back home.
When reports surface that a giant creature is destroying Seoul,
she gradually comes to the realization that she is somehow connected
to this phenomenon.
</div>
<div class="imdb">Read more about this movie on
IMDB
</div>
</body>
</html>
In a browser, this HTML looks like this:
The problems I encountered:
HTMLWorker doesn't take CSS into account at all
When I used HTMLWorker, I need to create an ImageProvider to avoid an error that informs me that the image can't be found. I also need to create a StyleSheet instance to change some of the styles:
public static class MyImageFactory implements ImageProvider {
public Image getImage(String src, Map<String, String> h,
ChainedProperties cprops, DocListener doc) {
try {
return Image.getInstance(
String.format("resources/html/img/%s",
src.substring(src.lastIndexOf("/") + 1)));
} catch (DocumentException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
}
public static void main(String[] args) throws IOException, DocumentException {
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream("results/htmlworker.pdf"));
document.open();
StyleSheet styles = new StyleSheet();
styles.loadStyle("imdb", "size", "-3");
HTMLWorker htmlWorker = new HTMLWorker(document, null, styles);
HashMap<String,Object> providers = new HashMap<String, Object>();
providers.put(HTMLWorker.IMG_PROVIDER, new MyImageFactory());
htmlWorker.setProviders(providers);
htmlWorker.parse(new FileReader("resources/html/sample.html"));
document.close();
}
The result looks like this:
For some reason, HTMLWorker also shows the content of the <title> tag. I don't know how to avoid this. The CSS in the header isn't parsed at all, I have to define all the styles in my code, using the StyleSheet object.
When I look at my code, I see that plenty of objects and methods I'm using are deprecated:
So I decided to upgrade to using XML Worker.
Images aren't found when using XML Worker
I tried the following code:
public static final String DEST = "results/xmlworker1.pdf";
public static final String HTML = "resources/html/sample.html";
public void createPdf(String file) throws IOException, DocumentException {
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
document.open();
XMLWorkerHelper.getInstance().parseXHtml(writer, document,
new FileInputStream(HTML));
document.close();
}
This resulted in the following PDF:
Instead of Times-Roman, the default font Helvetica is used; this is typical for iText (I should have defined a font explicitly in my HTML). Otherwise, the CSS seems to be respected, but the image is missing, and I didn't get an error message.
With HTMLWorker, an exception was thrown, and I was able to fix the problem by introducing an ImageProvider. Let's see if this works for XML Worker.
Not all CSS styles are supported in XML Worker
I adapted my code like this:
public static final String DEST = "results/xmlworker2.pdf";
public static final String HTML = "resources/html/sample.html";
public static final String IMG_PATH = "resources/html/";
public void createPdf(String file) throws IOException, DocumentException {
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
document.open();
CSSResolver cssResolver =
XMLWorkerHelper.getInstance().getDefaultCssResolver(true);
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.setImageProvider(new AbstractImageProvider() {
public String getImageRootPath() {
return IMG_PATH;
}
});
PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new FileInputStream(HTML));
document.close();
}
My code is much longer, but now the image is rendered:
The image is larger than when I rendered it using HTMLWorker which tells me that the CSS attribute width for the poster class is taken into account, but the float attribute is ignored. How do I fix this?
The remaining question:
So the question boils down to this: I have a specific HTML file that I try to convert to PDF. I have gone through a lot of work, fixing one problem after the other, but there is one specific problem that I can't solve: how do I make iText respect CSS that defines the position of an element, such as float: right?
Additional question:
When my HTML contains form elements (such as <input>), those form elements are ignored.

Why your code doesn't work
As explained in the introduction of the HTML to PDF tutorial, HTMLWorker has been deprecated many years ago. It wasn't intended to convert complete HTML pages. It doesn't know that an HTML page has a <head> and a <body> section; it just parses all the content. It was meant to parse small HTML snippets, and you could define styles using the StyleSheet class; real CSS wasn't supported.
Then came XML Worker. XML Worker was meant as a generic framework to parse XML. As a proof of concept, we decided to write some XHTML to PDF functionality, but we didn't support all of the HTML tags. For instance: forms weren't supported at all, and it was very hard to support CSS that is used to position content. Forms in HTML are very different from forms in PDF. There was also a mismatch between the iText architecture and the architecture of HTML + CSS. Gradually, we extended XML Worker, mostly based on requests from customers, but XML Worker became a monster with many tentacles.
Eventually, we decided to rewrite iText from scratch, with the requirements for HTML + CSS conversion in mind. This resulted in iText 7. On top of iText 7, we created several add-ons, the most important one in this context being pdfHTML.
How to solve the problem
Using the latest version of iText (iText 7.1.0 + pdfHTML 2.0.0) the code to convert the HTML from the question to PDF is reduced to this snippet:
public static final String SRC = "src/main/resources/html/sample.html";
public static final String DEST = "target/results/sample.pdf";
public void createPdf(String src, String dest) throws IOException {
HtmlConverter.convertToPdf(new File(src), new File(dest));
}
The result looks like this:
As you can see, this is pretty much the result you'd expect. Since iText 7.1.0 / pdfHTML 2.0.0, the default font is Times-Roman. The CSS is being respected: the image is now floating on the right.
Some additional thoughts.
Developers often feel opposed to upgrade to a newer iText version when I give the advice to upgrade to iText 7 / pdfHTML 2. Allow me to answer to the top 3 of arguments I hear:
I need to use the free iText, and iText 7 isn't free / the pdfHTML add-on is closed source.
iText 7 is released using the AGPL, just like iText 5 and XML Worker. The AGPL allows free use in the sense of free of charge in the context of open source projects. If you are distributing a closed source / proprietary product (e.g. you use iText in a SaaS context), you can't use iText for free; in that case, you have to purchase a commercial license. This was already true for iText 5; this is still true for iText 7. As for versions prior to iText 5: you shouldn't use these at all. Regarding pdfHTML: the first versions were indeed only available as closed source software. We have had heavy discussion within iText Group: on the one hand, there were the people who wanted to avoid the massive abuse by companies who don't listen to their developers when those developers tell the powers that be that open source isn't the same as free. Developers were telling us that their boss forced them to do the wrong thing, and that they couldn't convince their boss to purchase a commercial license. On the other hand, there were the people who argued that we shouldn't punish developers for the wrong behavior of their bosses. Eventually, the people in favor of open sourcing pdfHTML, that is: the developers at iText, won the argument. Please prove that they weren't wrong, and use iText correctly: respect the AGPL if you're using iText for free; make sure that your boss purchases a commercial license if you're using iText in a closed source context.
I need to maintain a legacy system, and I have to use an old iText version.
Seriously? Maintenance also involves applying upgrades and migrating to new versions of the software you're using. As you can see, the code needed when using iText 7 and pdfHTML is very simple, and less error-prone than the code needed before. A migration project shouldn't take too long.
I've only just started and I didn't know about iText 7; I only found out after I finished my project.
That's why I'm posting this question and answer. Think of yourself as an eXtreme Programmer. Throw away all of your code, and start anew. You'll notice that it's not as much work as you imagined, and you'll sleep better knowing that you've made your project future-proof because iText 5 is being phased out. We still offer support to paying customers, but eventually, we'll stop supporting iText 5 altogether.

Use iText 7 and this code:
public void generatePDF(String htmlFile) {
try {
//HTML String
String htmlString = htmlFile;
//Setting destination
FileOutputStream fileOutputStream = new FileOutputStream(new File(dirPath + "/USER-16-PF-Report.pdf"));
PdfWriter pdfWriter = new PdfWriter(fileOutputStream);
ConverterProperties converterProperties = new ConverterProperties();
PdfDocument pdfDocument = new PdfDocument(pdfWriter);
//For setting the PAGE SIZE
pdfDocument.setDefaultPageSize(new PageSize(PageSize.A3));
Document document = HtmlConverter.convertToDocument(htmlFile, pdfDocument, converterProperties);
document.close();
}
catch (Exception e) {
e.printStackTrace();
}
}

Convert a static HTML page take also any CSS Style:
HtmlConverter.convertToPdf(new File("./pdf-input.html"),new File("demo-html.pdf"));
For spring Boot user: Convert a dynamic HTML page using SpringBoot and Thymeleaf:
#RequestMapping(path = "/pdf")
public ResponseEntity<?> getPDF(HttpServletRequest request, HttpServletResponse response) throws IOException {
/* Do Business Logic*/
Order order = OrderHelper.getOrder();
/* Create HTML using Thymeleaf template Engine */
WebContext context = new WebContext(request, response, servletContext);
context.setVariable("orderEntry", order);
String orderHtml = templateEngine.process("order", context);
/* Setup Source and target I/O streams */
ByteArrayOutputStream target = new ByteArrayOutputStream();
ConverterProperties converterProperties = new ConverterProperties();
converterProperties.setBaseUri("http://localhost:8080");
/* Call convert method */
HtmlConverter.convertToPdf(orderHtml, target, converterProperties);
/* extract output as bytes */
byte[] bytes = target.toByteArray();
/* Send the response as downloadable PDF */
return ResponseEntity.ok()
.header(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=order.pdf")
.contentType(MediaType.APPLICATION_PDF)
.body(bytes);
}

Related

Itext Error - "The method add(AreaBreak) in the type Document is not applicable for the arguments (PdfTable)"

I have created an accessible pdf through iText. However now i trying to take input from userlike Name, Address etc in JSP and place the input somewhere in the pdf.
User gives the input in the text area (like on SO) with the ability to mark the text as Bold or Italics or create lists (I am using widgEditor for this)
I am using PdfHtml to parse the input to the pdf. As far as i know there are 2 mehtods to make this work - convertToDocument() method and convertToElements() method.
I am using conconvertToElements() methods since convertToDocument() does not give us the ablity to place parsed input to a specific position in the pdf it simply puts the input at the top of Pdf.
I have refereed to C01E08_HelloWorld example
But while adding pdfptable to the document i am getting the following error.
Error - "The method add(AreaBreak) in the type Document is not applicable for the arguments (PdfTable)"
public void createPdf(String baseUri, String src, String dest) throws IOException {
ConverterProperties properties = new ConverterProperties();
properties.setBaseUri(baseUri);
List<IElement> elements = HtmlConverter.convertToElements(HTML+HTML2, properties);
PdfDocument pdf = new PdfDocument(new PdfWriter(dest));
Document document = new Document(pdf);
for (IElement element : elements) {
document.add(new Paragraph(element.getClass().getName()));
document.add((IBlockElement)element);
}
PdfPTable t = new PdfPTable(new float[] {1,1});
document.add(t);
document.close();
}

You are mixing iText 7 with iText 5 elements. PdfPTable is an iText 5 element and can't be used with the Document class of iText 7. Please use the com.itextpdf.layout.element.Table class.
Also, check your dependencies to remove the iText 5 dependency to avoid further confusion.

How to insert an PDPage within another PDPage with pdfbox

I use different tools like processing to create vector plots. These plots are written as single or multi-page pdfs. I would like to include these plots in a single report-like pdf using pdfbox.
My current workflow includes these pdfs as images with the following pseudo code
PDDocument inFile = PDDocument.load(file);
PDPage firstPage = (PDPage) inFile.getDocumentCatalog().getAllPages().get(0);
BufferedImage image = firstPage.convertToImage(BufferedImage.TYPE_INT_RGB, 300);
PDXObjectImage ximage = new PDPixelMap(document, image);
PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.drawXObject(ximage, 0, 0, ximage.getWidth(), ximage.getHeight());
contentStream.close();
While this works it looses the benefits of the vector file formats, espectially file/size vs. printing qualitity.
Is it possible to use pdfbox to include other pdf pages as embedded objects within a page (Not added as a separate page)? Could I e.g. use a PDStream? I would prefer a solution like pdflatex is able to embed pdf figures into a new pdf document.
What other Java libraries can you recommend for that task?

Is it possible to use pdfbox to include other pdf pages as embedded objects within a page
It should be possible. The PDF format allows the use of so called form xobjects to serve as such embedded objects. I don't see an explicit implementation for that, though, but the procedure is similar enough to what PageExtractor or PDFMergerUtility do.
A proof of concept derived from PageExtractor using the current SNAPSHOT of the PDFBox 2.0.0 development version:
PDDocument source = PDDocument.loadNonSeq(SOURCE, null);
List<PDPage> pages = source.getDocumentCatalog().getAllPages();
PDDocument target = new PDDocument();
PDPage page = new PDPage();
PDRectangle cropBox = page.findCropBox();
page.setResources(new PDResources());
target.addPage(page);
PDFormXObject xobject = importAsXObject(target, pages.get(0));
page.getResources().addXObject(xobject, "X");
PDPageContentStream content = new PDPageContentStream(target, page);
AffineTransform transform = new AffineTransform(0, 0.5, -0.5, 0, cropBox.getWidth(), 0);
content.drawXObject(xobject, transform);
transform = new AffineTransform(0.5, 0.5, -0.5, 0.5, 0.5 * cropBox.getWidth(), 0.2 * cropBox.getHeight());
content.drawXObject(xobject, transform);
content.close();
target.save(TARGET);
target.close();
source.close();
This code imports the first page of a source document to a target document as XObject and puts it twice onto a page there with different scaling and rotation transformations, e.g. for this source
it creates this
The helper method importAsXObject actually doing the import is defined like this:
PDFormXObject importAsXObject(PDDocument target, PDPage page) throws IOException
{
final PDStream src = page.getContents();
if (src != null)
{
final PDFormXObject xobject = new PDFormXObject(target);
OutputStream os = xobject.getPDStream().createOutputStream();
InputStream is = src.createInputStream();
try
{
IOUtils.copy(is, os);
}
finally
{
IOUtils.closeQuietly(is);
IOUtils.closeQuietly(os);
}
xobject.setResources(page.findResources());
xobject.setBBox(page.findCropBox());
return xobject;
}
return null;
}
As mentioned above this is only a proof of concept, corner cases have not yet been taken into account.

To update this question:
There is already a helper class in org.apache.pdfbox.multipdf.LayerUtility to do the import.
Example to show superimposing a PDF page onto another PDF: SuperimposePage.
This class is part of the Apache PDFBox Examples and sample transformations as shown by #mkl were added to it.

As mkl appropriately suggested, PDFClown is among the Java libraries which provide explicit support for page embedding (so-called Form XObjects (see PDF Reference 1.7, § 4.9)).
In order to let you get a taste of the way PDFClown works, the following code represents the equivalent of mkl's PDFBox solution (NOTE: as mkl later stated, his code sample was by no means optimised, so this comparison may not correspond to the actual status of PDFBox -- comments are welcome to clarify this):
Document source = new File(SOURCE).getDocument();
Pages sourcePages = source.getPages();
Document target = new File().getDocument();
Page targetPage = new Page(target);
target.getPages().add(targetPage);
XObject xobject = sourcePages.get(0).toXObject(target);
PrimitiveComposer composer = new PrimitiveComposer(targetPage);
Dimension2D targetSize = targetPage.getSize();
Dimension2D sourceSize = xobject.getSize();
composer.showXObject(xobject, new Point2D.Double(targetSize.getWidth() * .5, targetSize.getHeight() * .35), new Dimension(sourceSize.getWidth() * .6, sourceSize.getHeight() * .6), XAlignmentEnum.Center, YAlignmentEnum.Middle, 45);
composer.showXObject(xobject, new Point2D.Double(targetSize.getWidth() * .35, targetSize.getHeight()), new Dimension(sourceSize.getWidth() * .4, sourceSize.getHeight() * .4), XAlignmentEnum.Left, YAlignmentEnum.Top, 90);
composer.flush();
target.getFile().save(TARGET, SerializationModeEnum.Standard);
source.getFile().close();
Comparing this code to PDFBox's equivalent you can notice some relevant differences which show PDFClown's neater style (it would be nice if some PDFBox expert could validate my assertions):
Page-to-FormXObject conversion: PDFClown natively supports a dedicated method (Page.toXObject()), so there's no need for additional heavy-lifting such as the helper method importAsXObject();
Resource management: PDFClown automatically (and transparently) allocates page resources, so there's no need for explicit calls such as page.getResources().addXObject(xobject, "X");
XObject drawing: PDFClown supports both high-level (explicit scale, translation and rotation anchors) and low-level (affine transformations) methods to place your FormXObject into the page, so there's no need to necessarily deal with affine transformations.
The whole point is that PDFClown features a rich architecture made up of multiple abstraction layers: according to your requirements, you can choose the most appropriate coding style (either to delve into PDF's low-level basic structures or to leverage its convenient and elegant high-level model). PDFClown lets you tweak every single byte and solve complex tasks with a ridiculously simple method call, at your will.
DISCLOSURE: I'm the lead developer of PDFClown.

Convert HTML/MXML file to Word doc programmatically in Java

I would like to convert either an HTML or MXML file document to Microsoft .doc and/or .docx format.
Please provide an example for doing this?

I've found that by far the best (free) option to do conversions like this is to use the OpenOffice API. It has a very robust conversion facility. It's a bit of a pain to initially get working because of how abstract the API is, but once you do, it's powerful. This API wrapper helps to simplify it somewhat.

You can also use docx4j.jar which simply converts xhtml to docx.
You can save your format information as xhtml template and place input from form (like name,age,address etc) into the template at runtime.
This is a sample code to refer from this link
public static void main(String[] args) throws Exception
{
String xhtml=
"<table border=\"1\" cellpadding=\"1\" cellspacing=\"1\" style=\"width:100%;\"><tbody><tr><td>test</td><td>test</td></tr><tr><td>test</td><td>test</td></tr><tr><td>test</td><td>test</td></tr></tbody></table>";
// To docx, with content controls
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
XHTMLImporterImpl XHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
wordMLPackage.getMainDocumentPart().getContent().addAll(
XHTMLImporter.convert( xhtml, null) );
wordMLPackage.save(new java.io.File("D://sample.docx"));
}

You can use both iText and Apache POI to handle and convert MS doc in Java.

You can convert HTML to DOCX using Aspose.Words Cloud SDK for Java. Its free pricing plan offers 150 free API calls per month.
P.S: I am a developer evangelist at Aspose
//Get Client ID and Client Key from https://dashboard.aspose.cloud/
WordsApi wordsApi = new WordsApi("xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx","xxxxxxxxxxxxxxxxxxxxxxx","https://api.aspose.cloud");
ApiClient client = wordsApi.getApiClient();
client.setConnectTimeout(12*60*1000);
client.setReadTimeout(12*60*1000);
client.setWriteTimeout(12*60*1000);
try {
ConvertDocumentRequest request = new ConvertDocumentRequest(
Files.readAllBytes(Paths.get("C:/Temp/02_pages.html").toAbsolutePath()),
"docx",
null,
null,
null,
null
);
File result = wordsApi.convertDocument(request);
System.out.println("api request completed...");
File dest = new File("C:/Temp/02_pages_java.docx");
Files.copy(result.toPath(), dest.toPath(),
java.nio.file.StandardCopyOption.REPLACE_EXISTING);
} catch (Exception e) {
System.out.println(e.getMessage());
}

Creating PDF from Word (DOC) using Apache POI and iText in JAVA

I am trying to generate a PDF document from a *.doc document.
Till now and thanks to stackoverflow I have success generating it but with some problems.
My sample code below generates the pdf without formatations and images, just the text.
The document includes blank spaces and images which are not included in the PDF.
Here is the code:
in = new FileInputStream(sourceFile.getAbsolutePath());
out = new FileOutputStream(outputFile);
WordExtractor wd = new WordExtractor(in);
String text = wd.getText();
Document pdf= new Document(PageSize.A4);
PdfWriter.getInstance(pdf, out);
pdf.open();
pdf.add(new Paragraph(text));

docx4j includes code for creating a PDF from a docx using iText. It can also use POI to convert a doc to a docx.
There was a time when we supported both methods equally (as well as PDF via XHTML), but we decided to focus on XSL-FO.
If its an option, you'd be much better off using docx4j to convert a docx to PDF via XSL-FO and FOP.
Use it like so:
wordMLPackage = WordprocessingMLPackage.load(new java.io.File(inputfilepath));
// Set up font mapper
Mapper fontMapper = new IdentityPlusMapper();
wordMLPackage.setFontMapper(fontMapper);
// Example of mapping missing font Algerian to installed font Comic Sans MS
PhysicalFont font
= PhysicalFonts.getPhysicalFonts().get("Comic Sans MS");
fontMapper.getFontMappings().put("Algerian", font);
org.docx4j.convert.out.pdf.PdfConversion c
= new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);
// = new org.docx4j.convert.out.pdf.viaIText.Conversion(wordMLPackage);
OutputStream os = new java.io.FileOutputStream(inputfilepath + ".pdf");
c.output(os);
Update July 2016
As of docx4j 3.3.0, Plutext's commercial PDF renderer is docx4j's default option for docx to PDF conversion. You can try an online demo at converter-eval.plutext.com
If you want to use the existing docx to XSL-FO to PDF (or other target supported by Apache FOP) approach, then just add the docx4j-export-FO jar to your classpath.
Either way, to convert docx to PDF, you can use the Docx4J facade's toPDF method.
The old docx to PDF via iText code can be found at https://github.com/plutext/docx4j-export-FO/.../docx4j-extras/PdfViaIText/

WordExtractor just grabs the plain text, nothing else. That's why all you're seeing is the plain text.
What you'll need to do is get each paragraph individually, then grab each run, fetch the formatting, and generate the equivalent in PDF.
One option may be to find some code that turns XHTML into a PDF. Then, use Apache Tika to turn your word document into XHTML (it uses POI under the hood, and handles all the formatting stuff for you), and from the XHTML on to PDF.
Otherwise, if you're going to do it yourself, take a look at the code in Apache Tika for parsing word files. It's a really great example of how to get at the images, the formatting, the styles etc.

I have succesfully used Apache FOP to convert a 'WordML' document to PDF. WordML is the Office 2003 way of saving a Word document as xml. XSLT stylesheets can be found on the web to transform this xml to xml-fo which in turn can be rendered by FOP into PDF (among other outputs).
It's not so different from the solution plutext offered, except that it doesn't read a .doc document, whereas docx4j apparently does. If your requirements are flexible enough to have WordML style documents as input, this might be worth looking into.
Good luck with your project!
Wim

Use OpenOffice/LbreOffice and JODConnector
This also mostly works for .doc to .docx. Problems with graphics that I have not yet worked out though.
private static void transformDocXToPDFUsingJOD(File in, File out)
{
OfficeDocumentConverter converter = new OfficeDocumentConverter(officeManager);
DocumentFormat pdf = converter.getFormatRegistry().getFormatByExtension("pdf");
converter.convert(in, out, pdf);
}
private static OfficeManager officeManager;
#BeforeClass
public static void setupStatic() throws IOException {
/*officeManager = new DefaultOfficeManagerConfiguration()
.setOfficeHome("C:/Program Files/LibreOffice 3.6")
.buildOfficeManager();
*/
officeManager = new ExternalOfficeManagerConfiguration().setConnectOnStart(true).setPortNumber(8100).buildOfficeManager();
officeManager.start();
}
#AfterClass
public static void shutdownStatic() throws IOException {
officeManager.stop();
}
You need to be running LibreOffice as a serverto make this work.
From the command line you can do this using;
"C:\Program Files\LibreOffice 3.6\program\soffice.exe" -accept="socket,host=0.0.0.0,port=8100;urp;LibreOffice.ServiceManager" -headless -nodefault -nofirststartwizard -nolockcheck -nologo -norestore

Another option I came across recently is using the OpenOffice (or LibreOffice) API (see here). I have not been able to get into this but it should be able to open documents in various formats and output them in a pdf format. If you look into this, let me know how it worked!

Barcode doesn't display in PDF using iText

I'm using iText to dynamically generate PDF docs. Now I'm trying to dynamically create a barcode in this PDF. Adobe Live Cycle has a barcode function built-in. You can just drag the barcode text box on the page and it's created.
Problem:
I placed the barcode field in the PDF. Then pass a number to the barcode field from the JSP page. But only the number appears. The barcode lines never display
The number, 20099002, is visible on the PDF doc, but the barcode lines fail to appear. I tried several other barcode options in LiveCycle but the all give the same result.
OurJavaPage.java
public class ExampleForm extends BaseOutput {
private static final Log LOG = LogFactory.getLog(ExampleForm.class);
public OutputStream generate() throws IOException, DocumentException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfReader reader;
reader = new PdfReader(BASEDIR + "MailingExample.pdf");
PdfStamper stamper = new PdfStamper(reader, baos);
Participant participantHome = home.getParticipant();
Set<Location> homeLocs = participantHome.getLocations();
final AcroFields form = stamper.getAcroFields();
setFormField(form, "addrNumber[0]", addrMaster.getStreetNum());
setFormField(form, "dateMiddle[0]", formatDate("MM-dd-yyyy", new Date()));
// *********** Here's the problem *****************************
setFormField(form, "Code128ABarcode1[0]", "20099002");
// ************************************************************
debugAcrobatForm("ExampleForm", form);
stamper.setFormFlattening(true);
stamper.close();
return baos;
}
}
Operating System: Linux
Programming: Java, .jsp, iText
Software: Adobe Live Cycle Designer ES 8.1

Problem solved!!!
I contacted iText and they suggested that I change this line.
From:
stamper.setFormFlattening(true);
To:
stamper.setFormFlattening(false);
It worked.

Does your software embed the barcode as a graphic or as a font representation of characters?
If the latter, is it embedding the font into the PDF?

Did you ask on the very active IText mailing list?

If this was a dynamic XFA form created with LiveCycle, then using form flattening will cause you to lose your form fields. Static XFA forms should work though.
Reference: http://itext.ugent.be/library/question.php?id=30
XFA support in iText is improving but spotty at best.

u have to add barcode font in ur system (font library), then it will be visible in ur font drop-down. Use acro-field (text) and set that font in this acro-field. Ur problem will be solved.
also, use setformflattening=true as it will make pdf uneditable

it works just fine for me.. maybe its got to do with the way you make the template in livecycle designer (static or dynamic)...
see sample here
http://1t3xt.info/examples/browse/?page=example&id=433
Regards
Raghavendra Samant

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Converting HTML to PDF using iText - java

Related

Itext Error - "The method add(AreaBreak) in the type Document is not applicable for the arguments (PdfTable)"

How to insert an PDPage within another PDPage with pdfbox

Convert HTML/MXML file to Word doc programmatically in Java

Creating PDF from Word (DOC) using Apache POI and iText in JAVA

Barcode doesn't display in PDF using iText

Categories

Resources