OpenHtmlToPdf results in Error when using PDF/A conformance

OpenHtmlToPdf results in Error when using PDF/A conformance - java

I am facing an issue, while creating a PDF using OpenHtmlToPdf. Creating a normal PDF works fine, but once PDF/A conformance is enabled, the application exits with a NullPointerException.
The PDF i am trying to create has to be PDF/A conform. However, without the conformance, the resulting pdf does not contain the fonts.
The process of creating a PDF starts with an HTML template, i process using Thymeleaf in Spring Boot. The rendered PDF is at last returned as a ByteArray and return by a controller as a Blob-Download. This part is working fine. The code generating the PDF looks something like this:
val model: Model = modelFactory.buildModel(id)
val context = Context(model.locale)
val outputStream = ByteArrayOutputStream()
val builder = PdfRendererBuilder()
val conformance = PdfRendererBuilder.PdfAConformance.NONE
context.setVariable("model", model)
// Other variables
val htmlContent: String = templateEngine.process("content", context)
builder.useFastMode().usePdfAConformance(conformance)
.withHtmlContent(htmlContent, "")
.usePdfVersion(if (conformance.part == 1) 1.4f else 1.5f)
.toStream(outputStream)
.useFont({ ClassPathResource("fonts/first-font.ttf").inputStream }, "first-font")
.useFont({ ClassPathResource("fonts/second-font.ttf").inputStream }, "second-font")
.useFont({ ClassPathResource("fonts/third-font.ttf").inputStream }, "third-font")
.useFont({ ClassPathResource("fonts/fourth-font.ttf").inputStream }, "fourth-font")
.run()
return outputStream.toByteArray()
The code has been slightly altered, but the logic is kept the same.
The html contains css, which includes the fonts like this:
#page {
size: a4;
#bottom-center {
font-size: 8pt;
border-top: 1px solid black;
content: 'Generated on [[${formatters.formatDateTime(model.generatedAt)}]]';
font-family: "first-font"; /* Font provided in code */
}
// ...
}
html {
font-size: 9pt;
font-family: "second-font"; /* Font provided in code */
}
// ...
With PdfAConformance.NONE, everything works fine and the PDF is rendered as expected.
However, once i change it to any PDF/A conformance, it no longer works. There is a very long list of logs to System.err, containing the following two lines. Repeated over and over again.
com.openhtmltopdf.exception WARNING:: Font metrics not available. Probably a bug.
com.openhtmltopdf.render WARNING:: Font is null.
And finally, the process ends with the following three info logs and NullPointerException
com.openhtmltopdf.general INFO:: Using fast-mode renderer. Prepare to fly.
com.openhtmltopdf.general INFO:: Specified fonts don't contain a space character!
com.openhtmltopdf.general INFO:: Specified fonts don't contain a space character!
java.lang.NullPointerException
at org.apache.pdfbox.pdmodel.PDPageContentStream.setFont(PDPageContentStream.java:409)
at com.openhtmltopdf.pdfboxout.PdfContentStreamAdapter.setFont(PdfContentStreamAdapter.java:262)
at com.openhtmltopdf.pdfboxout.PdfBoxFastOutputDevice.drawStringFast(PdfBoxFastOutputDevice.java:462)
at com.openhtmltopdf.pdfboxout.PdfBoxFastOutputDevice.drawString(PdfBoxFastOutputDevice.java:408)
at com.openhtmltopdf.pdfboxout.PdfBoxTextRenderer.drawString(PdfBoxTextRenderer.java:51)
at com.openhtmltopdf.render.AbstractOutputDevice.drawText(AbstractOutputDevice.java:107)
at com.openhtmltopdf.render.InlineText.paint(InlineText.java:171)
at com.openhtmltopdf.render.InlineLayoutBox.paintInline(InlineLayoutBox.java:279)
at com.openhtmltopdf.render.simplepainter.SimplePainter.paintInlineContent(SimplePainter.java:170)
at com.openhtmltopdf.render.simplepainter.SimplePainter.paintLayer(SimplePainter.java:72)
at com.openhtmltopdf.render.PageBox.paintMarginAreas(PageBox.java:430)
at com.openhtmltopdf.pdfboxout.PdfBoxRenderer.paintPageFast(PdfBoxRenderer.java:886)
at com.openhtmltopdf.pdfboxout.PdfBoxRenderer.writePDFFast(PdfBoxRenderer.java:619)
at com.openhtmltopdf.pdfboxout.PdfBoxRenderer.createPdfFast(PdfBoxRenderer.java:554)
at com.openhtmltopdf.pdfboxout.PdfBoxRenderer.createPDF(PdfBoxRenderer.java:472)
at com.openhtmltopdf.pdfboxout.PdfBoxRenderer.createPDF(PdfBoxRenderer.java:409)
at com.openhtmltopdf.pdfboxout.PdfBoxRenderer.createPDF(PdfBoxRenderer.java:391)
at com.openhtmltopdf.pdfboxout.PdfRendererBuilder.run(PdfRendererBuilder.java:42)
....
This Exception is thrown within Apache PdfBox and i cannot control the input values, hence this appears to be an issue within OpenHtmlToPdf and not another duplicate of "What is a NullPointerException, and how do I fix it?". This is also the case, because nothing is allowed to be null in this Kotlin code.
This Exception is thrown after calling run on the Builder.
I have tried a lot of possible combinations using the conformance. All 3 PDF/A conformance settings, in combination with and without the fast mode, because this post suggested it might be an issue. I made sure to [...] [use] a name that is not sans-serif, serif or monospace as this question might have suggested. Also, adding the "-fs-font-subset: complete-font", as well as the url additionally in the css entries did not help, as it apparently did in this question. If possible, i want to include the fonts in code not in css, since resolving those urls was tricky and finicky in the past. I also made sure that the fonts are not broken, by trying out different ttfs and all i tried resulted in said NullPointerException.
However, those answers and all other answers to issues on Github as well as here did not help me in my case. I am lost.
Is there anything i am missing? Do i have to do something different to get PDF/A conformance?

Related

ITEXT dataElements Loop Performance

Hi recently I'm working on a project and one of the reporting module is using iText 2.0.8 version library. Everything work fine until the number of data became huge (around 50,000+ of row). I really need suggestion from every expert on Stackoverflow to improve my code.
My code logic: I wrote HTML code with all data contained inside. After the full HTML code is done, it will store into a variable called as "content", then I'll convert the "content" variable into IElement list and perform a for loop to add into the document. I realise this loop is causing a bad performance (CPU usage is high) and the report is generating very slow (even caused connection timeout).
The following the part of code that caused a very high CPU usage for Java process.
//The **content** String variable is contain the HTML code of the report
//(From <head> to <body> with <table> as the main content to structure the data row).
//I didnt include here because the code is huge.
String PDFFileName = "123.pdf";
PdfDocument pdf = new PdfDocument(new PdfWriter(new FileOutputStream(PDFFileName)));
Document document = new Document(pdf);
List<IElement> dataElements = HtmlConverter.convertToElements(content.toString(), converterProperties);
for (IElement element : dataElements) {
if (element instanceof IBlockElement) {
document.add((IBlockElement) element);
}
}
I know the loop is the issue, but I don't any other way is better and efficient for my case, hope someone can help me on this! Thank you. Please comment below if need extra information (Sorry cant really include all the code since it's very huge).
Specification: itext 2.0.8, Java 8.0, HTML, CSS.

How to convert html with svg tags and css in to image in java

We can successfully convert an SVG into an image with Batik, however, I need to convert a whole HTML div, with SVG implemented within, along with its CSS presentation code, into an image.
Is there any modules / support within Batik or some other Java API for achieving this?

Selenium library for Java may help you. It can run a browser (ie, chrome, firefox, etc.) in background mode, and you can load an HTML and take a snapshot of the content.
Although it's designed for testing and automation, it's the only way I can offer to you.
Hope it helps.
http://www.seleniumhq.org/

We had the same problem, and we solved it by spawning an PhantomJS process.
Phantom takes an JavaScript file that will instruct its headless browser what to do.
You can wait until the page is fully loaded and then you can print the output into the console as a data URI.
Below is a very simple example from my PhantomJS scripts:
var page = require( "webpage" ).create();
var options = JSON.parse( phantom.args[ 0 ] );
page.open( options.url, "POST", decodeURIComponent( options.payload || "" ), function( status ) {
if ( status === "fail" ) {
phantom.exit( 1 );
}
var contents = page.renderBase64( "png" );
require( "system" ).stderr.write( contents );
});

This is not an easy task as what you are asking is the process called "html rendering" and is basically what browsers try to implement correctly for over 2 decades.
If the CSS you need rendered is fairly simple (no CSS3, no fancy stuff, etc.), then there is a high chance that one of the open-source renderers would be able to handle that (PhantomJS as an example). See #gustavohenke answer for more details.
If the CSS is moderately complex and if you are able to modify it if needed - then there are some fast but non-free renderers, like PrinceXML and DocRaptor.
If the CSS could be very complex and you are not able to make it simpler - then the only option would be to render it in a real browser. You can use Selenium for that as it has a way of running the browser, rendering your HTML in it and "screenshotting" the result all in automated fashion. See #Jorgeblom answer for more details.

vaadin legacy-styles.css doesn't load in IE8

I am using vaadin to create a web app. I want to import legacy-styles.css into my styles.css
my styles.css is as follow:
#import "../reindeer/legacy-styles.css";
.v-app {
background: yellow;
}
Then use morderniz to targer IE8
Element head = response.getDocument().head();
Element meta = head.appendElement("meta");
meta.attr("name", "viewport");
meta.attr("content", "width=device-width, initial-scale=1, maximum-scale=1");
// Meta tag to force IE8 to standard mode
String ie8Meta = "<meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge,chrome=1\">";
head.prepend(ie8Meta);
// some other stuffs ...
// Adding modernizr library to target ie8
// http://modernizr.com/docs/
String modernizr = "<script type=\"text/javascript\" src=\"//cdnjs.cloudflare.com/ajax/libs/modernizr/2.6.2/modernizr.min.js\"></script>";
head.prepend(modernizr);
Strangely, the legacy-styles loaded properly in IE10. Then legacy-styles.css doesn't load in IE8
The error reported is
com.vaadin.client.VConsole
SEVERE: CSS files may have not loaded properly.
I have tried to rearrange moderniz.js (using append instead of prepend) but didn't work

Before we dig into the problem at hand, using #import is a super bad idea. While it is a convenient thing to do for you, it means that every single user that comes to your site will have to download a file (the document itself), in order to download another file (the one containing the #import string) just to download yet another file. In addition to DNS resolution and tcp handshakes, you are looking at possibly up to a dozen round trips between your server and their computer just to get this one single document.
It would be much better to use a css precompiler, such as sass or less, and concatenating the older file with the new one into a single, compressed, file.
Next, modernizr shouldn't be used to target IE8, or any browser for that matter. That is actually completely against the purpose of Modernizr - which specializes in feature detection, rather than browser detection. That means that rather than saying you want to "target IE8", you would think more along the lines of wanting to "target browsers lacking feature X" (svg, geolocation, rounded corners, etc).
That being said, there are a number of reasons to specifically target versions of internet explorer, but modernizr is not the way to do it. The more correct way to do this would be using IE's conditional classes
<!--[if IE 8]>
//IE specific stlyes here
<![endif]-->
Nothing you have shown has anything to do with IE specific code, and since modernizr doesn't add any features itself, there it isn't really clear why IE 10 would do anything different at all.

sencha GXT css classes

i'm having problems with styling grid in gxt, the thing is that the elements in the grid get (i don't know how exactly) css class named ".GKA1XC4LIC" and this class overrides the settings, provided by my own css class (in my own css file). However some properties (like font-size) i'm able to change with my class (i mean my css file is being loaded).
i guess this .GKA1XC4LIC class is generated somewhere i don't know where. Why it is done this way? Am i doing this completely wrong?
i set class name like this:
codeColumnConfig.setColumnTextClassName("smk-grid-text");
thanks

I assume you are using GXT3. You said some properties are set by changing the css. That is because the GXT3 has not set them and so they work.
To do use the GXT3 Appearnces correctly, it may be best to see this section Styling a GXT 3 application in the migration guide. It's about the middle of the page.
It explains the two ways to modify the Appearance pattern that GXT3 uses.
via configuration (in the GWT module XML file)
via constructor arguments
There is another explanation in the Sencha docs for Appearances
That said, that is pretty involved depending on how much you need to change things.
To do it quickly, I sometimes use a cell to render it how I need:
For example to render a cell in a grid a particular way I would do
ColumnConfig<Users, String> userCol = new ColumnConfig<SelectUserDialog.Users, String>(selectUserProperties.userName(), 240);
AbstractCell<String> c2 = new AbstractCell<String>() {
#Override public void render(com.google.gwt.cell.client.Cell.Context context, String value, SafeHtmlBuilder sb) {
value = "<div style=\"font-size:2.5EM; line-height : 30px; height=40px\" >" + value + "</div>";
sb.appendHtmlConstant(value);
}
};
userCol.setCell(c2);
If you are not using ColumnConfig already, you may need to see ValueProvider and ProperyAccess

Unkown error when calling Java applet from JavaScript

Here's the JavaScript (on an aspx page):
function WriteDocument(clientRef, system, branch, category, pdfXML)
{
AppletReturnValue = document.DocApplet.WriteDocument(clientRef, apmBROOMS, branch, category, pdfXML);
if (AppletReturnValue.length > 0) {
document.getElementById('pdfData').value = "";
CallServer(AppletReturnValue,'');
}
PostBackAndDisplayPDF()
}
pdfXML is got from pdfData which is a hidden field on the page containing the XML that contains base64 encoded pdf data which is passed to the java applet. All the other values being passed have within range sensible values.
The XML is like this
<Documents>
<FileName>AFileName</FileName>
<PDF>JVBERiDAzOTY1NzMwIDAwMDAwIG4NCjAwMDM5NjU4NDcgMDAwMDAgbg0KMDAwMzk2NTk2</PDF>
</Documents>
The contents of the element PDF is a lot bigger than displayed here
The signature of the Java method is:
public String WriteDocument(String clientPolicyReference,
int systemType,
int branch,
String category,
String PDFData) throws Exception
It seems that when the size of the PDF data gets large the applet fails to be called and the error 'Unknown Error' is thrown in the JS.
The PDF doc the data of which is producing this error is about 4Mb in size.
Many thanks in advance for any help.

Thanks for responding chaps but I've sorted the problem.
How? I took JRE 1.6 update 12 off and stuck update 7 (which is the version we reccomend to those who use our website) on my machine.
Why update 12 stopped working I don't know. Why update 7 is stable I don't know. [sigh]
It's things like this that make me glad I work mostly with a 'long time between releases' framework like .net.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

OpenHtmlToPdf results in Error when using PDF/A conformance - java

Related

ITEXT dataElements Loop Performance

How to convert html with svg tags and css in to image in java

vaadin legacy-styles.css doesn't load in IE8

sencha GXT css classes

Unkown error when calling Java applet from JavaScript

Categories

Resources