HTML -> PDF rendering issues - java

I'm generating pdfs from HTML pages with an application. Sometimes, the pdf is formatted correctly (with styles); other times, it lacks style elements.
In the log file I can see the "Error in rendering".
We are using HTML tags and using string buffer we are converting html tag to pdf file. Not sure why we are getting this missing format issues while generating the pdf file.

So sometimes the CSS file (style) does convert with the HTML file, and sometimes the CSS doesn't convert with the HTML file.
I'm guessing that you use an external CSS file. If I were you, I would try to type your CSS code inside your HTML file, under the header element, like this:
<style>
body {background-color:#fff}
h1 {color:#eee}
</style>

Related

Keep CKEditor formatting when exporting to MS Word

I'm trying to export a text area (for which I use ckeditor) into a Word document. I'm using JSP, and setting HTTP headers of a target page to receive the textarea value in request scope:
<%#page contentType="application/vnd.ms-word"%>
response.setHeader("Content-Disposition", "attachment;filename=responseLetter.doc")
...
<%=textAreaReqScopeValue%>
However, I lose formatting and style of my source ckeditor (example below) when the Word document has been generated:
<p>Dear Anonymous,</p><p>This is in response to your <strong><em><u>request regarding your continued ...
Is there any way to keep the formatting, either by generating the Word document or through CKEditor?
Using googoose.js or html-doc.js solved my problem. An open xml library should have been used to process html tags for the ms-word output.

parse text from xml

I have following link
https://hero.epa.gov/hero/ws/swift.cfc?method=getProjectRIS&project_id=993&getallabstracts=true
I want to parse this xml to get only text, like
Provider: HERO - 2.xx
DBvendor=EPA
Text-encoding=UTF-8
How can I parse it ?
Well, it's not a text file, it's an HTML file. If you open a file in browser and select view source you will be able to see text enclosed in <char> tags.
When it's opened in browser, these tags and other HTML content is interpreted and output is rendered on the page (that's why it looks like a text). If you want to implement similar behavior in Java then you should look into PhantomJS and/or JSoup examples.
It looks like a text file but it is an XML file and the browser just displays its text content.
To verify right click and look at the page source.
You can use a library like Jsoup for parsing the file and getting the contents.
https://jsoup.org/cookbook/introduction/parsing-a-document

Adding Header or Footer on every page using ITextRenderer from HTML

I'm creating an HTML report usgin freemarker, and i produce a PDF from that HTML using ITextRenderer.
ITextRenderer renderer = new ITextRenderer();
renderer.setDocumentFromString(html);
renderer.layout();
renderer.createPDF(baosPDF);
I have a table in that html, with a header that successfully shows on every page using css classes:
thead { display:table-header-group }
Is it possible to do the same trick for an arbitrary section of my document? (let say, a div) I'ld like to keep my html vanilla, and identify the "header" and "footer" i want to see on every page using css.
Is it possible, only with css?
Perhaps you should have a look at
http://developers.itextpdf.com/content/itext-7-examples/converting-html-pdf
It gives a few examples of converting html to pdf. Including loading an external stylesheet.

how to place HTML text into OpenOffice document using OpenOffice API

Lets see at this example:
I've got HTML tagged text:
<font size="100">Example text</font>
I have *.odt (OpenDocument Text) document where I want to place this HTML text with formatting depends on HTML tags (in this example font tag should be ommited and text Example text should have 100point size font in result *.odt file).
I prefer (but this is not strong requirement) to use OpenOffice UNO API for Java to achieve that. Is there any way to inject this HTML text into body of *.odt document with simple UNO API build-in HTML-odt converter or something like this (or I have to manually go through HTML tags in text and then use OO UNO API for placing text with specific formatting - e.g. font size)?
OK, this is what I've done to achieve this (using OpenOffice UNO Api with JAVA):
Load odt document where we want to place HTML text.
Goto place where you want to place HTML text.
Save HTML text in temp file in the system (maybe it is possible without saving with http URL but I wasn't testing it).
Insert HTML into odt following this instructions and passing URL to temp HTML file (remember about converting system path to OO path).
Maybe you can use JODConverter or you can use the xslt from xhtml2odt

Printing HTML file out to printer in java

I need to print a an html out to the printer programmatically. I do not want to print the html tags, I want the html tags parsed before printed.
This code adds html features and data to an htm document named document. I am then sending the output to a file named itext.html
HtmlWriter writer2 = HtmlWriter.getInstance(document,new FileOutputStream("itext.html"));
I know need to somehow parse that html file and print it without having to open it up in a browser and going to FILE and Print.
Cobra will render HTML to a Swing-compatible panel. You should be able to print that using the standard Print APIs/services.

Categories

Resources