UTF-8 emoji problem in PDF for Spring Boot - java

I am using Spring Boot to create and return PDF. There is an issue when my string content contains emoji and Unicode characters like "This is d£escript😭ion section😢😤😠😡🤬", then in downloaded PDF they are skipped. Can someone please help me to resolve this issue.
My code is like below
ITextRenderer renderer = new ITextRenderer();
ResourceLoaderUserAgent callback = new ResourceLoaderUserAgent(renderer.getOutputDevice());
callback.setSharedContext(renderer.getSharedContext());
renderer.getSharedContext().setUserAgentCallback(callback);
renderer.setDocumentFromString(pdfContent(templateId, pdfData));
renderer.layout();
renderer.createPDF(outputStream);
}
pdfContent(TemplateId templateId, Map<String, Object> pdfData) throws TemplateException,
IOException {
return FreeMarkerTemplateUtils
.processTemplateIntoString(freemarkerMailConfiguration.getTemplate(templateId.getValue()), pdfData);
}

The problem is that the font you use doesn't contain emojis, so they can't be rendered in the PDF. Unfortunately, I could not find a font that would cover all emojis. The best I could find is DejaVu, which cover some of the emojis in your example.
To use it,
you have to download the DejaVu font (you will find it easily on the internet).
include it in the rendering process (make sure you match the exact path of the file):
ITextRenderer renderer = new ITextRenderer();
renderer.getFontResolver().addFont("font/dejavu-sans/DejaVuSans.ttf", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
set the font in the HTML:
<html>
<head>
<meta charset="utf-8" />
<style>
body{font-family:"DejaVu Sans", sans-serif;}
</style>
</head>
<body>
<p>This is descript😭ion section😢😤😠😡🤬.</p>
</body>
</html>
Here is the result in the PDF:

Emoji symbols are problematic as symbols we can see that if we use one font with two styles (upper left) even in one font the symbols are not matched well so in upper style there is one missing and in lower style two look identical.
Converted to PDF (upper middle) they look reasonable on the surface graphic image however we see that when extracted text (upper right) the font styling was lost and there is only one glyph possible for each valid font character.
So the lower row is on left also as shown in modern notepad however the same system font is now applying the other style and if we extract those we get
😭😢😤😠😡🤬 as
Thus the way a font and its style of emoji symbols is generally not well supported by a font system but if we traverse via html it is much more consistent however the text is not text.
The best we might get is a poor hybrid of images of undefined CID characters which can be confusing as the characters are all the same.
������
������
So if you export the pdf as symbols with an image overlay there is no visual equivalence

Related

Java ,wkhtmltopdf, HTML to PDF not all fonts works correctly

I am using wkhtmltopdf to generate PDF from HTML (string not file).
Before I start creating PDF, I add all fonts to HTML file
htmlTemplate = htmlTemplate.replaceAll("\\$\\{fontsPlaceholder}", ResourcesCache.getInstance().getFontsCSSCache());
and all fonts are inside of html, and look like
#font-face {
font-family: 'Abril_Fatface-Regular';
src: url(data:font/ttf;base64,AAEAA....
But when I tried to use font properties, bold, italic etc, and then make the pdf, this is not working correctly, and field use 'regular' font in pdf, but in html set bold..
So, why not all fonts working good in wkhtmltopdf, does someone fixed issue like this?
I have been solve this issue. The problem was with fonts, not with wkhtmltopdf lib.
If you want to use font-property correct, you need to be sure that 'Preferred Family' is set in your font. To check or to set this you can with FontForge app.
Open FontForge, then import your font, and click 'Element/FontInfo/TTF Names', and change 'Preferred Family' (it's need to be unique for each font).

How to display Chinese characters in java web applications?

I use Itext 5 to create pdf file. I refer to https://developers.itextpdf.com/examples/itext-action-second-edition/chapter-1 and get a pdf. When I open it, Chinese characters display normally.
But I develop web applications like https://developers.itextpdf.com/examples/itext-action-second-edition/chapter-9 described. Chinese characters is blank when pdf show in browser.
My font code is
String chFontPath = "c:\fonts\xxx.ttf";
BaseFont chBaseFont = BaseFont.CreateFont(chFontPath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Font font = new Font(chBaseFont, 12);
Does anybody know?
If you embed the font using an absolute path, probably the path will be broken for any webapp you develop. Use a relative path instead for any embeddable (fonts, images, etc) so you can place them in server without any trouble.
I think Bruno's answer about a relative anchor could help you to set up a relative path for your font: https://stackoverflow.com/a/27064142/4048864

Adding Header or Footer on every page using ITextRenderer from HTML

I'm creating an HTML report usgin freemarker, and i produce a PDF from that HTML using ITextRenderer.
ITextRenderer renderer = new ITextRenderer();
renderer.setDocumentFromString(html);
renderer.layout();
renderer.createPDF(baosPDF);
I have a table in that html, with a header that successfully shows on every page using css classes:
thead { display:table-header-group }
Is it possible to do the same trick for an arbitrary section of my document? (let say, a div) I'ld like to keep my html vanilla, and identify the "header" and "footer" i want to see on every page using css.
Is it possible, only with css?
Perhaps you should have a look at
http://developers.itextpdf.com/content/itext-7-examples/converting-html-pdf
It gives a few examples of converting html to pdf. Including loading an external stylesheet.

Croatian letters in Java Program

I need help your help on croatian letters in my program. On the website (play framework) you can put in names. The name will be saved and a PDF file will be created (with iText) where the string the user typed in is shown. I want to use the font lucida bright. The problem is that there are non-german letters in the names that are not shown. I also tried to convert it into unicode (/u----) but it also doesn't work. I tried to use utf-8 like this in the iText doc:
String name = new String(e.getName().getBytes("UTF-8"));
// e is the object where the name and some other infos are saved
and in the html where the user can type in the name
<meta name="language" content="cr">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
but it doesn't work completely.
In lucida bright (font) are only Š and š correctly shown and in times new roman Š, Ž, š and ž. How can I solve this problem?
If you want to use a font in iText's PDF generation then you have to add it
As in
Font font = FontFactory.getFont("Times-Roman");
document.add(new Paragraph("Times-Roman", font));
For more information see iText

Issue rendering Basic HTML (without inline CSS styles) using Flying Saucer ITextRenderer

I am very new to Flying Saucer.
I am generating PDFs using ITextRenderer class in Java.
The problem that I am facng is, the HTML that i need to convert, contains basic HTML tags WITHOUT INLINE CSS STYLES.
For Example:-
<p><b>hello</b> <i>this</i> is a <u>sample</u>
<font color="#FF6600">text for HTML</font> to pdf <font size="18">gen</font></p>
What I notice is that, in the above HTML, the attributes of font tag (size, color etc) have no effect in the PDF. Whereas, I have also been experimenting by hardcoding an HTML with inline CSS styles which works perfectly fine.
But my problem is I want the above HTML attributes to work due to several reasons...
Any helpful pointers will be appreciated.
Thanks,
Mangirish
Flying saucer doesn't support attributes on the <font> tag -- you need to use inline styles, like <font style="...">.

Categories

Resources