I am trying to create PDF file with java and I need to use Lithuanian letters within file. I tried to use html code and use htmlWorker to parse it, however it does not work on most letters(it works on some). If anyone could help me with this I would gladly appreciate it.
try {
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream(pathy));
document.open();
HTMLWorker htmlWorker = new HTMLWorker(document);
String s = ("<html>ĄąĘęŪūČŠ"
+ "ŽčšžĖėĮįŲų</html>");
htmlWorker.parse(new StringReader(s));
document.close();
}
catch(Exception e2){
}
I solved my issue by using unicode, not sure why it doesn't work with html code still...
Document document = new Document();
try {
PdfWriter.getInstance(document, new FileOutputStream(pathy));
document.open();
BaseFont bfComic = BaseFont.createFont("c:\\windows\\fonts\\arial.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
document.add(new Paragraph("ąĄčČęĘėĖįĮšŠųŲūŪžŽ", new Font(bfComic, 12)));
} catch (Exception e2) {
System.err.println(e2.getMessage());
}
document.close();
Related
Hi I'm creating a pdf report using Itext 5, the report contains non-Latin characters, some characters are showing, and some are not.
I'm adding the minimum code example to replicate the issue. I think it may be font, but same font is working with Apache POI to generate doc file.
File file = new File(Util.FONTS, "calibri.ttf");
FontFactory.register(file.getAbsolutePath(), "MY_FONT");
try {
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream("Hello.pdf"));
document.open();
document.add(new Paragraph("ครูเวียดนามคนนี้เขาเย็ดเด็กนักเรียนทั้งห้อง สุดยอดจริงๆ ม้าขาว.mp4",
FontFactory.getFont("Tahoma", BaseFont.IDENTITY_H, BaseFont.EMBEDDED)));
document.add(new Paragraph("Hello, Καλημέρα, Grüßgott, Привіт",
FontFactory.getFont("MY_FONT", BaseFont.IDENTITY_H, BaseFont.EMBEDDED)));
document.close();
} catch (DocumentException | IOException e1) {
e1.printStackTrace();
}
[Font File use][1]
[1]: https://www.dropbox.com/sh/2ca0cc8nvrviido/AAAAcrHj4DMehZm8no0ImR89a?dl=1
Thank you.
i try to create PDF Document but i'am not able to Create a Document with 2 Paragraphs.
It just show the first one added:
Here is my code for reproducing:
public void createBillingDocument(List<PDFData> datas) {
datas.forEach(data -> {
try {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
PdfWriter writer = new PdfWriter(outputStream);
PdfDocument pdf = new PdfDocument(writer);
Document document = new Document(pdf,PageSize.A4);
document.add(new Paragraph("Muh"));
document.add(new Paragraph("Kuh"));
document.close();
pdf.close();
writer.close();
outputStream.close();
fileAccess.storeFile(outputStream.toString(), "test/" + "Name.pdf");
} catch (IOException e) {
throw new RuntimeException(e);
}
});
}
Has anyone the same Problem an found a Solution?
Regards
Edit:
One Strange thing is, if i make a Breake betwenn it both Paragraphs are shown. Each one one Page.
try {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
PdfWriter writer = new PdfWriter(outputStream);
PdfDocument pdf = new PdfDocument(writer);
Document document = new Document(pdf,PageSize.A4);
document.add(new Paragraph("Muh"));
document.add(new AreaBreak());
document.add(new Paragraph("Kuh"));
document.close();
pdf.close();
writer.close();
outputStream.close();
fileAccess.storeFile(outputStream.toString(), "test/" + "Name.pdf");
} catch (IOException e) {
throw new RuntimeException(e);
}
Ok found out. #mkl was right. The Problem lies somewhere in "save the Document as String". Chaged it to ByteArray and voila it worked :)
Thanks for your Time!
I'm executing this code from Eclipse and on Tomcat into a webapp
FileInputStream is = new FileInputStream("C:/Users/admin/Desktop/dummy.txt");
try {
FontFactory.register("C:/Workspace/Osmosit/ReportManager/testSvn/ReportManagerCommon/src/main/java/com/osmosit/reportmanager/common/itext/fonts/ARIALUNI.TTF");
} catch (Exception e) {
e.printStackTrace();
}
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(1024);
Document document = new Document(PageSize.A4);
PdfWriter writer;
writer = PdfWriter.getInstance(document, byteArrayOutputStream);
document.open();
XMLWorkerHelper.getInstance().parseXHtml(writer, document, is);
document.close();
byteArrayOutputStream.close();
FileOutputStream fos = new FileOutputStream("C:/Users/admin/Desktop/prova-web.pdf");
fos.write(byteArrayOutputStream.toByteArray());
fos.close();
the dummy.txt is a simple html with aranic and latin characters
<div style="font-family: Arial Unicode MS;" ><p>كما. أي مدن العدّ وقام test latin</p><br /></div>
When I run under eclipse I obtain a correct pd, when it runs on Tomcat I get this:
كما. أي مدن العدّ وقام test latin
PS: I'm using itextpdf ver 5.5.8
You have an encoding problem. Either you saved dummy.txt using the wrong encoding (e.g. as Latin-1 instead of as UTF-8), or you are reading dummy.txt using the wrong encoding.
See html to pdf convert, cyrillic characters not displayed properly and adapt the line in which you use parseHtml():
XMLWorkerHelper.getInstance().parseXHtml(writer, document,
is, null, Charset.forName("UTF-8"), fontImp);
Take a look at the ParseHtml11 example to find out what fontImp is about.
You are also making another mistake: Arabic is read from right to left, and in your code, you aren't defining the run direction. See Arabic characters from html content to pdf using iText
In your case, I would put the Arabic text in a table and I would follow the ParseHtml7 example from the official documentation:
public void createPdf(String file) throws IOException, DocumentException {
// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
// step 3
document.open();
// step 4
// Styles
CSSResolver cssResolver = new StyleAttrCSSResolver();
XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider(XMLWorkerFontProvider.DONTLOOKFORFONTS);
fontProvider.register("resources/fonts/NotoNaskhArabic-Regular.ttf");
CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline pdf = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new FileInputStream(HTML), Charset.forName("UTF-8"));
PdfPTable table = new PdfPTable(1);
PdfPCell cell = new PdfPCell();
cell.setRunDirection(PdfWriter.RUN_DIRECTION_RTL);
for (Element e : elements) {
cell.addElement(e);
}
table.addCell(cell);
document.add(table);
// step 5
document.close();
}
So I was trying to use this font I installed called "Tengwar Quenya-1 Regular" and it didn't work, it keep writing de PDF document with the default font. So I tried to use the downloaded file, by using EMBED method, and it is still printing the default font, I wondering if anyone had tried this before, and could tell me what I am doing wrong. Check the code:
public void testePdf(){
Document document = new Document();
String filename = "C:\\Users\\Marcelo\\Downloads\\tengwar_quenya\\QUENCAP1.TFF";
FontFactory.register(filename);
Font fonte = FontFactory.getFont(filename, BaseFont.CP1252, BaseFont.EMBEDDED);
try {
PdfWriter.getInstance(document,
new FileOutputStream(filename+ "HelloWorld.pdf"));
document.open();
document.add(new Paragraph("A Hello World PDF document.", fonte));
document.close(); // no need to close PDFwriter?
} catch (DocumentException e) {
e.printStackTrace();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
It compiles fine, just not with the font I selected. If it is a glyph instead of a caracter, will it be a problem?
I have Googled for the font you mention. I have downloaded it and I have made a small SSCCE that you can download here: TengwarQuenya1
This is the code:
public static final String FONT = "resources/fonts/QUENCAP1.TTF";
public void createPdf(String dest) throws IOException, DocumentException {
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream(DEST));
document.open();
Font f1 = FontFactory.getFont(FONT, BaseFont.WINANSI, BaseFont.EMBEDDED, 12);
document.add(new Paragraph("A Hello World PDF document.", f1));
document.close();
}
This is the result: tengwarquenya1.pdf
I'm not sure what the resulting text means, but it doesn't look like the default font to me.
In other words: I can't reproduce the problem. Note that you don't need to register a font if you pass its file path to the FontFactory. Obviously, my font path is different from yours. I think that yours is wrong. Try putting the ".TTF" file in another location.
I have HTML file with an external CSS. I want to create PDF from the HTML file, but the endcoing doesn't work. HTML file works fine, but after transfering to PDF, some characters in PDF are missing. (čřě...) It happens even if I set the Charset in PDFWriter constructor.
How do I solve this, please?
public void createPDF() {
try {
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(username + ID + ".pdf"));
document.open();
String hovinko = username + ID + ".html";
XMLWorkerHelper.getInstance().parseXHtml(writer, document, new FileInputStream(hovinko), Charset.forName("UTF-8"));
document.close();
System.out.println("PDF Created!");
} catch (Exception ex) {
ex.printStackTrace();
}
}
Did you try to convert your special characters before writing them to your PDF?
yourHTMLString.replaceAll(oldChar, newChar);
ć = ć
ř = ř
ě = ě
If you need more special characters, visit this link.
EDIT: Then try this out, it worked for me:
BaseFont basefont = BaseFont.createFont("C:/Windows/Fonts/ARIAL.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Font font = new Font(basefont, 12);
document.add(new Paragraph("čřě", font));
Try it with below logic. It worked for me:
InputStream is = new ByteArrayInputStream(hovinko.getBytes(Charset.forName("UTF-8")));
XMLWorkerHelper.getInstance().parseXHtml(writer, document, is, Charset.forName("UTF-8"));
I used xmlworker version 5.5.12 and itextpdf version 5.5.12.
I was strugling with sam problem (Polish special signs).
For me solution was to write a good font-family in html code.