docx conversion to pdf in korean font - java

Hope someone can help me. It's about docx to pdf conversion having korean sign in docx document.
I'm able to convert a docx document to pdf with docx4j.
In pdf document, I can see the result. But if my docx document contains korean font, I can't see any korean font in my pdf document except the latin numbers.
What do I have to do to get korean font in my pdf from the docx document?
Here is my code:
File docXFile ="E:/contract2Files/test.docx";
WordprocessingMLPackage wordprocessingMLPackage = WordprocessingMLPackage.load(docXFile);
String out = docXFile .replace("docx","pdf");
File pdfFile = new File(out);
OutputStream pdfFileOs = new java.io.FileOutputStream(pdfFile);
org.docx4j.convert.out.pdf.PdfConversion c = new JanoPdfConversion(wordprocessingMLPackage);
c.output(pdfFileOs);

Please try http://www.docx4java.org/docx4j/docx4j-3_0-beta2.zip (link updated 15 Nov)
You might need to configure your font mapper, though things work out of the box with the Identity mapper on my Windows box, since I have the relevant font installed.
If this doesn't help, please put a sample docx somewhere StackOverflow users can see it.

thanks again. I tried the newest jar file with this code passage and it worked!!! Now I get the korean letters into pdf. Thank again.
ThemePart themePart =
wordprocessingMLPackage.getMainDocumentPart().getThemePart();
org.docx4j.dml.BaseStyles.FontScheme fontScheme = themePart.getFontScheme();
org.docx4j.dml.TextFont textFont = fontScheme.getMinorFont().getLatin();
textFont.setTypeface("Malgun Gothic");

Related

PDFBox returns missing descendant font dictionary

when extracting the first page of a PDF I get java.io.IOException: Missing descendant font dictionary.
The extraction code is the following:
PDDocument pdDocument = PDDocument.load(file);
PageExtractor pageExtractor = new PageExtractor(pdDocument, 1, 1);
PDDocument singlePageDocument = pageExtractor.extract();
It only happens with few PDFs and the error points to the Fonts definitions, but I am unclear on how fonts in PDF are processed by Apache PDFBox (using version v2.0.18).
Any tip?
Thanks

PDF Box flatten PDF causes weird spacing

I'm having an issue with PDF box flattening a PDF generated by Adobe Acrobat DC.
The Adobe Acrobat text field I created is absolutely the default text field.
In my example below, I have a PatientName field with the text value "Douglas McDouggelman".
When I flatten the PDF, here's what it looks like:
Anyone know what's up with this bizarre spacing?
It appears that the space + next character are combined. This is what it looks like when you try to select that character.
Code:
try (PDDocument document = PDDocument.load(pdfFormInputStream)) {
PDDocumentCatalog catalog = document.getDocumentCatalog();
PDAcroForm acroForm = catalog.getAcroForm();
acroForm.getField("PatientName").setValue("Douglas McDouggelman");
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
if (flattenPdfs) {
acroForm.flatten();
}
document.save(byteArrayOutputStream);
}
I realized this PDF was from some other group who made it and who knows what they did. So I found the source word document, repeated the creation of the form from Adobe DC, added the fields back to the document, then it was totally fine.
PDF box was not the problem... it was some unknown incorrect step that the person who originally prepared the pdf did.

showing emoji in pdf or excel

I have the data containing emoji in database. I want to display in the generated document such as pdf or in excel format.
I am using spring boot application. Please suggest any java library for generating either PDF or excel which supports emoji.
iText supports this. Assuming
your emoji is a unicode character
you use a font that contains the correct glyph for this unicode character
Best way to test this is to try it.
This is how to get started with iText:
https://developers.itextpdf.com/content/itext-7-jump-start-tutorial/installing-itext-7
And this is a small code-snippet that adds text to a document with different fonts:
PdfDocument pdf = new PdfDocument(new PdfWriter(dest));
Document document = new Document(pdf);
PdfFont font = PdfFontFactory.createFont(FontConstants.TIMES_ROMAN);
PdfFont bold = PdfFontFactory.createFont(FontConstants.TIMES_BOLD);
Text title =
new Text("The Strange Case of Dr. Jekyll and Mr. Hyde").setFont(bold);
Text author = new Text("Robert Louis Stevenson").setFont(font);
Paragraph p = new Paragraph().add(title).add(" by ").add(author);
document.add(p);
document.close();
For more information check out the tutorials.
https://developers.itextpdf.com/content/itext-7-building-blocks/chapter-1

Save russian text to pdf

I try save text to pdf via http://code.google.com/p/droidtext/ . But I have problem with saving russian text. In created pdf I see all latin letters and symbols. But russian letters I don't see. If write text like: "dfыва-:", in pdf I see: "df-:". I use font which have russian letters. If somebody had same problem, help please.
Code:
Document doc = new Document();
PdfWriter.getInstance(doc, new FileOutputStream("/sdcard/test.pdf");
doc.open();
BaseFont times = baseFont.createFont("/sdcard/test/TIMES.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
doc.add( new Paragraph("dfыва-:", new Font(times, 14) ) );
doc.close();
You need to set proper encoding.
This example will probably help you out.

how to create pdf file which contains marathi font using itext?

I have to create report using itext but the language should be hindi or marathi.
Is it possible to make pdf file which contains marathi font like mangal,shruti,shree-dev...etc if yes plz reply me
Thank you!
Adding a marathi font in itext pdf is :
BaseFont kruti_Dev = BaseFont.createFont("c:/WINDOWS/Font/Kruti_Dev_010.ttf"
,BaseFont.CP1252,BaseFont.EMBEDDED);
Font font = new Font(kruti_Dev, 12, Font.NORMAL);
Try to use Aspose PDF Generator. Make request and response to utf-8.

Categories

Resources