iText PDF display marathi, hindi and different languages in android

iText PDF display marathi, hindi and different languages in android - java

How to display different languages like Marathi, Hindi or any languages in pdf using ITEXT Pdf Library in android and java.

You can use any custom font for your language but I recommend Googles Nato Fonts which support wide variety of languages which you can easily get from here https://www.google.com/get/noto/.
Here are some fonts link for some languages:
Marathi & Hindi - https://www.google.com/get/noto/#sans-deva
Telugu - https://www.google.com/get/noto/#sans-telu
Arabic - https://www.google.com/get/noto/#sans-arab
You can search any language and get .tff file from there. And put in your resources folder. For Android put in assets folder and refer font file as assets/filename.ttf
Now here is the sample code to set marathi font -
File pdfFile = new File("marathi.pdf");
try {
PdfWriter pdfWriter = new PdfWriter(pdfFile);
PdfDocument pdfDocument = new PdfDocument(pdfWriter);
Document document = new Document(pdfDocument);
//font
final FontSet set = new FontSet();
set.addFont("assets/NotoSans-Regular.ttf");
document.setFontProvider(new FontProvider(set));
document.setProperty(Property.FONT, new String[]{"MyFontFamilyName"});
Paragraph paragraph = new Paragraph("अंतरिक्ष यान से दूर नीचे पृथ्वी शानदार ढंग से जगमगा रही थी ।");
document.add(paragraph);
document.close();
pdfDocument.close();
pdfWriter.close();
} catch (IOException e) {
e.printStackTrace();
}

Related

Non Latin characters is not showing in IText 5 using java

Hi I'm creating a pdf report using Itext 5, the report contains non-Latin characters, some characters are showing, and some are not.
I'm adding the minimum code example to replicate the issue. I think it may be font, but same font is working with Apache POI to generate doc file.
File file = new File(Util.FONTS, "calibri.ttf");
FontFactory.register(file.getAbsolutePath(), "MY_FONT");
try {
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream("Hello.pdf"));
document.open();
document.add(new Paragraph("ครูเวียดนามคนนี้เขาเย็ดเด็กนักเรียนทั้งห้อง สุดยอดจริงๆ ม้าขาว.mp4",
FontFactory.getFont("Tahoma", BaseFont.IDENTITY_H, BaseFont.EMBEDDED)));
document.add(new Paragraph("Hello, Καλημέρα, Grüßgott, Привіт",
FontFactory.getFont("MY_FONT", BaseFont.IDENTITY_H, BaseFont.EMBEDDED)));
document.close();
} catch (DocumentException | IOException e1) {
e1.printStackTrace();
}
[Font File use][1]
[1]: https://www.dropbox.com/sh/2ca0cc8nvrviido/AAAAcrHj4DMehZm8no0ImR89a?dl=1
Thank you.

How to export a PDF from WebView where I can select text, inside the new PDF

I'm trying to convert WebView's text content to PDF. Using the code below.
PdfDocument.Page page = document.StartPage(new PdfDocument.PageInfo.Builder(webpage.Width, webpage.Height, 1).Create());
webpage.Draw(page.Canvas);
PDF is properly generated but I can't select text from that PDF. Its like WebView content but converted into an image.
But if I try to print same WebView from the print menu and save it as PDF text selection is working and size of the pdf is smaller.
So how can I create PDF from WebView where text is also selectable.
Eg. of text selection.

What you did is printing the web page as an image to the PDF canvas - that's why the size is larger, and you cannot select text. Because the text is not present there as a text object, but as an image.
The reason for this, is that you draw the webview (and not the webpage) to the PDF canvas. That is just like outputting the view itself as a bitmap to the PDF.
What you might want to do is to use an XHTML to PDF rendering library, like iText or Flying Saucer to properly render the XHTML webpage (and not the webview!) to a PDF.

You can use iText library for this, add this dependency in gradle
compile 'com.itextpdf:itextg:5.5.10'
Now convert the webview's text to pdf like this
try {
File mFolder = new File(getExternalFilesDir(null) + "/sample");
File imgFile = new File(mFolder.getAbsolutePath() + "/Test.pdf");
if (!mFolder.exists()) {
mFolder.mkdir();
}
if (!imgFile.exists()) {
imgFile.createNewFile();
}
String webviewText = "<html><body>Your webview's text content </body></html>";
OutputStream file = new FileOutputStream(imgFile);
Document document = new Document();
PdfWriter.getInstance(document, file);
document.open();
HTMLWorker htmlWorker = new HTMLWorker(document);
htmlWorker.parse(new StringReader(webviewText));
document.close();
file.close();
} catch (Exception e) {
e.printStackTrace();
}

Write cyrillic chars into PDF form fields with PDFBox

I am using pdfbox 2.0.5 to fill out form fields of a PDF document using this code:
doc = PDDocument.load(inputStream);
PDDocumentCatalog catalog = doc.getDocumentCatalog();
PDAcroForm form = catalog.getAcroForm();
for (PDField field : form.getFieldTree()){
field.setValue("должен");
}
I get this error: U+0434 ('afii10069') is not available in this font Times-Roman (generic: TimesNewRomanPSMT) encoding: StandardEncoding with differences
The PDF document itself contains cyrillic text which is displayed fine. I have tried using different fonts. For "Arial Unicode MS" it wants to download a 50MB "Adobe Acrobat Reader DC Font Pack". Is this a requirement for cyrillic characters?
Which font do I have to specify in the text field to handle cyrillic (or asian) characters?
Thanks,
Ropo

Adobe handles that by reusing the embedded font file in the {/Ubuntu} font and creates a new font resource from that. Here is a quick hack which can serve as a guide of how to achieve something similar. The code is specific to a sample I've got.
PDDocument doc = PDDocument.load(new File(...));
PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
PDResources formResources = acroForm.getDefaultResources();
PDTrueTypeFont font = (PDTrueTypeFont) formResources.getFont(COSName.getPDFName("Ubuntu"));
// here is the 'magic' to reuse the font as a new font resource
TrueTypeFont ttFont = font.getTrueTypeFont();
PDFont font2 = PDType0Font.load(doc, ttFont, true);
ttFont.close();
formResources.put(COSName.getPDFName("F0"), font2);
PDTextField formField = (PDTextField) acroForm.getField("Text2");
formField.setDefaultAppearance("/F0 0 Tf 0 g");
formField.setValue("öäüинформацию");
doc.save(...);
doc.close();

The solution was trivial:
form.setNeedAppearances(true);
And then I remove the blue box of the field with:
field.setReadOnly(true);

Convert docx file into PDF with Java

I'am looking for some "stable" method to convert DOCX file from MS WORD into PDF. Since now I have used OpenOffice installed as listener but it often hangs. The problem is that we have situations when many users want to convert SXW,DOCX files into PDF at the same time. Is there some other possibility? I tryed with examples from this site: https://angelozerr.wordpress.com/2012/12/06/how-to-convert-docxodt-to-pdfhtml-with-java/ but the output result is not good (converted documents have errors and layout is quite modified).
here is "source" docx document:
here is document converted with docx4j with some exception text inside document. Also the text in upper right corner is missing.
this one is PDF created with OpenOffice as converter from docx to pdf. Some text is missing "upper right corner"
Is there some other option to convert docx into pdf with Java?

There are lot of methods to do conversion
One of the used method is using POI and DOCX4j
InputStream is = new FileInputStream(new File("your Docx PAth"));
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
.load(is);
List sections = wordMLPackage.getDocumentModel().getSections();
for (int i = 0; i < sections.size(); i++) {
wordMLPackage.getDocumentModel().getSections().get(i)
.getPageDimensions();
}
Mapper fontMapper = new IdentityPlusMapper();
PhysicalFont font = PhysicalFonts.getPhysicalFonts().get(
"Comic Sans MS");//set your desired font
fontMapper.getFontMappings().put("Algerian", font);
wordMLPackage.setFontMapper(fontMapper);
PdfSettings pdfSettings = new PdfSettings();
org.docx4j.convert.out.pdf.PdfConversion conversion = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(
wordMLPackage);
//To turn off logger
List<Logger> loggers = Collections.<Logger> list(LogManager
.getCurrentLoggers());
loggers.add(LogManager.getRootLogger());
for (Logger logger : loggers) {
logger.setLevel(Level.OFF);
}
OutputStream out = new FileOutputStream(new File("Your OutPut PDF path"));
conversion.output(out, pdfSettings);
System.out.println("DONE!!");
This works perfect and even tried on multiple DOCX files.

How to add text watermark to pdf in Java using Apache PDFBox?

I am not getting any tutorial for adding a text watermark in a PDF file? Can you all please guide me, I am very new to PDFBOX.
Its not duplicate, the link in the comment didn't help me. I want to add text, not an image to the pdf.

Here is an example using PDFBox 2.0.2. This will load a PDF and write some text in the bottom right corner in a red transparent font. If it is a multiple page PDF the watermark will appear on every page. It might not be production ready, as I am not sure if there are some additional null conditions that need to be checked, but it should get you running in the right direction.
Keep in mind that this particular block of code will not modify the original PDF, but will create a new PDF using the Tmp_(filename) as the output.
private static void watermarkPDF (File fileStored) {
File tmpPDF;
PDDocument doc;
tmpPDF = new File(fileStored.getParent() + System.getProperty("file.separator") +"Tmp_"+fileStored.getName());
doc = PDDocument.load(fileStored);
for(PDPage page:doc.getPages()){
PDPageContentStream cs = new PDPageContentStream(doc, page, AppendMode.APPEND, true, true);
String ts = "Some sample text";
PDFont font = PDType1Font.HELVETICA_BOLD;
float fontSize = 14.0f;
PDResources resources = page.getResources();
PDExtendedGraphicsState r0 = new PDExtendedGraphicsState();
r0.setNonStrokingAlphaConstant(0.5f);
cs.setGraphicsStateParameters(r0);
cs.setNonStrokingColor(255,0,0);//Red
cs.beginText();
cs.setFont(font, fontSize);
cs.setTextMatrix(Matrix.getTranslateInstance(0f,0f));
cs.showText(ts);
cs.endText();
}
cs.close();
}
doc.save(tmpPDF);
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

iText PDF display marathi, hindi and different languages in android - java

How to display different languages like Marathi, Hindi or any languages in pdf using ITEXT Pdf Library in android and java.

Related

Non Latin characters is not showing in IText 5 using java

How to export a PDF from WebView where I can select text, inside the new PDF

Write cyrillic chars into PDF form fields with PDFBox

Convert docx file into PDF with Java

How to add text watermark to pdf in Java using Apache PDFBox?

Categories

Resources