how to create Persian PDF with iText - java

I know that many people may have asked this question before. I've read almost all of them`but it couldn't help me solve my problem.
I'm using iText java library to generate a Persian PDF. I'm using the following
how to use PdfWriter.RUN_DIRECTION_RTL
code:
String ruta = txtruta.getText();
String contenido= txtcontenido.getText();
try {
FileOutputStream archivo = new FileOutputStream(ruta+".pdf");
Document doc = new Document(PageSize.A4,50,50,50,50);
PdfWriter.getInstance(doc, archivo);
doc.open();
BaseFont bfComic = BaseFont.createFont("D:\\Font\\B Lotus.ttf", BaseFont.IDENTITY_H,BaseFont.EMBEDDED);
Font font = new Font(bfComic, 12,Font.NORMAL);
doc.add(new Paragraph(contenido,font));
doc.close();
JOptionPane.showMessageDialog(null,"ok");
} catch (Exception e) {
System.out.println("Eroor"+e);
}
Output:
Problem

Document.add() doesn't support RTL text. You'll have to use ColumnText.setRunDirection or PdfPTable.setRunDirection.

I haven't worked with Persian language. But, I think your problem will be with the font (B Lotus.ttf) you used. In most of times using a registered Unicode font may solve the problem. Try again using a different font.
Also you can RTL a text phrase using following code.
PdfPCell pdfCell = new PdfPCell(new Phrase(contenido, myUnicodePersianFont));
pdfCell.setRunDirection(PdfWriter.RUN_DIRECTION_RTL);
You will find out a similar question here.

I succeeded
private void jButton1ActionPerformed(java.awt.event.ActionEvent evt) {
JFileChooser dlg = new JFileChooser();
int option = dlg.showSaveDialog(this);
if(option==JFileChooser.APPROVE_OPTION){
File f = dlg.getSelectedFile();
txtaddress.setText(f.toString());
}
}
private void jButton2ActionPerformed(java.awt.event.ActionEvent evt) {
String ruta = txtaddress.getText();
String con= content.getText();
try {
FileOutputStream archivo = new FileOutputStream(ruta+".pdf");
Document doc = new Document(PageSize.A4,50,50,50,50);
PdfWriter Writer = PdfWriter.getInstance(doc, archivo);
doc.open();
LanguageProcessor al = new ArabicLigaturizer();
Writer.setRunDirection(PdfWriter.RUN_DIRECTION_RTL);
BaseFont bfComic = BaseFont.createFont("D:\\Font\\titr.ttf", BaseFont.IDENTITY_H,BaseFont.EMBEDDED);
Font font = new Font(bfComic, 12,Font.NORMAL);
Paragraph p = new Paragraph(al.process(con),font);
p.setAlignment(Element.ALIGN_RIGHT);
doc.add(p);
doc.close();
JOptionPane.showMessageDialog(null,"Yes");
} catch (Exception e) {
System.out.println("Eroor"+e);
}
}

Related

iText not extract Shruti Text Correctly using java

I want to Extract Shruti Text from pdf file and write new pdf. I am using
iText but it can't Extract Proper text,so what is the solution for it???
i am using iText 5.4 lib for this
iText is show in new pdf ',' '-','_' and blanks instaed of Shruti font text
Code That I am Using is,
//for extract text From pdf
try {
PdfReader pdfreader = new PdfReader(file path,password);
String iTextContent = PdfTextExtractor.getTextFromPage(pdfreader,1);
} catch (IOException ex) {
Logger.getLogger(JFileChooserDemo.class.getName()).log(Level.SEVERE, null, ex);
}
//write new pdf file
try{
Document docNew = new Document();
PdfWriter writer = PdfWriter.getInstance(docNew,new FileOutputStream("D:\\demo.pdf"));
docNew.open();
BaseFont bf = BaseFont.createFont("D:\\DeskTop\\Pdf Box jar\\shruti.ttf", BaseFont.IDENTITY_H,BaseFont.NOT_EMBEDDED);
Font f = new Font(bf,5);
docNew.add(new Paragraph(newStr,f));
docNew.close();
writer.close();
}catch(Exception e){
e.printStackTrace();
}

Java : BOLD with iText in PDF Generation doesn't work correctly

I use iText for generating PDF, from a XML file, with content in HTML. Everything is working, except one little thing.
When I have a bloc of text containing a part in BOLD, the BOLD doesn't appear in the resulting PDF file. If I have a complete phrase in BOLD, it's working fine.
Examples :
<DIV><FONT face='Arial' size='10'><B>The BOLD for this phrase works</B></FONT></DIV>
<DIV><FONT face='Arial' size='10'>The BOLD for <B>this part of the phrase </B> doesn't work</FONT></DIV>
With 'Italic' or 'Underline', I can do the same test but I don't have the problem. It's working...
A little precision : if I use a tag <B> combined with a tag <U> or <I>, for a part of bloc of text, it's working too.
Example :
<DIV><FONT face='Arial' size='10'>The combination of <B><I>BOLD and something else (U or I)</I></B> works fine.</FONT></DIV>
For the context : WebApp with struts, the PDF is not saved as a file but sent to the navigator as a response. As suggested by an answer, I update my version of iText from 1.4.8 to 5.5.7.
For the HTML code saved in a xml file, you can see examples above.
For the Java code (I picked up the code from severals long methods. I hope I forgot nothing...).
ByteArrayOutputStream baoutLettre = new ByteArrayOutputStream();
Document document = new Document();
PdfWriter myWriter = PdfWriter.getInstance(document, baoutLettre);
handleHeaderFooter(request, response, document, Constantes.Type_LETTRE);
document.open();
String lettreContent = FileHelper.readFile("myLetter.xml");
XmlParser.parse(document, new ByteArrayInputStream(lettreContent.getBytes("UTF-8")), getTagMap());
document.close();
ByteArrayOutputStream outTmp = new ByteArrayOutputStream(64000);
PdfCopyMerge pdfCM = new PdfCopyMerge(outTmp);
pdfCM.addDocument(baoutLettre.toByteArray());
pdfCM.close();
ByteArrayOutputStream outPDF = addPageNumber(outTmp.toByteArray(), soc, dicoEdition, request);
outPDF.writeTo(request.getOutputStream());
And for the class PdfCopyMerge :
public class PdfCopyMerge {
private ByteArrayOutputStream outStream = new ByteArrayOutputStream();
private Document document = null;
private PdfCopy writer = null;
public PdfCopyMerge(ByteArrayOutputStream stream) {
super();
outStream = stream;
}
public int addDocument(byte[] pdfByteArray) {
int numberOfPages = 0;
try {
PdfReader reader = new PdfReader(pdfByteArray);
numberOfPages = reader.getNumberOfPages();
if (this.document == null) {
this.document = new Document(reader.getPageSizeWithRotation(1));
this.writer = new PdfCopy(this.document, this.getOutputStream());
this.document.open();
}
PdfImportedPage page;
for (int i = 0; i < numberOfPages;) {
++i;
page = this.writer.getImportedPage(reader, i);
this.writer.addPage(page);
}
PRAcroForm form = reader.getAcroForm();
if (form != null) {
this.writer.copyAcroForm(reader);
}
} catch (Exception e) {
logger.error(e.getMessage(),e);
}
return numberOfPages;
}
Does anybody face the same problem ? I look for any helping ideas ...
Thanks.
Try the lastest version 5.5.7. Everything works fine.
https://github.com/itext/itextpdf/tags

Java iText using Lithuanian letters

I am trying to create PDF file with java and I need to use Lithuanian letters within file. I tried to use html code and use htmlWorker to parse it, however it does not work on most letters(it works on some). If anyone could help me with this I would gladly appreciate it.
try {
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream(pathy));
document.open();
HTMLWorker htmlWorker = new HTMLWorker(document);
String s = ("<html>ĄąĘęŪūČŠ"
+ "ŽčšžĖėĮįŲų</html>");
htmlWorker.parse(new StringReader(s));
document.close();
}
catch(Exception e2){
}
I solved my issue by using unicode, not sure why it doesn't work with html code still...
Document document = new Document();
try {
PdfWriter.getInstance(document, new FileOutputStream(pathy));
document.open();
BaseFont bfComic = BaseFont.createFont("c:\\windows\\fonts\\arial.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
document.add(new Paragraph("ąĄčČęĘėĖįĮšŠųŲūŪžŽ", new Font(bfComic, 12)));
} catch (Exception e2) {
System.err.println(e2.getMessage());
}
document.close();

pdfwriter doesn't translate special characters

I have HTML file with an external CSS. I want to create PDF from the HTML file, but the endcoing doesn't work. HTML file works fine, but after transfering to PDF, some characters in PDF are missing. (čřě...) It happens even if I set the Charset in PDFWriter constructor.
How do I solve this, please?
public void createPDF() {
try {
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(username + ID + ".pdf"));
document.open();
String hovinko = username + ID + ".html";
XMLWorkerHelper.getInstance().parseXHtml(writer, document, new FileInputStream(hovinko), Charset.forName("UTF-8"));
document.close();
System.out.println("PDF Created!");
} catch (Exception ex) {
ex.printStackTrace();
}
}
Did you try to convert your special characters before writing them to your PDF?
yourHTMLString.replaceAll(oldChar, newChar);
ć = ć
ř = ř
ě = ě
If you need more special characters, visit this link.
EDIT: Then try this out, it worked for me:
BaseFont basefont = BaseFont.createFont("C:/Windows/Fonts/ARIAL.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Font font = new Font(basefont, 12);
document.add(new Paragraph("čřě", font));
Try it with below logic. It worked for me:
InputStream is = new ByteArrayInputStream(hovinko.getBytes(Charset.forName("UTF-8")));
XMLWorkerHelper.getInstance().parseXHtml(writer, document, is, Charset.forName("UTF-8"));
I used xmlworker version 5.5.12 and itextpdf version 5.5.12.
I was strugling with sam problem (Polish special signs).
For me solution was to write a good font-family in html code.

Java writing PDF - Font not supported

Below is the code to write PDF using Java.
Code
public class PDFTest {
public static void main(String args[]) {
Document document = new Document(PageSize.A4, 50, 50, 50, 50);
try {
File file = new File("C://test//itext-test.pdf");
FileOutputStream fileout = new FileOutputStream(file);
PdfWriter.getInstance(document, fileout);
document.addAuthor("Me");
document.addTitle("My iText Test");
document.open();
Chunk chunk = new Chunk("iText Test");
Paragraph paragraph = new Paragraph();
String test = "și";
String test1 = "şi";
if (test.equalsIgnoreCase(test1)) {
// System.out.println("equal ignore case true");
paragraph.add(test + " New Font equal with Old Font");
} else {
// System.out.println("equal ignore case X true");
paragraph.add(test1 + " New Font Not equal with Old Font");
}
paragraph.setAlignment(Element.ALIGN_CENTER);
document.add(paragraph);
document.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
When I test with Romanian language, I found that "ș" is missing in created PDF.
The Document appears like below:
Any advice or references links regarding this issue is highly appreciated.
**EDITED**
I've use unicode example like below and the output is still same. "ș" is still missing.
Code
static String RESULT = "C://test/itext-unicode4.pdf";
static String FONT = "C://Users//PenangIT//Desktop//Arial Unicode.ttf";
public static void main(String args[])
{
try
{
Document doc = new Document();
PdfWriter.getInstance(doc, new FileOutputStream(RESULT));
doc.open();
BaseFont bf;
bf = BaseFont.createFont(FONT,BaseFont.IDENTITY_H,BaseFont.EMBEDDED);
doc.add(new Paragraph("Font : "+bf.getPostscriptFontName()+" with encoding: "+bf.getEncoding()));
doc.add(new Paragraph(" TESTING "));
doc.add(new Paragraph(" TESTING 1 și "));
doc.add(new Paragraph(" TESTING 2 şi "));
doc.add(Chunk.NEWLINE);
doc.close();
}
catch(Exception ex)
{
}
The Output looks like this
It same for encode as well. The "ș" is still missing.
Please take a look at this PDF: encoding_example.pdf (*)
It contains all kinds of characters that aren't present in the default font Helvetica (which is the default font you're using as you're not defining any other font).
In the EncodingExample source, we use arialbd.ttf with a specific encoding, resulting in the use of a simple font in the PDF. In the UnicodeExample source, we use IDENTITY_H as encoding, resulting in the use of a composite font in the PDF.
I've adapted your code, because I see that you didn't understand my answer:
BaseFont bf = BaseFont.createFont(FONT,BaseFont.IDENTITY_H,BaseFont.EMBEDDED);
doc.add(new Paragraph(" TESTING 1 și ", new Font(bf, 12)));
doc.add(new Paragraph(" TESTING 2 \u015Fi ", new Font(bf, 12)));
Do you see the difference? In your code, you create bf, but you aren't using that object anywhere.
(* )Note: pdf.js can't interpret some glyphs because pdf.js doesn't support simple fonts with a special encoding; these glypgh show up correctly in Adobe Reader and Chrome PDF viewer. If you want to be safe, use composite fonts, because pdf.js can render those glyphs correctly: unicode_example.pdf

Categories

Resources