Writing and showing Hebrew letters on pdf - java

I am writing contents on pdf file.
When i write Hebrew letters ("שלום") The letters dont appear on the pdf.
Waybe its a Encode issue, anyhow how can i write Hebrew on a pdf file?

It could be an encoding issue, but it is difficult to tell without knowing how you are writing to the PDF file (what library, what encoding etc...).
Another thing to look at are the embedded fonts used in the PDF - by default there wouldn't be any and you would need to embed a Hebrew font to be used for Hebrew text. You would need to ensure that you have the rights to embed and distribute such a font before doing so.

You need the font with Hebrew letters (glyphs) embedded in PDF.

Related

Java DOCX file Viewer

Currently I'm developing an application that allows users to create a template and generate it into a DOCX file. The application needs to be able to display to users the changes in the template as the user is creating it.
The approach I tried was using DOCX4J library (allows manipulation of DOCX file) and ICEPDF which is primarily used to display the DOCX into the swing component by converting it first into a PDF file. Now the problem in this approach is that it loads pretty slow and some of the changes that occurs in the DOCX file does not reflect on the PDF conversion (example: dashed underline, font changes). When I tried to open the DOCX file ouput in MS WORD, the file is viewed correctly so I know changes do occur, but it seems that ICEPDF just can't show it properly.
So I was wondering if anyone knows a java library that allows DOCX files to be viewed directly from a Swing Component instead of converting it first into a PDF file.
You can try docx4all or DocxEditorKit. Both of these are built around docx4j.

Displaying embedded fonts with PDFBox and Swing

I am using PDFBox to display PDF files inside a JInternalFrame. When opening PDF I get lots of warnings like this:
Changing font on <m> from <Tahoma Negrita> to the default font
I am aware that the fonts being reported are not part of the standard set of 14 fonts. So I decided to check if those fonts are embedded on the PDF file (thinking that there shouldn't be a problem loading embedded fonts, right?).
So I open the file on different readers and check properties/fonts. I am in doubt whether this section reports fonts required by the document or fonts actually embedded in the document.
The information that I get is as follows:
BAAAA+Tahoma-Bold (embedded Subset), type:TrueType, Encoding:
CAAAA+Tahoma (Embedded Subset), type:TrueType, Encoding:
Confused about this, I researched on how to embed fonts from OpenOffice and found that the PDF/A-1a option should be checked. So I made another PDF using this option (in case this was not used when making the original PDF file), yet I got the same results.
I would like your guidance understanding how this works. I would like to be able to open PDF files just as PDF readers do. I also read about the PDFBox_External_Fonts.properties but I am guessing this file shouldn't be modified since I am dealing with embedded fonts.
Thanks.
pdfbox is not able to parse embedded subsets of TrueType fonts.
As far as I understand it, embedded TrueType subsets are missing some metadata for the font file that pdfbox needs.
The bug is known but not easy to solve. Right now I can only advise to use embedded Type 1 Fonts if possible, pdfbox can deal with them.
You can also try to set the path to your complete font files in your pdfbox.jar under org/apache/pdfbox/resources/PDFBox_External_Fonts.properties, so if pdfbox cannot parse the subset, at least it can find a full path to the original font file. Maybe that works, but I have not tested this.
Good Luck!

Problems with special characters in Microsoft Excel

In a Java portlet I'm offering files to download through the serveResource(...) method.
I'm calling
response.getPortletOutputStream().write(byteArray);
This byte array contains some special characters in German, for example Ä, Ü or ö. The file format of the resulting file is csv.
When I'm opening the file in a text editor, the special characters are displayed correctly.
However when I open them in Microsoft Excel, they're displayed as ü or ß.
Do you have any ideas of what could be the cause of this problem?
Notepad++ displays the file as
ANSI as UTF-8
This might help you: Microsoft Excel mangles Diacritics in .csv files?
Basically, you'd need to add a byte order mark (BOM) to your CSV file.

International characters with Java

I am building an app that takes information from java and builds an excel spreadsheet. Some of the information contains international characters. I am having an issue when Russian characters, for example, are rendered correctly in Java, but when I send those characters to Excel, they are not rendered properly. I initially thought the issue was an encoding problem, but I am now thinking that the problem is simply not have the Russian language pack loaded on my Windows 7 machine.
I need to know if there is a way for a Java application to "force" Excel to show international characters.
Thanks
Check the file encoding you're using is characters don't show up. Java defaults to platform native encoding (Win-1252 for Windows) instead of UTF-8. You can explicitly set the writers to use UTF-8 or Cyrillic encoding.

Exporting a JasperReport to PDF, Characters Missing

I have a Java application that is generating JasperReports. It will create as many as three JasperPrints from a single report: one prints on the printer, one is serialized and saved to the database, and the third is exported to PDF using Jasper's built-in export capability.
The problem is that when exporting to PDF, characters containing 8 or more bits (i.e. not 7-bit ASCII) are showing up as empty squares, meaning Acrobat Reader is not able to display that character. The print version is correct, and loading the database version and printing it shows up correctly. If I change the PDF exported version to a different format, e.g. XML, the character shows up fine in a web browser.
Based on the evidence, I believe the issue is something specific to font handling in PDFs, but I am not sure what.
The font used is Lucida Sans Typewriter, a Unicode monospaced font. The Windows "font" directory is listed in the Java classpath: without this step, PDF exporting fails miserably with zero text at all, so I know it is finding the font.
The specific characters not displayed are accented characters used in Spanish text: á, é, í, ó, and ú. I haven't checked ñ but I am guessing that won't work too.
Any ideas what the problem is, areas of the system to check, or maybe parameters I need to send to the export process?
The PDF encoding used for exporting was UTF-8, and apparently the font didn't support that properly. When I changed it to ISO-8859-1, every character showed up correctly in the PDF output.
In iReport, try setting the Pdf Embedded property of your TextFields to true.
I'm using Jasper Report 6, My team has spend a few days to display Khmer Unicode. I have found solution finally, and everything work as expected.
follow this https://community.jaspersoft.com/wiki/custom-font-font-extension
after you exported, upload your jar file to lib folder and restart your jasper server.

Categories

Resources