I wrote a program that reads from a file has Arabic text encoded with ANSI.
I made a runnable jar of that program.
It run perfectly on my Laptop, however, when I run it on another laptop the Arabic characters turn into a messy symbols.
So what to do?
Make sure your end system is having the fonts required to display those letters if not bundle it with your application.
Check whether you are reading the file content UTF-8 (Or appropriate encoding format).
Related
I am writing a java application on Ubuntu Linux that reads in a text file and creates an xml file from the data. Some of the text contains curly apostrophes and quotes that I convert to straight apostrophes and quotes using the following code:
dataLine = dataLine.replaceAll( "[\u2018|\u2019]", "\u0027" ).replaceAll( "[\u201C|\u201D]", "\u005c\u0022" );
This works fine, but when I port the jar file to a Mac OSX machine, I get three question marks where I should get straight apostrophes and quotes. I created a test application on the Mac using the same line of code to do the conversion and the same test file for input and it worked fine. Why doesn't the jar file created on the Linux machine work correctly on a Mac? I thought java was supposed to be cross platform compatible.
Chances are you'tr not reading the file correctly to start with. You haven't shown how you're reading the file, but my guess is that you're just using FileReader, or an InputStreamReader without specifying the encoding. In that case, the default platform encoding is used - and if that's not the actual encoding of the file, you won't be reading the right characters. You should be able to detect that without doing any replacement at all.
Instead, you should use a FileInputStream and wrap it in an InputStreamReader with the correct encoding - which is likely to be UTF-8 as it's XML. (You should be able to check this easily.)
I am building an app that takes information from java and builds an excel spreadsheet. Some of the information contains international characters. I am having an issue when Russian characters, for example, are rendered correctly in Java, but when I send those characters to Excel, they are not rendered properly. I initially thought the issue was an encoding problem, but I am now thinking that the problem is simply not have the Russian language pack loaded on my Windows 7 machine.
I need to know if there is a way for a Java application to "force" Excel to show international characters.
Thanks
Check the file encoding you're using is characters don't show up. Java defaults to platform native encoding (Win-1252 for Windows) instead of UTF-8. You can explicitly set the writers to use UTF-8 or Cyrillic encoding.
I have a Java application that is generating JasperReports. It will create as many as three JasperPrints from a single report: one prints on the printer, one is serialized and saved to the database, and the third is exported to PDF using Jasper's built-in export capability.
The problem is that when exporting to PDF, characters containing 8 or more bits (i.e. not 7-bit ASCII) are showing up as empty squares, meaning Acrobat Reader is not able to display that character. The print version is correct, and loading the database version and printing it shows up correctly. If I change the PDF exported version to a different format, e.g. XML, the character shows up fine in a web browser.
Based on the evidence, I believe the issue is something specific to font handling in PDFs, but I am not sure what.
The font used is Lucida Sans Typewriter, a Unicode monospaced font. The Windows "font" directory is listed in the Java classpath: without this step, PDF exporting fails miserably with zero text at all, so I know it is finding the font.
The specific characters not displayed are accented characters used in Spanish text: á, é, í, ó, and ú. I haven't checked ñ but I am guessing that won't work too.
Any ideas what the problem is, areas of the system to check, or maybe parameters I need to send to the export process?
The PDF encoding used for exporting was UTF-8, and apparently the font didn't support that properly. When I changed it to ISO-8859-1, every character showed up correctly in the PDF output.
In iReport, try setting the Pdf Embedded property of your TextFields to true.
I'm using Jasper Report 6, My team has spend a few days to display Khmer Unicode. I have found solution finally, and everything work as expected.
follow this https://community.jaspersoft.com/wiki/custom-font-font-extension
after you exported, upload your jar file to lib folder and restart your jasper server.
I have a Java project that connects to a C# program that prints Turkish words. Printing Turkish characters in C# using console is not causing any problems. However, the main issue is that when this C# program is called from Java, the Turkish characters are printed weirdly.
What I would like to do is to get the output printed on console and reprint it using Java GUI without having any problems with Turkish characters.
I really appreciate any kind of help.
Many thanks in advance
The issue is likely to be that the C# application is encoding its character data in one encoding while the Java application is decoding the data as another. Assuming Windows, it is possibly an ANSI/OEM mismatch.
You need to identify the encoding the C# application is emitting. In the Java application, read each byte and check its hex value. Check to see if the bytes are Windows-1254, OEM-857 or whatever and then decode them appropriately using a reader with the appropriate encoding.
I'm experimenting with internationalization by making a Hello World program that uses properties files + ResourceBundle to get different strings.
Specifically, I have a file "messages_en_US.properties" that stores "hello.world=Hello World!", which works fine of course.
I then have a file "messages_ja_JP.properties" which I've tried all sorts of things with, but it always appears as some type of garbled string when printed to the console or in Swing. The problem is obviously with the reading of the content into a Java string, as a Java string in Japanese typed directly into the source can print fine.
Things I've tried:
The .properties file in UTF-8 encoding with the Japanese string as-is for the value. Something I read indicates that Java expects a properties file to be in the native encoding of the system...? It didn't work either way.
The file in default encoding (ISO-8859-1) and the value stored as escaped Unicode created by the native2ascii program included with Java. Tried with a source file in various Japanese encodings... SHIFT-JIS, EUC-JP, ISO-2022-JP.
Edit:
I actually figured this out while I was typing this, but I figured I'd post it anyway and answer it in case it helps anyone.
I realized that native2ascii was assuming (surprise) that it was converting from my operating system's default encoding each time, and as such not producing the correct escaped Unicode string.
Running native2ascii with the "-encoding encoding_name" option where encoding_name was the name of the source file's encoding (SHIFT-JIS in this case) produced the correct result and everything works fine.
Ant also has a native2ascii task that runs native2ascii on a set of input files and sends output files wherever you want, so I was able to add a builder that does that in Eclipse so that my source folder has the strings in their original encoding for easy editing and building automatically puts converted files of the same name in the output folder.
As of JDK 1.6, Properties has a load() method that accepts a Reader. That means you can save all the property files as UTF-8 and read them all directly by passing an InputStreamReader to load(). I think that's the most elegant solution, but it requires your app to run on a Java 6 runtime.
Historically, load() only accepted an InputStream, and the stream was decoded as ISO-8859-1. Not the system default encoding, always ISO-8859-1. That's important, because it makes a certain hack possible. Say your property file is stored as UTF-8. After you retrieve a property, you can re-encode it as ISO-8859-1 and decode it again as UTF-8, like this:
String realProp = new String(prop.getBytes("ISO-8859-1"), "UTF-8");
It's ugly and fragile, but it does work. But I think the best solution, at least for the next few years, is the one you found: bulk-convert the files with native2ascii using a build tool like Ant.
An alternative way to handle the properties files is:
http://www.unipad.org/main/
This is an editor which can read/write files in \u unicode escape format, this is the format native2ascii creates.
It don't know how well it works with Japanese, I've used it for Hungarian.