encoding issue in my web application - java

I am facing an encoding issue in my web application.
I am using properties file (resource bundle) to store language text.
If I check encoding of my properties file using notepad, it's UTF-8 and I see proper arabic character when I open it in notepad.
LOGIN=دخول
When I build my application using JDeveloper, in my properties file under classes folder, arabic characters are converted like this:
LOGIN=\u062f\u062e\u0648\u0644
Also encoding of this file is shown as ANSI in notepad.
Surprisingly, in browser, characters appeared perfectly fine (دخول).
Now when I build my application using ant, I've a copy task which is copying this properties file from src folder to classes folder.
After running build script, if I see encoding of properties file under classes folder, it still is UTF-8 and characters are in arabic only.
However in browser, characters doesn't appear properly.
As far as I know UTF-8 encoding is supposed to cater for all languages but in my case something is wrong somewhere.
I tried following also in copy task:
encoding="UTF-8" outputencoding="UTF-8"
However still no luck.
Anyone know where I am wrong?
Thanks.

Well, the comment and link provided by Edwin did help.
I moved my translations to XML bundles (also called as XLF or XLIFF bundle in ADF) and now everything is working fine.
Thanks.

right click the properties file > prefrences > Environment > Encoding >utf8

Related

Tomcat restart or redeploy breaks unicode characters from java source files

I have developed a liferay 6.2 application using jsf and primefaces 4. I have unicode characters both in xhtml files and java source files. There a strange behaviour of breaking my characters after tomcat restart or redeploy of application and the problem is only with characters coming from the source files. The rest unicode characters on the page are displayed correctly. And the behaviour is not always reproducible.
I have read posts referring to setting the jvm's or tomcat default encoding and main suggested action was setting -Dfile.encoding=UTF-8 but didn't have any luck.
I am using tomcat 7.0.42
If you are using Eclipse, try setting the text encoding in your project Properties>Resource
The problem has to do with the encoding of the class files. The solution was to set the correct encoding for javac. I finally discovered that in eclipse I had to edit the build.user.properties file to set javac.encoding = UTF-8

Can source code files with different encoding coexist in (the same) Java (project in Eclipse)?

I know Java uses UTF-16 internally and expects .properties files to be in ISO-8859-1 by default.
I'm currently working on a project that was written in Eclipse, whose default encoding on our systems is cp-1252. I'm thinking utf-8 would be a much more sensible option, going forward.
However, given the scale of the project (it's split up into modules and uses libraries from all over the place), I can't just batch-convert all source code files in one go.
Will Java have a problem with some files in a project being in one encoding and some in another? (Clearly, having entire libraries written in encodings that are different from one another doesn't seem to be a problem - probably because they are all UTF-16 once compiled, anyway.)
Would Eclipse be able to handle that (i.e. different encodings per file) correctly?
Yes you can.
You can choose the default encoding to use for the whole Eclipse project:
right click on a project
resource part
Text file encoding zone, check Other and choose UTF-8 (or what you want) in the combo
You can also change the encoding for a particular file:
right click on the file
resource part
Text file encoding zone, check Other and choose UTF-8 (or what you want) in the combo
Preferences are store in the hidden folder .settings in your project. File encoding preferences are store in the .settings/org.eclipse.core.resources.prefs.
Thoses preferences can be commited using your favorite source control and shared with other developpers.

is there a standard character encoding for war files?

If I create a text file on an operating system that uses the Latin encoding code page ISO/IEC 8859-1. Now if I package the text file as a .war file using the Java jar tool, will it be packaged using the same character encoding as it was on the source Operating System? Or, will it be packaged using some standard encoding such as UTF-8?
The character set encoding for JAR/WAR/EAR is UTF-8. Note however, that this only applies to the entry names, not the file contents, e.g. class file data.
WAR file is basically a ZIP archive with .war extension and it has nothing to do with encodings.
Seems like jar takes the bytes from the text file and stores exactly those bytes in the jar/war file without storing any encoding information. This is gleaned from comments on the question as well as the other answers. The answers do not state this clearly, so I am answering my own question. Please correct if I am wrong.

Pootle - issues exporting properties files

I am using pootle for localization.
I imported source translations from Java properties files.
the property file looks like:
STR_TEXT1 = Hello
Than I imported other language files. I paid translators to translate those other languages.
Now I need to export those translations from pootle back to JAVA property files. Problem is, that it randomly exports it to UTF8 other time to to \uXXXX escape Encoding. There is no way how to setup the encoding for export files. Second problem is, that those export files are corrupted. It renders them like many rows without problem and some rows are cut, like this:
STR_TEXT1 = HELLSTR_TEXT2 = bye
Than I accidently deleted property files from /po/my_project directory. When I did this, exports stoped to work. But all translations are still wisible on the pootle web. So, i suppose those translations are saved in some other files, maybe mo files. Is there a way how to get those translations to java property files? how to force pootle to replace those original texts from original property files by fresh texts from pootle?
Pootle should export you files as Latin1 not UTF-8, it will escape non-Latin characters using \uXXXX syntax. Newer Pootle's allow you to export in UTF-8.
The best is to attach your source and translated files to a bug over at bugs.locamotion.org so that the Pootle developers can look at your source files.
The translations are kept in a database. Pootle makes use of you template file usually en.properties to create translated versions. I haven't tested what happens if you delete this template file.

Strange file name character encoding problem with Ubuntu / Java / Glassfish

I have a Java application deployed on Glassfish web server on Ubuntu Server Edition PC.
One of the services this application has to provide is to mount an ISO image in a specific folder and copy all the contents of this folder to another destination.
Since once my Java method found a Cyrillic file name, it has crashed. This file name appears as "???????????????.txt" in server application logs.
First I thought this was a linux problem, because this file appeared incorrectly in terminal as well. After I added CP1251 locale the problem in linux terminal has solved but still my application was throwing an error.
One guy at UbuntuForums (http://ubuntuforums.org/showthread.php?t=1813920) suggested me to convert this bad file with "convmv" utility, but this utility's output said that this file was already a UTF-8.
After that I've created a test application with the same methods and run it on the same PC but just like "java Test $arguments$".
And it did worked!
Simple System.out.println method displayed the file name correctly and successfully copied the problem file to another folder.
This fact left me no choice but to claim Glassfish for being the gap between my class, java and linux (though I'm not sure how it's possible).
Is there any character encoding specific settings in Glassfish I could correct to fix this error or maybe I'm missing something and the problem isn't really there?
Thanks in advance!
Andrew
Try to change Charset.defaultCharset(). See Setting the default Java character encoding? for more details.
Also, see Glassfish configuration such as
In sun-web.xml You have to see something like this:
<locale-charset-info default-locale="">
<parameter-encoding default-charset="UTF-8"/>
</locale-charset-info>

Categories

Resources