Accents not shown on j2me - java

I'm building a j2me app in french, but it doesn't show certain strings not right. For instance: "Cette page donne un aperçu des dernières nouvelles" becomes "Cette page donne un eperÃϨ"res nouvelles".
Does anyone know why this is?

That funny ÃϨ string is what UTF-8 looks like when displayed in a program that's expecting ASCII or Windows-1252. Are you sure the software on the phone is set to UTF-8, and/or the encoding header on the data stream matches the actual encoding?
For example, if this is XML and the header said Windows-1252 but the actual encoding was UTF-8, and the phone software respected the header, this would be the result.

Related

Convert String from AS400 to Java

To communicate with as400 I use a java web service with the jt400 library, this web service is running under Linux.
The text result after calling as400 program contains accented character like é à è… but in my xhtml page the text isn't displayed correctly, for example é is replaced by {.
The as400 is configured like this: ccsid : 65535 and encoding : 297.
When the same web service run under windows, I can display correctly accented characters
Thank for help.
You seem to have ran in to Mojibake caused by interpreting bytes of text in the incorrect encoding. You mention é being replaced by {; the code point for é in CCSID 297 is 0xC0 which in CCSID 37 is {, so this makes sense.
I'm not sure where the data is coming from, but if you're using AS400Text to convert the data in to a Java String object, you'll need to specify the correct CCSID or it will pick a CCSID based on the current locale. You can either specify the CCSID from AS400.getCcsid or the associated encoding string value from AS400.getJobCCSIDEncoding.

encoding trouble while sending info on server

I try to send info to server. charset encoding set to UTF-8. jsp page encoding also set to UTF-8. I use Spring mvc
I form json and try to send it to server. but when when I get response body I see strange symbols between words attributeCategory%5B0%5D%5Batt.
I searched and all suggestions were to have encoding utf-8 to resolve such problem.
UPDATE
When I add on server side this line URLDecoder.decode(body, "ISO-8859-1"); everything was encoded in normal form. So my question what I need to change with my json or something else to make my program work with UTF-8 encoding
%5B = [ (hexadecimal code 5B)
%5D = ] (hexadecimal code 5D)
This might stem from HTML INPUT fields with the same name, so in fact attributeCategory[0][att was meant (probably miscopied here).
It could also be JavaScript.
It is a url encoding for HTTP transmission of non-basic characters like [, ] and so on. Nothing to do with an encoding.
I hope this points to some cause of this error.

Spanish character óé display error in Java properties

When I process a properties file with the Spanish characters ó and é, characters are displayed as ?. I tried different ways to fix this, but still fail:
I tried to use \uxxxx
I tried to use InputStreamReader with encoding UTF-8
I tried to convert string to bytes and then create a new String from those bytes:
new String( val.getBytes("UTF-8"), "UTF-8")
Nothing worked. What should I do next to fix this issue? Japanese and Russian are still OK.
The properties file needs to be in the proper encoding. By default some IDE's like eclipse saves the content using CP1252 but you are requiring the file as UTF-8. This is also required for your java code.
If you try to use \uxxxx characters but your application by default is working with CP1252 the conversion of the escape code result in a bad character.
If you use the InputStreamReader to force the reading as UTF-8 but your code and/or your file are not using UTF-8 support result in a bad character.
If you use UTF-8 conversion of an string but your source code is CP1252 you should have the same problem.
Related previous answer about source code : Should source code be saved in UTF-8 format
Notepad ++ Has a menu to view the format of the file and change it in "Format" menu you should view the file as if it should be opened by other formarts or you should convert the file to other file formats like "UTF-8"

UTF-8Characters not displayed correctly in

we are working on a project for school, The project is mandatory tri-lingual (dutch, english and french) , so the answer "Change to English will not do".
All our classes and resource files are encoded in UTF-8 format, and alle non-standar english characters are diplayed correctly in the classes themself.
the problem is that once we try to display our text, alle non-standard english characters are distorted.
We hear alot that this is due to an encoding issue, but I sincerly doubt that, since our whole project is encode in UTF-8.
here is extract from the french resource bundle:
VIDEOSETTINGS = Réglages du Vidéo
SOUNDSETTINGS = Réglages du son
KEYBINDSETTINGS = Keybind Paramètres
LANGUAGESETTINGS = Paramètres de langue
DIFFICULTYSETTINGS = Paramètres de Difficulté
EXITSETTINGS = Sortie les paramètres
and this results in these following displayed strings.
display result for provided resourcebundle extract
I would be most gratefull for a solution for this problem
EDIT
for extra info we are building a desktop app using Swing.
This is due to an encoding issue.
You are using the wrong decoder (probably ISO-8859-1) on UTF-8 encoded bytes.
Are these strings stored in a file? How are you loading the file? Via the Properties class? The Properties class always applies ISO-8859-1 decoding when loading the plain text format from an InputStream. If you are using Properties, use the load(Reader) overload, switch to the XML format, or re-write the file with the matching encoding. Also, if you are using Resource.getBundle() to load a properties file, you must use ISO-8859-1 encoding to write that file, escaping any non-Latin characters.
Since this is an encoding issue, it would be most helpful if you posted the code you have used to select the character encoding.
You didn't show some code, where you read the resource files. But if you use PropertyResourceBundle with an InputStream in the constructor, the InputStream must be encoded in ISO-8859-1. In that case, characters that cannot be represented in ISO-8859-1 encoding must be represented by Unicode Escapes.
You can use native2ascii or AnyEdit as tools to convert Properties to unicode escapes,
see Use cyrillic .properties file in eclipse project

How to check encoding in java?

I am facing a problem about encoding.
For example, I have a message in XML, whose format encoding is "UTF-8".
<message>
<product_name>apple</product_name>
<price>1.3</price>
<product_name>orange</product_name>
<price>1.2</price>
.......
</message>
Now, this message is supporting multiple languages:
Traditional Chinese (big5),
Simple Chinese (gb),
English (utf-8)
And it will only change the encoding in specific fields.
For example (Traditional Chinese),
蘋果
1.3
橙
1.2
.......
Only "蘋果" and "橙" are using big5, "<product_name>" and "</product_name>" are still using utf-8.
<price>1.3</price> and <price>1.2</price> are using utf-8.
How do I know which word is using different encoding?
It looks like whoever is providing the XML is providing incorrect XML. They should be using a consistent encoding.
http://sourceforge.net/projects/jchardet/files/ is a pretty good heuristic charset detector.
It's a port of the one used in Firefox to detect the encoding of pages that are missing a charset in content-type or a BOM.
You could use that to try and figure out the encoding for substrings in a malformed XML file if you can't get the provider to fix their output.
you should use only one encoding in one xml file. there are counterparts of the characters of big5 in the UTF_8 encoding.
Because I cannot get the provider to fix the output, so I should be handle it by myself and I cannot use the extend library in this project.
I only can solve that like this,
String str = new String(big5String.getByte("UTF-8"));
before display the message.

Categories

Resources