I am automating test cases for a web application using selenium 2.0 and Java. My application supports multiple languages. Some of the test cases require me to validate the text that appears in the UI like success/error messages etc.I am using a properties file to store whatever text I am referring in my tests from the UI, currently only english. For example there is locale_english.properties(see below) that contains all references in english. I am going to have multiple properties files like this for different locales like locale_chinese.properties,locale_french.properties and so on. For locales other than english, their corresponding properties file would have UTF-8 characters (e.g \u30ed) representing the native characters(see below). So If I want to test say Chinese UI, I would load "locale_chinese.properties" instead of "locale_english.properties". I am going to convert the native characters for non-english locale using perhaps native2ascii from JDK or some other way.I tested that Selenium API works well with UTF-8 characters for non-english locales
---locale_english.properties------
user.login.error= Please verify username/password
---locale_chinese.properties------
user.login.error= \u30ed\u30ef\u30eg\u30eh\u30ed
and so on.
The problem is that my locale_english.properties is growing and going out of control. It is becoming hard to manage a single properties file for one locale let alone for multiple locales. Is there a better way of handling localization in Java, particularly in situations like I am in?
Thanks!
You're right that there is a problem managing the files, but you're also right that this is the best approach. Some things are just hard :-(
Selenium (at least the Selenium RC API) does indeed support Unicode input and output, we have lots of tests that enter and confirm Cyrillic and Simple Chinese characters from C#. Since Java strings are Unicode at the core (just like C#), I expect you could simply create the file in a UTF-8-friendly editor like Notepad++ and read them straight into strings and use them directly in the Selenium API.
This is how I solved the issue for those who are interested.
a database would work better for many reasons, like growth, central location, kept outside of app and can be edited and maintained outside of app. We used a table with columns:
id (int) auto increment
id_text -- this and other columns are varchar ... except for date time for last 2
lang
translation
created_by
updated_by
created_date
updated_date
An id is a short english description of the text - like 'hello' or 'error1msg', the key in your map.
In java had a function to get the text for a particular text ... and a app level property - default language (usually en but good to keep it configurable)
Function would scan already loaded hashmap for language asked for - say "ch"
If corresponding translation was not found for this language we would return the default language translation and if that was not founf then we would return "[" + id "]" so the tester knows something is missing in data base - can go to web screen to edit translation table and add it.
Related
Is it possible to get the styles of a paragraph in a particular langage ?. For example: on my personal computer I happen to have a dutch installation of microsoft windows. this is resulting in the paragraph.getStyles() method returning the dutch values of the styles, instead of a normal value of "heading1", "heading2" etc I am receiving values such as"Kop1", "kop2".
I am creating a parser for word based documents which selects certain parts on style. does anyone have any experience with this ?
I would take a look at the data in the .docx file (it's a zip-file) to verify if the data is written this way by Word already or "transposed" by POI or some local functionality.
If the data is already written by Word you will need to check how you can create the document in a different language in Word.
If not, then if you are using POI 3.13 or newer, you can try to set a different locale via LocaleUtil.setUserLocale() and see if that affects the results.
I am saving text in other language like Urdu, Punjabi, Spanish etc in DB from HTML form. It gets saved in DB and I am able to display same text in same language via PHP on UI (HTML page). Now I want to do same thing in JAVA. In PHP I am using html_entity_decode(). How can I do same in JAVA.
Thanks in advance for your help. Please let me know if my question is not clear to you.
Here is an example to explain the necessary steps to use Arabic language font in Android app development.You can customize this according to your requirement. Same procedure is applicable to other languages including Persian or urdu.
Follow this link for complete example
Another helpful API would be JAVA Internationalization.Check Its documentation for further help
The usual way to do it with Java is through property files:
one file per language
each file is filled with a list of pairs key/message
the files are managed with an instance of ResourceBundle class
See for ex. Backing a ResourceBundle with Properties Files
About using it in html pages, it depends how you generate them.
UPDATE
About storing data filled by a user in any language.
It is strongly recommended to encode the data with a unique charset, usually UTF-8, all along the way, i.e.:
from the html form where the user enters data
to the html displaying them
through the db used to store them.
For html, the charset must be set up in all the pages.
For db, check what is the default charset used, and if needed changed it (if it's possible!)
If at any step of the process a change of charset is required, then explicit conversion will be needed.
Most of the time, Java deals with encoding without requiring to explicit any charset. But it not always true, like e.g. with copy from String to byte array.
I have a Java Desktop App, the users of the application have the availability to set the aplications language.
By now i manage it in the database, i call the value of a field called - userLanguage - which is an Integer, and when the user has logged in depending on this value i set the corresponding text to each element on the app by using a switch ( case 1: set labels text ENGLISH, case 2: set labels text SPANISH ... etc)
But i've heard that control the language from the database is an insult, and i would like to know which's a nice way to do it, or what's the best way to do so, it doesn't matter how difficult it would be but the efficence of the method to internationallize an app is what metters for me.
I would actually handle this problem using the Java Preferences. It keeps the preferences for each user separately in a system independent way (for you at least). If you use XML you need to create a SAX/DOM parser or if you use a DB you need to use jdbc. Neither XML or the DB is a bad or a tough solution, I just think the preferences are the easiest.
For internationalization, I would use a ResourceBundle that localized for different Locales. It is a pretty big topic see The Java internationalization (I18n) tutorial
Java Preferences is what you are looking for then.
Or, instead of using XML file you can use Properties.
...i've heard that control the language from the database is an insult...
I do not agree with that. I think it is scenario dependent, and in your case I think you should keep it the way it is to avoid unnecessary work, unless there is an absolute need for keeping the preferred idiom outside your DB.
You've received two answers, both of which are plainly wrong. If you have Java Desktop Application, you should this code:
Locale locale = Locale.getDefault(Locale.Category.DISPLAY);
This will give you valid User Interface language for your application - the one user set in his OS preferences. If you want to keep the language in a database or in some kind of preferences, you'll be forcing users to chose language. What for? I've already set what language I want. If you don't have it, let Java fall back to your application's default.
In case you wonder, if you use ResourceBundle, the default would be the one without a Locale in its name. That is unless you override this process by using custom ResourceBundle.Control.
I've been given the (rather daunting) task of introducing i18n to a J2EE web application using the 2.3 servlet specification. The application is very large and has been in active development for over 8 years.
As such, I want to get things right the first time so I can limit the amount of time I need to scrawl through JSPs, JavaScript files, servlets and wherever else, replacing hard-coded strings with values from message bundles.
There is no framework being used here. How can I approach supporting i18n. Note that I want to have a single JSP per view that loads text from (a) properties file(s) and not a different JSP for each supported locale.
I guess my main question is whether I can set the locale somewhere in the 'backend' (i.e. read locale from user profile on login and store value in session) and then expect that the JSP pages will be able to correctly load the specified string from the correct properties file (i.e. from messages_fr.properties when the locale is to French) as opposed to adding logic to find the correct locale in each JSP.
Any ideas how I can approach this?
There are a lot of things that need to be taken care of while internationalizing application:
Locale detection
The very first thing you need to think about is to detect end-user's Locale. Depending on what you want to support it might be easy or a bit complicated.
As you surely know, web browsers tend to send end-user's preferred language via HTTP Accept-Language header. Accessing this information in the Servlet might be as simple as calling request.getLocale(). If you are not planning to support any fancy Locale Detection workflow, you might just stick to this method.
If you have User Profiles in your application, you might want to add Preferred Language and Preferred Formatting Locale to it. In such case you would need to switch Locale after user logs in.
You might want to support URL-based language switching (for example: http://deutsch.example.com/ or http://example.com?lang=de). You would need to set valid Locale based on URL information - this could be done in various ways (i.e. URL Filter).
You might want to support language switching (selecting it from drop-down menu, or something), however I would not recommend it (unless it is combined with point 3).
JSTL approach could be sufficient if you just want to support first method or if you are not planning to add any additional dependencies (like Spring Framework).
While we are at Spring Framework it has quite a few nice features that you can use both to detect Locale (like CookieLocaleResolver, AcceptHeaderLocaleResolver, SessionLocaleResolver and LocaleChangeInterceptor) and externalizing strings and formatting messages (see spring:message tab).
Spring Framework would allow you to quite easily implement all the scenarios above and that is why I prefer it.
String externalization
This is something that should be easy, right? Well, mostly it is - just use appropriate tag. The only problem you might face is when it comes to externalizing client-side (JavaScript) texts. There are several possible approaches, but let me mention these two:
Have each JSP written array of translated strings (with message tag) and simply access that array in client code. This is easier approach but less maintainable - you would need to actually write valid strings from valid pages (the ones that actually reference your client-side scripts). I have done that before and believe me, this is not something you want to do in large application (but it is probably the best solution for small one).
Another approach may sound hard in principle but it is actually way easier to handle in the future. The idea is to centralize strings on client side (move them to some common JavaScript file). After that you would need to implement your own Servlet that will return this script upon request - the contents should be translated. You won't be able to use JSTL here, you would need to get strings from Resource Bundles directly.
It is much easier to maintain, because you would have one, central point to add translatable strings.
Concatenations
I hate to say that, but concatenations are really painful from Localizability perspective. They are very common and most people don't realize it.
So what is concatenation then?
On the principle, each English sentence need to be translated to target language. The problem is, it happens many times that correctly translated message uses different word order than its English counterpart (so English "Security policy" is translated to Polish "Polityka bezpieczeństwa" - "policy" is "polityka" - the order is different).
OK, but how it is related to software?
In web application you could concatenate Strings like this:
String securityPolicy = "Security " + "policy";
or like this:
<p><span style="font-weight:bold">Security</span> policy</p>
Both would be problematic. In the first case you would need to use MessageFormat.format() method and externalize strings as (for example) "Security {0}" and "policy", in the latter you would externalize the contents of the whole paragraph (p tag), including span tag. I know that this is painful for translators but there is really no better way.
Sometimes you have to use dynamic content in your paragraph - JSTL fmt:format tag will help you here as well (it works lime MessageFormat on the backend side).
Layouts
In localized application, it often happens that translated strings are way longer than English ones. The result could look very ugly. Somehow, you would need to fix styles. There are again two approaches:
Fix issues as they happen by adjusting common styles (and pray that it won't break other languages). This is very painful to maintain.
Implement CSS Localization Mechanism. The mechanism I am talking about should serve default, language-independent CSS file and per-language overrides. The idea is to have override CSS file for each language, so that you can adjust layouts on-demand (just for one language). In order to do that, default CSS file, as well as JSP pages must not contain !important keyword next to any style definitions. If you really have to use it, move them to language-based en.css - this would allow other languages to modify them.
Culture specific issues
Avoid using graphics, colors and sounds that might be specific for western culture. If you really need it, please provide means of Localization. Avoid direction-sensitive graphics (as this would be a problem when you try to localize to say Arabic or Hebrew). Also, do not assume that whole world is using the same numbers (i.e. not true for Arabic).
Dates and time zones
Handling dates in times in Java is to say the least not easy. If you are not going to support anything else than Gregorian Calendar, you could stick to built-in Date and Calendar classes.
You can use JSTL fmt:timeZone, fmt:formatDate and fmt:parseDate to correctly set time zone, format and parse date in JSP.
I strongly suggest to use fmt:formatDate like this:
<fmt:formatDate value="${someController.somedate}"
timeZone="${someController.detectedTimeZone}"
dateStyle="default"
timeStyle="default" />
It is important to covert date and time to valid (end user's) time zone. Also it is quite important to convert it to easily understandable format - that is why I recommend default formatting style.
BTW. Time zone detection is not something easy, as web browsers are not so nice to send anything. Instead, you can either add preferred time zone field to User preferences (if you have one) or get current time zone offset from web browser via client side script (see Date object's methods)
Numbers and currencies
Numbers as well as currencies should be converted to local format. It is done in the similar way to formatting dates (parsing is also done similarly):
<fmt:formatNumber value="1.21" type="currency"/>
Compound messages
You already have been warned not to concatenate strings. Instead you would probably use MessgageFormat. However, I must state that you should minimize use of compound messages. That is just because target grammar rules are quite commonly different, so translators might need not only to re-order the sentence (this would be resolved by using placeholders and MessageFormat.format()), but translate the whole sentence in different way based on what will be substituted. Let me give you some examples:
// Multiple plural forms
English: 4 viruses found.
Polish: Znaleziono 4 wirusy. **OR** Znaleziono 5 wirusów.
// Conjugation
English: Program encountered incorrect character | Application encountered incorrect character.
Polish: Program napotkał nieznaną literę | Aplikacja napotkała nieznaną literę.
Character encoding
If you are planning to Localize into languages that does not support ISO 8859-1 code page, you would need to support Unicode - the best way is to set page encoding to UTF-8. I have seen people doing it like this:
<%# page contentType="text/html; charset=UTF-8" %>
I must warn you: this is not enough. You actually need this declaration:
<%#page pageEncoding="UTF-8" %>
Also, you would still need to declare encoding in the page header, just to be on the safe side:
<META http-equiv="Content-Type" content="text/html;charset=UTF-8">
The list I gave you is not exhaustive but this is good starting point. Good luck :)
You can do exactly this using JSTL standard tag library with the tag. Grab a copy of the JSTL specification, read the i8N chapters, which discuss general text + date, time, currency. Very clearly written and shows you how you can do it all with tags. You can also set things like Locale programmatically
You dont(and shouldnt) need to have a separate JSP file per locale. The hard task is to figure out the keys that arent i18n-ed and move them to a file per locale, say, messages_en.properties, messages_fr.properties and so on.
Locale calculation can happen in multiple places depending on your logic. We support user locales stored in a database as well as the browser locale. Every request that comes into your application will have a "Accept-Language" header that indicates what are the languages your browser has been configured with , with preferences, i.e. Japanese first and then English. If thats the case, the application should read the messages_ja.properties and for keys that are not in that file, fallback to messages_en.properties. The same can hold true for user locales that are stored inside the database. Please note that the standard is just to switch the language in the browser and expect the content to be i18n-ed. (We initially started with storing locale in the database and then moved to support locales from the browser). Also you will need a default anyway as translators miss copying keys and values from english (main language file) to other languages, so you will need to default to english for values that are not in other files.
Ive also found mygengo very useful when giving translation job to other people who know a particular language, its saved us a lot of time.
I have a Java based web-application and a new requirement to allow Users to place variables into text fields that are replaced when a document or other output is produced. How have others gone about this?
I was thinking of having a pre-defined set of variables such as :
#BOOKING_NUMBER#
#INVOICE_NUMBER#
Then when a user enters some text they can specify a variable inline (select it from a modal or similar). For example:
"This is some text for Booking #BOOKING_NUMBER# that is needed by me"
When producing some output (eg. PDF) that uses this text, I would do a regex and find all variables and replace them with the correct value:
"This is some text for Booking 10001 that is needed by me"
My initial thought was something like Freemarker but I think that is too complex for my Users and would require them to know my DataModel (eww).
Thanks for reading!
D.
Have a look at java.text.MessageFormat - particularly the format method - as this is designed for exactly what you are looking for.
i.e.
MessageFormat.format("This is some text for booking {0} that is needed by me, for use with invoice {1}", bookingNumber, invoiceNumber);
You may even want to get the template text from a resource bundle, to allow for support of multiple languages, with the added ability to cope with the fact that {0} and {1} may appear in a different order in some languages.
UPDATE:
I just read your original post properly, and noticed the comment about the PDF.
This suggest that the template text is going to be significantly larger than a line or two.
In such cases, you may want to explore something like StringTemplate which seems better suited for this purpose - this comment is based solely on initial investigations, as I've not used it in anger.
I have used a similiar replacement token system before. I personally like something like.
[MYVALUE]
As it is easy for the user to type, and then I just use replacements to swap out the tokens for the real data.