Hello Guys
I am developing a dictionary application where users can search Arabic <-> Turkish. I'm getting the data from firebase, no problem here. In my algorithm, the user's keyboard language is selected when the user presses the search view. If this language is Turkish, the text entered by the user is listed as Turkish in the search view (+recycle view), sends it to the recycle view and is listed. If that language is Arabic, I list it as Arabic. By the way, you can think of the data I listed as key & value. The Turkish equivalent of each Arabic word is on the same line. So far the app is working fine for me because I am using my phone's default keyboard and I can get the keyboard language.
The problem starts here;
I can't get this keyboard language when user uses custom keyboards published in Play Store. I can't list it because I can't get the keyboard language. I opened a thread on Stackoverflow but was told that I can't access the language of these custom keyboards in any way. So, how can I sort by understanding whether the user is searching in Arabic or Turkish, without picking up the keyboard language or in any way asking the user in which language to search? Thanks in advance and good work.
You will have to maintain a translation in your server, when user searches in one language the corresponding meaning in other language should also be searched, the corresponding meaning will be stored on the server(or on client).
If you can't reliably get the keyboard's locale, that seems like a no-go for what you want to do. But even if someone's using a Turkish keyboard, that doesn't mean they're typing Turkish text, right? Since it basically covers the latin character set - they could even be typing in romanised Arabic! (I don't know how likely that is, but it's possible)
You might want to look into a library that detects languages - from a quick search there are a few, and ML-Kit is a Google library that people seem to recommend for it.
I think whatever you do, you probably want the user to be able to set their input language explicitly - give them the final say (and responsibility for ensuring it's correct!). Similar to how Google Translate does it - you can type and it can guess what language you're using (and it says something like (automatically detected) next to it) but the user gets to explicitly choose
edit since you really want this to be automatic (I'd really recommend giving the user control over this, just in case) could you do something like checking if the characters they've entered are Arabic script?
Doesn't help with romanised Arabic (I don't know if that's really used much at all!) - but if you can assume Arabic uses Arabic script, and Turkish is anything else (or you could do the same with the Turkish characters) then maybe you could take a guess just by comparing their input to a set of potential characters. There might even be a convenient Unicode grouping you can check, but I'm not sure off-hand. Might be worth looking into
Related
So I'm working with last.fm API. Sometimes, the query results in tracks that contain characters like these:
Æther, é, Hṛṣṭa
or non-English characters like these:
水鏡.
When debugging in Eclipse, I see them just fine (as-is) but printing on console prints these as ??? - which is OK for me.
Now, how do I handle these? At first I though I could remove every song that has any character other than the ones in English language. I used the regex ^\\w+$ but it didn't work. I also tried \\w+. That didn't work either.
Then I thought further on how do handle these properly. Any one can help me out? I am perfectly fine with letting these tracks out of the equation, ie. I'm fine with having only English character tracks.
Another question: What is the best way to display these character of console and/or Swing GUI?
You must ensure that you use correct encoding when reading your input first.
Second ensure that the font used in Eclipse on platform you developing has ability to display all these characters. Swing must display unicode chars if you read them correctly.
You will likely want to use UTF-8 everywhere.
It is a completed java desktop application and now need it to accept entries for data fields, input boxes, etc to be made in native language by non English speaking workers.
So, essentially need a less invasive approach to actually internationalize the application to be usable in more than one localized regions. Asking my hurdles in step by step manner -
what would be the ways to make a text input box show non english characters as they are typed from the keyboard?
is there any way to directly use unicode/ hexcode of the entered text strings?
I have several devices that install as HID keyboard devices in most any operating system and, when used, send a string of text back, just like a keyboard. Is there any way in a Swing app to listen only to a chosen device, ignoring the standard keyboard, and do it without a TextComponent to capture the data? Thanks!
For anyone who comes across via google, etc, here is the solution I finally found.
(This is a solution to the second part of my question, how to capture input without a TextComponent).
I followed this tutorial and attached a KeyListener to my program. This allowed me to capture and parse input, albeit rather awkwardly. I have yet to find a smoother solution to this.
I may come back and add code to this. Please leave a comment if I have not yet done so and you would find it helpful
I am automating test cases for a web application using selenium 2.0 and Java. My application supports multiple languages. Some of the test cases require me to validate the text that appears in the UI like success/error messages etc.I am using a properties file to store whatever text I am referring in my tests from the UI, currently only english. For example there is locale_english.properties(see below) that contains all references in english. I am going to have multiple properties files like this for different locales like locale_chinese.properties,locale_french.properties and so on. For locales other than english, their corresponding properties file would have UTF-8 characters (e.g \u30ed) representing the native characters(see below). So If I want to test say Chinese UI, I would load "locale_chinese.properties" instead of "locale_english.properties". I am going to convert the native characters for non-english locale using perhaps native2ascii from JDK or some other way.I tested that Selenium API works well with UTF-8 characters for non-english locales
---locale_english.properties------
user.login.error= Please verify username/password
---locale_chinese.properties------
user.login.error= \u30ed\u30ef\u30eg\u30eh\u30ed
and so on.
The problem is that my locale_english.properties is growing and going out of control. It is becoming hard to manage a single properties file for one locale let alone for multiple locales. Is there a better way of handling localization in Java, particularly in situations like I am in?
Thanks!
You're right that there is a problem managing the files, but you're also right that this is the best approach. Some things are just hard :-(
Selenium (at least the Selenium RC API) does indeed support Unicode input and output, we have lots of tests that enter and confirm Cyrillic and Simple Chinese characters from C#. Since Java strings are Unicode at the core (just like C#), I expect you could simply create the file in a UTF-8-friendly editor like Notepad++ and read them straight into strings and use them directly in the Selenium API.
This is how I solved the issue for those who are interested.
a database would work better for many reasons, like growth, central location, kept outside of app and can be edited and maintained outside of app. We used a table with columns:
id (int) auto increment
id_text -- this and other columns are varchar ... except for date time for last 2
lang
translation
created_by
updated_by
created_date
updated_date
An id is a short english description of the text - like 'hello' or 'error1msg', the key in your map.
In java had a function to get the text for a particular text ... and a app level property - default language (usually en but good to keep it configurable)
Function would scan already loaded hashmap for language asked for - say "ch"
If corresponding translation was not found for this language we would return the default language translation and if that was not founf then we would return "[" + id "]" so the tester knows something is missing in data base - can go to web screen to edit translation table and add it.
I have a Java based web-application and a new requirement to allow Users to place variables into text fields that are replaced when a document or other output is produced. How have others gone about this?
I was thinking of having a pre-defined set of variables such as :
#BOOKING_NUMBER#
#INVOICE_NUMBER#
Then when a user enters some text they can specify a variable inline (select it from a modal or similar). For example:
"This is some text for Booking #BOOKING_NUMBER# that is needed by me"
When producing some output (eg. PDF) that uses this text, I would do a regex and find all variables and replace them with the correct value:
"This is some text for Booking 10001 that is needed by me"
My initial thought was something like Freemarker but I think that is too complex for my Users and would require them to know my DataModel (eww).
Thanks for reading!
D.
Have a look at java.text.MessageFormat - particularly the format method - as this is designed for exactly what you are looking for.
i.e.
MessageFormat.format("This is some text for booking {0} that is needed by me, for use with invoice {1}", bookingNumber, invoiceNumber);
You may even want to get the template text from a resource bundle, to allow for support of multiple languages, with the added ability to cope with the fact that {0} and {1} may appear in a different order in some languages.
UPDATE:
I just read your original post properly, and noticed the comment about the PDF.
This suggest that the template text is going to be significantly larger than a line or two.
In such cases, you may want to explore something like StringTemplate which seems better suited for this purpose - this comment is based solely on initial investigations, as I've not used it in anger.
I have used a similiar replacement token system before. I personally like something like.
[MYVALUE]
As it is easy for the user to type, and then I just use replacements to swap out the tokens for the real data.