How Can I detect Unknown/Unassigned Unicode characters in my java program? - java

I want to write a java program to print Unicode characters. I want to detect and not print Unknown/Unassigned CHaracters (which are shown by a rectangular). I have tried "isDefined" and "isISOControl" from "Character" class, but it does not work.
Does anybody know the solution? it will be a big help for me.
Thanks.

The characters that are shown as a rectangle (on Windows) are ones that aren't available in the font you're using. While you could filter out a lot of them by filtering out undefined and control characters, it's entirely possible that the problem you're running into is that your font doesn't support certain ranges of valid characters (which is typical -- very few fonts define glyphs for all defined Unicode characters).
If your goal is really to remove characters that render as a rectangle, you can use the canDisplay method in Font.

Related

Rendering a string in Java which is in different languages english, chinese and Indic

I need to render the below String on a JTable cell. How do we do this?
testä漢字1ગુજરાતી2
Update:
Looks like my question is not clear. The above string is a name of file. Need to display this and many other file names in JTable. Data comes dynamically. I may need a custom renderer to display this string exactly same. Currently, this displays junk characters. Simply changing the table Font from Calibri to MS Gothic, I can see the chinese characters, but not indic letters. But, as the data comes dynamically, we won't be knowing what font to use.
So, want to know if there is way so that I can check the string programmatically and render the string with different fonts as appropriate.
The simplest solution would be to use a font which is more complete. I believe the DejaVu family is decent, others may be able to suggest better. The Oracle manual suggests that the logical fonts Dialog, DialogInput, Monospaced, Serif and SansSerif are also likely to be more complete. Potentially they could map to multiple underlying fonts depending on the specific characters which need to be rendered. Oracle also mentions the Lucidia family which is distributed with Oracle's JRE as another possibility which is fairly complete, though it doesn't have Chinese, Korean or Japanese characters.
A more convoluted solution would be to run Character.UnicodeBlock.of(c) on the characters in each string, assemble a Set<Character.UnicodeBlock> and guess which font is most appropriate based on the blocks present in the string, or even write a custom renderer to render each character (or sequence of characters) with a font appropriate to the Unicode block they belong to. Unicode blocks tend to be categorized according to the script they contain.

How to add two Fonts to java swing components, and make them automatically switch when user switches keyboard input language?

I'm working on an app where User should be able to input some text which contains both English and Persian(Same as Arabic: almost same characters and written from right to left ).
Currently i'm using fonts like Courier New which supports both languages but it looks really Ugly. I Want to use some better looking fonts, but these fonts only support one of these languages and show nonsense characters for the other language. So i need to use them based on text language.
So generally how can I make Java components (especially swing.JTextField, swing.JListBox and swing.JTextComponent) accept two fonts and switch appropriately to have a good looking GUI?
Edit: Here is an example of what i need. Let say user should input something like (FPGA استفاده از ) and all of it in a single swing.JTextField. It means (Using FPGA) and FPGA is an abbreviation, so there is no Persian translation. I need to set a font with a better look and all fonts which support Unicode are ugly for the Persian part.
Now if I set font to something like Times New Roman ,which only supports Latin, then Persian characters would show as empty squares. also if i set font to something like B Nazanin ,which only supports Persian, then Latin characters would show as empty squares. How can I have both fonts in a single Java component in the same time.
InputContext context = InputContext.getInstance();
if(context.getLocale().toString().equals("en_GB")){
.setFont(*font for english keyboard*);
}
else if(context.getLocale().toString().equals("fa_IR")){
.setFont(*font for persian keyboard*);
}

Japanese in JTextArea

I have a database with japanese words. Additionaly i have algorithm that reads these words and put them into JTextArea.
Problem is I see rectangles instead of japanese signs.
But when i copy such a set of rectangles (ctrl+c) from JTA and put them into eg. command input of TotalCommander or Winword document, signs appears are displayed properly. But only under Win7.
Because i run Eclipse on Virtual Machine under winXP I have ability of copy rectangles also to command input of TotalCommander under winXP. There are remain rectangles as in my Java app.
It means that there is in JTA an info about particular signs, but JTA can't interpretes this info.
Of course I have installed proper font.
I've tried many way with fonts:
textArea.setFont(new Font(blablabla));
and similar, but without effects.
What should i do?
The Problem with your JTextArea is most-probably, that the font you're using isn't applicable for UTF-8 & Japanese. The font doesn't provide an mapping table from UTF-8 values to characters. i.e. 0x41 is in ASCII, as well as in UTF-8 and even SHIFT-JIS the letter 'A' - but the Font you're just linking, resolves 0x41 to an Kanji character. And the whole font doesn't contain Hiragana and Katakana characters at all - please see also the comments section on the site where you got this font from here.
After using ChapMap it has a WSIfonts TAG and does NOT! support ALL the Chinese characters it only has 90 characters and assigns 1 character per Char except Caps.
It's a chinese font - not a Japanese one. But it won't even provide all chinese characters and has no useful mapping table included - so it's pretty useless.
Try to use another font - that should work just fine, if it contains really japanese characters and provides an applicable mapping table for UTF-8.
You can find fonts, that would work i.e. here

Can we insert Unicode Characters using Robot Class in Java?

I am developing a Real time English-Sinhala Unicode translator in java.I did the translation part.But now I want to add the Final output Unicode characters to the currently active window (like a web browser).There's a way to add characters via java Robot class with
Robot.keyPress(//keyInput) method.But is there any way to do this with java Unicode characters like u0200 hex value.If it can't be done with this way what solutions I have to resolve this.Please anyone help me ?
Yes, you can simulate key presses using Robot, as suggested here. No, Robot can't see what's printed on the user's key caps. You're probably going to have to develop a virtual keyboard. When available, Unicode glyphs make usable button labels, as shown here.
Addendum: Note that a KeyEvent represents a keystroke, while Unicode encodes graphemes represented by glyphs. The mapping depends on the keyboard layout, e.g. Sinhala.

Print unique ascii characters in eclipse console

Kind of a strange question but... here it goes.
Recently my application threw an IOException that the text only had a clubs symbol in it (like the suit in cards) I know this is probably because there was a number in there that was cast to a char and printed to the screen, and I've found where that might have happened. The only problem is, I can't recreate it in eclipse because the eclipse console doesn't want to print those characters for me. All I get are boxes.
I figure this is an encoding issue or something but I need eclipse to print out those characters just like the windows console would. Is there a setting I can change to do this?
The respective Unicode character is U+2663. Just print "\u2663" and you should be fine. This has nothing to do with ASCII, though.
If you get boxes it may also be a font issue. If the font you selected for the console view in Eclipse does not have a glyph for that code point you'll get boxes, usually. The character might still correctly printed, though. Usually monospaced fonts have that character, though, since it was historically part of the glyphs for the control characters below character code 32 (not that control characters ever had the intention of a visual appearance, but well, they could be in the screen buffer, so someone thought it would be a good idea to display them as well).

Categories

Resources