I'm writing an applet that's supposed to show both English and Japanese (unicode) characters on a JLabel. The Japanese characters show up fine when I run the applet on my system, but all I get is mojibake when I run it from the web page. The page can display Japanese characters if they're hard-coded into the HTML, but not in the applet. I'm pretty sure I've seen this sort of thing working before. Is there anything I can do in the Java code to fix this?
My first guess would be that the servlet container is not sending back the right character set for your webapp resources. Have a look at the response in an HTTP sniffer to see what character set is included - if the response says that the charset is e.g. CP-1252, then Japanese characters would not be decoded correctly.
You may be able to fix this in code by explicitly setting a Content-Type header with the right charset; but I'd argue it's more appropriate to fix the servlet container's config to return the correct character set for the relevant resources.
Well I'm not sure what was causing the problem, but I set EVERYTHING to read in and display out in UTF-8 and it seems to work now.
Related
I have wrote a web proxy for HTTP in Java. But when I get a HTML file, only SOME of the images can be displayed normally and the other cannot. How can I fix it?
Make sure the below points are considered.
Your Content Type is set correctly. Eg. "image/jpeg"
The Character Encoding is correctly set up while proxying
You might as well try this before writing one for your self unless it is a problem that is specific for you.
https://github.com/mitre/HTTP-Proxy-Servlet
I'm completely puzzled. I've set my Apache and Tomcat config files, Java Servlet project and JSP pages, ALL OF IT to "UTF-8" to support spanish characters (á, í, ó, etc). I've systematically followed all guidelines found on the documentation and forums. I know I could use Latin1, but since it seems to be easier to use UTF-8, but after 4 days of trial and error, I've decided (since my servlet will support only spanish characters) to switch to "ISO-8859-1", which is actually mostly working.
The only problem is that ONLY my JSP pages still says "UTF-8" when I right click --> Properties. The page directive and meta tag is correct (ISO-8859-1), but when I open it on the browser says "Windows-1252".
I have no idea why this is happening. If I switch to "UTF-8" (all of it, including server config, Java project, etc), the characters appears garbled at the browser, e.g.: "puntuación" instead of "puntuación".
So, to iron this question out...
Does anyone knows how to implement UTF-8 correctly and make spanish characters work everywhere?
or
Does anyone knows how to change JSP pages to be "ISO-8859-1" everywhere? Right now, it's UTF-8 at the Properties window, ISO-8859-1 as #page (contentType charset and pageEncoding) directive, and Windows-1252 at the browser
As always, I'm more than grateful in advance for your patience and support.
After adding the "CharsetFilter" class to my project, everything worked just fine.
Follow these guidelines: How to get UTF-8 working in Java webapps?
PS: I've completely removed all lines mentioning:
response.setCharacterEncoding("UTF-8");
response.setContentType("text/html;charset=UTF-8");
But I kept on the JSPs:
Happy coding!
I am writing a program in java with Eclipse IDE, and i want to write my comments in Greek. So i changed the encoding from Window->Preferences->General->Content Types->Text->Java Source File, to UTF-8. The comments in my code are ok but when i run my program some words contains weird characters e.g San Germ�n (San Germán). If i change the encoding to ISO-8859-1, all are ok when i run the program but the comments in my code are not(weird characters !). So, what is going wrong with it?
Edit: My program is in java swing and the weird characters with UTF-8 are Strings in cells of a JTable.
EDIT(2): Ok, i solve my problem i keep the UTF-8 encoding for java file but i change the encoding of the strings. String k = new String(myStringInByteArray,ISO-8859-1);
This is most likely due to the compiler not using the correct character encoding when reading your source. This is a very common source of error when moving between systems.
The typical way to solve it is to use plain ASCII (which is identical in both Windows 1252 and UTF-8) and the "\u1234" encoding scheme (unicode character 0x1234), but it is a bit cumbersome to handle as Eclipse (last time I looked) did not transparently support this.
The property file editor does, though, so a reasonable suggestion could be that you put all your strings in a property file, and load the strings as resources when needing to display them. This is also an excellent introduction to Locales which are needed when you want to have your application be able to speak more than one language.
I am building a web application using JSF and Spring in eclipse indigo using Glassfish server 3.1.2 . Everything is going great but it is showing me this error in firebug in 2 JavaScript files.
When I check in those files I didn't find any illegal character in those files but firebug still showing this.
I have used these files in one of ASP.Net project and they didn't mess there so i checked and matched their content type from both projects then I found that in ASP.Net project these files have
Content-Type = application/x-javascript
And in my JSP-Spring(JAVA) project there
Content-Type = text/javascript;charset=ISO-8859-1
is this.So you can see that sames files have changed their content scheme. I found that this scheme can be change by configuration in glassfish server.So I want to change my JS files content-Type to same as in ASP type.
If anyone have any other solution then please share because I haven't found any solution other than changing the scheme from glassfish serverThanks
Those strange characters you are seeing is the UTF-8 Byte Order Mark. They are a special set of bytes that indicate how a document is encoded. Unfortunately, when interpreted as ISO-8859-1, you wind up with the problem you have. There are two ways to resolve this.
The first way is to change the output character set to UTF-8. I believe this can be done in your server configuration, in your web.xml configuration, or by setting the character set on the HTTP request object in code; for example: yourServletRequest.setCharacterEncoding("UTF-8");
The second way is to remove the BOM from your Javascript files. You can do this by opening them in Notepad++, going to Encoding > Convert to ANSI, and then saving them. Alternatively, open them in Notepad, then go to Save As and ensure that the Encoding option is set to ANSI before saving them. Note that this may cause issues if you have non-ISO-8859-1 text in your Javascript files, although this is unlikely.
I'm running into an encoding issue that has stumped me for a few weeks and nothing seems to work. I have a website that works fine on my local machine, but when I push the jsp files to a Linux box for review, characters that previously rendered fine are now displaying as funky characters.
For some reason, some characters display just fine, but other characters will not encode properly. All text on the page is being read from java .properties files and output to the page using beans.
I've added a meta tag to the page to set encoding, which did nothing. I also added <%# page contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"%> but this did nothing on the linux box and actually made the encoding errors appear on my local windows machine.
Any help would be greatly appreciated.
Check that the method loading the properties is using the character encoding that the property files are actually written in.
Without explicit setting this, the default encoding for the file system is used, and it is ISO-Latin-1 on Windows, and UTF-8 on some Linux distributions.
The following need to play together for character encoding to work properly in Nixes and Nuxes:
file system encoding
database encoding (does not seem to apply)
database connector encoding
Java-internal string encoding (UTF-16, if I remember correctly)
Java output encoding
HTML page encoding
With your page directive, you only addressed the last bullet. In other words, you are instructing the brower to decode the page as UTF-8, but that's not what you are sending.
Take a look at this (admittedly a few years old) paper, chapter 11 in particular.
Also, check the physical files on both machines. I've seen several FTP clients muck up files during transfer. A quick check is to push your file as html instead of jsp. You'll get garbage for all the <% %> sequences, but the other text should show up unchanged. You've also taken the app server out of the equation. If the text is still funky, it's your FTP or WebDAV client trying to "help".
Look at the http headers sent by the server. That is the first place the browser looks for encoding before anything else.