Good afternoon,
I'm trying resolve the classic encoding error in java, but I don't know what to do...
I try:
add on jsp: <%#page contentType="text/html"pageEncoding="UTF-8"%>
use "SQL_Latin1_General_CP1_CI_AS" no select(sqlserver)
add "CharacterSet=UTF-8" on String conection of jdbc
add response.setContentType("application/json"); and response.setCharacterEncoding("utf-8"); on servlet
but nothing works!!!!
SGBD: SQL Server
Server: GlassFish
Exemple record of database "Está"
what can I do?
Seems that you have jtds parameter sendStringParametersAsUnicode=false
One solution is to change it to true. If not then:
SQL_Latin1_General_CP1_CI_AS is CP-1252 (Windows-1252) encoding, so to search in database you need to encode your Unicode string to Windows-1252:
new String(value.getBytes("UTF-8"), "Windows-1252")
Vice versa after read from database:
new String(value.getBytes("Windows-1252"), "UTF-8")
Related
I am trying to understand what is the difference and importance of different charsets available while encoding and decoding text.
I have a scenario, where I want to call a RestAPI. The RestAPI has a base URL, for ex: https://myrestapiurl.com. Now to perform a GET request, the URL is formed by appending the id of the entity that I want to fetch, like: https://myrestapiurl.com('id')
id : It has no limitations on valid characters!
I have encountered an id: باقی ریسورس , So before calling the RestAPI, I need to encode it. Using Java's URLEncoder, I tried the following:
String s ="باقی ریسورس";
String encodedID = URLEncoder.encode(s, StandardCharsets.UTF_8.name() )
Using the encodedID, I try to make a request using PostMan. The request fails with 404 or 400 when I use different charset. It only succeeds when I encode using ISO_8859_1 as follows:
String encodedID = URLEncoder.encode(s, StandardCharsets.ISO_8859_1.name());
String URL = "https://myrestapiurl.com('" + encodedID + "')";
This works fine, through code as well as PostMan. My question is:
How can I decide which charset to use before encoding? Or should I have fallbacks? That is if it fails with UTF_8 then try with UTF_16 etc etc...but this is very in-efficient. In case if the entity actually doesn't exist, then, these tries would be overhead
Also, when I visit https://www.w3schools.com/tags/ref_urlencode.ASP and enter the text to be encoded, it provides the valid encoded string with ISO_8859_1 , how does it manage to do so?
How can this be done in Java without using any other extra libraries like apache? We don't have choice to add extra dependencies!
I am sending sensitive data encrypted when the user clicks the onclick event. This encrypted data at times contains a plus sign (+) When I retrieve this request variable on the server, the + is getting converted to a whitespace. This causes the decryption to fail.
Example:
xrUxHtYpO2Yu3Z31ve+KNA==
gets converted to:
xrUxHtYpO2Yu3Z31ve KNA==
Is there a way escape the string so it is sent as is?
The function you're looking for is "encodeURIComponent()":
var encoded = encodeURIComponent("nasty string");
You shouldn't need any code at all on the server side; URL encoding will almost certainly be implicitly un-done by your web framework. (Edit - ah, if you're using some Java/JSP web framework, then you definitely don't have to do anything fancy on the server side.)
Try replacing the + with %2B. That came from HTML URL Encoding Reference at W3Schools. Hope this helps!
I'm facing a really annoying problem:I created a form with spring's form-tags and when I insert text with non-latin characters I get a sequence of questionmarks.I've used the CharacterEncodingFilter in my web.xml but I'm still facing the same problem,I've set characterEncoding in UTF-8 at the formBackingObject method of my controller,I've set page encoding charset and enctype to UTF-8 with no result.I know there are similar posts here and I've tried the suggested solutions but nothing changed!Any suggestions? thank you in advance
A sequence of question marks is typical when either the DB encoding or the HTTP response encoding cannot accept the obtained bytes for the encoding it was instructed to use.
Since you've set the page encoding to UTF-8, the HTTP response encoding part is fine (assuming that all you did was putting <%#page pageEncoding="UTF-8" %> in top of JSP).
So, the DB encoding is suspect. You need to ensure that the DB is been instructed to use the proper encoding to store the characters. You're supposed to do this in CREATE DATABASE and CREATE TABLE statements. With some JDBC drivers you also need to pass an extra argument in JDBC connection string to specify the encoding the bytes are transferred in. The details depends on the DB and JDBC driver used, so it's up to you to consult the appropriate manuals. If you stucks, update your question to include the DB make/version used.
See also:
Unicode - How to get the characters right? - Section about Databases
I have a servlet which receive some parameter from the client ,then do some job.
And the parameter from the client is Chinese,so I often got some invalid characters in the servet.
For exmaple:
If I enter
http://localhost:8080/Servlet?q=中文&type=test
Then in the servlet,the parameter of 'type' is correct(test),however the parameter of 'q' is not correctly encoding,they become invalid characters that can not parsed.
However if I enter the adderss bar again,the url will changed to :
http://localhost:8080/Servlet?q=%D6%D0%CE%C4&type=test
Now my servlet will get the right parameter of 'q'.
What is the problem?
UPDATE
BTW,it words well when I send the form with post.
WHen I send them in the ajax,for example:
url="http://..q='中文',
xmlhttp.open("POST",url,true);
Then the server side also get the invalid characters.
It seems that just when the Chinese character are encoded like %xx,the server side can get the right result.
That's to say http://.../q=中文 does not work,
http://.../q=%D6%D0%CE%C4 work.
But why "http://www.google.com.hk/search?hl=zh-CN&newwindow=1&safe=strict&q=%E4%B8%AD%E6%96%87&btnG=Google+%E6%90%9C%E7%B4%A2&aq=f&aqi=&aql=&oq=&gs_rfai=" work?
Ensure that the encoding of the page with the form itself is also UTF-8 and ensure that the browser is instructed to read the page as UTF-8. Assuming that it's JSP, just put this in very top of the page to achieve that:
<%# page pageEncoding="UTF-8" %>
Then, to process GET query string as UTF-8, ensure that the servletcontainer in question is configured to do so. It's unclear which one you're using, so here's a Tomcat example: set the URIEncoding attribute of the <Connector> element in /conf/server.xml to UTF-8.
<Connector URIEncoding="UTF-8">
For the case that you'd like to use POST, then you need to ensure that the HttpServletRequest is instructed to parse the POST request body using UTF-8.
request.setCharacterEncoding("UTF-8");
Call this before you access the first parameter. A Filter is the best place for this.
See also:
Unicode - How to get the characters right?
Using non-ASCII characters as GET parameters (i.e. in URLs) is generally problematic. RFC 3986 recommends using UTF-8 and then percent encoding, but that's AFAIK not an official standard. And what you are using in the case where it works isn't UTF-8!
It would probably be safest to switch to POST requests.
I believe that the problem is on sending side. As I understood from your description if you are writing the URL in browser you get "correctly" encoded request. This job is done by browser: it knows to convert unicode characters to sequence of codes like %xx.
So, try to check how do you send the request. It should be encoded on sending.
Other possibility is to use POST method instead of GET.
Do read this article on URL encoding format "www.blooberry.com/indexdot/html/topics/urlencoding.htm".
If you want, you could convert characters to hex or Base64 and put them in the parameters of the URL.
I think it's better to put them in the body (Post) then the URL (Get).
I have written a application that parses the html code of some web pages. My problem is with inserting that data into my mysq database. So for example i want to insert ľščťžýáíé and when i look into the table i get ?š??žýáíé.
I guess the problem could be that the html pages i'm downloading are encoded in cp1250. but the database is utf8.
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(),"cp1250"));
and this is how i download the data.
Do you have some ideas how to fix this problem? Because i allready ran out.
Edit: oh and when i write the data out to the console (with System.out, i know i shouldn't use it... :) ) then every character is showing up correctly.
issue a set names CP1251; just after your connect to mysql and before any inserts
So i found out what works.
As i'm connecting to via JDBC to MySQL i used the following connection string
conString = "jdbc:mysql://"+host+"/"+database+"?useUnicode=true&characterEncoding=utf8";
And this did the trick. I had to force JDBC to use utf8 for the connection using ?useUnicode=true&characterEncoding=utf8