How to "activate" UTF8 with Mysql - java

How to "activate" UTF8 with Mysql - java - java

I have problems when I insert into tables words with accent marks. So I think that I have to "activate" UTF-8 to fix that error.
I'm not using Class for name. That's my code:
miInitialContext = new InitialContext();
miDS = (DataSource) miInitialContext.lookup(InformacionProperties.getStrDataSource());
Connection conexion = miDS.getConnection();
Statement myStatement = conexion.createStatement();
myStatement.executeUpdate("INSERT INTO table values ......)
How can I "activate" that UTF8 with my code?

Use set names to set your connection charset. However, that won't matter much if the columns/tables/databases you interact with aren't configured with a compatible charset. For instance, with a latin1 column testcol, inserting utf8 data will result in an error like
INSERT INTO `test`.`table` (`testcol`) VALUES ('test_val'), ('√çdata');
ERROR 1366 (HY000): Incorrect string value: '\xE2\x88\x9A\xC3\xA7d...' for column 'testcol' at row 2
So you'll need to update the table structure
ALTER TABLE t MODIFY testcol CHAR(50) CHARACTER SET utf8;
Which then fixes the issue:
INSERT INTO `test`.`table` (`testcol`) VALUES ('test_val'), ('√çdata');
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
(See the mysql docs for details).
This post has good documentation on finding the character sets of various structures.
(Thanks to Shadow for the SET NAMES part)

Related

mysql character_set_results is not changed

I'm using mysql with hibernate and having problems with all other languages then English. I'm getting and exception saying that the language is not utf8 though the language is utf8 (hebrew).
I ran show variables like '%character%'; and this is what I got:
I think maybe character_set_server is the problem? it is latin1 and I can't change it to utf8, how do I do it? I'm using amazon RDS and there under parameters group I see utf character_set_server, so I don't understand why its not utf8 above.
On the other hand maybe it's not the problem at all. Any Other suggestions are welcome.
EDIT:
I managed to change the attached image values to utf8 for everything but still I"m getting the following exception:
2016-02-21 08:46:05 DEBUG SqlExceptionHelper:139 - could not execute statement [n/a]
java.sql.SQLException: Incorrect string value: '\xD7\xAA\xD7\xA9\xD7\x95...' for column 'text' at row 1
...
...
...
2016-02-21 08:46:05 WARN SqlExceptionHelper:144 - SQL Error: 1366, SQLState: HY000
2016-02-21 08:46:05 ERROR SqlExceptionHelper:146 - Incorrect string value: '\xD7\xAA\xD7\xA9\xD7\x95...' for column 'text' at row 1
2016-02-21 08:46:05 INFO AbstractBatchImpl:208 - HHH000010: On release of batch it still contained JDBC statements
2016-02-21 08:46:05 DEBUG SqlExceptionHelper:225 - SQL Warning
java.sql.SQLWarning: Incorrect string value: '\xD7\xAA\xD7\xA9\xD7\x95...' for column 'text' at row 1
EDIT 2:
So I managed also to fix the exception. It is now saved fine in the DB.
I fixed it by calling the following command for each column:
ALTER TABLE <table_name> MODIFY <column_name> VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_general_ci;
My problem now is that the results are returned from DB with question marks.
I still see empty value for character_set_results when calling show variables like '%character%';

I don't think character_set_server is the problem.
\xD7\xAA\xD7\xA9\xD7\x95 is hex for the utf8-encoding for 'תשו'. If it were interpreted as latin1, it would be '×ª×©×•'.
A strange setting is the empty value (empty string? NULL?) for character_set_result, which controls transliteration during SELECT.
Please provide the output of SELECT col, HEX(col) FROM ... -- If you get hex of D7AAD7A9D795 for that Hebrew string, then the data is stored correctly, and we should look at the output side. If not, then the data is stored incorrectly. Or that ALTER messed things up.
Hebrew, in utf8, displayed as hex, is mostly 'D7xx'.
You need utf8 in several places:
The bytes you are inserting need to be encode in utf8.
The connection needs to be in utf8. <property name="url" value="jdbc:mysql://...&characterSetResults=utf8&characterEncoding=utf-8"/>
The table definition needs to say CHARACTER SET utf8 (or utf8mb4). Do SHOW CREATE TABLE.
The html output needs <meta charset=utf-8" />.
If data is coming from an HTML form: <form accept-charset="UTF-8">

character_result_set is empty in mysql

I'm using mysql DB. The DB is amazone RDS DB.
When I execute this query: show global variables like '%character%';
I get the following result:
But when I execute this query: show variables like '%character%'; I get this result:
As you can see character_set_results is empty. I tried following queries and nothing changes the empty value:
ALTER DATABASE myDB CHARACTER SET utf8;
ALTER DATABASE myDB CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER DATABASE myDB DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
SET CHARACTER_SET_RESULTS=UTF8
SET NAMES 'utf8';
SET CHARACTER SET 'utf8';
SET session CHARACTER_SET_RESULTS = utf8;
As I understand the second query returns session parameters. Might this affect the results I'm getting?
I have two machines. Development machine where I run the application with eclipse (this is a java app with hibernate) and another one, the deployment machine. Everything works fine on the development machine but on the deployment machine sometimes I'm getting ???? or other strange characters when getting data from DB.
Both machines connect to the same DB and data is stored fine in the db itself.
Also the connection url is jdbc:mysql://myDB:3306/myApp?autoReconnect=true&useUnicode=true&createDatabaseIfNotExist=true&characterEncoding=utf-8.
Any ideas what can cause this?

Multiple question marks usually come from:
You are INSERTing Chinese (or any non-western-Europe text).
You said SET NAMES utf8 to declare that the client bytes are utf8-encoded (correct).
But the table columns are declared CHARSET latin1. <-- This is the problem.
Since there is no way to convert Chinese characters into latin1, '?' stored.
Please provide SHOW CREATE TABLE to confirm the above hypothesis.
If you are getting "other strange characters", please provide:
SELECT col, HEX(col) FROM ...
to show an example of what is stored for the "strange characters".

How to save special characters in database

I'm using a MySQL database and inserting like this:
try (Connection connection = DbConnector.connectToDb();
PreparedStatement stm = connection.prepareStatement("INSERT INTO Country (name) VALUES (?)")) {
stm.setString(1, name);
if (stm.executeUpdate() > 0) {
result = true;
}
} catch (Exception e) {
logStackTrace(e);
}
Now when I insert: België it is saved in a weird way in the database the ë isn't saved. How can I solve this?
EDIT:
I just changed the table via:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
But still when I add a new one it isn't displayed correctly on the webpage.

There is 2 points in the database to check in order to correctly set the UTF-8 charset.
Database Level
This is obtained by creating it :
CREATE DATABASE 'db' CHARACTER SET 'utf8';
Table Level
All of the tables need to be in UTF-8 also (which seems to be the case for you)
CREATE TABLE `Table1` (
[...]
) DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
The important part being DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
Finally, if your code weren't handling utf8 correctly, you could have forced your JVM to use utf8 encoding by changing the settings by on startup :
java -Dfile.encoding=UTF-8 [...]
or changing the environment variable
"**JAVA_TOOLS_OPTIONS**" to -Dfile.encoding="UTF-8"
or programmatically by using :
System.setProperty("file.encoding" , "UTF-8");
(this last one may not have the desire effect since the JVM caches value of default character encoding on startup)
Hope that helped.

String Converting Issue

Ive been stunk in working with java and mysql these days.
The problem is, ive got a mysql database. There is a column in one table which shows the chinese city names. One collegue changed the db to utf8 for every character(connection, db, results, server and system) The consequence is that the data before the change didn't show correctly any more only if i set the %character% back to latin1. In either character set i can only retrive half the data correctly. Could you please help me how to solve the problem?
Ive tried to use java to solve the problem but it doesn't work.
String sql = "SELECT * FROM customer_addresses";
ResultSet result = query.executeQuery(sql);
while (result.next()) {
byte b[] = result.getBytes("city");
c = new String(result.getBytes("city"), "UTF-8");
}
For example: there is one city in db like this ä¹Œé²æœ¨é½å¸‚
the java print: 乌�?木�?市
it should be:乌鲁木齐市
Thanks in advance

Default charset of your MySQL server is probably not UTF8. Try to execute the following SQL queries before getting data from the database:
SET NAMES utf8
and
SET CHARACTER SET utf8

Add characterEncoding=UTF-8 to the connection string, where you connect to the database. For example:
"jdbc:mysql://servername:3306/databasename?characterEncoding=UTF-8"
Incidentally, the data in the database appears to be broken. If you want the database to store 乌鲁木齐市, that's what should be in the table, not ä¹Œé²æœ¨é½å¸.
Update: The problem with the how the data is stored in the database is easier to solve using database's own tools, not Java. For each table that stores text do this:
ALTER TABLE tablename CONVERT TO CHARACTER SET binary;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8;

Can't store UTF-8 Content in MySQL Using Java PreparedStatement

For some strange reason I can't seem to add UTF-8 data to my MySQL database. When I enter a non-latin character, it's stored as ?????. Everything else is stored fine. So for example, "this is an example®™" is stored fine, but "和英辞典" is stored as "????".
The connection url is fine:
private DataSource getDB() throws PropertyVetoException {
ComboPooledDataSource db = new ComboPooledDataSource();
db.setDriverClass("com.mysql.jdbc.Driver");
db.setJdbcUrl("jdbc:mysql://domain.com:3306/db?useUnicode=true&characterEncoding=UTF-8");
db.setUser("...");
db.setPassword("...");
return db;
}
I'm using PreparedStatement as you would expect, I even tried entering "set names utf8" as someone suggested.
Connection conn = null;
PreparedStatement stmt = null;
ResultSet rs = null;
try {
conn = db.getConnection();
stmt = conn.prepareStatement("set names utf8");
stmt.execute();
stmt = conn.prepareStatement("set character set utf8");
stmt.execute();
... set title...
stmt = conn.prepareStatement("INSERT INTO Table (title) VALUES (?)");
stmt.setString(1,title);
stmt.execute();
} catch (final SQLException e) {
...
The table itself seems to be fine.
Default Character Set: utf8
Default Collation: utf8_general_ci
...
Field title:
Type text
Character Set: utf8
Collation: utf8_unicode_ci
I tested it by entering in Unicode ("和英辞典" specifically) through a GUI editor and then selecting from the table -- and it was returned just fine. So this seems to be an issue with JDBC.
What am I missing?

On your JDBC connection string, you just need set the charset encoding like this:
jdbc:mysql://localhost:3306/dbname?characterEncoding=utf8

There is 2 points in the mysql server to check in order to correctly set the UTF-8 charset.
Database Level
This is obtained by creating it :
CREATE DATABASE 'db' CHARACTER SET 'utf8';
Table Level
All of the tables need to be in UTF-8 also (which seems to be the case for you)
CREATE TABLE `Table1` (
[...]
) DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
The important part being DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
Finally, if your code weren't handling utf8 correctly, you could have forced your JVM to use utf8 encoding by changing the settings by on startup :
java -Dfile.encoding=UTF-8 [...]
or changing the environment variable
"**JAVA_TOOLS_OPTIONS**" to -Dfile.encoding="UTF-8"
or programmatically by using :
System.setProperty("file.encoding" , "UTF-8");
(this last one may not have the desire effect since the JVM caches value of default character encoding on startup)
Hope that helped.

Use stmt.setNString(...) instead of stmt.setString(...).
Also don't forget to check column collation in database side.

If you log in to your mysql database and run show variables like 'character%';
this might provide some insight.
Since you're getting a one-to-one ratio of multi-byte characters to question marks then it's likely that the connection is doing a character set conversion and replacing the Chinese characters with the replacement character for the single-byte set.

Also check locale -a on ubuntu default Ubuntu works with en_us locale and doesn't have other locale installed.
must specify characterEncoding=utf8 while connecting through JDBC.

add at the end of your DB connection url - (nothing else needed)
ex.
spring.datasource.url = jdbc:mysql://localhost:3306/dbname?characterEncoding=utf8

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.