How to save special characters in database - java

I'm using a MySQL database and inserting like this:
try (Connection connection = DbConnector.connectToDb();
PreparedStatement stm = connection.prepareStatement("INSERT INTO Country (name) VALUES (?)")) {
stm.setString(1, name);
if (stm.executeUpdate() > 0) {
result = true;
}
} catch (Exception e) {
logStackTrace(e);
}
Now when I insert: België it is saved in a weird way in the database the ë isn't saved. How can I solve this?
EDIT:
I just changed the table via:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
But still when I add a new one it isn't displayed correctly on the webpage.

There is 2 points in the database to check in order to correctly set the UTF-8 charset.
Database Level
This is obtained by creating it :
CREATE DATABASE 'db' CHARACTER SET 'utf8';
Table Level
All of the tables need to be in UTF-8 also (which seems to be the case for you)
CREATE TABLE `Table1` (
[...]
) DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
The important part being DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
Finally, if your code weren't handling utf8 correctly, you could have forced your JVM to use utf8 encoding by changing the settings by on startup :
java -Dfile.encoding=UTF-8 [...]
or changing the environment variable
"**JAVA_TOOLS_OPTIONS**" to -Dfile.encoding="UTF-8"
or programmatically by using :
System.setProperty("file.encoding" , "UTF-8");
(this last one may not have the desire effect since the JVM caches value of default character encoding on startup)
Hope that helped.

Related

How to "activate" UTF8 with Mysql - java

I have problems when I insert into tables words with accent marks. So I think that I have to "activate" UTF-8 to fix that error.
I'm not using Class for name. That's my code:
miInitialContext = new InitialContext();
miDS = (DataSource) miInitialContext.lookup(InformacionProperties.getStrDataSource());
Connection conexion = miDS.getConnection();
Statement myStatement = conexion.createStatement();
myStatement.executeUpdate("INSERT INTO table values ......)
How can I "activate" that UTF8 with my code?
Use set names to set your connection charset. However, that won't matter much if the columns/tables/databases you interact with aren't configured with a compatible charset. For instance, with a latin1 column testcol, inserting utf8 data will result in an error like
INSERT INTO `test`.`table` (`testcol`) VALUES ('test_val'), ('Ídata');
ERROR 1366 (HY000): Incorrect string value: '\xE2\x88\x9A\xC3\xA7d...' for column 'testcol' at row 2
So you'll need to update the table structure
ALTER TABLE t MODIFY testcol CHAR(50) CHARACTER SET utf8;
Which then fixes the issue:
INSERT INTO `test`.`table` (`testcol`) VALUES ('test_val'), ('Ídata');
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
(See the mysql docs for details).
This post has good documentation on finding the character sets of various structures.
(Thanks to Shadow for the SET NAMES part)

character_result_set is empty in mysql

I'm using mysql DB. The DB is amazone RDS DB.
When I execute this query: show global variables like '%character%';
I get the following result:
But when I execute this query: show variables like '%character%'; I get this result:
As you can see character_set_results is empty. I tried following queries and nothing changes the empty value:
ALTER DATABASE myDB CHARACTER SET utf8;
ALTER DATABASE myDB CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER DATABASE myDB DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
SET CHARACTER_SET_RESULTS=UTF8
SET NAMES 'utf8';
SET CHARACTER SET 'utf8';
SET session CHARACTER_SET_RESULTS = utf8;
As I understand the second query returns session parameters. Might this affect the results I'm getting?
I have two machines. Development machine where I run the application with eclipse (this is a java app with hibernate) and another one, the deployment machine. Everything works fine on the development machine but on the deployment machine sometimes I'm getting ???? or other strange characters when getting data from DB.
Both machines connect to the same DB and data is stored fine in the db itself.
Also the connection url is jdbc:mysql://myDB:3306/myApp?autoReconnect=true&useUnicode=true&createDatabaseIfNotExist=true&characterEncoding=utf-8.
Any ideas what can cause this?
Multiple question marks usually come from:
You are INSERTing Chinese (or any non-western-Europe text).
You said SET NAMES utf8 to declare that the client bytes are utf8-encoded (correct).
But the table columns are declared CHARSET latin1. <-- This is the problem.
Since there is no way to convert Chinese characters into latin1, '?' stored.
Please provide SHOW CREATE TABLE to confirm the above hypothesis.
If you are getting "other strange characters", please provide:
SELECT col, HEX(col) FROM ...
to show an example of what is stored for the "strange characters".

ResultSet.updateRow() produces "Illegal mix of collations (latin1_bin,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '<=>'"

I have the following table with name being LATIN1 and the rest being UTF8.
CREATE TABLE `test_names` (
`name` varchar(500) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
`other_stuff_1` int DEFAULT NULL,
`other_stuff_2` varchar(45) DEFAULT NULL,
PRIMARY KEY (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
I encounter the following problem in Java:
I SELECT ... FOR UPDATE. Then I call updateInt(2, 1) and updateRow() on its ResultSet and get Illegal mix of collations (latin1_bin,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '<=>'.
How can I make this work without changing the table’s / connection’s charset?
Thanks a lot.
--- UPDATE ---
I use SELECT name, other_stuff_1 FROM test_names LIMIT 1 FOR UPDATE; and the connection string is DriverManager.getConnection("jdbc:mysql://" + host + ":" + port + "/" + db + "?allowMultiQueries=true", user, password);.
The exact stack trace is:
java.sql.SQLException: Illegal mix of collations (latin1_bin,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '<=>'
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1086)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4237)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4169)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2617)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2778)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2156)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2441)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2366)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2350)
at com.mysql.jdbc.UpdatableResultSet.updateRow(UpdatableResultSet.java:2405)
From my side, I can give you some suggestions
First update your Connector/J latest version.
Run this query
SET NAMES='utf8'
Add &characterEncoding=UTF-8 onto your JDBC connect string. Hope you have done it already.
Use convert() for insert or update and cast() for select query. For more details http://dev.mysql.com/doc/refman/5.7/en/charset-convert.html
For "Illegal mix of collations" related issue, you can follow up Troubleshooting "Illegal mix of collations" error in mysql
The problem is that you are trying to select from a table which has different charsets on different columns.
You have to convert the columns that have different charsets in the query to the proper charset using CONVERT().
http://dev.mysql.com/doc/refman/5.7/en/charset-convert.html
In your case, you should modify your query to:
SELECT CONVERT(name USING latin1), other_stuff_1 FROM test_names LIMIT 1 FOR UPDATE;
IMHO you can't use updatable resultset in this case when the encoding in the table is not uniform (for example the primary key is in latin1 but the other columns are in utf8).
As Andras pointed out, you can convert the encoding on the sql side but then you won't be able to update the resultset.
Why don't you simply update the table with executeUpdate(...)? You can narrow down the target line(s) with a simple select then iterate over the resulting primary key list and call executeUpdate.
For me that is working for mixed encoded columns. Just an example:
conn.setAutoCommit(false);
ResultSet rs = st.executeQuery("SELECT name other_stuff_1 FROM test_names");
List<String> keys = new ArrayList<String>();
while(rs.next()) {
keys.add(rs.getString(1));
}
rs.close();
for(String key : keys) {
st.executeUpdate("update test_names set other_stuff_1='"+key.length()+"', other_stuff_2='" + key.toUpperCase() + "' where name='" + key + "'");
}
conn.commit();

String Converting Issue

Ive been stunk in working with java and mysql these days.
The problem is, ive got a mysql database. There is a column in one table which shows the chinese city names. One collegue changed the db to utf8 for every character(connection, db, results, server and system) The consequence is that the data before the change didn't show correctly any more only if i set the %character% back to latin1. In either character set i can only retrive half the data correctly. Could you please help me how to solve the problem?
Ive tried to use java to solve the problem but it doesn't work.
String sql = "SELECT * FROM customer_addresses";
ResultSet result = query.executeQuery(sql);
while (result.next()) {
byte b[] = result.getBytes("city");
c = new String(result.getBytes("city"), "UTF-8");
}
For example: there is one city in db like this 乌é²æœ¨é½å¸‚
the java print: 乌�?木�?市
it should be:乌鲁木齐市
Thanks in advance
Default charset of your MySQL server is probably not UTF8. Try to execute the following SQL queries before getting data from the database:
SET NAMES utf8
and
SET CHARACTER SET utf8
Add characterEncoding=UTF-8 to the connection string, where you connect to the database. For example:
"jdbc:mysql://servername:3306/databasename?characterEncoding=UTF-8"
Incidentally, the data in the database appears to be broken. If you want the database to store 乌鲁木齐市, that's what should be in the table, not 乌é²æœ¨é½å¸.
Update: The problem with the how the data is stored in the database is easier to solve using database's own tools, not Java. For each table that stores text do this:
ALTER TABLE tablename CONVERT TO CHARACTER SET binary;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8;

Can't store UTF-8 Content in MySQL Using Java PreparedStatement

For some strange reason I can't seem to add UTF-8 data to my MySQL database. When I enter a non-latin character, it's stored as ?????. Everything else is stored fine. So for example, "this is an example®™" is stored fine, but "和英辞典" is stored as "????".
The connection url is fine:
private DataSource getDB() throws PropertyVetoException {
ComboPooledDataSource db = new ComboPooledDataSource();
db.setDriverClass("com.mysql.jdbc.Driver");
db.setJdbcUrl("jdbc:mysql://domain.com:3306/db?useUnicode=true&characterEncoding=UTF-8");
db.setUser("...");
db.setPassword("...");
return db;
}
I'm using PreparedStatement as you would expect, I even tried entering "set names utf8" as someone suggested.
Connection conn = null;
PreparedStatement stmt = null;
ResultSet rs = null;
try {
conn = db.getConnection();
stmt = conn.prepareStatement("set names utf8");
stmt.execute();
stmt = conn.prepareStatement("set character set utf8");
stmt.execute();
... set title...
stmt = conn.prepareStatement("INSERT INTO Table (title) VALUES (?)");
stmt.setString(1,title);
stmt.execute();
} catch (final SQLException e) {
...
The table itself seems to be fine.
Default Character Set: utf8
Default Collation: utf8_general_ci
...
Field title:
Type text
Character Set: utf8
Collation: utf8_unicode_ci
I tested it by entering in Unicode ("和英辞典" specifically) through a GUI editor and then selecting from the table -- and it was returned just fine. So this seems to be an issue with JDBC.
What am I missing?
On your JDBC connection string, you just need set the charset encoding like this:
jdbc:mysql://localhost:3306/dbname?characterEncoding=utf8
There is 2 points in the mysql server to check in order to correctly set the UTF-8 charset.
Database Level
This is obtained by creating it :
CREATE DATABASE 'db' CHARACTER SET 'utf8';
Table Level
All of the tables need to be in UTF-8 also (which seems to be the case for you)
CREATE TABLE `Table1` (
[...]
) DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
The important part being DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
Finally, if your code weren't handling utf8 correctly, you could have forced your JVM to use utf8 encoding by changing the settings by on startup :
java -Dfile.encoding=UTF-8 [...]
or changing the environment variable
"**JAVA_TOOLS_OPTIONS**" to -Dfile.encoding="UTF-8"
or programmatically by using :
System.setProperty("file.encoding" , "UTF-8");
(this last one may not have the desire effect since the JVM caches value of default character encoding on startup)
Hope that helped.
Use stmt.setNString(...) instead of stmt.setString(...).
Also don't forget to check column collation in database side.
If you log in to your mysql database and run show variables like 'character%';
this might provide some insight.
Since you're getting a one-to-one ratio of multi-byte characters to question marks then it's likely that the connection is doing a character set conversion and replacing the Chinese characters with the replacement character for the single-byte set.
Also check locale -a on ubuntu default Ubuntu works with en_us locale and doesn't have other locale installed.
must specify characterEncoding=utf8 while connecting through JDBC.
add at the end of your DB connection url - (nothing else needed)
ex.
spring.datasource.url = jdbc:mysql://localhost:3306/dbname?characterEncoding=utf8

Categories

Resources