JOOQ emoji (utf8mb4) support

JOOQ emoji (utf8mb4) support - java

We try to store and read emoji in our MySQL 5.6 database with JOOQ.
The database, table and column are using character set utf8mb4 and collation utf8mb4_unicode_ci. With MySQL Workbench I can create and select emojis. So the database should be ready.
But when I store an emoji with JOOQ I get:
Incorrect string value: '\xF0\x9F\x98\x80' for column 'test' at row 1SQL
DSLContext dslContext = DSL.using(dataSource, SQLDialect.MYSQL);
dslContext.insertInto(table)
.set(testRecord)
.returning()
.fetchOne();
Retrieving en emoji I stored with MySQL Workbench works fine.

To use utf8mb4 in the application make sure you set it on the server level or before performing the query.
There are 2 ways of doing it:
Server level: add character_set_server=utf8mb4 to my.cnf or "set global character_set_server=utf8mb4"
Before running query: "set names utf8mb4"

Related

How to change the way Talend formulates SQL queries in a JDBC connection?

In Talend Data Quality, I have configured a JDBC connection to an OpenEdge database and it's working fine.
I can pull the list of tables and select columns to analyse, but when executing analysis, I get this :
Table "DBGSS.SGSSGSS" cannot be found.
This is because it does not specify a schema, only the database name - DBGSS.
How can I make it specify database, schema and then the table name ? Or just the table name, its would work too.
Thanks !

You can use a tDBConnection component that give you the right to specify a schéma
Then , use it with the option of Use Existing connection
See below documentation , https://help.talend.com/r/en-US/7.3/db-generic/tdbconnection

Unable to Save en-dash to DB2 column with CCSID 37 through Springboot

Am doing migration from CA-Gen to SpringBoot. When I try to insert en-dash(–) in db2 column through CA-Gen its working fine. But when I try to Insert data from SpringBoot it saved as '$(26)'. If I fetch value in Java, DB2 returning empty. Also I tried in AQT with SQL query same issue occur. DB2 Column has CCSID as 37 and type is varchar(2000).

utf8mb4 in MySQL Workbench and JDBC

I've been working with a UTF-8 encoded MySQL DB that now needs to be able to store 4-byte emojis, so I decided to change from utf8 encoding to utf8mb4:
ALTER DATABASE bstdb CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
ALTER TABLE HISTORY CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
ALTER TABLE HISTORY CHANGE SOURCE_CONTEXT SOURCE_CONTEXT VARCHAR(2000) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci;
And changed mysql.conf.d "character-set-server = utf8" to "character-set-server = utf8mb4"
After these steps, I am able to store emojis (as 💢),but only when the SQL query is executed in the MySQL console: When I try to launch the query from MySQL Workbench or from a Wildfly webapp, I am getting this error:
Error Code: 1366. Incorrect string value: '\xF0\x9F\x92\xA2' for column 'SOURCE_CONTEXT' at row 1
I assume I need to change the way the clients are connecting to the DB, but I have no clue on how. I've read something on using "useUnicode=yes" in JDBC, but does not work.
${bdpath:3306/bstdb?useUnicode=yes}
Edit:
As suggested in comments, I tried with:
${bdpath:3306/bstdb?characterEncoding=UTF-8}
but no luck, I am getting the same "Incorrect string value: '\xF0\x9F\x92\xA2'" error.
Also tried
${bdpath:3306/bstdb?useUnicode=true&characterEncoding=utf8mb4&}
but it refuses to stablish a connection.
Any idea on how to configure MySQL workbench and/or JDBC/Wildfly?
MySQL version is 5.7.18
MySQL WorkBench version is 6.0.8
JDBC driver version is 5.1.34
Thanks!

Use characterEncoding=utf8 for jdbc url
jdbc:mysql://x.x.x.x:3306/db?useUnicode=true&characterEncoding=utf8
Also check that you have configured MySQL to work with utf8mb4
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
See here

Starting from MySQL Connector/J 5.1.47,
When UTF-8 is used for characterEncoding in the connection string, it maps to the MySQL character set name utf8mb4.
You can check docs here

Finally, it works. It was an issue with stored procedures, that was still utf8 instead of utf8mb4 after the migration.
It was a 2-steps solution.
As suggested by #mike-adamenko set my.cnf to have the following
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
Execute in mysql:
SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci;
Drop procedures involved, and create them again. They will be in utf8mb4.
Can be checked with
SHOW PROCEDURE STATUS where name LIKE 'procedure_name';

You can follow the documentation available for MySQL to resolve your problem. Here's the MySQL documentation, that you could refer to.
Basically, your ALTER TABLE scripts can be changed as per the documentation mentioned above and then you could use the following parameter in your connection string for the changes to take effect.
jdbc:mysql://localhost/yourdatabasename?useUnicode=true&characterEncoding=UTF-8
Please don't forget to restart your MySQL services after making the character set and the encoding changes.

character_result_set is empty in mysql

I'm using mysql DB. The DB is amazone RDS DB.
When I execute this query: show global variables like '%character%';
I get the following result:
But when I execute this query: show variables like '%character%'; I get this result:
As you can see character_set_results is empty. I tried following queries and nothing changes the empty value:
ALTER DATABASE myDB CHARACTER SET utf8;
ALTER DATABASE myDB CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER DATABASE myDB DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
SET CHARACTER_SET_RESULTS=UTF8
SET NAMES 'utf8';
SET CHARACTER SET 'utf8';
SET session CHARACTER_SET_RESULTS = utf8;
As I understand the second query returns session parameters. Might this affect the results I'm getting?
I have two machines. Development machine where I run the application with eclipse (this is a java app with hibernate) and another one, the deployment machine. Everything works fine on the development machine but on the deployment machine sometimes I'm getting ???? or other strange characters when getting data from DB.
Both machines connect to the same DB and data is stored fine in the db itself.
Also the connection url is jdbc:mysql://myDB:3306/myApp?autoReconnect=true&useUnicode=true&createDatabaseIfNotExist=true&characterEncoding=utf-8.
Any ideas what can cause this?

Multiple question marks usually come from:
You are INSERTing Chinese (or any non-western-Europe text).
You said SET NAMES utf8 to declare that the client bytes are utf8-encoded (correct).
But the table columns are declared CHARSET latin1. <-- This is the problem.
Since there is no way to convert Chinese characters into latin1, '?' stored.
Please provide SHOW CREATE TABLE to confirm the above hypothesis.
If you are getting "other strange characters", please provide:
SELECT col, HEX(col) FROM ...
to show an example of what is stored for the "strange characters".

String Converting Issue

Ive been stunk in working with java and mysql these days.
The problem is, ive got a mysql database. There is a column in one table which shows the chinese city names. One collegue changed the db to utf8 for every character(connection, db, results, server and system) The consequence is that the data before the change didn't show correctly any more only if i set the %character% back to latin1. In either character set i can only retrive half the data correctly. Could you please help me how to solve the problem?
Ive tried to use java to solve the problem but it doesn't work.
String sql = "SELECT * FROM customer_addresses";
ResultSet result = query.executeQuery(sql);
while (result.next()) {
byte b[] = result.getBytes("city");
c = new String(result.getBytes("city"), "UTF-8");
}
For example: there is one city in db like this ä¹Œé²æœ¨é½å¸‚
the java print: 乌�?木�?市
it should be:乌鲁木齐市
Thanks in advance

Default charset of your MySQL server is probably not UTF8. Try to execute the following SQL queries before getting data from the database:
SET NAMES utf8
and
SET CHARACTER SET utf8

Add characterEncoding=UTF-8 to the connection string, where you connect to the database. For example:
"jdbc:mysql://servername:3306/databasename?characterEncoding=UTF-8"
Incidentally, the data in the database appears to be broken. If you want the database to store 乌鲁木齐市, that's what should be in the table, not ä¹Œé²æœ¨é½å¸.
Update: The problem with the how the data is stored in the database is easier to solve using database's own tools, not Java. For each table that stores text do this:
ALTER TABLE tablename CONVERT TO CHARACTER SET binary;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8;

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

JOOQ emoji (utf8mb4) support - java

To use utf8mb4 in the application make sure you set it on the server level or before performing the query. There are 2 ways of doing it: Server level: add character_set_server=utf8mb4 to my.cnf or "set global character_set_server=utf8mb4" Before running query: "set names utf8mb4"

Related

How to change the way Talend formulates SQL queries in a JDBC connection?

Unable to Save en-dash to DB2 column with CCSID 37 through Springboot

utf8mb4 in MySQL Workbench and JDBC

character_result_set is empty in mysql

String Converting Issue

Categories

Resources