Can't store UTF-8 Content in MySQL Using Java PreparedStatement

Can't store UTF-8 Content in MySQL Using Java PreparedStatement - java

For some strange reason I can't seem to add UTF-8 data to my MySQL database. When I enter a non-latin character, it's stored as ?????. Everything else is stored fine. So for example, "this is an example®™" is stored fine, but "和英辞典" is stored as "????".
The connection url is fine:
private DataSource getDB() throws PropertyVetoException {
ComboPooledDataSource db = new ComboPooledDataSource();
db.setDriverClass("com.mysql.jdbc.Driver");
db.setJdbcUrl("jdbc:mysql://domain.com:3306/db?useUnicode=true&characterEncoding=UTF-8");
db.setUser("...");
db.setPassword("...");
return db;
}
I'm using PreparedStatement as you would expect, I even tried entering "set names utf8" as someone suggested.
Connection conn = null;
PreparedStatement stmt = null;
ResultSet rs = null;
try {
conn = db.getConnection();
stmt = conn.prepareStatement("set names utf8");
stmt.execute();
stmt = conn.prepareStatement("set character set utf8");
stmt.execute();
... set title...
stmt = conn.prepareStatement("INSERT INTO Table (title) VALUES (?)");
stmt.setString(1,title);
stmt.execute();
} catch (final SQLException e) {
...
The table itself seems to be fine.
Default Character Set: utf8
Default Collation: utf8_general_ci
...
Field title:
Type text
Character Set: utf8
Collation: utf8_unicode_ci
I tested it by entering in Unicode ("和英辞典" specifically) through a GUI editor and then selecting from the table -- and it was returned just fine. So this seems to be an issue with JDBC.
What am I missing?

On your JDBC connection string, you just need set the charset encoding like this:
jdbc:mysql://localhost:3306/dbname?characterEncoding=utf8

There is 2 points in the mysql server to check in order to correctly set the UTF-8 charset.
Database Level
This is obtained by creating it :
CREATE DATABASE 'db' CHARACTER SET 'utf8';
Table Level
All of the tables need to be in UTF-8 also (which seems to be the case for you)
CREATE TABLE `Table1` (
[...]
) DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
The important part being DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
Finally, if your code weren't handling utf8 correctly, you could have forced your JVM to use utf8 encoding by changing the settings by on startup :
java -Dfile.encoding=UTF-8 [...]
or changing the environment variable
"**JAVA_TOOLS_OPTIONS**" to -Dfile.encoding="UTF-8"
or programmatically by using :
System.setProperty("file.encoding" , "UTF-8");
(this last one may not have the desire effect since the JVM caches value of default character encoding on startup)
Hope that helped.

Use stmt.setNString(...) instead of stmt.setString(...).
Also don't forget to check column collation in database side.

If you log in to your mysql database and run show variables like 'character%';
this might provide some insight.
Since you're getting a one-to-one ratio of multi-byte characters to question marks then it's likely that the connection is doing a character set conversion and replacing the Chinese characters with the replacement character for the single-byte set.

Also check locale -a on ubuntu default Ubuntu works with en_us locale and doesn't have other locale installed.
must specify characterEncoding=utf8 while connecting through JDBC.

add at the end of your DB connection url - (nothing else needed)
ex.
spring.datasource.url = jdbc:mysql://localhost:3306/dbname?characterEncoding=utf8

Related

How to "activate" UTF8 with Mysql - java

I have problems when I insert into tables words with accent marks. So I think that I have to "activate" UTF-8 to fix that error.
I'm not using Class for name. That's my code:
miInitialContext = new InitialContext();
miDS = (DataSource) miInitialContext.lookup(InformacionProperties.getStrDataSource());
Connection conexion = miDS.getConnection();
Statement myStatement = conexion.createStatement();
myStatement.executeUpdate("INSERT INTO table values ......)
How can I "activate" that UTF8 with my code?

Use set names to set your connection charset. However, that won't matter much if the columns/tables/databases you interact with aren't configured with a compatible charset. For instance, with a latin1 column testcol, inserting utf8 data will result in an error like
INSERT INTO `test`.`table` (`testcol`) VALUES ('test_val'), ('√çdata');
ERROR 1366 (HY000): Incorrect string value: '\xE2\x88\x9A\xC3\xA7d...' for column 'testcol' at row 2
So you'll need to update the table structure
ALTER TABLE t MODIFY testcol CHAR(50) CHARACTER SET utf8;
Which then fixes the issue:
INSERT INTO `test`.`table` (`testcol`) VALUES ('test_val'), ('√çdata');
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
(See the mysql docs for details).
This post has good documentation on finding the character sets of various structures.
(Thanks to Shadow for the SET NAMES part)

JDBC Clob and NClob

Is there any difference between java.sql.Clob and java.sql.NClob? There is no new method for java.sql.NClob interface. I tried the following:
The setup SQL:
create table tab(id number(2), clobcol clob, nclobcol nclob)
insert into tab values (1, to_clob('你好'), to_nclob('你好'))
JDBC code:
conn = getConnection();
stmt = conn.createStatement();
rs = stmt.executeQuery("select * from tab");
rs.next();
Clob c = rs.getClob(2);
NClob nc = rs.getNClob(3);
InputStream inputStream1 = c.getAsciiStream();
InputStream inputStream2 = nc.getAsciiStream();
System.out.println(inputStream1.available());
System.out.println(inputStream2.available());
c.free();
nc.free();
I have also tried some other methods, looks like there is no difference from the output. Is there a specific I can see some differences ?
Added the supported character set in the database:
SELECT parameter, value
FROM v$nls_parameters
3 WHERE parameter LIKE '%CHARACTERSET';
PARAMETER VALUE
--------------------------------- --------------------
NLS_CHARACTERSET AL32UTF8
NLS_NCHAR_CHARACTERSET AL16UTF16

In the old days (80s) many Databases were created using US7ASCII (in the US) or ISOLATIN1 (in Europe) as the character set. For these Databases that still exist today (after many upgrades), the only way to store non-ASCII character String data is to use the special types NVARCHAR or NCLOB. These Nxxx types are not used by newer Databases that were created directly using UTF8 (now the default in Oracle) as the encoding.

How to convert DB2 binary data to UTF-8 at query level

I am connected to IBM DB2 database with java but data is stored as binary format in database so when I fetch any value it comes as binary or hexdecimal format. How can I convert this in binary data in utf-8 at query level.
Sample code to fetch data -
String sql = "SELECT poMast.ORDNO from AMFLIBL.POMAST AS poMast ";
Class.forName("com.ddtek.jdbc.db2.DB2Driver");
String url = "jdbc:datadirect:db2://hostname:port;DatabaseName=dbName;";
Connection con = DriverManager.getConnection(url, "username","password");
PreparedStatement preparedStatement = con.prepareStatement(sql);
ResultSet rs = preparedStatement.executeQuery();
System.out.println("ResultSet : \n");
System.out.println(" VNDNO");
while (rs.next())
{
System.out.println(rs.getString("ORDNO"));
}

You probably need to use the CAST expression:
SELECT CAST(poMast.ORDNO as VARCHAR(50)) from AMFLIBL.POMAST AS poMast
Adjust the VARCHAR length to your needs. The string is in the database codepage (often UTF-8 these days) and converted to the client/application codepage when fetched.

you can "cast" the result from your select to utf8 like below.
String sql = "SELECT poMast.ORDNO, CAST(poMast.ORDNO AS VARCHAR(255) CCSID UNICODE) FROM AMFLIBL.POMAST AS poMast ";
src: cast db2

In my case, somehow bad UTF-8 data had gotten into varchars in a 1208/UTF-8 DB. Prior to conversion, when querying such data via the JDBC driver, the DB returned -4220 via the JDBC driver. This is fixable at the JDBC driver level by adding this property:
java -Ddb2.jcc.charsetDecoderEncoder=3 MyApp
see:
https://www.ibm.com/support/pages/sqlexception-message-caught-javaiocharconversionexception-and-errorcode-4220
The Db2 LUW Command Line Processor fixed it long ago as an APAR, so this error is only seen via the JDBC driver when the above property is not set.
But, if you want to fix the data in the db, this works:
update <table_name> set <bad_data_col> = cast(cast( <bad_data_col> as vargraphic) as varchar);
1st db2 treats (casts) the bad data as a binary where "anything goes" and then converts (casts) it back to valid UTF-8. After the casts, the JDBC driver shows the same result with or without the special property set and returns no errors.

How to save special characters in database

I'm using a MySQL database and inserting like this:
try (Connection connection = DbConnector.connectToDb();
PreparedStatement stm = connection.prepareStatement("INSERT INTO Country (name) VALUES (?)")) {
stm.setString(1, name);
if (stm.executeUpdate() > 0) {
result = true;
}
} catch (Exception e) {
logStackTrace(e);
}
Now when I insert: België it is saved in a weird way in the database the ë isn't saved. How can I solve this?
EDIT:
I just changed the table via:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
But still when I add a new one it isn't displayed correctly on the webpage.

There is 2 points in the database to check in order to correctly set the UTF-8 charset.
Database Level
This is obtained by creating it :
CREATE DATABASE 'db' CHARACTER SET 'utf8';
Table Level
All of the tables need to be in UTF-8 also (which seems to be the case for you)
CREATE TABLE `Table1` (
[...]
) DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
The important part being DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
Finally, if your code weren't handling utf8 correctly, you could have forced your JVM to use utf8 encoding by changing the settings by on startup :
java -Dfile.encoding=UTF-8 [...]
or changing the environment variable
"**JAVA_TOOLS_OPTIONS**" to -Dfile.encoding="UTF-8"
or programmatically by using :
System.setProperty("file.encoding" , "UTF-8");
(this last one may not have the desire effect since the JVM caches value of default character encoding on startup)
Hope that helped.

Java: insert accented characters in mysql

If I have this query from java:
String query="insert into user (..., name, ...) values (..., 'à', ...)";
Class.forName("com.mysql.jdbc.Driver").newInstance();
Connection con = DriverManager.getConnection("jdbc:mysql://localhost/Spinning?user=root");
PreparedStatement prest = con.prepareStatement(query);
prest.executeUpdate();
In the db I will have a strange character: a diamond with a question mark inside.
Is there any solution to this problem?

Change your connection url to the following:
jdbc:mysql://localhost/Spinning?user=root&useUnicode=true&characterEncoding=utf8

Verify the character set you are using in MySQL DB. You can try "SHOW CREATE TABLE xxxx" to print the table DDL with charset being used.
Verify the character set you are using in JDBC driver. If using MySQL ConnectorJ, you can set charset in the JDBC url.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Can't store UTF-8 Content in MySQL Using Java PreparedStatement - java

On your JDBC connection string, you just need set the charset encoding like this: jdbc:mysql://localhost:3306/dbname?characterEncoding=utf8

Use stmt.setNString(...) instead of stmt.setString(...). Also don't forget to check column collation in database side.

Also check locale -a on ubuntu default Ubuntu works with en_us locale and doesn't have other locale installed. must specify characterEncoding=utf8 while connecting through JDBC.

add at the end of your DB connection url - (nothing else needed) ex. spring.datasource.url = jdbc:mysql://localhost:3306/dbname?characterEncoding=utf8

Related

How to "activate" UTF8 with Mysql - java

JDBC Clob and NClob

How to convert DB2 binary data to UTF-8 at query level

How to save special characters in database

Java: insert accented characters in mysql

Categories

Resources