I am having a bit of trouble collating Arabic characters into MySQL database using Java.
I am using utf8 for all my tables and my database. Here are some screenshots from Mysql Workbench:
and
The String that I tried to collate is: عماد
It's worth mentioning that Arabic is coded over 2 bytes, so this is clearly not an issue of regular utf8 not being able to handle Arabic.
Code for connecting to the database:
String url = "jdbc:mysql://127.0.0.1:3306/mydatabase";
String user = "root";
String passwd = ".........";
String unicode= "?useUnicode=yes&characterEncoding=UTF-8";
setConnection((Connection) DriverManager.getConnection(url+unicode, user, passwd));
Code for inserting the value:
query3 = "INSERT INTO keyword (idkeyword, keyword) VALUES ("+keyWord.getId()+",'عماد')";
Statement state7 = (Statement) connection.createStatement();
state7.executeUpdate(query3);
The exception that I'm receiving:
java.sql.SQLException: Incorrect string value: '\xD8\xB9\xD9\x85\xD8\xA7...' for column 'keyword' at row 1
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3976)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3912)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2530)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2683)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2482)
at com.mysql.jdbc.StatementImpl.executeUpdateInternal(StatementImpl.java:1552)
at com.mysql.jdbc.StatementImpl.executeLargeUpdate(StatementImpl.java:2607)
at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1480)
at controller.DataBaseAccess.saveProject(DataBaseAccess.java:285)
All help is greatly appreciated!
Assuming you are using Java 8 or greater, you could use Base64 to encode the arabic string and store it encoded in the database. This would prevent many other future errors with different type of strings in different languages. When reading the value from the database, you just Base64 decode it.
Related
I am connected to IBM DB2 database with java but data is stored as binary format in database so when I fetch any value it comes as binary or hexdecimal format. How can I convert this in binary data in utf-8 at query level.
Sample code to fetch data -
String sql = "SELECT poMast.ORDNO from AMFLIBL.POMAST AS poMast ";
Class.forName("com.ddtek.jdbc.db2.DB2Driver");
String url = "jdbc:datadirect:db2://hostname:port;DatabaseName=dbName;";
Connection con = DriverManager.getConnection(url, "username","password");
PreparedStatement preparedStatement = con.prepareStatement(sql);
ResultSet rs = preparedStatement.executeQuery();
System.out.println("ResultSet : \n");
System.out.println(" VNDNO");
while (rs.next())
{
System.out.println(rs.getString("ORDNO"));
}
You probably need to use the CAST expression:
SELECT CAST(poMast.ORDNO as VARCHAR(50)) from AMFLIBL.POMAST AS poMast
Adjust the VARCHAR length to your needs. The string is in the database codepage (often UTF-8 these days) and converted to the client/application codepage when fetched.
you can "cast" the result from your select to utf8 like below.
String sql = "SELECT poMast.ORDNO, CAST(poMast.ORDNO AS VARCHAR(255) CCSID UNICODE) FROM AMFLIBL.POMAST AS poMast ";
src: cast db2
In my case, somehow bad UTF-8 data had gotten into varchars in a 1208/UTF-8 DB. Prior to conversion, when querying such data via the JDBC driver, the DB returned -4220 via the JDBC driver. This is fixable at the JDBC driver level by adding this property:
java -Ddb2.jcc.charsetDecoderEncoder=3 MyApp
see:
https://www.ibm.com/support/pages/sqlexception-message-caught-javaiocharconversionexception-and-errorcode-4220
The Db2 LUW Command Line Processor fixed it long ago as an APAR, so this error is only seen via the JDBC driver when the above property is not set.
But, if you want to fix the data in the db, this works:
update <table_name> set <bad_data_col> = cast(cast( <bad_data_col> as vargraphic) as varchar);
1st db2 treats (casts) the bad data as a binary where "anything goes" and then converts (casts) it back to valid UTF-8. After the casts, the JDBC driver shows the same result with or without the special property set and returns no errors.
I'm trying to fill an SQLite database with data in my java program.
The data is read from an excel file using Apache POI. I have no trouble inserting the data into the db using normal methods.
However, when I check the database manually with the shell, all the Norwegian characters æ,ø,å are not displayed correctly. Whenever I fill out the database manually through the shell, they are displayed as they should.
Also, when printing out a java string in console containing these characters, they are displayed correctly.
The problem must occur when an action like this is performed:
String sql = "insert into db(name) values (æøå)";
stmt.executeUpdate(sql);
I have tried
byte[] b = sql.getBytes("utf-8");
sql = new String(b, "utf-8");
to no avail.
Any idea how to remedy the situation?
Thanks!
There is a very simple solution for you: Let Java and the SQLite driver do everything for you. You don't have to care about encodings and escaping of parameters.
How that is possible: Use a PreparedStatement:
String name = "æøå"
PreparedStatement prepStmt = conn.prepareStatement("insert into db(name) values (?)");
prepStmt.setString(1, name);
prepStmt.executeUpdate();
Furthermore this code fragment is secure against SQL injection attacks.
BTW: The second code fragment you posted is totally useless, it does nothing. Converting a String to byte[] and back to String does not change a single bit of the String.
Hi i am using Oracle DB to store string on Varchar2 column,
with using eclipselink my code is here,
pdescription = new String(this.description.getBytes("ISO-8859-9"));
sometimes its ok but, somtimes it only question marks, like that
it is taken "door" or "????"
I have column that is also string there is problem with that, their types are same both varchar2
Ive been stunk in working with java and mysql these days.
The problem is, ive got a mysql database. There is a column in one table which shows the chinese city names. One collegue changed the db to utf8 for every character(connection, db, results, server and system) The consequence is that the data before the change didn't show correctly any more only if i set the %character% back to latin1. In either character set i can only retrive half the data correctly. Could you please help me how to solve the problem?
Ive tried to use java to solve the problem but it doesn't work.
String sql = "SELECT * FROM customer_addresses";
ResultSet result = query.executeQuery(sql);
while (result.next()) {
byte b[] = result.getBytes("city");
c = new String(result.getBytes("city"), "UTF-8");
}
For example: there is one city in db like this 乌é²æœ¨é½å¸‚
the java print: 乌�?木�?市
it should be:乌鲁木齐市
Thanks in advance
Default charset of your MySQL server is probably not UTF8. Try to execute the following SQL queries before getting data from the database:
SET NAMES utf8
and
SET CHARACTER SET utf8
Add characterEncoding=UTF-8 to the connection string, where you connect to the database. For example:
"jdbc:mysql://servername:3306/databasename?characterEncoding=UTF-8"
Incidentally, the data in the database appears to be broken. If you want the database to store 乌鲁木齐市, that's what should be in the table, not 乌é²æœ¨é½å¸.
Update: The problem with the how the data is stored in the database is easier to solve using database's own tools, not Java. For each table that stores text do this:
ALTER TABLE tablename CONVERT TO CHARACTER SET binary;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8;
For some strange reason I can't seem to add UTF-8 data to my MySQL database. When I enter a non-latin character, it's stored as ?????. Everything else is stored fine. So for example, "this is an example®™" is stored fine, but "和英辞典" is stored as "????".
The connection url is fine:
private DataSource getDB() throws PropertyVetoException {
ComboPooledDataSource db = new ComboPooledDataSource();
db.setDriverClass("com.mysql.jdbc.Driver");
db.setJdbcUrl("jdbc:mysql://domain.com:3306/db?useUnicode=true&characterEncoding=UTF-8");
db.setUser("...");
db.setPassword("...");
return db;
}
I'm using PreparedStatement as you would expect, I even tried entering "set names utf8" as someone suggested.
Connection conn = null;
PreparedStatement stmt = null;
ResultSet rs = null;
try {
conn = db.getConnection();
stmt = conn.prepareStatement("set names utf8");
stmt.execute();
stmt = conn.prepareStatement("set character set utf8");
stmt.execute();
... set title...
stmt = conn.prepareStatement("INSERT INTO Table (title) VALUES (?)");
stmt.setString(1,title);
stmt.execute();
} catch (final SQLException e) {
...
The table itself seems to be fine.
Default Character Set: utf8
Default Collation: utf8_general_ci
...
Field title:
Type text
Character Set: utf8
Collation: utf8_unicode_ci
I tested it by entering in Unicode ("和英辞典" specifically) through a GUI editor and then selecting from the table -- and it was returned just fine. So this seems to be an issue with JDBC.
What am I missing?
On your JDBC connection string, you just need set the charset encoding like this:
jdbc:mysql://localhost:3306/dbname?characterEncoding=utf8
There is 2 points in the mysql server to check in order to correctly set the UTF-8 charset.
Database Level
This is obtained by creating it :
CREATE DATABASE 'db' CHARACTER SET 'utf8';
Table Level
All of the tables need to be in UTF-8 also (which seems to be the case for you)
CREATE TABLE `Table1` (
[...]
) DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
The important part being DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
Finally, if your code weren't handling utf8 correctly, you could have forced your JVM to use utf8 encoding by changing the settings by on startup :
java -Dfile.encoding=UTF-8 [...]
or changing the environment variable
"**JAVA_TOOLS_OPTIONS**" to -Dfile.encoding="UTF-8"
or programmatically by using :
System.setProperty("file.encoding" , "UTF-8");
(this last one may not have the desire effect since the JVM caches value of default character encoding on startup)
Hope that helped.
Use stmt.setNString(...) instead of stmt.setString(...).
Also don't forget to check column collation in database side.
If you log in to your mysql database and run show variables like 'character%';
this might provide some insight.
Since you're getting a one-to-one ratio of multi-byte characters to question marks then it's likely that the connection is doing a character set conversion and replacing the Chinese characters with the replacement character for the single-byte set.
Also check locale -a on ubuntu default Ubuntu works with en_us locale and doesn't have other locale installed.
must specify characterEncoding=utf8 while connecting through JDBC.
add at the end of your DB connection url - (nothing else needed)
ex.
spring.datasource.url = jdbc:mysql://localhost:3306/dbname?characterEncoding=utf8