Special chars in JAVA - java

สวัสดี Mr.Java Sp'e c'i'a'l'' '
I tried to parse the String using below code but I could't make
simply it shows the wrong value.
String s = "สวัสดี Mr.Java Sp'e c'i'a'l'' '"";
s = s.replaceAll("'", "'");
//s = s.replaceAll("'", "''");
StringEscapeUtils.escapeHtml(s);
I am trying to get from JSP and save in SQL Server DB and show using JSP and update.
But some times in JSP it shows the converted &apos in jsp as it is instead of Special
Chars.
Very Simple is Here I have shown this String(สวัสดี Mr.Java Sp'e c'i'a'l'' ') in StackOverflow they
save in their DB and Shows and allows me to update this is what I
wanted.

OK. So lets look at what your code does:
// line 1
String s = "สวัสดี Mr.Java Sp'e c'i'a'l'' '";
We have a String with various international characters in it ... and some "'" characters.
// line 2
s = s.replaceAll("'", "'");
Assuming that those are really "'" characters characters, we will replace all instances of "'" with an XML / HTML character entity giving us:
"สวัสดี Mr.Java Sp'e c'i'a'l'' '"
And so ...
// line 3
s = StringEscapeUtils.escapeHtml(s);
This replaces any active HTML / XML characters with character references. This includes the ampersand characters "&" that you previously inserted. The result is this:
"&#xxxx;&#xxxx;&#xxxx;&#xxxx; Mr.Java Sp'e
c'i'a'l'' '"
(The &#xxxx; numeric character references encode those Thai (?) characters.)
When you embed that in an HTML document and display it, you will see "สวัสดี Mr.Java Sp'e c'i'a'l'' '"
See what has happened? You have HTML escaped your HTML escaped apostrophies!!
So what do you really need to do?
There is no need replace apostrophes with '. Apostrophes are legal in HTML text.
There should be no need to add HTML escapes so that you can store text in a database:
Any modern database will allow you to store Unicode strings without any special encoding.
If you are trying to prevent the database's SQL parser getting confused by quotes in the text you are storing, you are doing it the wrong way. The right way to do this is to use a PreparedStatement, add parameter placeholders to the query, and use the PreparedStatement.setXxx methods to provide the parameter values. The execute (or whatever) will take care of any SQL escaping that needs to be done.

Related

How can i write values in a column using comma

As per my jmeter test plan,i am saving following information into a csv file using Beanshell PostProcessor
username = vars.get("username");
password = vars.get("password");
f = new FileOutputStream("/path/user_details.csv", true);
p = new PrintStream(f);
this.interpreter.setOut(p);
print(username + "," + password);
f.close()
How can i save those values into a single column using comma (username,password)
Put double quotes around the entire string, so that the comma will be part of a single data item's value, rather than a value separator.
In practice the character you use for the column separator, and the characters you use as a delimiter, are configurable, by a CSV library (which, should really almost always be used instead of trying to get the syntax details right on your own).

Why I can't use the org.apache.commons.lang.StringEscapeUtils to convert this String containing character as &apos and &egrave?

I am trying to do some experiment with the org.apache.commons.lang.StringEscapeUtils class but I am finding some difficulties.
I have the following situation in my code:
String notNormalized = "c'è";
System.out.println("NOT NORMALIZED: " + notNormalized);
System.out.println("NORMALIZED: " + StringEscapeUtils.escapeJava(notNormalized));
So first I have declared the notNormalized field that (at least in my head) have to represent a not normalized string that contains an apostrophe character represented by the ' and an accented vowel represented by the è (that should be the è character)
Then I try to print it without normalization and I espect that is print the c'è string and the its normalized version and I expect to retrieve the c'è normalized\converted string.
But the problem is that I still obtain the same output, infact this is what I obtain in the console as output:
NOT NORMALIZED: c'è
NORMALIZED: c'è
Why? What am I missing? What is wrong? How can I perform this test and correctly convert a string that contains character as &apos ?
What you're looking to do is unescapeHtml4.
So
System.out.println("NORMALIZED: " + StringEscapeUtils.unescapeHtml4(notNormalized));
which prints
NORMALIZED: c'è
Unfortunately, &apos is not an HTML 4 entity and therefore can't be unescaped with this tool. You can use unescapeXml for the &apos but not for the &egrave. You'll have to mix and match.

Retrieve french characters from database in java

I've the sample input as "Mickaël"
When I hit the database, to retrive values by adding criteria in hibernate code snippet is as follows.
pCriteria.add(Restrictions.ilike("lastName", lLastName.toLowerCase() + "%"));
I get the result only with "mickaël".
But as of my Requirement I need to fetch both "Mickaël" and "Mickael"
can somebody help out with this??? TIA
You have to set a collation for example COLLATE French_CI_AI, either on the table or the query (see this post)
Try like this:
pCriteria.add(Restrictions.ilike("lastName", lLastName.toLowerCase().replaceAll('[ë]', '_')));
I am trying to make this string Micka_l. The sign '_' stands for any character, so if you have something like Micka^l or Mickaql in your table you will see this records too. You can add more special french characters in square brackets like this [ëöèù] to escape other special characters.
If you don't want to have other characters try this:
pCriteria.add(pCriteria.or(
Restrictions.eq("lastName", lLastName.toLowerCase())),
Restrictions.eq("lastName", lLastName.toLowerCase().replaceAll('[ë]', 'e'))
)
);
I assume that you input the whole word, if you don't change eq to ilike and add percent sign at the ends of the lines, like this:
pCriteria.add(pCriteria.or(
Restrictions.ilike("lastName", lLastName.toLowerCase() + "%")),
Restrictions.ilike("lastName", lLastName.toLowerCase().replaceAll('[ë]', 'e') + "%")
)
);
If there is another characters to escape you have to work around them like above.

Exception "Illegal character in query at index -" in Android

I am trying to send data on server using following link.
**WEBSERVICE LINK:**
http://75.125.237.76/post_reviews.php?data=text1
If I set data filed with single string (ex:data=text1), That time my try block in source code working fine, without any exception.
But When I set data field with multiple string with spaces (ex: data=text1 text2 text3), Then Exception generated i.e. Illegal character in query.
**EXCEPTION:**
Illegal character in query at index 75: http://75.125.237.76/post_reviews.php?data=text1 text2 text3
My question is Why exception generate when we use multiple strings (like: data=My name is xyz).
If I replace data field with single string that time is working fine.(data=xyz)
Encode space with %20 have a look at this one for more encodings
Encode your URI string so the spaces will be presented as %20

Java POST data to mySQL UTF-8 encoding issue

I have POST data that contains the Japanese string AKB48 ネ申テレビ シーズン3, defined in jQuery as data.
$("#some_div").load("someurl", { data : "AKB48 ネ申テレビ シーズン3"})
The post data is sent to Java Servlet:
String data = new String(this.request.getParameter("data").getBytes("ISO-8859-1"), "UTF-8");
My program saves it to MySQL, but after the data is saved to the database it becomes:
AKB48 u30CDu7533u30C6u30ECu30D3 u30B7u30FCu30BAu30F33
What should I do if I want to save it as it is in UTF-8? All my files are in UTF-8.
MySQL encoding is utf8 and here is the code
String sql = "INSERT INTO Inventory (uid, item_id, item_data, ctime) VALUES ("
+ inventory.getUid() + ",'"
+ inventory.getItemId() + "','"
+ StringEscapeUtils.escapeJava(inventory.getItemData()) + "',CURRENT_TIMESTAMP)";
Statement stmt = con.createStatement();
int cnt = stmt.executeUpdate(sql);
From your example above, I can verify that the Japanese string is getting saved to your MySQL database correctly, but as escaped Unicode.
I would check these items in order:
Are your tables and columns all set to have a character set and collation for utf8? I.e.,
CHARACTER SET utf8 COLLATE utf8_general_ci
Are explicitly setting the character set encoding before POST? request.setCharacterEncoding("UTF-8");
Are you setting the character encoding for your db connections? I.e., jdbc:mysql://localhost:3306/YOURDB?useUnicode=true&characterEncoding=UTF8
As the others have pointed out, you should not use that getBytes trick. It will surely mess up the POSTed values.
EDIT
Do not use StringEscapeUtils.escapeJava, since that will turn your string into escaped Unicode. That is what is transforming AKB48 ネ申テレビ シーズン3 into AKB48 u30CDu7533u30C6u30ECu30D3 u30B7u30FCu30BAu30F33.
Why you do not just extract value of parameter like this.request.getParameter("data")?
Your data is sent correctly using URL encoding where each unicode character is replaced by its code. Then you have to get the value of the parameter. When you are requesting bytes using ISO-8859-1 you are actually corrupting your data because the string is represented as a sequence if codes in textual form.
Java strings are stored in UTF-16. So, this code:
String data = new String(this.request.getParameter("data").getBytes("ISO-8859-1"), "UTF-8");
decodes a UTF-16 string (which has been re-encoded from UTF-8 in the HTTP protocol) into a binary array using the ISO-8859-1 charset, and re-encodes the binary array using the UTF-8 charset. This is almost certainly not what you want.
What happens when you use this?
String data = this.request.getParameter("data");
System.out.println(data);
If the second line generates bad data, then your problem is likely in jQuery. Determine that you are indeed getting unicode in your jQuery request:
System.out.println(this.request.getHeader("Content-Encoding"));
If it does not generate bad data, but the data doesn't get stored correctly in mySQL, your problem is at the database level. Make sure your column type supports unicode strings.
What's the point of the line
String data = new String(this.request.getParameter("data").getBytes("ISO-8859-1"), "UTF-8");
You're transforming chinese (or at least non-occidental) characters into bytes using the ISO-8859-1 encoding. Of course this can't work, since chinese characters are not supported by the ISO-8859-1 encoding. ANd then you're constructing a new String from bytes that are supposed to represent ISO-8859-1-encoded characters, using the UTF-8 encoding. This, once again, doesn't make any sense. UTF-8 and ISO-8859-1 are not the same thing, and only a small set of chars have the same encoding in both formats.
Just use
String data = this.request.getParameter("data");
and everything should be OK, provided that the column in the MySQL table uses an encoding that supports these characters.
EDIT:
now that you've shown us the code used to insert the data in database, I know where all this comes from (the preceding points are still valid, though). You're doing
StringEscapeUtils.escapeJava(inventory.getItemData())
What's the point? escapeJava is used to take a String and escape special characters in order to make it a valid Java String literal. It has nothing to do with SQL. Use a prepared statement:
String sql = "INSERT INTO Inventory (uid, item_id, item_data, ctime) VALUES (?, ?, ?, CURRENT_TIMESTAMP);
PreparedStatement stmt = con.prepareStatement();
stmt.setInteger(1, inventory.getUid()); // or setLong, depending on the type
stmt.setString(2, inventory.getItemId());
stmt.setString(inventory.getItemData());
int cnt = stmt.executeUpdate();
The PreparedStatement will take care of escaping special SQL characters correctly. They're the best tool agains SQL injection attack, and should always be used when a query has parameters, especially if the parameters come from the end user. See http://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html.

Categories

Resources