Retrieve french characters from database in java - java

I've the sample input as "Mickaël"
When I hit the database, to retrive values by adding criteria in hibernate code snippet is as follows.
pCriteria.add(Restrictions.ilike("lastName", lLastName.toLowerCase() + "%"));
I get the result only with "mickaël".
But as of my Requirement I need to fetch both "Mickaël" and "Mickael"
can somebody help out with this??? TIA

You have to set a collation for example COLLATE French_CI_AI, either on the table or the query (see this post)

Try like this:
pCriteria.add(Restrictions.ilike("lastName", lLastName.toLowerCase().replaceAll('[ë]', '_')));
I am trying to make this string Micka_l. The sign '_' stands for any character, so if you have something like Micka^l or Mickaql in your table you will see this records too. You can add more special french characters in square brackets like this [ëöèù] to escape other special characters.
If you don't want to have other characters try this:
pCriteria.add(pCriteria.or(
Restrictions.eq("lastName", lLastName.toLowerCase())),
Restrictions.eq("lastName", lLastName.toLowerCase().replaceAll('[ë]', 'e'))
)
);
I assume that you input the whole word, if you don't change eq to ilike and add percent sign at the ends of the lines, like this:
pCriteria.add(pCriteria.or(
Restrictions.ilike("lastName", lLastName.toLowerCase() + "%")),
Restrictions.ilike("lastName", lLastName.toLowerCase().replaceAll('[ë]', 'e') + "%")
)
);
If there is another characters to escape you have to work around them like above.

Related

How to match java regexp between some '#'?

I am facing an issue with the String.replaceFirst method.
I have the following String :
String content = "select * from queries
where update_date >= to_timestamp('#date|Date debut|dd/MM/yyyy# 00:00:00','DD/MM/YYYY HH24:MI:SS')
and update_date <= to_timestamp('#date|Date fin|dd/MM/yyyy# 23:59:59','DD/MM/YYYY HH24:MI:SS')";
(The two expressions between '#' are dynamically defined).
And I have 2 dates too :
String begin = "28/05/2018";
String end = "29/05/2018";
Then I would to replace the first expression with begin, and the second with end.
I use :
content = content.replaceFirst("#(date)\\|(.*)\\|(.*)#", begin);
content = content.replaceFirst("#(date)\\|(.*)\\|(.*)#", end);
Although, replaceFirst takes the last '#' of entire String and I am obtaining:
select * from queries where update_date >= to_timestamp('28/05/2018 23:59:59','DD/MM/YYYY HH24:MI:SS');
I understand the error but I ask you to help me to find a solution.
Thank you a lot ! Axel.
If looking for a generic regex for both replacements as your question's code seems to want, this is how to make it work:
the regex for .* that captures all characters is greedy by default, it means that it will try to capture as many characters as it can. This is why your first replacement replaces all.
You can use the lazy quantifier ? to precise that you want to capture the less characters possible instead of the most.
try:
#(date)\|(.*?)\|(.*?)#
(or escaped version for your code: "#(date)\\|(.*?)\\|(.*?)#")
see regex in regex101
When reading your question, I was not sure whether the text between #s (here I mean "date|Date debut|dd/MM/yyyy" and "date|Date fin|dd/MM/yyyy") were dynamically defined or if you were just explaining that you wanted to dynamically replace the fix contents above with your dynamically defined dates.
So I will give you two answers (and both should work).
If the text is fix:
#date\|Date debut\|dd/MM/yyyy# - for the first range
#date\|Date fin\|dd/MM/yyyy# - for the second range
If the text between # is not fix:
#[^#]*#
The regex above means find a range of chars that start with a #, than contains any chars except a #, this is what [^#] means, 0 or several times (the *) and ends with a #
I hope it helps!
Try this:
String content = "select * from queries " +
"where update_date >= to_timestamp('#date|Date debut|dd/MM/yyyy# 00:00:00','DD/MM/YYYY HH24:MI:SS') " +
"and update_date <= to_timestamp('#date|Date fin|dd/MM/yyyy# 23:59:59','DD/MM/YYYY HH24:MI:SS') ;";
String begin = "28/05/2018";
String end = "29/05/2018";
content = content.replaceFirst( "#date\\|[^\\|]*\\|[^#]*#", begin );
content = content.replaceFirst( "#date\\|[^\\|]*\\|[^#]*#", end );
System.out.println( content );
Here we don't need to use the () and we are matching until our character like | or # matched.

neo4j escaping for regular expressions

I receive a user input keyword and want to use it to search my database. I built a query that looks something like this:
db.execute("MATCH (n:User) WHERE n.firstname CONTAINS {keyword} OR n.lastname CONTAINS {keyword} RETURN n.username", params);
But this isn't case sensitive, so I thought of manually building the expression and using regular expressions, sort of as follows:
db.execute("MATCH (n:User) WHERE n.firstname =~ '(?i).*" + keyword + ".*' OR n.lastname =~ '(?i).*" + keyword + ".*' RETURN n.username");
I'm looking either for a function for escaping the regex or a better solution for making the query case-insensitive. Any ideas?
I would suggest storing the properties as all lowercase (or uppercase) and then using the Cypher lower() function to convert user input to lowercase for comparison.
Add lowercase name properties
MATCH (n:User)
SET n.lowerFirstName = lower(n.firstname),
n.lowerLastName = lower(n.lastname)
Find lower case matches based on user input
db.execute("MATCH (n:User) WHERE n.lowerFirstName CONTAINS lower({keyword}) OR n.lowerLastName CONTAINS lower({keyword}) RETURN n.username", params);

Java - Capture optional field with regexp?

I've a regex that correctly captures values from the result of a string.
regex is look like;
intGetHatSaatRenk_v22=anyType{SiraNo=(.*?); HatKodu=(.*?) ; GunTipi=(.*?); Gidis=(.*?); ? };
But the problem is the source is like;
intGetHatSaatRenk_v22=anyType{SiraNo=54; HatKodu=502 ; GunTipi=C; Gidis=12:00; RenkGidis=0000FF; };
intGetHatSaatRenk_v22=anyType{SiraNo=55; HatKodu=502 ; GunTipi=C; Gidis=12:07; }; intGetHatSaatRenk_v22=anyType{SiraNo=56; HatKodu=502 ; GunTipi=C; Gidis=12:14; };
as you can see there is an optional field that named RenkGidis, how can i get the value from RenkGidis if it's not null?
with the regex code that i wrote above, i can get if RenkGidis exists in group(4) like 12:00; RenkGidis=0000FF but group(4) must be only 12:00.
I hope that I could explain my problem.
Might want to make the last group optional:
intGetHatSaatRenk_v22=anyType\{SiraNo=([^;\s]*);\s+HatKodu=([^;\s]*)\s*;\s+GunTipi=([^;\s]*);\s+Gidis=([^;\s]*);(?:\s+RenkGidis=([^;\s]*);)?
As a Java string:
"intGetHatSaatRenk_v22=anyType\\{SiraNo=([^;\\s]*);\\s+HatKodu=([^;\\s]*)\\s*;\\s+GunTipi=([^;\\s]*);\\s+Gidis=([^;\\s]*);(?:\\s+RenkGidis=([^;\\s]*);)?"
At the last group ( ?: prevents the group to be captured into output. ( inside ) catpured as usual.
Also changed .*? to [^;\s]* (negation of [;\s] -> any characters, that are no white-space or ;)
As Alan mentioned in the comments, for not getting a null match for the optional part, e.g. just make RenkGidis optional and wrap the value in an alternation with nothing: ([^;\s]*;|)
intGetHatSaatRenk_v22=anyType\{SiraNo=([^;\s]*);\s+HatKodu=([^;\s]*)\s*;\s+GunTipi=([^;\s]*);\s+Gidis=([^;\s]*);(?:\s+RenkGidis=)?([^;\s]*|)
As a Java string:
"intGetHatSaatRenk_v22=anyType\\{SiraNo=([^;\\s]*);\\s+HatKodu=([^;\\s]*)\\s*;\\s+GunTipi=([^;\\s]*);\\s+Gidis=([^;\\s]*);(?:\\s+RenkGidis=)?([^;\\s]*|)"
The regex could look like this
intGetHatSaatRenk_v22=anyType\{SiraNo=(.*?); HatKodu=(.*?) ; GunTipi=(.*?); Gidis=(.*?);( RenkGidis=.*?;\s*|\s*)\};
Group 5 will then be either " RenkGidis=0000FF;" or " ". You can then use a second regex to get 0000FF.

Special chars in JAVA

สวัสดี Mr.Java Sp'e c'i'a'l'' '
I tried to parse the String using below code but I could't make
simply it shows the wrong value.
String s = "สวัสดี Mr.Java Sp'e c'i'a'l'' '"";
s = s.replaceAll("'", "&apos;");
//s = s.replaceAll("'", "''");
StringEscapeUtils.escapeHtml(s);
I am trying to get from JSP and save in SQL Server DB and show using JSP and update.
But some times in JSP it shows the converted &apos in jsp as it is instead of Special
Chars.
Very Simple is Here I have shown this String(สวัสดี Mr.Java Sp'e c'i'a'l'' ') in StackOverflow they
save in their DB and Shows and allows me to update this is what I
wanted.
OK. So lets look at what your code does:
// line 1
String s = "สวัสดี Mr.Java Sp'e c'i'a'l'' '";
We have a String with various international characters in it ... and some "'" characters.
// line 2
s = s.replaceAll("'", "&apos;");
Assuming that those are really "'" characters characters, we will replace all instances of "'" with an XML / HTML character entity giving us:
"สวัสดี Mr.Java Sp&apos;e c&apos;i&apos;a&apos;l&apos;&apos; &apos;"
And so ...
// line 3
s = StringEscapeUtils.escapeHtml(s);
This replaces any active HTML / XML characters with character references. This includes the ampersand characters "&" that you previously inserted. The result is this:
"&#xxxx;&#xxxx;&#xxxx;&#xxxx; Mr.Java Sp&apos;e
c&apos;i&apos;a&apos;l&apos;&apos; &apos;"
(The &#xxxx; numeric character references encode those Thai (?) characters.)
When you embed that in an HTML document and display it, you will see "สวัสดี Mr.Java Sp&apos;e c&apos;i&apos;a&apos;l&apos;&apos; &apos;"
See what has happened? You have HTML escaped your HTML escaped apostrophies!!
So what do you really need to do?
There is no need replace apostrophes with &apos;. Apostrophes are legal in HTML text.
There should be no need to add HTML escapes so that you can store text in a database:
Any modern database will allow you to store Unicode strings without any special encoding.
If you are trying to prevent the database's SQL parser getting confused by quotes in the text you are storing, you are doing it the wrong way. The right way to do this is to use a PreparedStatement, add parameter placeholders to the query, and use the PreparedStatement.setXxx methods to provide the parameter values. The execute (or whatever) will take care of any SQL escaping that needs to be done.

Java String Replace Regex

I am doing some string replace in SQL on the fly.
MySQLString = " a.account=b.account ";
MySQLString = " a.accountnum=b.accountnum ";
Now if I do this
MySQLString.replaceAll("account", "account_enc");
the result will be
a.account_enc=b.account_enc
(This is good)
But look at 2nd result
a.account_enc_num=a.account_enc_num
(This is not good it should be a.accountnum_enc=b.accountnum_enc)
Please advise how can I achieve what I want with Java String Replace.
Many Thanks.
From your comment:
Is there anyway to tell in Regex only replace a.account=b.account or a.accountnum=b.accountnum. I do not want accountname to be replace with _enc
If I understand correctly you want to add _enc part only to account or accountnum. To do this you can use
MySQLString = MySQLString.replaceAll("\\baccount(num)?\\b", "$0_enc");
(num)? mean that num is optional so regex will accept account or accountnum
\\b at start mean that there can be no letters, numbers or "_" before account so it wont accept (affect) something like myaccount, or my_account.
\\b at the end will prevent other letters, numbers or "_" after account or accountnum.
It's hard to extrapolate from so few examples, but maybe what you want is:
MySQLString = MySQLString.replaceAll("account\\w*", "$0_enc");
which will append _enc to any sequence of letters, digits, and underscores that starts with account.
try
String s = " a.accountnum=b.accountnum ".replaceAll("(account[^ =]*)", "$1_enc");
it means replace any sequence characters which are not ' ' or '=' which starts the word "account" with the sequence found + "_enc".
$1 is a reference to group 1 in regex; group 1 is the expression in parenthesis (account[^ =]+), i.e. our sequence
See http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html for details

Categories

Resources