java regular expression and replace all occurrences

java regular expression and replace all occurrences - java

I want to replace one string in a big string, but my regular expression is not proper I guess. So it's not working.
Main string is
Some sql part which is to be replaced
cond = emp.EMAIL_ID = 'xx#xx.com' AND
emp.PERMANENT_ADDR LIKE('%98n%')
AND hemp.EMPLOYEE_NAME = 'xxx' and is_active='Y'
String to find and replace is
Based on some condition sql part to be replaced
hemp.EMPLOYEE_NAME = 'xxx'
I have tried this with
Pattern and Matcher class is used and
Pattern pat1 = Pattern.compile("/^hemp.EMPLOYEE_NAME\\s=\\s\'\\w\'\\s[and|or]*/$", Pattern.CASE_INSENSITIVE);
Matcher mat = pat1.matcher(cond);
while (mat.find()) {
System.out.println("Match: " + mat.group());
cond = mat.replaceFirst("xx "+mat.group()+"x");
mat = pat1.matcher(cond);
}
It's not working, not entering the loop at all. Any help is appreciated.

Obviously not - your regexp pattern doesn't make any sense.
The opening /: In some languages, regexps aren't strings and start with an opening slash. Java is not one of those languages, and it has nothing to do with regexps itself. So, this looks for a literal slash in that SQL, which isn't there, thus, failure.
^ is regexpese for 'start of string'. Your string does not start with hemp.EMPLOYEE_NAME, so that also doesn't work. Get rid of both / and ^ here.
\\s is one whitespace character (there are many whitespace characters - this matches any one of them, exactly one though). Your string doesn't have any spaces. Your intent, surely, was \\s* which matches 0 to many of them, i.e.: \\s* is: "Whitespace is allowed here". \\s is: There must be exactly one whitespace character here. Make all the \\s in your regexp an \\s*.
\\w is exactly one 'word' character (which is more or less a letter or digit), you obviously wanted \\w*.
[and|or] this is regexpese for: "An a, or an n, or a d, or an o, or an r, or a pipe symbol". Clearly you were looking for (and|or) which is regexpese for: Either the sequence "and", or the sequence "or".
* - so you want 0 to many 'and' or 'or', which makes no sense.
closing slash: You don't want this.
closing $: You don't want this - it means 'end of string'. Your string didn't end here.
The code itself:
replaceFirst, itself, also does regexps. You don't want to double apply this stuff. That's not how you replace a found result.
This is what you wanted:
Matcher mat = pat1.matcher(cond);
mat.replaceFirst("replacement goes here");
where replacement can include references to groups in the match if you want to take parts of what you matched (i.e. don't use mat.group(), use those references).
More generally did you read any regexp tutorial, did any testing, or did any reading of the javadoc of Pattern and Matcher?
I've been developing for a few years. It's just personal experience, perhaps, but, reading is pretty fundamental.

Instead of the anchors ^ and $, you can use word boundaries \b to prevent a partial match.
If you want to match spaces on the same line, you can use \h to match horizontal whitespace char, as \s can also match a newline.
You can use replaceFirst on the string using $0 to get the full match, and an inline modifier (?i) for a case insensitive match.
Note that using [and|or] is a character class matching one of the listed chars and escape the dot to match it literally, or else . matches any char except a newline.
(?i)\bhemp\.EMPLOYEE_NAME\h*=\h*'\w+'\h+(?:and|or)\b
See a regex demo or a Java demo
For example
String regex = "\\bhemp\\.EMPLOYEE_NAME\\h*=\\h*'\\w+'\\h+(?:and|or)\\b";
String string = "cond = emp.EMAIL_ID = 'xx#xx.com' AND\n"
+ "emp.PERMANENT_ADDR LIKE('%98n%') \n"
+ "AND hemp.EMPLOYEE_NAME = 'xxx' and is_active='Y'";
System.out.println(string.replaceFirst(regex, "xx$0x"));
Output
cond = emp.EMAIL_ID = 'xx#xx.com' AND
emp.PERMANENT_ADDR LIKE('%98n%')
AND xxhemp.EMPLOYEE_NAME = 'xxx' andx is_active='Y'

Related

Regex pattern matching with multiple strings

Forgive me. I am not familiarized much with Regex patterns.
I have created a regex pattern as below.
String regex = Pattern.quote(value) + ", [NnoneOoff0-9\\-\\+\\/]+|[NnoneOoff0-9\\-\\+\\/]+, "
+ Pattern.quote(value);
This regex pattern is failing with 2 different set of strings.
value = "207e/160";
Use Case 1 -
When channelStr = "207e/160, 149/80"
Then channelStr.matches(regex), returns "true".
Use Case 2 -
When channelStr = "207e/160, 149/80, 11"
Then channelStr.matches(regex), returns "false".
Not able to figure out why? As far I can understand it may be because of the multiple spaces involved when more than 2 strings are present with separated by comma.
Not sure what should be correct pattern I should write for more than 2 strings.
Any help will be appreciated.

If you print your pattern, it is:
\Q207e/160\E, [NnoneOoff0-9\-\+\/]+|[NnoneOoff0-9\-\+\/]+, \Q207e/160\E
It consists of an alternation | matching a mandatory comma as well on the left as on the right side.
Using matches(), should match the whole string and that is the case for 207e/160, 149/80 so that is a match.
Only for this string 207e/160, 149/80, 11 there are 2 comma's, so you do get a partial match for the first part of the string, but you don't match the whole string so matches() returns false.
See the matches in this regex demo.
To match all the values, you can use a repeating pattern:
^[NnoeOf0-9+/-]+(?:,\h*[NnoeOf0-90+/-]+)*$
^ Start of string
[NnoeOf0-9\\+/-]+
(?: Non capture group
,\h* Match a comma and optional horizontal whitespace chars
[NnoeOf0-90-9\\+/-]+ Match 1+ any of the listed in the character class
)* Close the non capture group and optionally repeat it (if there should be at least 1 comma, then the quantifier can be + instead of *)
$ End of string
Regex demo
Example using matches():
String channelStr1 = "207e/160, 149/80";
String channelStr2 = "207e/160, 149/80, 11";
String regex = "^[NnoeOf0-9+/-]+(?:,\\h*[NnoeOf0-90+/-]+)*$";
System.out.println(channelStr1.matches(regex));
System.out.println(channelStr2.matches(regex));
Output
true
true
Note that in the character class you can put - at the end not having to escape it, and the + and / also does not have to be escaped.

You can use regex101 to test your RegEx. it has a description of everything that's going on to help with debugging. They have a quick reference section bottom right that you can use to figure out what you can do with examples and stuff.
A few things, you can add literals with \, so \" for a literal double quote.
If you want the pattern to be one or more of something, you would use +. These are called quantifiers and can be applied to groups, tokens, etc. The token for a whitespace character is \s. So, one or more whitespace characters would be \s+.
It's difficult to tell exactly what you're trying to do, but hopefully pointing you to regex101 will help. If you want to provide examples of the current RegEx you have, what you want to match and then the strings you're using to test it I'll be happy to provide you with an example.

^(?:[NnoneOoff0-9\\-\\+\\/]+ *(?:, *(?!$)|$))+$
^ Start
(?: ... ) Non-capturing group that defines an item and its separator. After each item, except the last, the separator (,) must appear. Spaces (one, several, or none) can appear before and after the comma, which is specified with *. This group can appear one or more times to the end of the string, as specified by the + quantifier after the group's closing parenthesis.
Regex101 Test

Regex Example Confusion

I am preparing for Oracle Certified Java Programmer. I am looking into regular expressions. I was going through javaranch Regular Expression and i am not able to understand the regular expression present in the example. Please help me in understanding it. I am adding source code for reference here. Thanks.
class Test
{
static Map props = new HashMap();
static
{
props.put("key1", "fox");
props.put("key2", "dog");
}
public static void main(String[] args)
{
String input = "The quick brown ${key1} jumps over the lazy ${key2}.";
Pattern p = Pattern.compile("\\$\\{([^}]+)\\}");
Matcher m = p.matcher(input);
StringBuffer sb = new StringBuffer();
while (m.find())
{
m.appendReplacement(sb, "");
sb.append(props.get(m.group(1)));
}
m.appendTail(sb);
System.out.println(sb.toString());
}
}

An illustration of your regex:
\$\{([^}]+)\}
Edit live on Debuggex

Here is a very good tutorial on regular expressions you might want to check out. The article on quantifiers has two sections "Laziness instead of Greediness" and "An Alternative to Laziness", that should explain this particular example really well.
Anyway, here is my explanation. First, you need to realize that there are two compilation steps in Java. One compiles the string literal in your code to an actual string. This step already interprets some of the backslashes, so that the string Java receives looks like
\$\{([^}]+)\}
Now let's pick that apart in free-spacing mode:
\$ # match a literal $
\{ # match a literal {
( # start capturing group 1
[^}] # match any single character except } - note the negation by ^
+ # repeat one or more times
) # end of capturing group 1
\} # match a literal }
So this really matches all occurrences of ${...}, where ... can be anything except closing }. The contents of the braces (i.e. the ...) can later be accessed via m.group(1), as it's the first set of parentheses in the expression.
Here are some more relevant articles of the above tutorial (but you should really read it in its entirety - it's definite worth it):
Character classes (including how to negate them with ^)
Repetition/quantifiers
Grouping and capturing
Java's regex peculiarities

\\$: matches a literal dollar sign. Without the backslashes, it matches the end of a string.
\\{: matches a literal opening curly brace.
(: start of a capturing group
[^}]: matches any character that isn't a closing curly brace.
+: repeats the last character set, which will match one or more characters that aren't curly braces.
): closing capturing group.
\\}: matches a literal closing curly brace.
It matches stuff that looks like ${key1}.

Explanation:
\\$ literal $ (must be escaped since it is a special character that
means "end of the string"
\\{ literal { (i m not sure this must be escaped but it doesn't matter)
( open the capture group 1
[^}]+ character class containing all chars but }
) close the capture group 1
\\} literal }

How can i add multiple match conditions in a regex

I have a String like this : String x = "return function ('ABC','DEF')";
I am using this:
Pattern pattern = Pattern.compile("'(.*?)'");
Matcher matcher = pattern.matcher(formula);
while (matcher.find()) {
System.out.println("------> " + matcher.group();
}
to retrieve strings between single quotes.
My question is: how can i adapt this regex so that it will check for strings between single quotes AND strings like " ,'DEF' " (meaning which start with ,' and end with ')?

You can use this pattern:
'[^']+'|"[^"]+"
Just to match with empty quoted string change '+' to '*'.
See test.

This pattern should do what you want:
"(?:,\s*)?'[^']*'"
The ? means the first group will match zero or one times.
I used (?:...) because this is a non-capturing group. It is better to use when you don't need to capture that portion of the match.
Also, I replaced .*? with [^']*, meaning the single-quoted string contains anything that is not a single quote. This is more efficient and less likely to lead to mistakes in your regex than .*?.
(Note: this regex allows there to be space between the comma and the start of the string. At first looking at your example, I thought that was true of your example. But now I see that it is not. Still, that might be useful depending on what your data looks like).

You could use the regex pattern:
Pattern.compile(",?'(.*?)'");
,? means 0 or 1 commas. The ? is greedy, so if there is a comma, it will be included in the match.
So: This will match:
A comma, followed by a string enclosed in single quotes
OR.. only a string enclosed in single quotes

Java Regex for username

I'm looking for a regex in Java, java.util.regex, to accept only letters ’, -, and . and a range of Unicode characters such as umlauts, eszett, diacritic and other valid letters from European languages.
What I don't want is numbers, spaces like “ ” or “ Tom”, or special characters like !”£$% etc.
So far I'm finding it very confusing.
I started with this
[A-Za-z.\\s\\-\\.\\W]+$
And ended up with this:
[A-Za-z.\\s\\-\\.\\D[^!\"£$%\\^&*:;##~,/?]]+$
Using the cavat to say none of the inner square brackets, according to the documentation
Anyone have any suggestions for a new regex or reasons why the above isn't working?

For my answer, I want to use a simpler regex similar to yours: [A-Z[^!]]+, which means "At least once: (a character from A to Z) or (a character that is not '!').
Note that "not '!'" already includes A to Z. So everything in the outer character group([A-Z...) is pointless.
Try [\p{Alpha}'-.]+ and compile the regex with the Pattern.UNICODE_CHARACTER_CLASS flag.

Use: (?=.*[##$%&\s]) - Return true when atleast one special character (from set) and also if username contain space.
you can add more special character as per your requirment. For Example:
String str = "k$shor";
String regex = "(?=.*[##$%&\\s])";
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(str);
System.out.println(matcher.find()); => gives true

Problem matching regex pattern in Android

I am trying to search this string:
,"tt" : "ABC","r" : "+725.00","a" : "55.30",
For:
"r" : "725.00"
And here is my current code:
Pattern p = Pattern.compile("([r]\".:.\"[+|-][0-9]+.[0-9][0-9]\")");
Matcher m = p.matcher(raw_string);
I've been trying multiple variations of the pattern, and a match is never found. A second set of eyes would be great!

Your regexp actually works, it's almost correct
Pattern p = Pattern.compile("\"[r]\".:.\"[+|-][0-9]+.[0-9][0-9]\"");
Matcher m = p.matcher(raw_string);
if (m.find()){
String res = m.toMatchResult().group(0);
}

The next line should read:
if ( m.find() ) {
Are you doing that?
A few other issues: You're using . to match the spaces surrounding the colon; if that's always supposed to be whitespace, you should use + (one or more spaces) or \s+ (one or more whitespace characters). On the other hand, the dot between the digits is supposed to match a literal ., so you should escape it: \. Of course, since this is a Java String literal, you need to escape the backslashes: \\s+, \\..
You don't need the square brackets around the r, and if you don't want to match a | in front of the number you should change [+|-] to [+-].
While some of these issues I've mentioned could result in false positives, none of them would prevent it from matching valid input. That's why I suspect you aren't actually applying the regex by calling find(). It's a common mistake.

First thing try to escape your dot symbol: ...[0-9]+\.[0-9][0-9]...
because the dot symbol match any character...
Second thing: the [+|-]define a range of characters but it's mandatory...
try [+|-]?
Alban.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

java regular expression and replace all occurrences - java

Related

Regex pattern matching with multiple strings

Regex Example Confusion

How can i add multiple match conditions in a regex

Java Regex for username

Problem matching regex pattern in Android

Categories

Resources