Cleaning text for regex [duplicate] - java

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to escape text for regular expression in Java
Is there a built in way or a standard library for cleaning arbitrary strings for use in regex?
As in, if I have the string something .* foo and I want to turn that into a regex like ^something \.\* foo$ is that something that can be easily done?

You can use Pattern.quote(String) for this purpose. From the docs:
Returns a literal pattern String for the specified String.
This method produces a String that can be used to create a Pattern that would
match the string s as if it were a literal pattern.
Metacharacters or escape sequences in the input sequence will be given
no special meaning.

Related

Check dot(.) present in the string without considering it as regular expression [duplicate]

This question already has answers here:
Java RegEx meta character (.) and ordinary dot?
(9 answers)
Closed 2 years ago.
I am new to java. I am practicing String methods currently. I wanted to check if the email contains "gmail.com" or not.
This is the code I came up with
System.out.println(domain.matches(".*gmail.com(.*)"));
but dot(.) here means any character so even if i pass the string as "xyz#gmailpcom" it will return true. Basically I want to check dot(.) in the string without considering it as regular expression.
If you just want to check whether it contains gmail.com, then just use contains:
System.out.println(domain.contains("gmail.com"));
If at a later stage you want to use a regular expression but escape a dot, do that with a backslash in the regular expression, which needs to be escaped again for use in a Java string literal:
domain.matches(".*gmail\\.com(.*)")
you can also check it bu checking the indexOf or lastIndexOf:
System.out.println(domain.indexOf("gmail.com") != -1);
System.out.println(domain.lastIndexOf("gmail.com") != -1);

Replacement specific identifer with regex [duplicate]

This question already has answers here:
Regex match entire words only
(7 answers)
Closed 5 years ago.
I have a java regex for replacing all instances of a specific identifier in a script.
This is the search regex that searchers for the "foo" identifier:
([^\w_]|^)foo([^\w\d_]|$)
And this is the replacement:
$1bar$2
Doing a replaceAll in something like
for foo: [1,2,3];foo&&foo;
works well, it outputs
for bar: [1,2,3];bar&&bar;
However, when we apply this to a string with two instances of the identifier separated by a single character, it only replaces the first:
foo&foo
outputs
bar&foo
This happens, I think, because the first match is "bar&" and so when analyzing the rest of the string no other match is found.
Is there a way to fix this by changing the regex only?
I think you are almost looking for \bfoo\b as your regex otherwise use lookarounds (?<=\W|^)foo(?=\W|$). In both ways replacement string is bar.
Note: \d and _ are subsets of \w and [^\w] is equal to \W

This pattern matches for input 123456789.2.2.2 , which it should not [duplicate]

This question already has answers here:
Java RegEx meta character (.) and ordinary dot?
(9 answers)
Closed 6 years ago.
I am trying to solve following task:
Match the pattern abc.def.ghi.jkl, where each variable a,b,c,d,e,f,g,h,i,j,k,l can be any single character except the newline.
For above question I am matching the input to regex :
"([^\\n]{3}(.)){3}([^\\n]{3})"
// this is the regex pattern I am using currently
What am I doing wrong? Please help me correct the above regex so that it does not match the incorrect input I have provided in the title. Currently it matches to it somehow. Although I have provided 3 it is apparently matching to more than 3 characters.
. has a special meaning in regular expression patterns.
If you want to get a "simple dot", you need to quote/escape it (as "\\.").
And that special meaning is (under normal configuration) "any character except line breaks", which exactly matches your other condition, so you can simplify this to
"(...)\\.(...)\\.(...)\\.(...)"

REgular Expression using Java [duplicate]

This question already has answers here:
Java string split with "." (dot) [duplicate]
(4 answers)
Closed 8 years ago.
I am Using Regular Expression to break the string, I am trying to break the string but In reqular Expressions I am missing some format. Can any one please let me know where i went wrong.
String betweenstring="['Sheet 1$'].[DEPTNO] AS [DEPTNO]";
System.out.println("betweenstring: "+betweenstring);
Pattern pattern = Pattern.compile("\\w+[.]\\w+");
Matcher matchers=pattern.matcher(betweenstring);
while(matchers.find())
{
String filtereddata=matchers.group(0);
System.out.println("filtereddata: "+filtereddata);
}
I need to break like this:
['Sheet 1$']
[DEPTNO] AS [DEPTNO]
Given your very specific input, this regex works.
([\w\[\]' $]+)\.([\w\[\]' $]+)
Capture group one is before the period, capture group 2, after. To escape this for a Java string:
Pattern pattern = Pattern.compile("([\\w\\[\\]' $]+(\\.*[\\w\\[\\]' $]+)");
However, it would be much easier to split the string on the literal dot, if this is what you are trying to achieve:
String[] pieces = between.split("\\.");
System.out.println(pieces[0]);
System.out.println(pieces[1]);
Output:
['Sheet 1$']
[DEPTNO] AS [DEPTNO]

Convert java string to string compatible with a regex in replaceAll [duplicate]

This question already has answers here:
How to escape text for regular expression in Java?
(8 answers)
Closed 9 years ago.
Is there a library or any easy way to convert a string and make sure its compatible as a regex to look for and replace in another string. So if the string is "$money" it would get converted to "\$money". I tried using StringEscapeUtil.escape but it doesn't work with characters such as $.
You can use Pattern.quote("$money").
Prepend the \\Q in front of the string, and \\E at the end:
"\\Q$money\\E"
This tells the regex engine that the string between \Q and \E must be interpreted verbatim, ignoring any metacharacters that it may contain.

Categories

Resources