To check if a pattern exists in a String - java

I tried searching but could not find anything that made any sense to me! I am noob at regex :)
Trying to see if a particular word "some_text" exists in another string.
String s = "This is a test() function"
String s2 = "This is a test () function"
Assuming the above two strings I can search this using the following pattern at RegEx Tool
[^\w]test[ ]*[(]
But unable to get a positive match in Java using
System.out.println(s.matches("[^\\w]test[ ]*[(]");
I have tried with double \ and even four \\ as escape characters but nothing really works.
The requirement is to see the word starts with space or is the first word of a line and has an open bracket "(" after that particular word, so that all these "test (), test() or test ()" should get a positive match.
Using Java 1.8
Cheers,
Faisal.

The point you are missing is that Java matches() puts a ^ at the start and a $ at the end of the Regex for you. So your expression actually is seen as:
^[^\w]test[ ]*[(]$
which is never going to match your input.
Going from your requirement description, I suggest reworking your regex expression to something like this (assuming by "particular word" you meant test):
(?:.*)(?<=\s)(test(?:\s+)?\()(?:.*)
See the regex at work here.
Explanation:
^ Start of line - added by matches()
(?:.*) Non-capturing group - match anything before the word, but dont capture into a group
(?<=\s) Positive lookbehind - match if word preceded by space, but dont match the space
( Capturing group $1
test(?:\s+)? Match word test and any following spaces, if they exist
\( Match opening bracket
)
(?:.*) Non-capturing group - match rest of string, but dont capture in group
$ End of line - added by matches()
Code sample:
public class Main {
public static void main(String[] args) {
String s = "This is a test() function";
String s2 = "This is a test () function";
System.out.println(s.matches("(?:.*)((?<=\\s))(test(?:\\s+)?\\()(?:.*)"));
//true
}
}

I believe this should be enough:
s.find("\\btest\\s*\\(")

Try this "\btest\b(?= *()".
And dont use "matches", use "find". Mathes trying to match the whole string
https://regex101.com/r/xaPCyp/1

The Matches() method tells whether or not this whole string matches the given regular expression. Since that's not the case you'll yield errors.
If you just interested in if your lookup-value exists within the string I found the following usefull:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Main {
public static void main(String[] args) {
String s = "This is a test () function";
Pattern p = Pattern.compile("\\btest *\\(");
Matcher m = p.matcher(s);
if (m.find())
System.out.println("Found a match");
else
System.out.println("Did not find a match");
}
}
I went with the following pattern: \\btest *\\(
\\b - Match word-boundary (will also catch if first word).
test - Literally match your lookup-value.
* - Zero or more literal spaces.
\\( - Escaped open paranthesis to match literally.
Debuggex Demo

The .matches method will match the whole string where your pattern would only get a partial match.
In the pattern that you tried, the negated character class [^\\w] could also match more than a whitespace boundary as it matches any char except a word character. It could for example also match a ( or a newline.
As per the comments test() function should also match, using [^\\w] or (?<=\s) expects a character to be there on the left.
Instead you could make use of (?<!\\S) to assert a whitespace boundary on the left.
.*(?<!\S)test\h*\(.*
Explanation
.* Match 0+ times any char except a newline
(?<!\S) Assert a whitespace boundary on the left
test\h* Match test and 0+ horizontal whitespace chars
\( Match a ( char
.* Match 0+ times any char except a newline
Regex demo | Java demo
In Java
System.out.println(s.matches(".*(?<!\\S)test\\h*\\(.*"));

Related

java regular expression and replace all occurrences

I want to replace one string in a big string, but my regular expression is not proper I guess. So it's not working.
Main string is
Some sql part which is to be replaced
cond = emp.EMAIL_ID = 'xx#xx.com' AND
emp.PERMANENT_ADDR LIKE('%98n%')
AND hemp.EMPLOYEE_NAME = 'xxx' and is_active='Y'
String to find and replace is
Based on some condition sql part to be replaced
hemp.EMPLOYEE_NAME = 'xxx'
I have tried this with
Pattern and Matcher class is used and
Pattern pat1 = Pattern.compile("/^hemp.EMPLOYEE_NAME\\s=\\s\'\\w\'\\s[and|or]*/$", Pattern.CASE_INSENSITIVE);
Matcher mat = pat1.matcher(cond);
while (mat.find()) {
System.out.println("Match: " + mat.group());
cond = mat.replaceFirst("xx "+mat.group()+"x");
mat = pat1.matcher(cond);
}
It's not working, not entering the loop at all. Any help is appreciated.
Obviously not - your regexp pattern doesn't make any sense.
The opening /: In some languages, regexps aren't strings and start with an opening slash. Java is not one of those languages, and it has nothing to do with regexps itself. So, this looks for a literal slash in that SQL, which isn't there, thus, failure.
^ is regexpese for 'start of string'. Your string does not start with hemp.EMPLOYEE_NAME, so that also doesn't work. Get rid of both / and ^ here.
\\s is one whitespace character (there are many whitespace characters - this matches any one of them, exactly one though). Your string doesn't have any spaces. Your intent, surely, was \\s* which matches 0 to many of them, i.e.: \\s* is: "Whitespace is allowed here". \\s is: There must be exactly one whitespace character here. Make all the \\s in your regexp an \\s*.
\\w is exactly one 'word' character (which is more or less a letter or digit), you obviously wanted \\w*.
[and|or] this is regexpese for: "An a, or an n, or a d, or an o, or an r, or a pipe symbol". Clearly you were looking for (and|or) which is regexpese for: Either the sequence "and", or the sequence "or".
* - so you want 0 to many 'and' or 'or', which makes no sense.
closing slash: You don't want this.
closing $: You don't want this - it means 'end of string'. Your string didn't end here.
The code itself:
replaceFirst, itself, also does regexps. You don't want to double apply this stuff. That's not how you replace a found result.
This is what you wanted:
Matcher mat = pat1.matcher(cond);
mat.replaceFirst("replacement goes here");
where replacement can include references to groups in the match if you want to take parts of what you matched (i.e. don't use mat.group(), use those references).
More generally did you read any regexp tutorial, did any testing, or did any reading of the javadoc of Pattern and Matcher?
I've been developing for a few years. It's just personal experience, perhaps, but, reading is pretty fundamental.
Instead of the anchors ^ and $, you can use word boundaries \b to prevent a partial match.
If you want to match spaces on the same line, you can use \h to match horizontal whitespace char, as \s can also match a newline.
You can use replaceFirst on the string using $0 to get the full match, and an inline modifier (?i) for a case insensitive match.
Note that using [and|or] is a character class matching one of the listed chars and escape the dot to match it literally, or else . matches any char except a newline.
(?i)\bhemp\.EMPLOYEE_NAME\h*=\h*'\w+'\h+(?:and|or)\b
See a regex demo or a Java demo
For example
String regex = "\\bhemp\\.EMPLOYEE_NAME\\h*=\\h*'\\w+'\\h+(?:and|or)\\b";
String string = "cond = emp.EMAIL_ID = 'xx#xx.com' AND\n"
+ "emp.PERMANENT_ADDR LIKE('%98n%') \n"
+ "AND hemp.EMPLOYEE_NAME = 'xxx' and is_active='Y'";
System.out.println(string.replaceFirst(regex, "xx$0x"));
Output
cond = emp.EMAIL_ID = 'xx#xx.com' AND
emp.PERMANENT_ADDR LIKE('%98n%')
AND xxhemp.EMPLOYEE_NAME = 'xxx' andx is_active='Y'

Regex: Is it possible to skip repeating negative lookbehinds?

I've been trying to fix a simple regex that:
Matches all characters from beginning of line (^) to the first & character or to the end of line ($).
The match cannot start with a &.
Examples:
test should match test.
one&two should match one.
&test shouldn't match anything.
My current regex is the following:
^(?<!\&)(.+?)(?=\&|$)
(Regex101)
Currently, this regex fails example 3, where if I gave this regex &test it matches &test, but it shouldn't match anything.
I think it may be a problem with the negative lookbehind (?<!\&) and that &test matches because the character before it is not a &, but it doesn't account for any following & characters.
Is modifying the negative lookbehind to account for repeating & characters possible, and if so, how could I fix this regex?
(I know that Regex101 is using Python's Regex, but this question's Regex is intended to work with Java.)
You need to use a look-ahead instead of a look-behind, and instead of lazy dot matching with a lookahead, use a negated character class:
^[^&]+
See demo (note that \n is added just for a demo, if you test strings without newline characters, it won't be necessary).
Here, ^ asserts the position at the start of the string, and [^&]+ class matches 1 or more characters other than & (thus, no need to use (?=\&|$) look-ahead, if needed, the whole line will be matched).
See IDEONE demo
public static void main (String[] args) throws java.lang.Exception
{
System.out.println(fetchMatch("test", 0));
System.out.println(fetchMatch("one&test", 0));
System.out.println(fetchMatch("&test", 0));
}
public static String fetchMatch(String s, int groupId)
{
Pattern pattern = Pattern.compile("^[^&]+");
Matcher matcher = pattern.matcher(s);
if (matcher.find()){
return matcher.group(groupId);
}
return "ERROR: NOT MATCHED";
}
Output:
test
one
ERROR: NOT MATCHED

Regular expression in java

I know it's a simple problem but i'm blocked on it : i want to retrieve all strings written in this form :
$F{ETIQX}
Where X is a number. i wrote this regular expression but i'm getting errors :
if (textField.getText().matches("$F{ETIQ\d}")){
System.out.println("matches!!");
}
Any help will be appreciated.
i want to retrieve all strings
Then you shouldn't be using .matches() in the first place. but a Matcher and .find(). .matches() is a misnomer. It will succeed only if the whole input matches the regex (in contradiction with the definiton of regex matching which can occur anywhere in the input).
Also, your regex should be:
"\\$F\\{ETIQ\\d\\}"
(you need to escape backslashes in a Java string)
$, { and } are regex metacharacters; the first is an anchor matching the end of input, the two latter are bounds for a repetition quantifier.
Your code should read:
private static final Pattern PATTERN = Pattern.compile("\\$F\\{ETIQ\\d\\}");
// ...
final Matcher m = PATTERN.matcher(textField.getText());
while (m.find())
// work with m.group()
\$F\{ETIQ\d\}
escape character which have meaning in regex.
$ means end of string
{ means start of a quantifier
} means end of a quantifier
for matching these you must escape them to match them literally.
here is a demo http://regex101.com/r/xT4mR6
In java \ has no meaning and will throuw an error , so we need to escape \ with \.

Regex Example Confusion

I am preparing for Oracle Certified Java Programmer. I am looking into regular expressions. I was going through javaranch Regular Expression and i am not able to understand the regular expression present in the example. Please help me in understanding it. I am adding source code for reference here. Thanks.
class Test
{
static Map props = new HashMap();
static
{
props.put("key1", "fox");
props.put("key2", "dog");
}
public static void main(String[] args)
{
String input = "The quick brown ${key1} jumps over the lazy ${key2}.";
Pattern p = Pattern.compile("\\$\\{([^}]+)\\}");
Matcher m = p.matcher(input);
StringBuffer sb = new StringBuffer();
while (m.find())
{
m.appendReplacement(sb, "");
sb.append(props.get(m.group(1)));
}
m.appendTail(sb);
System.out.println(sb.toString());
}
}
An illustration of your regex:
\$\{([^}]+)\}
Edit live on Debuggex
Here is a very good tutorial on regular expressions you might want to check out. The article on quantifiers has two sections "Laziness instead of Greediness" and "An Alternative to Laziness", that should explain this particular example really well.
Anyway, here is my explanation. First, you need to realize that there are two compilation steps in Java. One compiles the string literal in your code to an actual string. This step already interprets some of the backslashes, so that the string Java receives looks like
\$\{([^}]+)\}
Now let's pick that apart in free-spacing mode:
\$ # match a literal $
\{ # match a literal {
( # start capturing group 1
[^}] # match any single character except } - note the negation by ^
+ # repeat one or more times
) # end of capturing group 1
\} # match a literal }
So this really matches all occurrences of ${...}, where ... can be anything except closing }. The contents of the braces (i.e. the ...) can later be accessed via m.group(1), as it's the first set of parentheses in the expression.
Here are some more relevant articles of the above tutorial (but you should really read it in its entirety - it's definite worth it):
Character classes (including how to negate them with ^)
Repetition/quantifiers
Grouping and capturing
Java's regex peculiarities
\\$: matches a literal dollar sign. Without the backslashes, it matches the end of a string.
\\{: matches a literal opening curly brace.
(: start of a capturing group
[^}]: matches any character that isn't a closing curly brace.
+: repeats the last character set, which will match one or more characters that aren't curly braces.
): closing capturing group.
\\}: matches a literal closing curly brace.
It matches stuff that looks like ${key1}.
Explanation:
\\$ literal $ (must be escaped since it is a special character that
means "end of the string"
\\{ literal { (i m not sure this must be escaped but it doesn't matter)
( open the capture group 1
[^}]+ character class containing all chars but }
) close the capture group 1
\\} literal }

Java regex: Negative lookahead

I'm trying to craft two regular expressions that will match URIs. These URIs are of the format: /foo/someVariableData and /foo/someVariableData/bar/someOtherVariableData
I need two regexes. Each needs to match one but not the other.
The regexes I originally came up with are:
/foo/.+ and /foo/.+/bar/.+ respectively.
I think the second regex is fine. It will only match the second string. The first regex, however, matches both. So, I started playing around (for the first time) with negative lookahead. I designed the regex /foo/.+(?!bar) and set up the following code to test it
public static void main(String[] args) {
String shouldWork = "/foo/abc123doremi";
String shouldntWork = "/foo/abc123doremi/bar/def456fasola";
String regex = "/foo/.+(?!bar)";
System.out.println("ShouldWork: " + shouldWork.matches(regex));
System.out.println("ShouldntWork: " + shouldntWork.matches(regex));
}
And, of course, both of them resolve to true.
Anybody know what I'm doing wrong? I don't need to use Negative lookahead necessarily, I just need to solve the problem, and I think that negative lookahead might be one way to do it.
Thanks,
Try
String regex = "/foo/(?!.*bar).+";
or possibly
String regex = "/foo/(?!.*\\bbar\\b).+";
to avoid failures on paths like /foo/baz/crowbars which I assume you do want that regex to match.
Explanation: (without the double backslashes required by Java strings)
/foo/ # Match "/foo/"
(?! # Assert that it's impossible to match the following regex here:
.* # any number of characters
\b # followed by a word boundary
bar # followed by "bar"
\b # followed by a word boundary.
) # End of lookahead assertion
.+ # Match one or more characters
\b, the "word boundary anchor", matches the empty space between an alphanumeric character and a non-alphanumeric character (or between the start/end of the string and an alnum character). Therefore, it matches before the b or after the r in "bar", but it fails to match between w and b in "crowbar".
Protip: Take a look at http://www.regular-expressions.info - a great regex tutorial.

Categories

Resources