Regular expression in java - java

I know it's a simple problem but i'm blocked on it : i want to retrieve all strings written in this form :
$F{ETIQX}
Where X is a number. i wrote this regular expression but i'm getting errors :
if (textField.getText().matches("$F{ETIQ\d}")){
System.out.println("matches!!");
}
Any help will be appreciated.

i want to retrieve all strings
Then you shouldn't be using .matches() in the first place. but a Matcher and .find(). .matches() is a misnomer. It will succeed only if the whole input matches the regex (in contradiction with the definiton of regex matching which can occur anywhere in the input).
Also, your regex should be:
"\\$F\\{ETIQ\\d\\}"
(you need to escape backslashes in a Java string)
$, { and } are regex metacharacters; the first is an anchor matching the end of input, the two latter are bounds for a repetition quantifier.
Your code should read:
private static final Pattern PATTERN = Pattern.compile("\\$F\\{ETIQ\\d\\}");
// ...
final Matcher m = PATTERN.matcher(textField.getText());
while (m.find())
// work with m.group()

\$F\{ETIQ\d\}
escape character which have meaning in regex.
$ means end of string
{ means start of a quantifier
} means end of a quantifier
for matching these you must escape them to match them literally.
here is a demo http://regex101.com/r/xT4mR6
In java \ has no meaning and will throuw an error , so we need to escape \ with \.

Related

Regular expression for matching texts before and after string

I would like to match URL strings which can be specified in the following manner.
xxx.yyy.com (For example, the regular expression should match all strings like 4xxx.yyy.com, xxx4.yyy.com, xxx.yyy.com, 4xxx4.yyy.com, 444xxx666.yyy.com, abcxxxdef.yyy.com etc).
I have tried to use
([a-zA-Z0-9]+$)xxx([a-zA-Z0-9]+$).yyy.com
([a-zA-Z0-9]*)xxx([a-zA-Z0-9]*).yyy.com
But they don't work. Please help me write a correct regular expression. Thanks in advance.
Note: I'm trying to do this in Java.
If you want to make sure there is xxx and you want to allow all non whitespace chars before and after. If you want to match the whole string, you could add anchors at the start and end.
Note to escape the dot to match it literally.
^\S*xxx\S*\.yyy\.com$
^ Start of string
\S*xxx\S* Match xxx between optional non whitespace chars
\.yyy Match .yyy
\.com Match .com
$ End of string
Regex demo
In Java double escape the backslash
String regex = "^\\S*xxx\\S*\\.yyy\\.com$";
Or specify the characters on the left and right that you would allow to match in the character class:
^[0-9A-Za-z!##$%^&*()_+]*xxx[0-9A-Za-z!##$%^&*()_+]*\.yyy\.com$
Regex demo

remove part of matcher after the match in regex pattern

I need to help in writing regex pattern to remove only part of the matcher from original string.
Original String: 2017-02-15T12:00:00.268+00:00
Expected String: 2017-02-15T12:00:00+00:00
Expected String removes everything in milliseconds.
My regex pattern looks like this: (:[0-5][0-9])\.[0-9]{1,3}
i need this regex to make sure i am removing only the milliseconds from some time field, not everything that comes after dot. But using above regex, I am also removing the minute part. Please suggest and help.
You have defined a capturing group with (...) in your pattern, and you want to have that part of string to be present after the replacement is performed. All you need is to use a backreference to the value stored in this capture. It can be done with $1:
String s = "2017-02-15T12:00:00.268+00:00";
String res = s.replaceFirst("(:[0-5][0-9])\\.[0-9]{1,3}", "$1");
System.out.println(res); // => 2017-02-15T12:00:00+00:00
See the Java demo and a regex demo.
The $1 in the replacement pattern tells the regex engine it should look up the captured group with ID 1 in the match object data. Since you only have one pair of unescaped parentheses (1 capturing group) the ID of the group is 1.
Change your pattern to (?::[0-5][0-9])(\.[0-9]{1,3}), run the find in the matcher and remove all it finds in the group(1).
The backslash will force the match with the '.' char, instead of any char, which is what the dot represents in a regex.
The (?: defines a non-capturing group, so it will not be considered in the group(...) on the matcher.
And adding a parenthesis around what you want will make it show up as group in the matcher, and in this case, the first group.
A good reference is the Pattern javadoc: http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html
Use $1 and $2 variable for replace
string.replaceAll("(.*)\\.\\d{1,3}(.*)","$1$2");

Java: Regular Expression not matching?

I am trying to extract a special sequence out of a String using the following Regular Expression:
[(].*[)]
My Pattern should only match if the String contains () with text between them.
Somehow, i I create a new Pattern using Pattern#compile(myString) and then match the String using Matcher matcher = myPattern.matcher(); it doesn't find anything, even though I tried it on regexr.com and it worked there.
My Pattern is a static final Pattern object in another class (I directly used Pattern#compile(myString).
Example String to match:
save (xxx,yyy)
The likely problem here is your quantifier.
Since you're using greedy * with a combination of . for any character, your match will not delimit correctly as . will also match closing ).
Try using reluctant [(].*?[)].
See quantifiers in docs.
You can also escape parenthesis instead of using custom character classes, like so: \\( and \\), but that has nothing to do with your issue.
Also note (thanks esprittn)
The * quantifier will match 0+ characters, so if you want to restrict your matches to non-empty parenthesis, use .+? instead - that'll guarantee at least one character inside your parenthesis.
Hope the below code helps : its extracts the data between '(' & ')' including them .
String pattern = "\\(.*\\)";
String line = "save(xx,yy)";
Pattern TokenPattern = Pattern.compile(pattern);
Matcher m = TokenPattern.matcher(line);
while (m.find()) {
int start = m.start(0);
int end = m.end(0);
System.out.println(line.substring(start, end));
}
to remove the brackets change 'start' to 'start+1' and 'end' to 'end-1' to change the bounding indexes of the sub-string being taken.

Regex Example Confusion

I am preparing for Oracle Certified Java Programmer. I am looking into regular expressions. I was going through javaranch Regular Expression and i am not able to understand the regular expression present in the example. Please help me in understanding it. I am adding source code for reference here. Thanks.
class Test
{
static Map props = new HashMap();
static
{
props.put("key1", "fox");
props.put("key2", "dog");
}
public static void main(String[] args)
{
String input = "The quick brown ${key1} jumps over the lazy ${key2}.";
Pattern p = Pattern.compile("\\$\\{([^}]+)\\}");
Matcher m = p.matcher(input);
StringBuffer sb = new StringBuffer();
while (m.find())
{
m.appendReplacement(sb, "");
sb.append(props.get(m.group(1)));
}
m.appendTail(sb);
System.out.println(sb.toString());
}
}
An illustration of your regex:
\$\{([^}]+)\}
Edit live on Debuggex
Here is a very good tutorial on regular expressions you might want to check out. The article on quantifiers has two sections "Laziness instead of Greediness" and "An Alternative to Laziness", that should explain this particular example really well.
Anyway, here is my explanation. First, you need to realize that there are two compilation steps in Java. One compiles the string literal in your code to an actual string. This step already interprets some of the backslashes, so that the string Java receives looks like
\$\{([^}]+)\}
Now let's pick that apart in free-spacing mode:
\$ # match a literal $
\{ # match a literal {
( # start capturing group 1
[^}] # match any single character except } - note the negation by ^
+ # repeat one or more times
) # end of capturing group 1
\} # match a literal }
So this really matches all occurrences of ${...}, where ... can be anything except closing }. The contents of the braces (i.e. the ...) can later be accessed via m.group(1), as it's the first set of parentheses in the expression.
Here are some more relevant articles of the above tutorial (but you should really read it in its entirety - it's definite worth it):
Character classes (including how to negate them with ^)
Repetition/quantifiers
Grouping and capturing
Java's regex peculiarities
\\$: matches a literal dollar sign. Without the backslashes, it matches the end of a string.
\\{: matches a literal opening curly brace.
(: start of a capturing group
[^}]: matches any character that isn't a closing curly brace.
+: repeats the last character set, which will match one or more characters that aren't curly braces.
): closing capturing group.
\\}: matches a literal closing curly brace.
It matches stuff that looks like ${key1}.
Explanation:
\\$ literal $ (must be escaped since it is a special character that
means "end of the string"
\\{ literal { (i m not sure this must be escaped but it doesn't matter)
( open the capture group 1
[^}]+ character class containing all chars but }
) close the capture group 1
\\} literal }

Problem matching regex pattern in Android

I am trying to search this string:
,"tt" : "ABC","r" : "+725.00","a" : "55.30",
For:
"r" : "725.00"
And here is my current code:
Pattern p = Pattern.compile("([r]\".:.\"[+|-][0-9]+.[0-9][0-9]\")");
Matcher m = p.matcher(raw_string);
I've been trying multiple variations of the pattern, and a match is never found. A second set of eyes would be great!
Your regexp actually works, it's almost correct
Pattern p = Pattern.compile("\"[r]\".:.\"[+|-][0-9]+.[0-9][0-9]\"");
Matcher m = p.matcher(raw_string);
if (m.find()){
String res = m.toMatchResult().group(0);
}
The next line should read:
if ( m.find() ) {
Are you doing that?
A few other issues: You're using . to match the spaces surrounding the colon; if that's always supposed to be whitespace, you should use + (one or more spaces) or \s+ (one or more whitespace characters). On the other hand, the dot between the digits is supposed to match a literal ., so you should escape it: \. Of course, since this is a Java String literal, you need to escape the backslashes: \\s+, \\..
You don't need the square brackets around the r, and if you don't want to match a | in front of the number you should change [+|-] to [+-].
While some of these issues I've mentioned could result in false positives, none of them would prevent it from matching valid input. That's why I suspect you aren't actually applying the regex by calling find(). It's a common mistake.
First thing try to escape your dot symbol: ...[0-9]+\.[0-9][0-9]...
because the dot symbol match any character...
Second thing: the [+|-]define a range of characters but it's mandatory...
try [+|-]?
Alban.

Categories

Resources