How to use groups in Regular Expression In java? - java

I have this code and I want to find both 1234 and 4321 but currently I can only get 4321. How could I fix this problem?
String a = "frankabc123 1234 frankabc frankabc123 4321 frankabc";
String rgx = "frank.* ([0-9]*) frank.*";
Pattern patternObject = Pattern.compile(rgx);
Matcher matcherObject = patternObject.matcher(a);
while (matcherObject.find()) {
System.out.println(matcherObject.group(1));
}

Your regex is too greedy. Make it non-greedy.
String rgx = "frank.*? ([0-9]+) frank";

Your r.e. is incorrect. The first part: frank.* matches everything and then backtracks until the rest of the match succeeds. Try this instead:
String rgx = "frank.*? ([0-9]*) frank";
The ? after the quantifier will make it reluctant, matching as few characters as necessary for the rest of the pattern to match. The trailing .* is also causing problems (as nhahtdh pointed out in a comment).

Related

What is the Regex for decimal numbers in Java?

I am not quite sure of what is the correct regex for the period in Java. Here are some of my attempts. Sadly, they all meant any character.
String regex = "[0-9]*[.]?[0-9]*";
String regex = "[0-9]*['.']?[0-9]*";
String regex = "[0-9]*["."]?[0-9]*";
String regex = "[0-9]*[\.]?[0-9]*";
String regex = "[0-9]*[\\.]?[0-9]*";
String regex = "[0-9]*.?[0-9]*";
String regex = "[0-9]*\.?[0-9]*";
String regex = "[0-9]*\\.?[0-9]*";
But what I want is the actual "." character itself. Anyone have an idea?
What I'm trying to do actually is to write out the regex for a non-negative real number (decimals allowed). So the possibilities are: 12.2, 3.7, 2., 0.3, .89, 19
String regex = "[0-9]*['.']?[0-9]*";
Pattern pattern = Pattern.compile(regex);
String x = "5p4";
Matcher matcher = pattern.matcher(x);
System.out.println(matcher.find());
The last line is supposed to print false but prints true anyway. I think my regex is wrong though.
Update
To match non negative decimal number you need this regex:
^\d*\.\d+|\d+\.\d*$
or in java syntax : "^\\d*\\.\\d+|\\d+\\.\\d*$"
String regex = "^\\d*\\.\\d+|\\d+\\.\\d*$"
String string = "123.43253";
if(string.matches(regex))
System.out.println("true");
else
System.out.println("false");
Explanation for your original regex attempts:
[0-9]*\.?[0-9]*
with java escape it becomes :
"[0-9]*\\.?[0-9]*";
if you need to make the dot as mandatory you remove the ? mark:
[0-9]*\.[0-9]*
but this will accept just a dot without any number as well... So, if you want the validation to consider number as mandatory you use + ( which means one or more) instead of *(which means zero or more). That case it becomes:
[0-9]+\.[0-9]+
If you on Kotlin, use ktx:
fun String.findDecimalDigits() =
Pattern.compile("^[0-9]*\\.?[0-9]*").matcher(this).run { if (find()) group() else "" }!!
Your initial understanding was probably right, but you were being thrown because when using matcher.find(), your regex will find the first valid match within the string, and all of your examples would match a zero-length string.
I would suggest "^([0-9]+\\.?[0-9]*|\\.[0-9]+)$"
There are actually 2 ways to match a literal .. One is using backslash-escaping like you do there \\., and the other way is to enclose it inside a character class or the square brackets like [.]. Most of the special characters become literal characters inside the square brackets including .. So use \\. shows your intention clearer than [.] if all you want is to match a literal dot .. Use [] if you need to match multiple things which represents match this or that for example this regex [\\d.] means match a single digit or a literal dot
I have tested all the cases.
public static boolean isDecimal(String input) {
return Pattern.matches("^[-+]?\\d*[.]?\\d+|^[-+]?\\d+[.]?\\d*", input);
}

Java Regex Find All Between But Last Character Not Preceded Or Followed By

I have a string: "stuffhere{# name="productViewer" vars="productId={{id}}"}morestuff"
How can I find everything between the beginning { and last }.
Pattern.compile("\\{#(.*?)\\}" + from, Pattern.DOTALL); //Finds {# name="productViewer" vars="productId={{id}
How can I verify that the ending } is not preceded or followed by another }? The string may also be surrounded by other characters.
Id like for the regex to only return: name="productViewer" vars="productId={{id}}"
You can use this pattern:
\\{#(.*)(?<!\\})\\}
(?<!..) is a negative lookbehind that checks your condition (not preceded by })
Note that closing curly brackets don't need to be escaped, you can write:
\\{#(.*)(?<!})}
Try this:
^[^{]*\\{#(.*?)\\}[^}]*$
how about
(?<=[{])[^{}]+
i have never used java, but regex is international isn't it :)
EDIT:
wait... regex has errors...
try this:
String s = "stuffhere{# name=\"productViewer\" vars=\"productId={{id}}\"}morestuff";
Pattern p = Pattern.compile("\\{#\\s+(.*)\\}");
Matcher m = p.matcher(s);
if(m.find()){
System.out.println(m.group(1));
}

extract a string with regex

I'm a beginner using regex, I have Strings like String1= "DELIVERY 'text1' 'text2'" and string2="DELIVERY 'text1'", I want to extract "text1". I tried this pattern
Pattern p = Pattern.compile("^DELIVERY\\s'(.*)'");
Matcher m2 = p.matcher(string);
if (m2.find()) {
System.out.println(m2.group(1));
}
the result was : text1' 'text2 for the 1st string and text1 for the second
i tried this too
Pattern p = Pattern.compile("^DELIVERY\\s'(.*)'\\s'(.*)'");
Matcher m2 = p.matcher(string);
if (m2.find()) {
System.out.println(m2.group(1));
}
it return a result only for String1
Your first attempt was almost right. Just replace:
.*
With:
.*?
This makes the operator "non-greedy", so it will "swallow up" as little matched text as possible.
Your regex .* is "greedy", and consumes as much input as possible yet still match, so it will consume everything from the first to the last quote.
Instead use a relictant version by adding ?, ie .*? to costume as little as possible yet still match, which won't skip iver a quote.
Combine this change with some java Kung Fu and you can do it all in one line:
String quoted = str.replaceAll(".*DELIVERY\\s'(.*?)'.*", "$1");
if you only want to have 'text1', try this regex:
"DELIVERY '([^']*)"
or without grouping:
"(?<=DELIVERY ')[^']*"

Regex - to accept latin/ucs2 characters

I am trying to write a regex to accept latin/UCS2 characters. But I am getting error while doing that. In the following code, the 'text1' should pass for the pattern. I am still working on this. can anyone please help me in fxing this?
String text1 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz !\"#$%&'()*+,-./:;<=>?#"
+ "{|}~¡ ";
String pattern = "^[a-zA-Z0-9\\*\\?\\$\\[\\]\\(\\)\\|\\{\\}\\/\\'\\#\\~\\.,;\"\\<=\\>-#%&!+:~¡ ]+$";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(text1);
if (m.find()) {
System.out.println("true");
}
What is not working? Is the pattern not matching or is there an error message?
What I see first you have escaped so many characters, that doesn't need to be escaped and an important one is not escaped.
In a character class there are only a few characters that have a special meaning []- and ^ when it is at the first position. You haven't escaped the -, this can cause an error, so try:
String pattern = "^[a-zA-Z0-9*?$\\[\\]()|{}/'#~.,;\"<=>\\-#%&!+:~¡ £¤¥ §¿ ÄÅÆÇÉÑÖØÜßàäåæ èéìñòöøùü ]+$";
The next thing is: Have a look at Unicode Properties/Scripts. You can e.g. use \\p{L} to match a letter in any language.
String pattern = "^[\\p{L}\\p{M}0-9*?$\\[\\]()|{}/'#~.,;\"<=>\\-#%&!+:~¡ £¤¥ §¿]+$";
Would match all letters you had in your class and more!

Java regex validating special chars

This seems like a well known title, but I am really facing a problem in this.
Here is what I have and what I've done so far.
I have validate input string, these chars are not allowed :
&%$###!~
So I coded it like this:
String REGEX = "^[&%$###!~]";
String username= "jhgjhgjh.#";
Pattern pattern = Pattern.compile(REGEX);
Matcher matcher = pattern.matcher(username);
if (matcher.matches()) {
System.out.println("matched");
}
Change your first line of code like this
String REGEX = "[^&%$##!~]*";
And it should work fine. ^ outside the character class denotes start of line. ^ inside a character class [] means a negation of the characters inside the character class. And, if you don't want to match empty usernames, then use this regex
String REGEX = "[^&%$##!~]+";
i think you want this:
[^&%$###!~]*
To match a valid input:
String REGEX = "[^&%$##!~]*";
To match an invalid input:
String REGEX = ".*[&%$##!~]+.*";

Categories

Resources