Java: Regex: +Quantifier not working - java

So "XXXXX**".matches("[X{9,11}\\*{2,3}]") returns false as expected...
But, "XXXXX**".matches("[X{9,11}\\*{2,3}]+") returns true. Am I using the + quantifier correctly? (I want the second one to also return false)

[...] matches any character defined in the character class, so
[X{9,11}\\*{2,3}] actually means, a single character which is: X, or open brace, or 9, or comma, or 1, or 1 (yes you have it duplicated), or backslash, or asterisk....
So as your string have more than character in your string to-be-matched, such pattern will not match.
When you add a +, it means matching a string with 1 or more [ X or asterisk or....], so it match
I believe what you really want to do is using a group.
So the regex looks like (X{9,11}\*{2,3}])+

"XXXXXXXXX**".matches("(X{9,11}\\*{2,3})+")
"XXXXXXXX**".matches("(X{9,11}\\*{2,3})+")
match true and false.
The whole thing of (Xes and *s) has to occur at least once (+).
No group of characters involved, no need to use '[]'.

Related

Check if String ends with two digits after a dot in Regular Expression?

I'm trying to test if a String ends with EXACTLY two digits after a dot in Java using a Regular Expression. How can achieve this?
Something like "500.23" should return true, while "50.3" or "50" should return false.
I tried things like "500.00".matches("/^[0-9]{2}$/") but it returns false.
Here is a RegEx that might help you:
^\d+\.\d{2,2}$
it may neither be perfect nor the most efficient, but it should lead you in the right direction.
^ says that the expression should start here
\d looks for any digit
+ says, that the leading \d can appear as often as necessary (1–infinity)
\. means you are expecting a dot(.) at one point
\d{2,2} thats the trick: it says you want 2 and exactly 2 digits (not less not more)
$ tells you that the expression ends there (after the 2 digits)
in Java the \ needs to be escaped so it would be:
^\\d*\\.\\d{2,2}$
Edit
if you don't need digits before the dot (.) or if you really don't care what comes before the dot, then you can replace the first \d+ by a .* as in Bohemians answer. The (non escaped) dot means that the expression can contain any character (not only digets). Then even the leading ^ might no longer be necessary.
\\.*\\.\\d{2,2}$
use this regex
String s="987234.42";
if(Pattern.matches("^\\d+(\\.\\d{2})$", s)){ // string must start with digit followed by .(dot) then exactly two digit.
....
}
Firstly, forward slashes are no part of regular expressions whatsoever. They are however used by some languages to delimit regular expressions - but not java, so don't use them.
Secondly, in java matches() must match the whole string to return true (so ^ and $ are implied in the regex).
Try this:
if (str.matches(".*\\.\\d\\d"))
// it ends with dot then 2 digits
Note that in java a bash slash in a regex requires escaping by a further back slash in a string literal.

Regexp: Specific characters in the text

My goal is to validate specific characters (*,^,+,?,$,[],[^]) in the some text, like:
?test.test => true
test.test => false
test^test => true
test:test => false
test-test$ => true
test-test => false
I've already created regex regarding to requirment above, but I am not sure in this.
^(.*)([\[\]\^\$\?\*\+])(.*)$
Will be good to know whether it can be optimized in such way.
Your regex is already optimized one as its very simple. You can make is much simpler or readable only.
Also if you use the matches() method of Java's String class then you'll not require the ^ and $ at the both ends.
.*([\\[\\]^$?*+]).*
Double slashes(\\) for Java, otherwise please use single slash(\).
Look, I have removed the captures () along with escape character \ for the characters ^$?*+ as they are inside the character class [].
TL;DR
The quickest regex to do the job is
# ^[^\]\[^$?*+]*([\]\[^$?*+])
^ #start of the string
[^ #any character BUT...
\]\[^$?*+ #...these ones (^$?*+ aren't special inside a character class)
]*+ #zero or more times (possessive quantifier)
([ #capture any of...
\]\[^$?*+ #...these characters
])
Be careful that in a java string, you need to escape the \ as well, so you should transform every \ into \\.
Discussion
At first two regex come in mind:
[\]\[^$?*+], which will match only the character you want inside the string.
^.*[\]\[^$?*+], which will match your string up to the desired character.
It's actually important performance-wise to understand the difference between the case with .* at the beginning and the one with no wildcard at all.
When searching for the pattern, the first .* will make the regex engine eat all the string, then backtrack character by character to see if it's a match for your character range [...]. So the regex will actually search from the end of the string.
This is an advantage when your wanted sign if near the end, a disadvantage when it is at the beginning.
On the other case, the regex engine will try every character, beginning from the left, until it matches what you want.
You can see what I mean with these two examples from the excellent regex101.com:
with the .*, match is found in 26 steps when near the beginning, 8 when it's near the beginning: http://regex101.com/r/oI3pS1/#debugger
without it, it is found in 5 steps when near the beginning and 23 when near the end
Now, if you want to combine these two approaches you can use the tl;dr answer: you eat everything that isn't your character, then you match your character (or fail if there isn't one).
On our example, it takes 7 steps wherever your character is in the string (and 7 steps even if there is no character, thanks to the possessive quantifier).
That should also work:
String regex = ".*[\\[\\]^$?*+].*";
String test1 = "?test.test";
String test2 = "test.test";
String test3 = "test^test";
String test4 = "test:test";
String test5 = "test-test$";
String test6 = "test-test";
System.out.println(test1.matches(regex));
System.out.println(test2.matches(regex));
System.out.println(test3.matches(regex));
System.out.println(test4.matches(regex));
System.out.println(test5.matches(regex));
System.out.println(test6.matches(regex));

Regular Expression for a string that contains one or more letters somewhere in it

What would be a regular expression that would evaluate to true if the string has one or more letters anywhere in it.
For example:
1222a3999 would be true
a222aZaa would be true
aaaAaaaa would be true
but:
1111112())-- would be false
I tried: ^[a-zA-Z]+$ and [a-zA-Z]+ but neither work when there are any numbers and other characters in the string.
.*[a-zA-Z].*
The above means one letter, and before/after it - anything is fine.
In java:
String regex = ".*[a-zA-Z].*";
System.out.println("1222a3999".matches(regex));
System.out.println("a222aZaa ".matches(regex));
System.out.println("aaaAaaaa ".matches(regex));
System.out.println("1111112())-- ".matches(regex));
Will provide:
true
true
true
false
as expected
^.*[a-zA-Z].*$
Depending on the implementation, match() functions check if the entire string matches (which is probably why your [a-zA-Z] or [a-zA-Z]+ patterns didn't work).
Either use match() with the above pattern or use some sort of search() method instead.
This regexp should do it:
[a-zA-Z]
It matches as long as there's a single letter anywhere in the string, it doesn't care about any of the other characters.
[a-zA-Z]+
should have worked as well, I don't know why it didn't for you.
.*[a-zA-Z]?.*
Should get you the result you want.
The period matches any character except new line, the asterisk says this should exist zero or more times. Then the pattern [a-zA-Z]? says give me at least one character that is in the brackets because of the use of the question mark. Finally the ending .* says that the alphabet characters can be followed by zero or more characters of any type.

String.replaceAll() with [\d]* appends replacement String inbetween characters, why?

I have been trying for hours now to get a regex statement that will match an unknown quantity of consecutive numbers. I believe [0-9]* or [\d]* should be what I want yet when I use Java's String.replaceAll it adds my replacement string in places that shouldn't be matching the regex.
For example:
I have an input string of "This is my99String problem"
If my replacement string is "~"
When I run this
myString.replaceAll("[\\d]*", "~" )
or
myString.replaceAll("[0-9]*", "~" )
my return string is "~T~h~i~s~ ~i~s~ ~m~y~~S~t~r~i~n~g~ ~p~r~o~b~l~e~m~"
As you can see the numbers have been replaced but why is it also appending my replacement string in between characters.
I want it to look like "This is my~String problem"
What am I doing wrong and why is java matching like this.
\\d* matches 0 or more digits, and so it even matches an empty string. And you have an empty string before every character in your string. So, for each of them, it replaces it with ~, hence the result.
Try using \\d+ instead. And you don't need to include \\d in character class.
[\\d]*
matches zero or more (as defined by *). Hence you're getting matches all through your strings. If you use
[\\d]+
that'll match 1 or more numbers.
From the doc:
Greedy quantifiers
X? X, once or not at all
X* X, zero or more times
X+ X, one or more times

Regular Expression - Java not working

I have a line of Java code
System.out.println("...Somtime".matches("^[^a-zA-Z]"));
Which returns false. Why? Can any one help?
String#matches matches at both the ends, so your pattern should cover the complete string. And also you don't need to give those anchors (Caret - ^) at the beginning. It is implicit.
Now, since your first three characters matches - [^a-zA-Z], while the later characters matches - [a-zA-Z].
So, probably you want: -
"...Somtime".matches("[^a-zA-Z]{3}[a-zA-Z]+")
String.matches("regex")
This method will match the regex against the WHOLE string. If the string matches regex, it will return true and false otherwise
System.out.println("...Somtime".matches("^[^a-zA-Z]{3}[a-zA-Z]+"));
here for three dots you are using {3} and this return true
System.out.println("Somtime".matches("^[^a-zA-Z]"));
it return false

Categories

Resources