java string.matches requires matches of too much of the string - java

The following code:
String s = "casdfsad";
System.out.println(s.matches("[a-z]"));
System.out.println(s.matches("^[a-z]"));
System.out.println(s.matches("^[a-z].*"));
outputs
false
false
true
But why is that? I did not specify any $ at the end of any of the patterns.
Does String.matches add ^ and $ implicitly to force a full string match?
Why? And can I disable full string matching, perhaps by using another method?
Edit:
If String.matches implicitly adds ^ and $, why don't String.replaceAll or String.replaceFirst also do this? Isn't this inconsistent?

Unfortunately there is no find method in String you must use Matcher.find().
Pattern pattern = Pattern.compile("[a-z]");
Matcher matcher = pattern.matcher("casdfsad");
System.out.println(matcher.find());
will output
true
EDIT: If you want to find full strings and you don't need regular expressions you can use String.indexOf(), e.g.
String someString = "Hello World";
boolean isHelloContained = someString.indexOf("Hello") > -1;
System.out.println(isHelloContained);
someString = "Some other string";
isHelloContained = someString.indexOf("Hello") > -1;
System.out.println(isHelloContained);
will output
true
false

Try, by putting + of greedy quantifier you can match whole String. Because, s has more than one character. So,to match you should choose a quantifier which will match, more than one a-z range character. For String.matches, you don't need boundary character ^ and $.
String s = "casdfsad";
System.out.println(s.matches("[a-z]+"));// It will be true

You are trying to use a single character regex for a Sring?
You could try :
String s = "casdfsad";
System.out.println(s.matches("[a-z]+"));
System.out.println(s.matches("^[a-z]+"));
System.out.println(s.matches("^[a-z].*"));
The third one matches because of the *. String.matches is not adding any ^ and $ implicitly to force a full string match.

Related

String.matches() with \n

Why the String::matches method return false when I put \n into the String?
public class AppMain2 {
public static void main(String[] args) {
String data1 = "\n London";
System.out.println(data1.matches(".*London.*"));
}
}
It doesn't match because "." in regex may not match line terminators as in the documentation here :
https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#sum
By default Java's . does not match newlines. To have . include newlines, set the Pattern.DOTALL flag with (?s):
System.out.println(data1.matches("(?s).*London.*"));
Note for those coming from other regex flavors, the Java documentation use of the term "match" is different from other languages. What is meant is Java's string::matches() returns true only if the entire string is matched, i.e. it behaves as if a ^ and $ were added to the head and tail of the passed regex, NOT simply that it contains a match.
If you want true, you need use Pattern.DOTALL or (?s).
By this way . match any characters included \n
String data1 = "\n London";
Pattern pattern = Pattern.compile(".*London.*", Pattern.DOTALL);
System.out.println(data1.matches(pattern));
or :
System.out.println(data1.matches("(?s).*London.*"));
"\n" considered as newline so String.matches searching for the pattern to in new line.so returning false try something like this.
Pattern.compile(".London.", Pattern.MULTILINE);

What is the Regex for decimal numbers in Java?

I am not quite sure of what is the correct regex for the period in Java. Here are some of my attempts. Sadly, they all meant any character.
String regex = "[0-9]*[.]?[0-9]*";
String regex = "[0-9]*['.']?[0-9]*";
String regex = "[0-9]*["."]?[0-9]*";
String regex = "[0-9]*[\.]?[0-9]*";
String regex = "[0-9]*[\\.]?[0-9]*";
String regex = "[0-9]*.?[0-9]*";
String regex = "[0-9]*\.?[0-9]*";
String regex = "[0-9]*\\.?[0-9]*";
But what I want is the actual "." character itself. Anyone have an idea?
What I'm trying to do actually is to write out the regex for a non-negative real number (decimals allowed). So the possibilities are: 12.2, 3.7, 2., 0.3, .89, 19
String regex = "[0-9]*['.']?[0-9]*";
Pattern pattern = Pattern.compile(regex);
String x = "5p4";
Matcher matcher = pattern.matcher(x);
System.out.println(matcher.find());
The last line is supposed to print false but prints true anyway. I think my regex is wrong though.
Update
To match non negative decimal number you need this regex:
^\d*\.\d+|\d+\.\d*$
or in java syntax : "^\\d*\\.\\d+|\\d+\\.\\d*$"
String regex = "^\\d*\\.\\d+|\\d+\\.\\d*$"
String string = "123.43253";
if(string.matches(regex))
System.out.println("true");
else
System.out.println("false");
Explanation for your original regex attempts:
[0-9]*\.?[0-9]*
with java escape it becomes :
"[0-9]*\\.?[0-9]*";
if you need to make the dot as mandatory you remove the ? mark:
[0-9]*\.[0-9]*
but this will accept just a dot without any number as well... So, if you want the validation to consider number as mandatory you use + ( which means one or more) instead of *(which means zero or more). That case it becomes:
[0-9]+\.[0-9]+
If you on Kotlin, use ktx:
fun String.findDecimalDigits() =
Pattern.compile("^[0-9]*\\.?[0-9]*").matcher(this).run { if (find()) group() else "" }!!
Your initial understanding was probably right, but you were being thrown because when using matcher.find(), your regex will find the first valid match within the string, and all of your examples would match a zero-length string.
I would suggest "^([0-9]+\\.?[0-9]*|\\.[0-9]+)$"
There are actually 2 ways to match a literal .. One is using backslash-escaping like you do there \\., and the other way is to enclose it inside a character class or the square brackets like [.]. Most of the special characters become literal characters inside the square brackets including .. So use \\. shows your intention clearer than [.] if all you want is to match a literal dot .. Use [] if you need to match multiple things which represents match this or that for example this regex [\\d.] means match a single digit or a literal dot
I have tested all the cases.
public static boolean isDecimal(String input) {
return Pattern.matches("^[-+]?\\d*[.]?\\d+|^[-+]?\\d+[.]?\\d*", input);
}

Using java regex how to find particular word any where in the string?

Using java regex how to find particular word anywhere in the string. My need is to check whether the string "Google" contains the word "gooe" or not.
For example:-
String: Goolge
word to find : gooe
The string "Google" contains all the characters g,o,o,e then it should return true.
IF the string is "wikipedia" and my word to find is "gooe" then it should return false.
How to form regex expression in this scenario..?
I've just tested such RegEx that makes a use of "look-ahead":
(?=^.*g)(?=^.*o)(?=^.*e)
It should return true for all strings that contain g, o and e, while returning false if any of these characters is missing.
If you want to find word in whole string you can use:
"^(?=.*e)(?=.*o.*o)(?=.*g).*"
You have to build a positive lookahead for each letter. In case of having gooe as search term our RegEx would be:
(?i)(?=.*g)(?=.*o)(?=.*o)(?=.*e)
It's obvious that we have two exact same lookaheads. They will satisfy at the position of second o letter, so one is redundant. You can remove duplicate letters from search term before building final pattern. (?i) sets case-insensitivity flag on.
String term = "Gooe"; // Search term
String word = "google"; // Against word `Google`
String pattern = "(?i)(?=.*" + String.join(")(?=.*", term.split("(?!^)")) + ")";
Pattern regex = Pattern.compile(pattern);
Matcher match = regex.matcher(word);
if (match.find()) {
// Matched
}
See demo here
If order is important and while looking for two os, exactly both of them should exist then our RegEx would be:
(?i).*?g.*?o.*?o.*?e
Java:
String pattern = "(?i).*?" + String.join(".*?", term.split("(?!^)"));

Why doesn't /0/g match in a string that contains zeroes?

This code always returns "false" at last, even if Integer contains any zero:
Integer i = (int) rand(1, 200); // random [1;200)
String regexp = "/0/g";
Pattern p = Pattern.compile(regexp);
Matcher m = p.matcher(i.toString());
print(i);
print(m.matches());
What is the reason? I don't get where the mistake could be.
Needed: m.matches() = "true" if Integer contains one or more zero.
The problem is that you're giving the regular expression incorrectly. The string you give Pattern.compile is just the text of the expression, without / on either side, and without flags; flags are specified separately.
So in your case, you'd just want:
String regexp = "0";
There's no "global" flag; instead, you use the methods on the resulting Matcher as appropriate to what you're doing.
Needed: m.matches() = "true" if Integer contains one or more zero.
Then you don't want to use Matcher#matches, you want Match#find. Or if you need to use Matcher#matches, the expression would be:
String regexp = ".*0.*";
...e.g., any number of any character, then a 0, then any number of any character. That way, the entire string can match the expression.
Of course, if you just want to know there's a zero, it's much simpler to just use
boolean flag = String.valueOf(i).indexOf('0') != -1;
In this particular case you don't need a regex at all since you are looking for a literal character, use indexOf:
if (Str.indexOf( '0' ) != -1) {
...
about your original pattern:
regex don't need to be enclosed between delimiters in Java, so slashes are useless. The global modifier isn't needed too because the global nature is determined by the method you choose. (in other words, the only way to obtain several results is to use the find method in a loop to obtain the different results)
print(m.find());
Matcher will match from beginning.Use find as 0 input is not possible in your case.
Using find will enable you to locate 0 anywhere in the string.
matches tries to match the expression against the entire string and implicitly add a ^ at the start and $ at the end of your pattern, meaning it will not look for a substring. Hence false.
Also change your regex to "0" as suggested by the other answer.
Try,
String regexp = ".*0.*";
Pattern p = Pattern.compile(regexp);
Matcher m = p.matcher(i.toString());
if(m.find()){
System.out.println(i);
System.out.println(m.matches());
}
Regex :

Regex NOT operator doesn't work

I'm trying to filter files in a folder. I need the files that don't end with ".xml-test". The following regex works as expected (ok1,ok2,ok3 = false, ok4 = true)
String regex = ".+\\.xml\\-test$";
boolean ok1 = Pattern.matches(regex, "database123.xml");
boolean ok2 = Pattern.matches(regex, "database123.sql");
boolean ok3 = Pattern.matches(regex, "log_file012.txt");
boolean ok4 = Pattern.matches(regex, "database.xml-test");
Now I just need to negate it, but it doesn't work for some reason:
String regex = "^(.+\\.xml\\-test)$";
I still get ok1,ok2,ok3 = false, ok4 = true
Any ideas? (As people pointed, this could be done easily without regex. But for arguments sake assume I have to use a single regex pattern and nothing else (ie !Pattern.matches(..); is also not allowed))
I think you are looking for:
if (! someString.endsWith(".xml-test")) {
...
}
No regular expression required. Throw this into a FilenameFilter as follows:
public accept(File dir, String name) {
return ! name.endsWith(".xml-test");
}
The meaning of ^ changes depending on its position in the regexp. When the symbol is inside a character class [] as the first character, it means negation of the character class; when it is outside a character class, it means the beginning of line.
The easiest way to negate a result of a match is to use a positive pattern in regex, and then to add a ! on the Java side to do the negation, like this:
boolean isGoodFile = !Pattern.matches(regex, "database123.xml");
The following Java regex asserts that a string does NOT end with: .xml-test:
String regex = "^(?:(?!\\.xml-test$).)*$";
This regex walks the string one character at a time and asserts that at each and every position the remainder of the string is not .xml-test.
Simple!
^ - is not a negation in regexp, this is a symbol indicating beginning of line
you probably need (?!X) X, via zero-width negative lookahead
But I suggest you to use File#listFiles method with FilenameFilter implementation:
name.endsWith(".xml-test")
If you really need to test it with regex, then you should use negative lookbehinds from Pattern class:
String reges = "^.*(?<!\\.xml-test)$"
How it works:
first you match whole string: from start (^) all characters (.*),
you check if what have already matched doesn't have ".xml-test" at end (lookbehind at position you already matched),
you test if it's end of string.

Categories

Resources