I have a regular expression which I want to negate, e.g.
/(.{0,4})
which String.matches returns the following
"/1234" true
"/12" true
"/" true
"" false
"1234" false
"/12345" false
Is there a way to negate (using regx only) to the above so that the results are:
"/1234" false
"/12" false
"/" false
"" true
"1234" true
"/12345" true
I'm looking for a general solution that would work for any regx without re-writing the whole regex.
I have looked at the following
How to negate the whole regex? using (?! pattern), but that doesn't seem to work for me.
The following regx
(?!/(.{0,4}))
returns the following:
"/1234" false
"/12" false
"/" false
"" true
"1234" false
"/12345" false
which is not what I want.
Any help would be appreciated.
You need to add anchors. The original regex (minus the unneeded parentheses):
/.{0,4}
...matches a string that contains a slash followed by zero to four more characters. But, because you're using the matches() method it's automatically anchored, as if it were really:
^/.{0,4}$
To achieve the inverse of that, you can't rely on automatic anchoring; you have to make at least the end anchor explicit within the lookahead. You also have to "pad" the regex with a .* because matches() requires the regex to consume the whole string:
(?!/.{0,4}$).*
But I recommend that you explicitly anchor the whole regex, like so:
^(?!/.{0,4}$).*$
It does no harm, and it makes your intention perfectly clear, especially to people who learned regexes from other flavors like Perl or JavaScript. The automatic anchoring of the matches() method is highly unusual.
I know this is a really old question but hopefully my answer can help anyone looking for this in the future.
While Alan Moore's answer is almost correct. You would need to group the whole regex too, or else you risk anchoring only part of the original regex.
For example if you want to negate the following regex: abc|def (which matches either "abc" or "def"
Prepending (?! and appending $).*. You will end up with (?!abc|def$).*.
The anchor here is only applying to def, meaning that "abcx" will not match when it should.
I would rather prepend (?!(?:and append )$).*.
String negateRegex(String regex) {
return "(?!(?:" + regex + ")$).*";
}
From my testing it looks like negateRegex(negateRegex(regex)) would indeed be functionally the same as regex.
Assuming our regex is MYREG, match other lines with:
^(?:(?!.*MYREG).*)$
Ave Maria.
Related
What would be a regular expression that would evaluate to true if the string has one or more letters anywhere in it.
For example:
1222a3999 would be true
a222aZaa would be true
aaaAaaaa would be true
but:
1111112())-- would be false
I tried: ^[a-zA-Z]+$ and [a-zA-Z]+ but neither work when there are any numbers and other characters in the string.
.*[a-zA-Z].*
The above means one letter, and before/after it - anything is fine.
In java:
String regex = ".*[a-zA-Z].*";
System.out.println("1222a3999".matches(regex));
System.out.println("a222aZaa ".matches(regex));
System.out.println("aaaAaaaa ".matches(regex));
System.out.println("1111112())-- ".matches(regex));
Will provide:
true
true
true
false
as expected
^.*[a-zA-Z].*$
Depending on the implementation, match() functions check if the entire string matches (which is probably why your [a-zA-Z] or [a-zA-Z]+ patterns didn't work).
Either use match() with the above pattern or use some sort of search() method instead.
This regexp should do it:
[a-zA-Z]
It matches as long as there's a single letter anywhere in the string, it doesn't care about any of the other characters.
[a-zA-Z]+
should have worked as well, I don't know why it didn't for you.
.*[a-zA-Z]?.*
Should get you the result you want.
The period matches any character except new line, the asterisk says this should exist zero or more times. Then the pattern [a-zA-Z]? says give me at least one character that is in the brackets because of the use of the question mark. Finally the ending .* says that the alphabet characters can be followed by zero or more characters of any type.
There is regular expression for finding blank string and I want only negation. I also see this question but it does not work for java (see examples). Solution also not work for me (see 3-rd line in example).
For example
Pattern.compile("/^$|\\s+/").matcher(" ").matches() - false
Pattern.compile("/^$|\\s+/").matcher(" a").matches()- false
Pattern.compile("^(?=\\s*\\S).*$").matcher("\t\n a").matches() - false
return false in both cases.
P.S. If something is not clear ask me questions.
UPDATED
I want to use this regular expression in #Pattern annotation without creating custom annotation and programmatic validator for it. That's why I want a "plain" regexp solution without using find function.
It's not clear what you mean by negation.
If you mean "a string that contains at least one non-blank character," then you can use this:
Pattern.compile("\\S").matcher(str).find()
If it's really necessary to use matches, then you can do it with this.
Pattern.compile("\\A\\s*\\S.*\\Z").matcher(str).matches()
This just matches 0 or more spaces followed by a non-space followed by any characters at all up to the end of the string.
If you mean "a string that is all non-blank with at least one such character," then you can use this:
Pattern.compile("\\A\\S+\\Z").matcher(str).matches()
You need to study the Java regex syntax. In Java, regular expressions are compiled from strings, so there's no need for special delimiters like /.../ or %r{...} as you'll see in other languages.
How about this:
if(!string.trim().isEmpty()) {
// do something
}
Use regex \s : A whitespace character: \t\n\x0B\f\r.
Pattern.compile("\\s")
By compiling the following:
System.out.println(Pattern.matches(".?(\\d)$","3"));
It returns true because before 3 there is nothing and ? check for a one or zero.
However 3 is already the first character of the input which starts at 0 and end at 1. How can the jvm recognize that there is nothing before 3.
For example the following.
System.out.println(Pattern.matches(".*","hello");
It returns true as well but only the very last character gets matched with "nothing".
There should not be a "nothing" character at the beginning of a string, only at the end of it right?
This is not really about the JVM. This is about Java regular expressions.
The regular expression ".*" means "match 0 or more characters". It's easy to satisfy this, since a blank string has 0 characters, and therefore satisfies this. Whether Java regular expressions will choose to be lazy and match an empty string, or to be greedy and match the entire string depends on the implementation of Java regular expressions. If you read this excellent writeup (http://docs.oracle.com/javase/tutorial/essential/regex/quant.html) you can see that patterns like ".*" in Java are considered "reluctant" quantifiers and will prefer to take as little as possible.
Based on the information in that writeup, you can see that a pattern like ".{0,}" is a greedy version of the same expression. Perhaps you'd like to use that instead if this is truly a problem for you.
You are not interpreting your regex correctly. There is no such thing as a "nothing character" . Rather, your pattern reads: any charachter followed by a digit at the end of the string OR a digit at the end of the string.
And surely, "3" fits the second description very well.
matches method tries to match the input exactly.
so there's no need to use ^,$..
Basically i want to match filename with .json extension but not file that start with . and excluding list.json.
This is what i come out with (without java string escapes)
(?i)^([^\.][^list].+|list.+)\.json$
I had use an online regex tester, Regexplanet to try my regex
http://fiddle.re/x9g86
Everything works fine with the regex tester, however when i tried it in Java. Everything that has the letter l,i,s,t will be excluded... which is very confusing for me.
Can anyone give me some clues?
Many thanks in advance.
I want to match filename with .json extension but not file that start with . and excluding list.json.
I am not sure you need regular expressions for this. I find the following much easier on the eye:
boolean match = s.endsWith(".json") && !s.startsWith(".") && !s.equals("list.json");
You're using a character exclusion class, [^list], which ignores character order and instead of excluding list, excludes any cases of l, i, s, or t.
Instead, you want to use a negative lookahead:
(?i)(?!^list\.json$)[^\.].*\.json
A negative look-ahead will do it.
(?i)(?!\.|list\.json$).*\.json
(?!\.|list\.json$) is a negative look-ahead checking that the characters following is not either list.json followed by the end of the string, or ..
Code:
String regex = "(?i)(?!\\.|list\\.json$).*\\.json";
System.out.println("list.json".matches(regex)); // false
System.out.println(".json".matches(regex)); // false
System.out.println("a.Json".matches(regex)); // true
System.out.println("abc.json".matches(regex)); // true
But NPE's more readable solution is probably preferred.
I have a line of Java code
System.out.println("...Somtime".matches("^[^a-zA-Z]"));
Which returns false. Why? Can any one help?
String#matches matches at both the ends, so your pattern should cover the complete string. And also you don't need to give those anchors (Caret - ^) at the beginning. It is implicit.
Now, since your first three characters matches - [^a-zA-Z], while the later characters matches - [a-zA-Z].
So, probably you want: -
"...Somtime".matches("[^a-zA-Z]{3}[a-zA-Z]+")
String.matches("regex")
This method will match the regex against the WHOLE string. If the string matches regex, it will return true and false otherwise
System.out.println("...Somtime".matches("^[^a-zA-Z]{3}[a-zA-Z]+"));
here for three dots you are using {3} and this return true
System.out.println("Somtime".matches("^[^a-zA-Z]"));
it return false