I try to create a regex for a String which is NotBlank and cannot contain "<".
My question is what Im doing wrong thank you.
"(\\A(?!\\s*\\Z))|([^<]+)"
Edit
Maybe this way how to combine this regex
^[^<]+$
with this regex
\\A(?!\\s*\\Z).+
With regex, you can use
\A(?!\s+\z)[^<]+\z
(?U)\A(?!\s+\z)[^<]+\z
The (?U) is only necessary when you expect any Unicode chars in the input.
In Java, when used with matches, the anchors on both ends are implicit:
text.matches("(?U)(?!\\s+\\z)[^<]+")
The regex in matches is executed once and requires the full string match. Here, it matches
\A - (implicit in matches) - start of string
(?U) - Pattern.UNICODE_CHARACTER_CLASS option enabled so that \s could match any Unicode whitespaces
(?!\\s+\\z) - until the very end of string, there should be no one or more whitespaces
[^<]+ - one or more chars other than <
\z - (implicit in matches) - end of string.
See the Java test:
String texts[] = {"Abc <<", " ", "", "abc 123"};
Pattern p = Pattern.compile("(?U)(?!\\s+\\z)[^<]+");
for(String text : texts)
{
Matcher m = p.matcher(text);
System.out.println("'" + text + "' => " + m.matches());
}
Output:
'Abc <<' => false
' ' => false
'' => false
'abc 123' => true
See an online regex test (modified to fit the single multiline string demo environment so as not to cross over line boundaries.)
You can try to use this regex:
[^<\s]+
Any char that is not "<", for 1 or more times.
Here is the example to test it: https://regex101.com/r/9ptt15/2
However, you can try to solve it without a regular expression:
boolean isValid = s != null && !s.isEmpty() && s.indexOf(" ") == -1 && s.indexOf("<") == -1;
Related
I am trying to verify if the string match a regular expression or not.
The URL format is : key=value&key=value&....
Key or value can be empty.
My code is :
Pattern patt = Pattern.compile("\\w*=\\w*&(\\w *=\\w*)* ");
Matcher m = patt.matcher(s);
if(m.matches()) return true;
else return false;
when i enter one=1&two=2, it shows false whereas it should show true.
Any idea !
The regex you need is
Pattern.compile("(?:\\w+=\\w*|=\\w+)(?:&(?:\\w+=\\w*|=\\w+))*");
See the regex demo. It will match:
(?:\\w+=\\w*|=\\w+) - either 1+ word chars followed with = and then 0+ word chars (obligatory key, optional value) or = followed with 1+ word chars (optional key)
(?:&(?:\\w+=\\w*|=\\w+))* - zero or more of such sequences as above.
Java demo:
String s = "one=1&two=2&=3&tr=";
Pattern patt = Pattern.compile("(?:\\w+=\\w*|=\\w+)(?:&(?:\\w+=\\w*|=\\w+))*");
Matcher m = patt.matcher(s);
if(m.matches()) {
System.out.println("true");
} else {
System.out.println("false");
}
// => true
To allow whitespaces, add \\s* where needed. If you need to also allow non-word chars, use, say, [\\w.-] instead of \w to match word chars, . and - (keep the - at the end of the character class).
I want to get the word text2, but it returns null. Could you please correct it ?
String str = "Text SETVAR((&&text1 '&&text2'))";
Pattern patter1 = Pattern.compile("SETVAR\\w+&&(\\w+)'\\)\\)");
Matcher matcher = patter1.matcher(str);
String result = null;
if (matcher.find()) {
result = matcher.group(1);
}
System.out.println(result);
One way to do it is to match all possible pattern in parentheses:
String str = "Text SETVAR((&&text1 '&&text2'))";
Pattern patter1 = Pattern.compile("SETVAR[(]{2}&&\\w+\\s*'&&(\\w+)'[)]{2}");
Matcher matcher = patter1.matcher(str);
String result = "";
if (matcher.find()) {
result = matcher.group(1);
}
System.out.println(result);
See IDEONE demo
You can also use [^()]* inside the parentheses to just get to the value inside single apostrophes:
Pattern patter1 = Pattern.compile("SETVAR[(]{2}[^()]*'&&(\\w+)'[)]{2}");
^^^^^^
See another demo
Let me break down the regex for you:
SETVAR - match SETVAR literally, then...
[(]{2} - match 2 ( literally, then...
[^()]* - match 0 or more characters other than ( or ) up to...
'&& - match a single apostrophe and two & symbols, then...
(\\w+) - match and capture into Group 1 one or more word characters
'[)]{2} - match a single apostrophe and then 2 ) symbols literally.
Your regex doesn't match your string, because you didn't specify the opened parenthesis also \\w+ will match any combinations of word character and it won't match space and &.
Instead you can use a negated character class [^']+ which will match any combinations of characters with length 1 or more except one quotation :
String str = "Text SETVAR((&&text1 '&&text2'))";
"SETVAR\\(\\([^']+'&&(\\w+)'\\)\\)"
Debuggex Demo
I'm currently trying to add support to our application for Japanese and French language encodings. In doing so, I'm trying to create two Pattern matchers to detect tabs-only and spaces-only in a read file, regardless of language encoding.
These will be used to determine what delimiter is used in a file, so they can be processed accordingly.
When I've tried compiling a space pattern
Pattern.compile(" ", Pattern.UNICODE_CHARACTER_CLASS);
I don't see it generating a regex to handle different unicode space values.
eg something like "[\\u00A0\\u2028\\u2029\\u3000\\u00C2\\u009A\\u0041]"
Compilation seems to work properly with the '\s' character set, but that includes tabs and newlines.
How should I be doing this in Java?
UPDATE
So part of the reason this wasn't working was the fact that Japanese web text HAS NO spaces, even though there appear to be spaces. Take the following line from a web imoprt:
実なので説明は不要だろう。その後1987
There are actually no spaces here う。そ. Just three characters.
Fixing this is really the subject of another question, so I have accepted Casimir's answer, as it handled the French case just fine.
You can use a negated character class. Example:
[^\\S \\t]
that means \s without space and tab.
Or you can use a class intersection:
[\\s&&[^ \\t]]
If I follow your question, you could use something like this for spaces -
Pattern p = Pattern.compile("^[ ]+$", Pattern.UNICODE_CHARACTER_CLASS);
String[] inputs = {" ", " ", " \t", "Hello"};
for (String input : inputs) {
Matcher m = p.matcher(input);
System.out.printf("For input: '%s' = %s%n", input, m.find());
}
Output is
For input: ' ' = true
For input: ' ' = true
For input: ' ' = false
For input: 'Hello' = false
and for tabs
Pattern p = Pattern.compile("^[\t]+$", Pattern.UNICODE_CHARACTER_CLASS);
String[] inputs = {"\t", "\t\t", " \t", "Hello"};
for (String input : inputs) {
Matcher m = p.matcher(input);
System.out.printf("For input: '%s' = %s%n", input, m.find());
}
Output is
For input: ' ' = true
For input: ' ' = true
For input: ' ' = false
For input: 'Hello' = false
Finally, use * instead of + for 0 or more matches. This uses +, so that is 1 or more match required. Starting with (^) and ending with ($).
So I have a string I would like to parse and I can not get my regular expression to work. I am using https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/RegExp as my regular expression guide.
I would like my regular expression to match on any of the following symbols.
+ - * % /
My code as follows. Input String: D[1]+D[0]. Should print true...but prints false.
String tmp = "D[1]+D[0]";
if(tmp.matches("[\\+\\-\\*\\/\\%]"))
System.out.println("true");
else
System.out.println("false");
Any ideas?
This is because matches wants the entire string to be matched, not just any part of it.
You do not need to escape characters inside square brackets.
String str = "D[1]+D[0]";
Pattern p = Pattern.compile("[+-/*]");
Matcher m = p.matcher(str);
if (m.find()) {
System.out.println("Found: " + m.group());
}
matches() must match the entire input, but all you need do is add .* to each end:
if (tmp.matches(".*[-+*/%].*"))
Note: Characters between [] don't need escaping if the hyphen is first or last.
I need a regular expression to match any string other than none.
I tried using
regular exp ="^[^none]$",
But it does not work.
If you are matching a String against a specific word in Java you should use equals(). In this case you want to invert the match so your logic becomes:
if(!theString.equals("none")) {
// do stuff here
}
Much less resource hungry, and much more intuitive.
If you need to match a String which contains the word "none", you are probably looking for something like:
if(theString.matches("\\bnone\\b")) {
/* matches theString if the substring "none" is enclosed between
* “word boundaries”, so it will not match for example: "nonetheless"
*/
}
Or if you can be fairly certain that “word boundaries” mean a specific delimiter you can still evade regular expressions by using the indexOf() method:
int i = theString.indexOf("none");
if(i > -1) {
if(i > 0) {
// check theString.charAt(i - 1) to see if it is a word boundary
// e.g.: whitespace
}
// the 4 is because of the fact that "none" is 4 characters long.
if((theString.length() - i - 4) > 0) {
// check theString.charAt(i + 4) to see if it is a word boundary
// e.g.: whitespace
}
}
else {
// not found.
}
You can use the regular expression (?!^none$).*. See this question for details: Regex inverse matching on specific string?
The reason "^[^none]$" doesn't work is that you are actually matching all strings except the strings "n", "o", or "e".
Of course, it would be easier to just use String.equals like so: !"none".equals(testString).
Actually this is the regex to match all words except "word":
Pattern regex = Pattern.compile("\\b(?!word\\b)\\w+\\b");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
// matched text: regexMatcher.group()
// match start: regexMatcher.start()
// match end: regexMatcher.end()
}
You must use word boundaries so that "word" is not contained in other words.
Explanation:
"
\b # Assert position at a word boundary
(?! # Assert that it is impossible to match the regex below starting at this position (negative lookahead)
Lorem # Match the characters “Lorem” literally
\b # Assert position at a word boundary
)
\w # Match a single character that is a “word character” (letters, digits, etc.)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\b # Assert position at a word boundary
"
This is the regex you are looking for:
Pattern p = Pattern.compile("^(?!none$).*$");
Matcher m = p.matcher("your string");
System.out.println(s + ": " + (m.matches() ? "Match" : "NO Match"));
Having that said, if you are not forced to use a regex that matches everything but "none", the more simple, fast, clear, and easy to write and understand is this:
Pattern p = Pattern.compile("^none$");
Then, you just exclude the matches.
Matcher m = p.matcher("your string");
System.out.println(s + ": " + (m.matches() ? "NO Match" : "Match"));