Why negation is not working here? - java

String text = "$.example(\"This is the tes't\")";
final String quoteRegex = "example.*?(\"[^is].*?\")";
Matcher matcher0 = Pattern.compile(quoteRegex).matcher(text);
while (matcher0.find()) {
System.out.println(matcher0.group(1));
}
It returns This is the tes't. I was expecting not to return any result because of negation [^is] which says do not match is. Then why
it is returning This is the tes't ?
Similarly example.*?(\".*?\") regex returns This is the tes't but example(\".*?\") does not why ?

[^is] does not say do not match is, it says match a character that is not i or s, and your example has T after the " so it matches.
If you want to match zero or more characters and exclude the string "is", you can do:
example.*?(\"(?:(?!is).)*?\")
If you only want to not match is immediately after the " (which is not what your example has):
example.*?(\"(?!is).*?\")
You also ask why example(\".*?\") does not match; that regex only matches if there is a " immediately after example, while your example has a ( between. You could match the ( but still capture the quoted string with:
example\((\"...

Related

What is the Regex for decimal numbers in Java?

I am not quite sure of what is the correct regex for the period in Java. Here are some of my attempts. Sadly, they all meant any character.
String regex = "[0-9]*[.]?[0-9]*";
String regex = "[0-9]*['.']?[0-9]*";
String regex = "[0-9]*["."]?[0-9]*";
String regex = "[0-9]*[\.]?[0-9]*";
String regex = "[0-9]*[\\.]?[0-9]*";
String regex = "[0-9]*.?[0-9]*";
String regex = "[0-9]*\.?[0-9]*";
String regex = "[0-9]*\\.?[0-9]*";
But what I want is the actual "." character itself. Anyone have an idea?
What I'm trying to do actually is to write out the regex for a non-negative real number (decimals allowed). So the possibilities are: 12.2, 3.7, 2., 0.3, .89, 19
String regex = "[0-9]*['.']?[0-9]*";
Pattern pattern = Pattern.compile(regex);
String x = "5p4";
Matcher matcher = pattern.matcher(x);
System.out.println(matcher.find());
The last line is supposed to print false but prints true anyway. I think my regex is wrong though.
Update
To match non negative decimal number you need this regex:
^\d*\.\d+|\d+\.\d*$
or in java syntax : "^\\d*\\.\\d+|\\d+\\.\\d*$"
String regex = "^\\d*\\.\\d+|\\d+\\.\\d*$"
String string = "123.43253";
if(string.matches(regex))
System.out.println("true");
else
System.out.println("false");
Explanation for your original regex attempts:
[0-9]*\.?[0-9]*
with java escape it becomes :
"[0-9]*\\.?[0-9]*";
if you need to make the dot as mandatory you remove the ? mark:
[0-9]*\.[0-9]*
but this will accept just a dot without any number as well... So, if you want the validation to consider number as mandatory you use + ( which means one or more) instead of *(which means zero or more). That case it becomes:
[0-9]+\.[0-9]+
If you on Kotlin, use ktx:
fun String.findDecimalDigits() =
Pattern.compile("^[0-9]*\\.?[0-9]*").matcher(this).run { if (find()) group() else "" }!!
Your initial understanding was probably right, but you were being thrown because when using matcher.find(), your regex will find the first valid match within the string, and all of your examples would match a zero-length string.
I would suggest "^([0-9]+\\.?[0-9]*|\\.[0-9]+)$"
There are actually 2 ways to match a literal .. One is using backslash-escaping like you do there \\., and the other way is to enclose it inside a character class or the square brackets like [.]. Most of the special characters become literal characters inside the square brackets including .. So use \\. shows your intention clearer than [.] if all you want is to match a literal dot .. Use [] if you need to match multiple things which represents match this or that for example this regex [\\d.] means match a single digit or a literal dot
I have tested all the cases.
public static boolean isDecimal(String input) {
return Pattern.matches("^[-+]?\\d*[.]?\\d+|^[-+]?\\d+[.]?\\d*", input);
}

Using java regex how to find particular word any where in the string?

Using java regex how to find particular word anywhere in the string. My need is to check whether the string "Google" contains the word "gooe" or not.
For example:-
String: Goolge
word to find : gooe
The string "Google" contains all the characters g,o,o,e then it should return true.
IF the string is "wikipedia" and my word to find is "gooe" then it should return false.
How to form regex expression in this scenario..?
I've just tested such RegEx that makes a use of "look-ahead":
(?=^.*g)(?=^.*o)(?=^.*e)
It should return true for all strings that contain g, o and e, while returning false if any of these characters is missing.
If you want to find word in whole string you can use:
"^(?=.*e)(?=.*o.*o)(?=.*g).*"
You have to build a positive lookahead for each letter. In case of having gooe as search term our RegEx would be:
(?i)(?=.*g)(?=.*o)(?=.*o)(?=.*e)
It's obvious that we have two exact same lookaheads. They will satisfy at the position of second o letter, so one is redundant. You can remove duplicate letters from search term before building final pattern. (?i) sets case-insensitivity flag on.
String term = "Gooe"; // Search term
String word = "google"; // Against word `Google`
String pattern = "(?i)(?=.*" + String.join(")(?=.*", term.split("(?!^)")) + ")";
Pattern regex = Pattern.compile(pattern);
Matcher match = regex.matcher(word);
if (match.find()) {
// Matched
}
See demo here
If order is important and while looking for two os, exactly both of them should exist then our RegEx would be:
(?i).*?g.*?o.*?o.*?e
Java:
String pattern = "(?i).*?" + String.join(".*?", term.split("(?!^)"));

Replacing Strings with a number in it without a for loop

So I currently have this code;
for (int i = 1; i <= this.max; i++) {
in = in.replace("{place" + i + "}", this.getUser(i)); // Get the place of a user.
}
Which works well, but I would like to just keep it simple (using Pattern matching)
so I used this code to check if it matches;
System.out.println(StringUtil.matches("{place5}", "\\{place\\d\\}"));
StringUtil's matches;
public static boolean matches(String string, String regex) {
if (string == null || regex == null) return false;
Pattern compiledPattern = Pattern.compile(regex);
return compiledPattern.matcher(string).matches();
}
Which returns true, then comes the next part I need help with, replacing the {place5} so I can parse the number. I could replace "{place" and "}", but what if there were multiple of those in a string ("{place5} {username}"), then I can't do that anymore, as far as I'm aware, if you know if there is a simple way to do that then please let me know, if not I can just stick with the for-loop.
then comes the next part I need help with, replacing the {place5} so I can parse the number
In order to obtain the number after {place, you can use
s = s.replaceAll(".*\\{place(\\d+)}.*", "$1");
The regex matches arbitrary number of characters before the string we are searching for, then {place, then we match and capture 1 or more digits with (\d+), and then we match the rest of the string with .*. Note that if the string has newline symbols, you should append (?s) at the beginning of the pattern. $1 in the replacement pattern "restores" the value we need.

Java regular expression truncates string

I have the following Java string replaceAll function with a regular expression that replaces with zero variables with format ${var}:
String s = "( 200828.22 +400000.00 ) / ( 2.00 + ${16!*!8!1} ) + 200828.22 + ${16!*!8!0}";
s = s.replaceAll("\\$\\{.*\\}", "0");
The problem is that the resulting string s is:
"( 200828.22 +400000.00 ) / ( 2.00 + 0"
What's wrong with this code?
Change your regex to
\\$\\{.*?\\}
↑
* is greedy, the engine repeats it as many times as it can, so it matches {, then match everything until last token. It then begins to backtrack until it matches the last character before }.
For example, if you have the regex
\\{.*\\}
and the string
"{this is} a {test} string"
it'll match as follows:
{ matches the first {
.* matches everything until g token
the regex fails to match last } in the string
it backtracks until it reaches t, then it can match the next } resulting with matching "{this is} a {test}"
In order to make it ungreedy, you should add an ?. By doing that, it'll become lazy and stops until first } is encountered.
As mentioned in the comments, an alternative would be [^}]*. It matches anything that's not } (since it's placed in a character class).

How to find substring of a string with whitespaces in Java?

I want to check whether string contains particular sub string and using CONTAINS() for it.
But here problem is with space.
Ex- str1= "c not in(5,6)"
I want to check whether str contains NOT IN so I am using str.contains("not in")..
But problem is that here space between NOT and IN is not decided i.e. there can be 5 spaces also..
How to solve that I can find sub string like not in with any no of spaces in between...
Use a regular expression (Pattern) to get a Matcher to match your string.
The regexp should be "not\\s+in" ("not", followed by a number of space-characters, followed by "in"):
public static void main(String[] args) {
Matcher m = Pattern.compile("not\\s+in").matcher("c not in(5,6)");
if (m.find())
System.out.println("matches");
}
Note that there is a String method called matches(String regexp). You can use regular expression ".*not\\s+in.*" to get the match but it's not really a good way to perform the pattern matching.
You should use a regex: "not\\s+in"
String s = "c not in(5,6)";
Matcher matcher = Pattern.compile("not\\s+in").matcher(s);
System.out.println(matcher.find());
Explanation: The \\s+ means any kind of white space [tab is also acceptable], and must repeat at least one [any number >=1 will be accepted].
If you want only spaces, without tabs change your regex to "not +in"
Use the String.matches() method, which checks if the string matches a regular expression (docs).
In your case:
String str1 = "c not in(5,6)";
if (str1.matches(".*not\\s+in.*")) {
// do something
// the string contains "not in"
}
Case insensitive: (?i)
Considering newline as dot . too: (?s)
str1.matches("(?is).*not\\s+in.*")
Please try following,
int result = str1.indexOf ( "not in" );
if ( result != -1 )
{
// It contains "not in"
}
else if ( result == -1 )
{
// It does not contain "not in"
}
In general you can just do this:
if (string.indexOf("substring") > -1)... //It's there

Categories

Resources