I am new to regexp. I need to validate a string, but when I use my current attempt it is always returning false.
Rules:
A text matcher like "polygon(( ))"
number matcher like X Y, where x and y can be any double numbers
as many X Y pairs, separated by comma.
eg:
PolyGoN((
-74.0075783459999 40.710775696,
-74.007375926 40.710655064,
-74.0074640719999 40.7108592490001,
-74.0075783459999 40.710775696))
Here is the code that I used:
String inputString = "POLYGON((-74.0075783459999 40.710775696, -74.007375926 40.710655064, -74.0072836009999 40.710720973, -74.0075783459999 40.710775696))";
String regexp = "polygon[\\((][(\\-?\\d+(\\.\\d+)?)\\s*(\\-?\\d+(\\.\\d+)?)]*[\\))]";
Pattern pattern = Pattern.compile(regexp, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(inputString);
boolean result = matcher.matches();
[\\((] is incorrect way of specifying that you need ( twice. No matter how many times you repeat a character but inside a character class [] it counts only once. Since, it's the same character repeating you don't even need a character class there but just the character \\( with a quantifier that tells how many times it should repeat {2}. So, you need \\({2} at the start and \\){2} at the end.
Another problem with your use of [] is that you used them to denote a group of double pairs that repeats (using *). You always use () for grouping a part of your match. [] denotes a character class only. I wonder why you got that wrong because you grouped your doubles and their pairs correctly.
Next, you've forgotten to match all the commas , separating the double pairs. I've included that as (,\\s*)? in my regex. The hyphen - (or the negative sign here) doesn't need to be escaped since it's not inside a character class [] and so the regex parser knows you're not using it to specify a character range.
The corrected regex is (indented for clarity)
polygon\({2}\s*(
(-?\d+(\.\d+)?)\s*(-?\d+(\.\d+)?)(,\s*)
)*(-?\d+(\.\d+)?)\s*(-?\d+(\.\d+)?)
\s*\){2}
m|Polygon\(\(((\s*-?\d+\.\d+\s*){2},)*(\s*-?\d+\.\d+\s*){2}\)\)|i
Related
I have this string "u2x4m5x7" and I want replace all the characters but a number followed by an x with "".
The output should be:
"2x5x"
Just the number followed by the x.
But I am getting this:
"2x45x7"
I'm doing this:
String string = "u2x4m5x7";
String s = string.replaceAll("[^0-9+x]","");
Please help!!!
Here is a one-liner using String#replaceAll with two replacements:
System.out.println(string.replaceAll("\\d+(?!x)", "").replaceAll("[^x\\d]", ""));
Here is another working solution. We can iterate the input string using a formal pattern matcher with the pattern \d+x. This is the whitelist approach, of trying to match the variable combinations we want to keep.
String input = "u2x4m5x7";
Pattern pattern = Pattern.compile("\\d+x");
Matcher m = pattern.matcher(input);
StringBuilder b = new StringBuilder();
while(m.find()) {
b.append(m.group(0));
}
System.out.println(b)
This prints:
2x5x
It looks like this would be much simpler by searching to get the match rather than replacing all non matches, but here is a possible solution, though it may be missing a few cases:
\d(?!x)|[^0-9x]|(?<!\d)x
https://regex101.com/r/v6udph/1
Basically it will:
\d(?!x) -- remove any digit not followed by an x
[^0-9x] -- remove all non-x/digit characters
(?<!\d)x -- remove all x's not preceded by a digit
But then again, grabbing from \dx would be much simpler
Capture what you need to $1 OR any character and replace with captured $1 (empty if |. matched).
String s = string.replaceAll("(\\d+x)|.", "$1");
See this demo at regex101 or a Java demo at tio.run
I my java app, I have a following character sequence: b"2 (any single character, followed by a double quote followed by a single-digit number)
I need to replace the double quote with a single quote character.
I'm trying this:
Pattern p = Pattern.compile(".\"d");
Matcher m = p.matcher(initialOutput);
String replacement = m.replaceAll(".'d");
This does not seem to do anything.
What is the right way of doing this?
First off, d represents a literal character. You're looking for \d, which represents a numeric digit.
The other issue is that you're replacing variable characters with the string literal ".'d". One solution is to capture the variable portions and reference them in the replacement:
String replacement = initialOutput.replaceAll("(.)\"(\\d)", "$1'$2");
Another approach is to use lookarounds to check the surrounding characters without actually matching them for replacement:
String replacement = initialOutput.replaceAll("(?<=.)\"(?=\\d)", "'");
I am trying to match a string that looks like "WIFLYMODULE-xxxx" where the x can be any digit. For example, I want to be able to find the following...
WIFLYMODULE-3253
WIFLYMODULE-1585
WIFLYMODULE-1632
I am currently using
final Pattern q = Pattern.compile("[WIFLYMODULE]-[0-9]{3}");
but I am not picking up the string that I want. So my question is, why is my regular expression not working? Am i going about it in the wrong way?
You should use (..) instead of [...]. [..] is used for Character class
With a "character class", also called "character set", you can tell the regex engine to match only one out of several characters.
(WIFLYMODULE)-[0-9]{4}
Here is demo
Note: But in this case it's not needed at all. (...) is used for capturing group to access it by Matcher.group(index)
Important Note: Use \b as word boundary to match the correct word.
\\bWIFLYMODULE-[0-9]{4}\\b
Sample code:
String str = "WIFLYMODULE-3253 WIFLYMODULE-1585 WIFLYMODULE-1632";
Pattern p = Pattern.compile("\\bWIFLYMODULE-[0-9]{4}\\b");
Matcher m = p.matcher(str);
while (m.find()) {
System.out.println(m.group());
}
output:
WIFLYMODULE-3253
WIFLYMODULE-1585
WIFLYMODULE-1632
The regex should be:
"WIFLYMODULE-[0-9]{4}"
The square brackets means: one of the characters listed inside. Also you were matching three numbers instead of four. So your were matching strings like (where xxx is a number of three digits):
W-xxx, I-xxx, F-xxx, L-xxx, Y-xxx, M-xxx, O-xxx, D-xxx, U-xxx, L-xxx, E-xxx
You had it match on 3 digits instead of 4. And putting WIFLYMODULE inside [] makes it match on only one of those characters.
final Pattern q = Pattern.compile("WIFLYMODULE-[0-9]{4}");
[...] means that one character out of the ones in the bracket must match and not the string within it.
You, however, want to match WIFLYMODULE, thus, you have to use Pattern.compile("WIFLYMODULE-[0-9]{3}"); or Pattern.compile("(WIFLYMODULE)-[0-9]{3}");
{n} means that the character (or group) must match n-times. In your example you need 4 instead of 3: Pattern.compile("WIFLYMODULE-[0-9]{4}");
This way will work:
final Pattern q = Pattern.compile("WIFLYMODULE-[0-9]{4}");
The pattern breaks down to:
WIFLYMODULE- The literal string WIFLYMODULE-
[0-9]{4} Exactly four digits
What you had was:
[WIFLYMODULE] Any one of the characters in WIFLYMODULE
- The literal string -
[0-9]{3} Exactly three digits
I tried like this but it outputs false,Please help me
String inputString1 = "dfgh";// but not dFgH
String regex = "[a-z]";
boolean result;
Pattern pattern1 = Pattern.compile(regex);
Matcher matcher1 = pattern1.matcher(inputString1);
result = matcher1.matches();
System.out.println(result);
Your solution is nearly correct. The regex must say "[a-z]+"—include a quantifier, which means that you are not matching a single character, but one or more lowercase characters. Note that the uber-correct solution, which matches any lowercase char in Unicode, and not only those from the English alphabet, is this:
"\\p{javaLowerCase}+"
Additionally note that you can achieve this with much less code:
System.out.println(input.matches("\\p{javaLowerCase}*"));
(here I am alternatively using the * quantifier, which means zero or more. Choose according to the desired semantics.)
you are almost there, except that you are only checking for one character.
String regex = "[a-z]+";
the above regex would check if the input string would contain any number of characters from a to z
read about how to use Quantifiers in regex
Use this pattern :
String regex = "[a-z]*";
Your current pattern only works if the tested string is one char only.
Note that it does exactly what it looks like : it doesn't really test if the string is in lowercase but if it doesn't contain chars outside [a-z]. This means it returns false for lowercase strings like "àbcd". A correct solution in a Unicode world would be to use the Character.isLowercase() function and loop over the string.
It should be
^[a-z]+$
^ is the start of string
$ is the end of string
[a-z]+ matches 1 to many small characters
You need to use quantifies like * which matches 0 to many chars,+ which matches 1 to many chars..They would matches 0 or 1 to many times of the preceding character or range
Why bother with a regular expression ?
String inputString1 = "dfgh";// but not dFgH
boolean result = inputString1.toLowerCase().equals( inputString1 );
I am trying to search this string:
,"tt" : "ABC","r" : "+725.00","a" : "55.30",
For:
"r" : "725.00"
And here is my current code:
Pattern p = Pattern.compile("([r]\".:.\"[+|-][0-9]+.[0-9][0-9]\")");
Matcher m = p.matcher(raw_string);
I've been trying multiple variations of the pattern, and a match is never found. A second set of eyes would be great!
Your regexp actually works, it's almost correct
Pattern p = Pattern.compile("\"[r]\".:.\"[+|-][0-9]+.[0-9][0-9]\"");
Matcher m = p.matcher(raw_string);
if (m.find()){
String res = m.toMatchResult().group(0);
}
The next line should read:
if ( m.find() ) {
Are you doing that?
A few other issues: You're using . to match the spaces surrounding the colon; if that's always supposed to be whitespace, you should use + (one or more spaces) or \s+ (one or more whitespace characters). On the other hand, the dot between the digits is supposed to match a literal ., so you should escape it: \. Of course, since this is a Java String literal, you need to escape the backslashes: \\s+, \\..
You don't need the square brackets around the r, and if you don't want to match a | in front of the number you should change [+|-] to [+-].
While some of these issues I've mentioned could result in false positives, none of them would prevent it from matching valid input. That's why I suspect you aren't actually applying the regex by calling find(). It's a common mistake.
First thing try to escape your dot symbol: ...[0-9]+\.[0-9][0-9]...
because the dot symbol match any character...
Second thing: the [+|-]define a range of characters but it's mandatory...
try [+|-]?
Alban.