validate Regular Expression using Java

validate Regular Expression using Java - java

I need validate a String using a Regular Expression, the String must be like "createRobot(x,y)", where x and y are digits.
I have Something like
String ins;
Pattern ptncreate= Pattern.compile("^createRobot(+\\d,\\d)");
Matcher m = ptncreate.matcher(ins);
System.out.println(m.find());
but doesn't work
Can you help me ?.
Thanks.

You forgot the word Robot in your pattern. Also, parenthesis are special characters in regex, and the + should be placed after the \d, not after a (:
Pattern.compile("^createRobot\\(\\d+,\\d+\\)$")
Note that if you want to validate input that should consist solely of this "createRobot"-string, you mind as well do:
boolean success = s.matches("createRobot\\(\\d+,\\d+\\)");
where s is the String you want to validate. But if you want to retrieve the numbers that were matched, you do need to use a Pattern/Matcher:
Pattern p = Pattern.compile("createRobot\\((\\d+),(\\d+)\\)");
Matcher m = p.matcher("createRobot(12,345)");
if(m.matches()) {
System.out.printf("x=%s, y=%s", m.group(1), m.group(2));
}
As you can see, after calling Matcher.matches() (or Matcher.find()), you can retrieve the nth match-group through group(n).

You must add \ before ( because ( in regex is the special group character
The regexp pattren is:
^create(\d+,\d+)

Related

What is the Regex for decimal numbers in Java?

I am not quite sure of what is the correct regex for the period in Java. Here are some of my attempts. Sadly, they all meant any character.
String regex = "[0-9]*[.]?[0-9]*";
String regex = "[0-9]*['.']?[0-9]*";
String regex = "[0-9]*["."]?[0-9]*";
String regex = "[0-9]*[\.]?[0-9]*";
String regex = "[0-9]*[\\.]?[0-9]*";
String regex = "[0-9]*.?[0-9]*";
String regex = "[0-9]*\.?[0-9]*";
String regex = "[0-9]*\\.?[0-9]*";
But what I want is the actual "." character itself. Anyone have an idea?
What I'm trying to do actually is to write out the regex for a non-negative real number (decimals allowed). So the possibilities are: 12.2, 3.7, 2., 0.3, .89, 19
String regex = "[0-9]*['.']?[0-9]*";
Pattern pattern = Pattern.compile(regex);
String x = "5p4";
Matcher matcher = pattern.matcher(x);
System.out.println(matcher.find());
The last line is supposed to print false but prints true anyway. I think my regex is wrong though.

Update
To match non negative decimal number you need this regex:
^\d*\.\d+|\d+\.\d*$
or in java syntax : "^\\d*\\.\\d+|\\d+\\.\\d*$"
String regex = "^\\d*\\.\\d+|\\d+\\.\\d*$"
String string = "123.43253";
if(string.matches(regex))
System.out.println("true");
else
System.out.println("false");
Explanation for your original regex attempts:
[0-9]*\.?[0-9]*
with java escape it becomes :
"[0-9]*\\.?[0-9]*";
if you need to make the dot as mandatory you remove the ? mark:
[0-9]*\.[0-9]*
but this will accept just a dot without any number as well... So, if you want the validation to consider number as mandatory you use + ( which means one or more) instead of *(which means zero or more). That case it becomes:
[0-9]+\.[0-9]+

If you on Kotlin, use ktx:
fun String.findDecimalDigits() =
Pattern.compile("^[0-9]*\\.?[0-9]*").matcher(this).run { if (find()) group() else "" }!!

Your initial understanding was probably right, but you were being thrown because when using matcher.find(), your regex will find the first valid match within the string, and all of your examples would match a zero-length string.
I would suggest "^([0-9]+\\.?[0-9]*|\\.[0-9]+)$"

There are actually 2 ways to match a literal .. One is using backslash-escaping like you do there \\., and the other way is to enclose it inside a character class or the square brackets like [.]. Most of the special characters become literal characters inside the square brackets including .. So use \\. shows your intention clearer than [.] if all you want is to match a literal dot .. Use [] if you need to match multiple things which represents match this or that for example this regex [\\d.] means match a single digit or a literal dot

I have tested all the cases.
public static boolean isDecimal(String input) {
return Pattern.matches("^[-+]?\\d*[.]?\\d+|^[-+]?\\d+[.]?\\d*", input);
}

Java Matcher Pattern issue

I am trying to extract everything that is after this string path /share/attachments/docs/. All my strings are starting with /share/attachments/docs/
For example: /share/attachments/docs/image2.png
Number of characters after ../docs/ is not static!
I tried with
Pattern p = Pattern.compile("^(.*)/share/attachments/docs/(\\d+)$");
Matcher m = p.matcher("/share/attachments/docs/image2.png");
m.find();
String link = m.group(2);
System.out.println("Link #: "+link);
But I am getting Exception that: No match found.
Strange because if I use this:
Pattern p = Pattern.compile("^(.*)ABC Results for draw no (\\d+)$");
Matcher m = p.matcher("ABC Results for draw no 2888");
then it works!!!
Also one thing is that in some very rare cases my string does not start with /share/attachments/docs/ and then I should not parse anything but that is not related directly to the issue, but it will be good to handle.

I am getting Exception that: No match found.
This is because image2.png doesn't match with \d+ use a more appropriate pattern like .+ assuming that you want to extract image2.png.
Your regular expression will then be ^(.*)/share/attachments/docs/(.+)$
In case of ABC Results for draw no 2888, the regexp ^(.*)ABC Results for draw no (\\d+)$ works because you have several successive digits at the end of your String while in the first case you had image2.png that is a mix of letters and digits which is the reason why there were no match found.
Generally speaking to avoid getting an IllegalStateException: No match found, you need first to check the result of find(), if it returns true the input String matches:
if (m.find()) {
// The String matches with the pattern
String link = m.group(2);
System.out.println("Draw #: "+link);
} else {
System.out.println("Input value doesn't match with the pattern");
}

The regular expression \d+ (expressed as \\d+ inside a string literal) matches a run of one or more digits. Your example input does not have a corresponding digit run, so it is not matched. The regex metacharacter . matches any character (+/- newline, depending on regex options); it seems like that may be what you're really after.
Additionally, when you use Matcher.find() it is unnecessary for the pattern to match the whole string, so it is needless to include .* to match leading context. Furthermore, find() returns a value that tells you whether a match to the pattern was found. You generally want to use this return value, and in your particular case you can use it to reject those rare non-matching strings.
Maybe this is more what you want:
Pattern p = Pattern.compile("/share/attachments/docs/(.+)$");
Matcher m = p.matcher("/share/attachments/docs/image2.png");
String link;
if (m.find()) {
link = m.group(1);
System.out.println("Draw #: " + link);
} else {
link = null;
System.out.println("Draw #: (not found)");
}

Java String matches and replaceAll differ in matching parentheses

I have strings with parentheses and also escaped characters. I need to match against these characters and also delete them. In the following code, I use matches() and replaceAll() with the same regex, but the matches() returns false, while the replaceAll() seems to match just fine, because the replaceAll() executes and removes the characters. Can someone explain?
String input = "(aaaa)\\b";
boolean matchResult = input.matches("\\(|\\)|\\\\[a-z]+");
System.out.printf("matchResult=%s\n", matchResult);
String output = input.replaceAll("\\(|\\)|\\\\[a-z]+", "");
System.out.printf("INPUT: %s --> OUTPUT: %s\n", input, output);
Prints out:
matchResult=false
INPUT: (aaaa) --> OUTPUT: aaaa

matches matches the whole input, not part of it.
The regular expression \(|\)|\\[a-z]+ doesn't describe the whole word, but only parts of it, so in your case it fails.

What matches is doing has already been explained by Binyamin Sharet. I want to extend this a bit.
Java does not have a "findall" or a "g" modifier like other languages have it to get all matches at once.
The Java Matcher class knows only two methods to use a pattern against a string (without replacing it)
matches(): matches the whole string against the pattern
find(): returns the next match
If you want to get all things that fits your pattern, you need to use find() in a loop, something like this:
Pattern p = Pattern
.compile("\\(|\\)|\\\\[a-z]+");
Matcher m = p.matcher(text);
while(m.find()){
System.out.println(m.group(0));
}
or if you are only interested if your pattern exists in the string
if (m.find()) {
System.out.println(m.group());
} else {
System.out.println("not found");
}

Regular expression to match unescaped special characters only

I'm trying to come up with a regular expression that can match only characters not preceded by a special escape sequence in a string.
For instance, in the string Is ? stranded//? , I want to be able to replace the ? which hasn't been escaped with another string, so I can have this result : **Is Dave stranded?**
But for the life of me I have not been able to figure out a way. I have only come up with regular expressions that eat all the replaceable characters.
How do you construct a regular expression that matches only characters not preceded by an escape sequence?

Use a negative lookbehind, it's what they were designed to do!
(?<!//)[?]
To break it down:
(
?<! #The negative look behind. It will check that the following slashes do not exist.
// #The slashes you are trying to avoid.
)
[\?] #Your special charactor list.
Only if the // cannot be found, it will progress with the rest of the search.
I think in Java it will need to be escaped again as a string something like:
Pattern p = Pattern.compile("(?<!//)[\\?]");

Try this Java code:
str="Is ? stranded//?";
Pattern p = Pattern.compile("(?<!//)([?])");
m = p.matcher(str);
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, m.group(1).replace("?", "Dave"));
}
m.appendTail(sb);
String s = sb.toString().replace("//", "");
System.out.println("Output: " + s);
OUTPUT
Output: Is Dave stranded?

I was thinking about this and have a second simplier solution, avoiding regexs. The other answers are probably better but I thought I might post it anyway.
String input = "Is ? stranded//?";
String output = input
.replace("//?", "a717efbc-84a9-46bf-b1be-8a9fb714fce8")
.replace("?", "Dave")
.replace("a717efbc-84a9-46bf-b1be-8a9fb714fce8", "?");
Just protect the "//?" by replacing it with something unique (like a guid). Then you know any remaining question marks are fair game.

Use grouping. Here's one example:
import java.util.regex.*;
class Test {
public static void main(String[] args) {
Pattern p = Pattern.compile("([^/][^/])(\\?)");
String s = "Is ? stranded//?";
Matcher m = p.matcher(s);
if (m.matches)
s = m.replaceAll("$1XXX").replace("//", "");
System.out.println(s + " -> " + s);
}
}
Output:
$ java Test
Is ? stranded//? -> Is XXX stranded?
In this example, I'm:
first replacing any non-escaped ? with "XXX",
then, removing the "//" escape sequences.
EDIT Use if (m.matches) to ensure that you handle non-matching strings properly.
This is just a quick-and-dirty example. You need to flesh it out, obviously, to make it more robust. But it gets the general idea across.

Match on a set of characters OTHER than an escape sequence, then a regex special character. You could use an inverted character class ([^/]) for the first bit. Special case an unescaped regex character at the front of the string.

String aString = "Is ? stranded//?";
String regex = "(?<!//)[^a-z^A-Z^\\s^/]";
System.out.println(aString.replaceAll(regex, "Dave"));
The part of the regular expression [^a-z^A-Z^\\s^/] matches non-alphanumeric, whitespace or non-forward slash charaters.
The (?<!//) part does a negative lookbehind - see docco here for more info
This gives the output Is Dave stranded//?

try matching:
(^|(^.)|(.[^/])|([^/].))[special characters list]

I used this one:
((?:^|[^\\])(?:\\\\)*[ESCAPABLE CHARACTERS HERE])
Demo: https://regex101.com/r/zH1zO3/4

Author and time matching regex

I would to use a regex in my Java program to recognize some feature of my strings.
I've this type of string:
`-Author- has wrote (-hh-:-mm-)
So, for example, I've a string with:
Cecco has wrote (15:12)
and i've to extract author, hh and mm fields. Obviously I've some restriction to consider:
hh and mm must be numbers
author hasn't any restrictions
I've to consider space between "has wrote" and (
How can I can use regex?
EDIT: I attach my snippet:
String mRegex = "(\\s)+ has wrote \\((\\d\\d):(\\d\\d)\\)";
Pattern mPattern = Pattern.compile(mRegex);
String[] str = {
"Cecco CQ has wrote (14:55)", //OK (matched)
"yesterday you has wrote that I'm crazy", //NO (different text)
"Simon has wrote (yesterday)", // NO (yesterday isn't numbers)
"John has wrote (22:32)", //OK
"James has wrote(22:11)", //NO (missed space between has wrote and ()
"Tommy has wrote (xx:ss)" //NO (xx and ss aren't numbers)
};
for(String s : str) {
Matcher mMatcher = mPattern.matcher(s);
while (mMatcher.find()) {
System.out.println(mMatcher.group());
}
}

homework?
Something like:
(.+) has wrote \((\d\d):(\d\d)\)
Should do the trick
() - mark groups to capture (there are three in the above)
.+ - any chars (you said no restrictions)
\d - any digit
\(\) escape the parens as literals instead of a capturing group
use:
Pattern p = Pattern.compile("(.+) has wrote \\((\\d\\d):(\\d\\d)\\)");
Matcher m = p.matcher("Gareth has wrote (12:00)");
if( m.matches()){
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
}
To cope with an optional (HH:mm) at the end you need to start to use some dark regex voodoo:
Pattern p = Pattern.compile("(.+) has wrote\\s?(?:\\((\\d\\d):(\\d\\d)\\))?");
Matcher m = p.matcher("Gareth has wrote (12:00)");
if( m.matches()){
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
}
m = p.matcher("Gareth has wrote");
if( m.matches()){
System.out.println(m.group(1));
// m.group(2) == null since it didn't match anything
}
The new unescaped pattern:
(.+) has wrote\s?(?:\((\d\d):(\d\d)\))?
\s? optionally match a space (there might not be a space at the end if there isn't a (HH:mm) group
(?: ... ) is a none capturing group, i.e. allows use to put ? after it to make is optional
I think #codinghorror has something to say about regex

The easiest way to figure out regular expressions is to use a testing tool before coding.
I use an eclipse plugin from http://www.brosinski.com/regex/
Using this I came up with the following result:
([a-zA-Z]*) has wrote \((\d\d):(\d\d)\)
Cecco has wrote (15:12)
Found 1 match(es):
start=0, end=23
Group(0) = Cecco has wrote (15:12)
Group(1) = Cecco
Group(2) = 15
Group(3) = 12
An excellent turorial on regular expression syntax can be found at http://www.regular-expressions.info/tutorial.html

Well, just in case you didn't know, Matcher has a nice function that can draw out specific groups, or parts of the pattern enclosed by (), Matcher.group(int). Like if I wanted to match for a number between two semicolons like:
:22:
I could use the regex ":(\\d+):" to match one or more digits between two semicolons, and then I can fetch specifically the digits with:
Matcher.group(1)
And then its just a matter of parsing the String into an int. As a note, group numbering starts at 1. group(0) is the whole match, so Matcher.group(0) for the previous example would return :22:
For your case, I think the regex bits you need to consider are
"[A-Za-z]" for alphabet characters (you could probably also safely use "\\w", which matchers alphabet characters, as well as numbers and _).
"\\d" for digits (1,2,3...)
"+" for indicating you want one or more of the previous character or group.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

validate Regular Expression using Java - java

You must add \ before ( because ( in regex is the special group character The regexp pattren is: ^create(\d+,\d+)

Related

What is the Regex for decimal numbers in Java?

Java Matcher Pattern issue

Java String matches and replaceAll differ in matching parentheses

Regular expression to match unescaped special characters only

Author and time matching regex

Categories

Resources