Regex for a pattern XXYYZZ - java

I want to validate a string which should follow the pattern XXYYZZ where X, Y, Z can be any letter a-z, A-Z or 0-9.
Example of valid strings:
RRFFKK
BB7733
WWDDMM
5599AA
Not valid:
555677
AABBCD
For now I am splitting the string using the regex (?<=(.))(?!\\1) and iterating over the resulting array and checking if each substring has a length of 2.
String str = "AABBEE";
boolean isValid = checkPattern(str);
public static boolean checkPattern(String str) {
String splited = str.split("(?<=(.))(?!\\1)");
for (String s : splited) {
if (s.length() != 2) {
return false;
}
}
return true;
}
I would like to replace my way of checking with String#matches and get rid of the loop, but can't come up with a valid regex. Can some one help what to put in someRegex in the below snippet?
public static boolean checkPattern(String str) {
return str.matches(someRegex);
}

You can use
s.matches("(\\p{Alnum})\\1(?!\\1)(\\p{Alnum})\\2(?!\\1|\\2)(\\p{Alnum})\\3")
See the regex demo.
Details
\A - start of string (it is implicit in String#matches) - the start of string
(\p{Alnum})\1 - an alphanumeric char (captured into Group 1) and an identical char right after
(?!\1) - the next char cannot be the same as in Group 1
(\p{Alnum})\2 - an alphanumeric char (captured into Group 2) and an identical char right after
(?!\1|\2) - the next char cannot be the same as in Group 1 and 2
(\p{Alnum})\3 - an alphanumeric char (captured into Group 3) and an identical char right after
\z - (implicit in String#matches) - end of string.
RegexPlanet test results:

Since you know a valid pattern will always be six characters long with three pairs of equal characters which are different from each other, a short series of explicit conditions may be simpler than a regex:
public static boolean checkPattern(String str) {
return str.length() == 6 &&
str.charAt(0) == str.chatAt(1) &&
str.charAt(2) == str.chatAt(3) &&
str.charAt(4) == str.chatAt(5) &&
str.charAt(0) != str.charAt(2) &&
str.charAt(0) != str.charAt(4) &&
str.charAt(2) != str.charAt(4);
}

Would the following work for you?
^(([A-Za-z\d])\2(?!.*\2)){3}$
See the online demo
^ - Start string anchor.
(- Open 1st capture group.
( - Open 2nd capture group.
[A-Za-z\d] - Any alphanumeric character.
) - Close 2nd capture group.
\2 - Match exactly what was just captured.
(?!.*\2) - Negative lookahead to make sure the same character is not used elsewhere.
) - Close 1st capture group.
{3} - Repeat the above three times.
$ - End string anchor.

Well, here's another solution that uses regex and streams in combination.
It breaks up the pattern into groups of two characters.
keeps the distinct groups.
and returns true if the count is 3.
String[] data = { "AABBBB", "AABBCC", "AAAAAA","AABBAA", "ABC", "AAABCC",
"RRABBCCC" };
String pat = "(?:\\G(.)\\1)+";
Pattern pattern = Pattern.compile(pat);
for (String str : data) {
Matcher m = pattern.matcher(str);
boolean isValid = m.results().map(MatchResult::group).distinct().count() == 3;
System.out.printf("%8s -> %s%n",
str, isValid ? "Valid" : "Not Valid");
}
Prints
AABBBB -> Not Valid
AABBCC -> Valid
AAAAAA -> Not Valid
AABBAA -> Not Valid
ABC -> Not Valid
AAABCC -> Not Valid
RRABBCCC -> Not Valid

You can check if a character matches with its following character and also if the count of distinct characters is 3.
Demo:
public class Main {
public static void main(String[] args) {
// Test
System.out.println(isValidPattern("RRFFKK"));
System.out.println(isValidPattern("BBAABB"));
System.out.println(isValidPattern("555677"));
}
static boolean isValidPattern(String str) {
return str.length() == 6 &&
str.charAt(0) == str.charAt(1) &&
str.charAt(2) == str.charAt(3) &&
str.charAt(4) == str.charAt(5) &&
str.chars().distinct().count() == 3;
}
}
Output:
true
false
false
Note: String#chars is available since Java-9.

Related

validate a user input

Hello I'm new to programming and I'm having a trouble understanding my assignment. I know that this might be a really simple problem for you guys and I'm sorry for that. Is it possible that she's just asking me to write a method that will perform the given instructions?
Write a program to find if the user input is valid base on the instructions.**
a string must have at least nine characters
a string consists of letters and numbers only.
a string must contain at least two digits.
You can simply use the regex, ^(?=(?:\D*\d){2})[a-zA-Z\d]{9,}$ which can be explained as follows:
^ : asserts position at start of a line
Positive Lookahead (?=(?:\D*\d){2})
Non-capturing group (?:\D*\d){2}
{2} matches the previous token exactly 2 times
\D matches any character that's not a digit (equivalent to [^0-9])
* matches the previous token between zero or more time (greedy)
\d matches a digit (equivalent to [0-9])
The pattern, [a-zA-Z\d]{9,} :
{9,} matches the previous token between 9+ times (greedy)
a-z matches a single character in a-z
A-Z matches a single character in A-Z
\d matches a digit (equivalent to [0-9])
$ : asserts position at the end of a line
Demo:
import java.util.stream.Stream;
public class Main {
public static void main(String args[]) {
//Test
Stream.of(
"helloworld",
"hello",
"hello12world",
"12helloworld",
"helloworld12",
"123456789",
"hello1234",
"1234hello",
"12345hello",
"hello12345"
).forEach(s -> System.out.println(s + " => " + isValid(s)));
}
static boolean isValid(String s) {
return s.matches("^(?=(?:\\D*\\d){2})[a-zA-Z\\d]{9,}$");
}
}
Output:
helloworld => false
hello => false
hello12world => true
12helloworld => true
helloworld12 => true
123456789 => true
hello1234 => true
1234hello => true
12345hello => true
hello12345 => true
Requirement #1: a string must have at least nine characters
This is solved by checking whether the length of the String is greater than 9, with s.length()>9
Requirement #2: a string consists of letters and numbers (whole numbers) only.
Use the regex [a-zA-Z0-9]+, which matches all Latin alphabet characters and numbers.
Requirement #3: a string must contain at least two digits.
I've written a method that loops through every character and uses Character.isDigit() to check whether it is a digit.
Check it out:
public static boolean verify(String s) {
final String regex = "[a-zA-Z0-9]+";
System.out.println(numOfDigits(s));
return s.length() > 9 && s.matches(regex) && numOfDigits(s) > 2;
}
public static int numOfDigits(String s) {
int a = 0;
int b = s.length();
for (int i = 0; i < b; i++) {
a += (Character.isDigit(s.charAt(i)) ? 1 : 0);
}
return a;
}

Regex to require one alphabet and one non-alphabet and no whitespace

I want to create a regex in Java to match at least 1 alphabet and 1 non-alphabet (could be anything except A-Za-z) and no white space.
Below Regex is working partially correct:
^([A-Za-z]{1,}[^A-Za-z]{1,})+$
It matches aaaa7777
but doesn't match 777aaaaa.
Any Help would be appreciated.
Your regex implicitly assumes the order of the characters you want to match. The regex is saying that a letter must come before a non-latter. However, you want the letter and the non-letter to come in either order, so you need to account for both cases. Also note that it should be [^\sa-zA-Z] instead of [^a-zA-Z] as you don't allow spaces.
(?:[a-zA-Z][^\sa-zA-Z]|[^\sa-zA-Z][a-zA-Z])
At the start and end, any non-space character is allowed, so:
^\S*(?:[a-zA-Z][^\sa-zA-Z]|[^\sa-zA-Z][a-zA-Z])\S*$
You may use
s.matches("(?=\\P{Alpha}*\\p{Alpha})(?=\\p{Alpha}*\\P{Alpha})\\S*")
This is how the pattern works.
Details
The pattern will match a whole string since ^ and \z anchors are implicit in matches
(?=\P{Alpha}*\p{Alpha}) - a lookahead that requires at least one ASCII letter after any 0+ chars other than an ASCII letter
(?=\p{Alpha}*\P{Alpha}) - a lookahead that requires a char other than an ASCII letter after 0 or more ASCII letters
\S* - zero or more non-whitespace chars.
To make the regex Unicode aware replace \p{Alpha} with \p{L} and \P{Alpha} with \P{L}.
Regular expressions aren't the right tool for this type of validation. Just write out the plain logic, your specific example:
public class Main {
public static void main(String[] args) {
System.out.println("'foo' ? " + doesMatch("foo"));
System.out.println("'bar7' ? " + doesMatch("bar7"));
System.out.println("'55baz' ? " + doesMatch("55baz"));
}
public static boolean doesMatch(String input) {
boolean hasAlpha = false,
hasNonAlpha = false;
for(char ch : input.toCharArray()) {
if(ch >= 'a' && ch <= 'z' || ch >= 'A' && ch <= 'Z') {
hasAlpha = true;
} else {
hasNonAlpha = true;
}
if(hasAlpha && hasNonAlpha) {
return true;
}
}
return false;
}
}
Anyone can understand what inputs do match and which inputs don't. If you use regular expressions this wouldn't be so simple.

Java string matching with wildcards

I have a pattern string with a wild card say X (E.g.: abc*).
Also I have a set of strings which I have to match against the given pattern.
E.g.:
abf - false
abc_fgh - true
abcgafa - true
fgabcafa - false
I tried using regex for the same, it didn't work.
Here is my code
String pattern = "abc*";
String str = "abcdef";
Pattern regex = Pattern.compile(pattern);
return regex.matcher(str).matches();
This returns false
Is there any other way to make this work?
Thanks
Just use bash style pattern to Java style pattern converter:
public static void main(String[] args) {
String patternString = createRegexFromGlob("abc*");
List<String> list = Arrays.asList("abf", "abc_fgh", "abcgafa", "fgabcafa");
list.forEach(it -> System.out.println(it.matches(patternString)));
}
private static String createRegexFromGlob(String glob) {
StringBuilder out = new StringBuilder("^");
for(int i = 0; i < glob.length(); ++i) {
final char c = glob.charAt(i);
switch(c) {
case '*': out.append(".*"); break;
case '?': out.append('.'); break;
case '.': out.append("\\."); break;
case '\\': out.append("\\\\"); break;
default: out.append(c);
}
}
out.append('$');
return out.toString();
}
Is there an equivalent of java.util.regex for “glob” type patterns?
Convert wildcard to a regex expression
you can use stringVariable.startsWith("abc")
abc* would be the RegEx that matches ab, abc, abcc, abccc and so on.
What you want is abc.* - if abc is supposed to be the beginning of the matched string and it's optional if anything follows it.
Otherwise you could prepend .* to also match strings with abc in the middle: .*abc.*
Generally i recommend playing around with a site like this to learn RegEx. You are asking for a pretty basic pattern but it's hard to say what you need exactly. Good Luck!
EDIT:
It seems like you want the user to type a part of a file name (or so) and you want to offer something like a search functionality (you could have made that clear in your question IMO). In this case you could bake your own RegEx from the users' input:
private Pattern getSearchRegEx(String userInput){
return Pattern.compile(".*" + userInput + ".*");
}
Of course that's just a very simple example. You could modify this and then use the RegEx to match file names.
So I thin here is your answer:
The regexp that you are looking for is this : [a][b][c].*
Here is my code that works:
String first = "abc"; // true
String second = "abctest"; // true
String third = "sthabcsth"; // false
Pattern pattern = Pattern.compile("[a][b][c].*");
System.out.println(first.matches(pattern.pattern())); // true
System.out.println(second.matches(pattern.pattern())); // true
System.out.println(third.matches(pattern.pattern())); // false
But if you want to check only if starts with or ends with you can use the methods of String: .startsWith() and endsWith()
// The main function that checks if two given strings match. The pattern string may contain
// wildcard characters
default boolean matchPattern(String pattern, String str) {
// If we reach at the end of both strings, we are done
if (pattern.length() == 0 && str.length() == 0) return true;
// Make sure that the characters after '*' are present in str string. This function assumes that
// the pattern string will not contain two consecutive '*'
if (pattern.length() > 1 && pattern.charAt(0) == '*' && str.length() == 0) return false;
// If the pattern string contains '?', or current characters of both strings match
if ((pattern.length() > 1 && pattern.charAt(0) == '?')
|| (pattern.length() != 0 && str.length() != 0 && pattern.charAt(0) == str.charAt(0)))
return matchPattern(pattern.substring(1), str.substring(1));
// If there is *, then there are two possibilities
// a: We consider current character of str string
// b: We ignore current character of str string.
if (pattern.length() > 0 && pattern.charAt(0) == '*')
return matchPattern(pattern.substring(1), str) || matchPattern(pattern, str.substring(1));
return false;
}
public static void main(String[] args) {
test("w*ks", "weeks"); // Yes
test("we?k*", "weekend"); // Yes
test("g*k", "gee"); // No because 'k' is not in second
test("*pqrs", "pqrst"); // No because 't' is not in first
test("abc*bcd", "abcdhghgbcd"); // Yes
test("abc*c?d", "abcd"); // No because second must have 2 instances of 'c'
test("*c*d", "abcd"); // Yes
test("*?c*d", "abcd"); // Yes
}

inserting parentheses and asterisks into string according to some conditions

I have the following method which is used to insert parentheses and asterisks into a boolean expression when dealing with multiplication. For instance, an input of A+B+AB will give A+B+(A*B).
However, I also need to take into account the primes (apostrophes). The following are some examples of input/output:
A'B'+CD should give (A'*B')+(C*D)
A'B'C'D' should give (A'*B'*C'*D')
(A+B)'+(C'D') should give (A+B)'+(C'*D')
I have tried the following code but seems to have errors. Any thoughts?
public static String modify(String expression)
{
String temp = expression;
StringBuilder validated = new StringBuilder();
boolean inBrackets=false;
for(int idx=0; idx<temp.length()-1; idx++)
{
//no prime
if((Character.isLetter(temp.charAt(idx))) && (Character.isLetter(temp.charAt(idx+1))))
{
if(!inBrackets)
{
inBrackets = true;
validated.append("(");
}
validated.append(temp.substring(idx,idx+1));
validated.append("*");
}
//first prime
else if((Character.isLetter(temp.charAt(idx))) && (temp.charAt(idx+1)=='\'') && (Character.isLetter(temp.charAt(idx+2))))
{
if(!inBrackets)
{
inBrackets = true;
validated.append("(");
}
validated.append(temp.substring(idx,idx+2));
validated.append("*");
idx++;
}
//second prime
else if((Character.isLetter(temp.charAt(idx))) && (temp.charAt(idx+2)=='\'') && (Character.isLetter(temp.charAt(idx+1))))
{
if(!inBrackets)
{
inBrackets = true;
validated.append("(");
}
validated.append(temp.substring(idx,idx+1));
validated.append("*");
idx++;
}
else
{
validated.append(temp.substring(idx,idx+1));
if(inBrackets)
{
validated.append(")");
inBrackets=false;
}
}
}
validated.append(temp.substring(temp.length()-1));
if(inBrackets)
{
validated.append(")");
inBrackets=false;
}
return validated.toString();
}
Your help will greatly be appreciated. Thank you in advance! :)
I would suggest you should start with positions of + character in your string. If they differ by 1, you dont do anything. If they differ by two then there are two possiblities: AB or A'. So you check for it. If they differ by more than 2, then just check for ' symbol and put required symbol.
You can do it in 2 passes using regular expressions:
StringBuilder input = new StringBuilder("A'B'+(CDE)+A'B");
Pattern pattern1 = Pattern.compile("[A-Z]'?(?=[A-Z]'?)");
Matcher matcher1 = pattern1.matcher(input);
while (matcher1.find()) {
input.insert(matcher1.end(), '*');
matcher1.region(matcher1.end() + 1, input.length());
}
Pattern pattern2 = Pattern.compile("([A-Z]'?[*])+[A-Z]'?");
Matcher matcher2 = pattern2.matcher(input);
while (matcher2.find()) {
int start = matcher2.start();
int end = matcher2.end();
if (start==0||input.charAt(start-1) != '(') {
input.insert(start, '(');
end++;
}
if (input.length() == end || input.charAt(end) != ')') {
input.insert(end, ')');
end++;
}
matcher2.region(end, input.length());
}
It works as follows: the regex [A-Z]'? will match a letter from A-Z (all the capital letters) and it can be followed by an optional apostrophe, so it conveniently takes care of whether there is an apostrophe or not for us. The regex [A-Z]'?(?=[A-Z]'?) then means "look for a capital letter followed by an option apostrophe and then look for (but don't match against) a capital letter followed by an option apostrophe. This wil be all the places after which you want to put an asterisk. We then create a Matcher and find all the characters that match it. then we insert the asterisk. Since we modified the string, we need to update the Matcher for it to function properly.
In the second pass, we use the regex ([A-Z]'?[*])+[A-Z]'? which will look for "a capital letter followed by an option apostrophe and then an asterisk at least one time and then a capital letter followed by an option apostrophe". this is where all the groups that parentheses need to go in lie. So we create a Matcher and find the matches. we then check to see if there is already a parentese there (making sure not to go out of bounds ). If not we add a one. We again need to update the Matcher since we inserted characters. once this is over we have or final string.
for more on regex:
Pattern documentation
Regex tutorial

Finding whether a string meets a certain pattern

I recently had an interview with Google for a Software Engineering position and the question asked regarded building a pattern matcher.
So you have to build the
boolean isPattern(String givenPattern, String stringToMatch)
Function that does the following:
givenPattern is a string that contains:
a) 'a'-'z' chars
b) '*' chars which can be matched by 0 or more letters
c) '?' which just matches to a character - any letter basically
So the call could be something like
isPattern("abc", "abcd") - returns false as it does not match the pattern ('d' is extra)
isPattern("a*bc", "aksakwjahwhajahbcdbc"), which is true as we have an 'a' at the start, many characters after and then it ends with "bc"
isPattern("a?bc", "adbc") returns true as each character of the pattern matches in the given string.
During the interview, time being short, I figured one could walk through the pattern, see if a character is a letter, a * or a ? and then match the characters in the given string respectively. But that ended up being a complicated set of for-loops and we didn't manage to come to a conclusion within the given 45 minutes.
Could someone please tell me how they would solve this problem quickly and efficiently?
Many thanks!
Assuming you are allowed to use regexes, you could have written something like:
static boolean isPattern(String givenPattern, String stringToMatch) {
String regex = "^" + givenPattern.replace("*", ".*").replace("?", ".") + "$";
return Pattern.compile(regex).matcher(stringToMatch).matches();
}
"^" is the start of the string
"$" is the end of the string
. is for "any character", exactly once
.* is for "any character", 0 or more times
Note: If you want to restrict * and ? to letters only, you can use [a-zA-Z] instead of ..
boolean isPattern(String givenPattern, String stringToMatch) {
if (givenPattern.empty)
return stringToMatch.isEmpty();
char patternCh = givenPatter.charAt(0);
boolean atEnd = stringToMatch.isEmpty();
if (patternCh == '*') {
return isPattenn(givenPattern.substring(1), stringToMatch)
|| (!atEnd && isPattern(givenPattern, stringToMatch.substring(1)));
} else if (patternCh == '?') {
return !atEnd && isPattern(givenPattern.substring(1),
stringToMatch.substring(1));
}
return !atEnd && patternCh == stringToMatch.charAt(0)
&& isPattern(givenPattern.substring(1), stringToNatch.subtring(1);
}
(Recursion being easiest to understand.)

Categories

Resources