Java unicode name regexp

Java unicode name regexp - java

I want to check user name and last name using java regexp.
And I use this pattern
private static final Pattern CHECK_NAME_FIELD_PATTERN = Pattern.compile("\\w+",
Pattern.UNICODE_CHARACTER_CLASS);
public static boolean checkNameField(String name){
return CHECK_NAME_FIELD_PATTERN.matcher(name).matches();
}
But checkNameField("234523") returns true.

It returns true for the numbers because \w would match also the digits.
private static final Pattern CHECK_NAME_FIELD_PATTERN = Pattern.compile("\\p{L}+",
Pattern.UNICODE_CHARACTER_CLASS);
\\p{L} matches any kind of letter from any language.

Related

How to match a string of tuples in Java?

I have strings like "(C,D) (E,F) (G,H) (J,K)" and "(C,D) (E,F) (G,H) (J,K)" or "((C,D) (E,F) (G,H) (J,K)". How to return true if regex matches pattern like in first string (which is a one tuple or series of tuples seperated by one whitespace). I tried something like "(\([A-Z],[A-Z]\)[ |$])+?", but it does not capture the final pair of tuple. In case of 2nd and 3rd string it should return false.

Here is the problem of your regex:
(\([A-Z],[A-Z]\)[ |$])+?
^^^^^
You thought that meant "space or end of string", didn't you? It actually means "space or | or dollar sign". A lot of special characters lose their special meaning when placed inside a character class.
You should replace it with (?: |$) instead. Also, the +? at the end should be a greedy +:
(\([A-Z],[A-Z]\)(?: |$))+
Personally, I don't really like this "space or end of string" thing. I would prefer repeating the tuple pattern (especially when the repeated pattern is not long):
(?:\([A-Z],[A-Z]\) )*(?:\([A-Z],[A-Z]\))
Needless to say, you should match with matches, not find.

If you want to match a string of parenthesised pairs of comma-separated capital letters, with a single space between each pair, you could use a pattern like this:
^\\([A-Z],[A-Z]\\)( \\([A-Z],[A-Z]\\))*$
That is: letter,comma,letter all in parentheses, following by zero or more occurrences of the similar parenthetic expressions, each preceded by a space.

I guess, you might be able to do that with:
\s*|\(([^()\r\n]+)\)
If the pattern would not return an empty string, would be false.
RegEx Demo
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegularExpression{
public static void main(String[] args){
final String regex = "\\([^()\\r\\n]+\\)|\\s*";
final String string = "(C,D) (E,F) (G,H) (J,K)\n"
+ "(C,D) (E,F) (G,H) (J,K)\n"
+ "((C,D) (E,F) (G,H) (J,K)";
final String subst = "";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
final String result = matcher.replaceAll(subst);
System.out.println(result);
}
}
Output
(
If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.
Source
Regular expression to match balanced parentheses

Pattern matching for Japanese string have issues in java

I have a strange issue while pattern matching only Japaneese characters in Java.
Let me explain by code.
private static final Pattern ADDRESS_STRING_PATTERN =
Pattern.compile("^[\\p{L}\\d\\s\\p{Punct}]{1,200}$");
private static boolean isValidInput(final String input, Pattern pattern) {
return pattern.matcher(input).matches();
}
System.out.println("こんにちは、元気ですか");
Here I am matching any Letter,Space, digit or Punctuation letters 1 to 200.
Now this will always return false. After some debugging found that the issue is with one character "、" . If I add that character as part of the regular expression it works fine.
Anyone come across this issue ? Or is this bug in Java ?

The thing is that 、 (U+3001 IDEOGRAPHIC COMMA) belongs to "Punctuation, other" Unicode category and \\p{Punct} only matches ASCII punctuation by default. If you use a Pattern.UNICODE_CHARACTER_CLASS option or (?U) embedded flag option, it will match (i.e. the pattern might look like "(?U)^[\\p{L}\\d\\s\\p{Punct}]{1,200}$"). However, this may impact \d and \s, and I am not sure you want to match all Unicode digits and whitespace.
An alternative is to use \p{P}\p{S} (to match Unicode punctuation and symbols) instead of \p{Punct} (the POSIX character class matches both punctuation and symbols).
See a Java demo printing true:
private static final Pattern ADDRESS_STRING_PATTERN = Pattern.compile("^[\\p{L}\\d\\s\\p{P}\\p{S}]{1,200}$");
private static boolean isValidInput(final String input, Pattern pattern) {
return pattern.matcher(input).matches();
}
public static void main (String[] args) throws java.lang.Exception
{
System.out.println(isValidInput("こんにちは、元気ですか",ADDRESS_STRING_PATTERN));
}
// => true

Java class validating a Pattern expression compile

I've written a really simple regular expression to validate a phone number that I can see works in the engine provided by zytrax.com regex. When I use it in the class to compile as a pattern I get en error with the escaped characters for the Pattern.compile string to process.
package Test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class FindMainTestExcercisePN {
private static String phone;
private static Matcher matcher;
private boolean getCheckNumber(String pn) {
boolean valid = matcher.matches();
return valid;
}
private void PhoneNumber(String input) {
Pattern pattern = Pattern.compile("^(?:(?:\\+?\\s*1\\s*(?:[.-\\s*]?)(?:[.\\s*-]?))?(?:(\\s*([0-9]|[0-9]|[0-9])\\s*)|([0-9]|[0-9]|[0-9]))\\s*(?:[.-\\s*]?)?)?([0-9]|[0-9]|[0-9]{2})\\s*(?:[.\\s*-]?)(?:[.-\\s*]?)?([0-9]|[0-9]|[0-9]|[0-9]{4})\\s*");
matcher = pattern.matcher(input);
}
public static void main(String[] a) {
FindMainTestExcercisePN ex15 = new FindMainTestExcercisePN();
phone = "1-098-234-5454";
ex15.PhoneNumber(phone);
boolean bool = ex15.getCheckNumber(phone);
System.out.println("The number is valid= " + bool);
}
}
If you take out the escapes it will work just fine (prime ex. 1-345-345-3324) so any suggestions please?

This expression is illegal:
[.-\\s*]
In a character class, the dash character is a range operator, eg [0-9] means "any character in the range 0 to 9"., but here you have coded a range .-\s, which attempts to express "any character in the range dot to 'any whitespace'", which is clearly nonsense.
To code a literal dash in a character class, code it first or last.
If the intention if this expression is "a dot, dash, whitespace or star", then code:
[.\\s*-]
If the star is not intended as a literal, but you want to express "a dot or dash, or any number of whitespace", use this:
([.-]?|\\s*)

you method getCheckNumber always return true

password validation with regex in java

I have to create a Regex for password validation which match
eg.
abcdABCD1234$%^
password must contains atleast two lowercase,two uppercase, two numeric and two special character. But they can give more than this criteria
Note- pattern should be inorder.
String pattern="(?=.*[a-z]{2,})(?=.*[A-Z]{2,})(?=.*[0-9]{2,})(?=.*[##$%&]{2,})";
it is working for me but it is not checking order
means
AB uppercase or anycharacter should not come before ab (lowercase).
Does it clear for u.
String minNum="4";
String max="20";
String REGEX="(^(?!.*(d))(?=.*[a-z]{3,})(?=.*[A-Z]{2,})(?=.*[0-9]{3,})(?=.[##$%&*><?+]{2,})^(?!.*(#r)).{"+minNum+","+max+"})";
//String regex="(?=.*[a-z]{2,})(?=.*[A-Z]{2,})(?=.*[0-9]{2,})(?=.*[##$%&]{2,})";
String INPUT ="acABC1333323##";
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(INPUT);
System.out.println(m.matches());
it is working correctly but when i am changing password
"ABac1333323##"; it also matches but it is in correct according to my requirement, because AB is first order.

To require the characters to be in a special order is the weirdest password requirement I have ever heard and I can not believe that your customer really wants this.
Stated this I can explain your regex to you.
The lookahead assertions (the (?=...) stuff), you are using in your regex, are normally used, when the required characters can be in any order. If you really don't have this requirement, then your regex is simple, you just need to skip your lookaheads.
This will match your requirements:
String pattern="[a-z]{2,}[A-Z]{2,}[0-9]{2,}[##$%&]{2,}";
Just in case you want to allow all letters, digits and all other characters in your passwords, use Unicode code properties:
String pattern="p{Ll}{2,}p{Lu}{2,}\d{2,}[^\p{L}\d]{2,}";

I think this is what you want
(?=[a-z]{2,}).{2,}(?=[A-Z]{2,}).{2,}(?=\d{2,}).{2,}(?=[##$%&]{2,}).{2,}
It matches abcdABCD1234$%^ and abABcdCD1234$%^
It does not match ABababcdCD1234$%^ or ABac1333323##
For two or more lower case followed by two or more upper case followed by two or more digits followed by two or more special characters, use :
[a-z]{2,}[A-Z]{2,}\d{2,}[##$%&]{2,}

Maybe it could help you:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class PwdValidator{
private Matcher match;
private Pattern pattern;
private static final String PWD_PATTERN = "((?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%]).{6,20})";
public PwdValidator(){
pattern = Pattern.compile(PWD_PATTERN);
}
public boolean validate(final String pwd){
match = pattern.matcher(pwd);
return match.matches();
}
}

java regular expression returning false

I am newbie to java regular expression. I wrote following code for validating the non digit number. If we enter any non digit number it should return false. for me the below code always return false. whats the wrong here?
package regularexpression;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class NumberValidator {
private static final String NUMBER_PATTERN = "\\d";
Pattern pattern;
public NumberValidator() {
pattern = Pattern.compile(NUMBER_PATTERN);
}
public boolean validate(String line){
Matcher matcher = pattern.matcher(line);
return matcher.matches();
}
public static void main(String[] args) {
NumberValidator validator = new NumberValidator();
boolean validate = validator.validate("123");
System.out.println("validate:: "+validate);
}
}

From Java documentation:
The matches method attempts to match the entire input sequence against the pattern.
Your regular expression matches a single digit, not a number. Add + after \\d to matchone or more digits:
private static final String NUMBER_PATTERN = "\\d+";
As a side note, you can combine initialization and declaration of pattern, making the constructor unnecessary:
Pattern pattern = Pattern.compile(NUMBER_PATTERN);

matches "returns true if, and only if, the entire region sequence matches this matcher's pattern."
The string is 3 digits, which doesn't match the pattern \d, meaning 'a digit'.
Instead you want the pattern \d+, meaning 'one or more digits.' This is expressed in a string as "\\d+"

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java unicode name regexp - java

It returns true for the numbers because \w would match also the digits. private static final Pattern CHECK_NAME_FIELD_PATTERN = Pattern.compile("\\p{L}+", Pattern.UNICODE_CHARACTER_CLASS); \\p{L} matches any kind of letter from any language.

Related

How to match a string of tuples in Java?

Pattern matching for Japanese string have issues in java

Java class validating a Pattern expression compile

password validation with regex in java

java regular expression returning false

Categories

Resources