simple java regular expression not working - java

I have this simple example of a regular expression. But it is not working. I don't know what I am doing wrong:
String name = "abc";
System.out.println(name.matches("[a-zA-Z]"));
it returns false, it should be true.

use :
name.matches("[a-zA-Z]+") // matches more than one character
or name.matches("\\w+") // matches more than one character
name.matches("[a-zA-Z]") // matches exactly one character.

Add + to your regex to match one or more alphabets,
String name = "abc"; System.out.println(name.matches("[a-zA-Z]+"));
Your regex [a-zA-Z] must match a single alphabet, not more than one.
[a-zA-Z] Match a lowercase alphabet from a-z or match an uppercase alphabet from A-Z.

The reason why this evaluates to false is, it tries to match the entrie string (see doc of String.matches()) to the Pattern [A-Za-z] wich only matches a single character. Either use
Pattern.compile("[A-Za-z]").matcher(str).find() to see if a substring matches (will return true in this case), or alter the RegEx to account for multiple Characters. The cleanest way of doing so is
Pattern.compile("^[A-Za-z]+$");
The ^ marks "start of string" and $ marks "end of string". + means "previous token at least once".
If you want to allow the empty String as well, use
Pattern.compile("^[A-Za-z]*$");
instead (* means "match the previous token 0 or more times")

Try with [a-zA-Z]+
[a-zA-Z] indicates:

Related

Java Regexp to match words only (', -, space)

What is the Java Regular expression to match all words containing only :
From a to z and A to Z
The ' - Space Characters but they must not be in the beginning or the
end.
Examples
test'test match
test' doesn't match
'test doesn't match
-test doesn't match
test- doesn't match
test-test match
You can use the following pattern: ^(?!-|'|\\s)[a-zA-Z]*(?!-|'|\\s)$
Below are the examples:
String s1 = "abc";
String s2 = " abc";
String s3 = "abc ";
System.out.println(s1.matches("^(?!-|'|\\s)[a-zA-Z]*(?!-|'|\\s)$"));
System.out.println(s2.matches("^(?!-|'|\\s)[a-zA-Z]*(?!-|'|\\s)$"));
System.out.println(s3.matches("^(?!-|'|\\s)[a-zA-Z]*(?!-|'|\\s)$"));
When you mean the whitespace char it is: [a-zA-Z ]
So it checks if your string contains a-z(lowercase) and A-Z(uppercase) chars and the whitespace chars. If not, the test will fail
Here's my solution:
/(\w{2,}(-|'|\s)\w{2,})/g
You can take it for a spin on Regexr.
It is first checking for a word with \w, then any of the three qualifiers with "or" logic using |, and then another word. The brackets {} are making sure the words on either end are at least 2 characters long so contractions like don't aren't captured. You could set that to any value to prevent longer words from being captured or omit them entirely.
Caveat: \w also looks for _ underscores. If you don't want that you could replace it with [a-zA-Z] like so:
/([a-zA-Z]{2,}(-|'|\s)[a-zA-Z]{2,})/g

Java string validation

I'm finding a regular expression which adheres below rules.
Allowed Characters
Alphabet : a-z A-Z
Numbers : 0-9
I am using [^a-zA-Z0-9] but when call
regex = "[^a-zA-Z0-9]" ;
String key = "message";
if (!key.matches(regex))
message = "Invalid key";
system will show Invalid key, The key should be valid. Could you please help me?
If you want to allow these characters [a-zA-Z0-9] you should not use ^ since it negates what is inside the [].
This expression [^a-zA-Z0-9] means anything that is not a-z A-Z or numbers : 0-9.
You may have seen the ^ being used outside the [] at the begging of a regular expression to indicate the begging string like ^[a-zA-Z0-9].
The below regex would allow one or more alphanumeric characters,
^[A-Za-z0-9]+$
Your regex [^a-zA-Z0-9], matches a single character but not of a alphanumeric character. [^..] called negated character class which do the negation of chars which are present inside that character class.
You don't need to give start or end anchors in the regex when it is passed to matches method. So [A-Za-z0-9]+ would be enough.
Explanation:
^ Anchor which denotes the start.
[A-Za-z0-9]+ , + repeats the preceding token [A-Za-z0-9] one or more times.
$ End of the line.
I think you just have to remove the not-operator. Here is the same example, only the variable is renamed:
invalidChars = "[^a-zA-Z0-9]" ;
String key = "message";
if (key.matches(invalidChars)) {
message = "Invalid key";
}
(However, the negated logic is not very readable.)
Try below Alphanumeric regex
"^[a-zA-Z0-9]$"
^ - Start of string
[a-zA-Z0-9] - multiple characters to include
$ - End of string
With validation use \A \z anchors instead of ^ $:
\\A[a-zA-Z0-9]+\\z

Regular Expression for a string that contains one or more letters somewhere in it

What would be a regular expression that would evaluate to true if the string has one or more letters anywhere in it.
For example:
1222a3999 would be true
a222aZaa would be true
aaaAaaaa would be true
but:
1111112())-- would be false
I tried: ^[a-zA-Z]+$ and [a-zA-Z]+ but neither work when there are any numbers and other characters in the string.
.*[a-zA-Z].*
The above means one letter, and before/after it - anything is fine.
In java:
String regex = ".*[a-zA-Z].*";
System.out.println("1222a3999".matches(regex));
System.out.println("a222aZaa ".matches(regex));
System.out.println("aaaAaaaa ".matches(regex));
System.out.println("1111112())-- ".matches(regex));
Will provide:
true
true
true
false
as expected
^.*[a-zA-Z].*$
Depending on the implementation, match() functions check if the entire string matches (which is probably why your [a-zA-Z] or [a-zA-Z]+ patterns didn't work).
Either use match() with the above pattern or use some sort of search() method instead.
This regexp should do it:
[a-zA-Z]
It matches as long as there's a single letter anywhere in the string, it doesn't care about any of the other characters.
[a-zA-Z]+
should have worked as well, I don't know why it didn't for you.
.*[a-zA-Z]?.*
Should get you the result you want.
The period matches any character except new line, the asterisk says this should exist zero or more times. Then the pattern [a-zA-Z]? says give me at least one character that is in the brackets because of the use of the question mark. Finally the ending .* says that the alphabet characters can be followed by zero or more characters of any type.

Java regex match all characters except

What is the correct syntax for matching all characters except specific ones.
For example I'd like to match everything but letters [A-Z] [a-z] and numbers [0-9].
I have
string.matches("[^[A-Z][a-z][0-9]]")
Is this incorrect?
Yes, you don't need nested [] like that. Use this instead:
"[^A-Za-z0-9]"
It's all one character class.
If you want to match anything but letters, you should have a look into Unicode properties.
\p{L} is any kind of letter from any language
Using an uppercase "P" instead it is the negation, so \P{L} would match anything that is not a letter.
\d or \p{Nd} is matching digits
So your expression in modern Unicode style would look like this
Either using a negated character class
[^\p{L}\p{Nd}]
or negated properties
[\P{L}\P{Nd}]
The next thing is, matches() matches the expression against the complete string, so your expression is only true with exactly one char in the string. So you would need to add a quantifier:
string.matches("[^\p{L}\p{Nd}]+")
returns true, when the complete string has only non alphanumerics and at least one of them.
Almost right. What you want is:
string.matches("[^A-Za-z0-9]")
Here's a good tutorial
string.matches("[^A-Za-z0-9]")
Lets say that you want to make sure that no Strings have the _ symbol in them, then you would simply use something like this.
Pattern pattern = Pattern.compile("_");
Matcher matcher = Pattern.matcher(stringName);
if(!matcher.find()){
System.out.println("Valid String");
}else{
System.out.println("Invalid String");
}
You can negate character classes:
"[^abc]" // matches any character except a, b, or c (negation).
"[^a-zA-Z0-9]" // matches non-alphanumeric characters.

What is the responsibility of (.*) in the Java String?

What is the responsibility of (.*) in the third line and how it works?
String Str = new String("Welcome to Tutorialspoint.com");
System.out.print("Return Value :" );
System.out.println(Str.matches("(.*)Tutorials(.*)"));
.matches() is a call to parse Str using the regex provided.
Regex, or Regular Expressions, are a way of parsing strings into groups. In the example provided, this matches any string which contains the word "Tutorials". (.*) simply means "a group of zero or more of any character".
This page is a good regex reference (for very basic syntax and examples).
Your expression matches any word prefixed and suffixed by any character of word Tutorial. .* means occurrence of any character any number of times including zero times.
The . represents regular expression meta-character which means any character.
The * is a regular expression quantifier, which means 0 or more occurrences of the expression character it was associated with.
matches takes regular expression string as parameter and (.*) means capture any character zero or more times greedily
.* means a group of zero or more of any character
In Regex:
.
Wildcard: Matches any single character except \n
for example pattern a.e matches ave in nave and ate in water
*
Matches the previous element zero or more times
for example pattern \d*\.\d matches .0, 19.9, 219.9
There is no reason to put parentheses around the .*, nor is there a reason to instantiate a String if you've already got a literal String. But worse is the fact that the matches() method is out of place here.
What it does is greedily matching any character from the start to the end of a String. Then it backtracks until it finds "Tutorials", after which it will again match any characters (except newlines).
It's better and more clear to use the find method. The find method simply finds the first "Tutorials" within the String, and you can remove the "(.*)" parts from the pattern.
As a one liner for convenience:
System.out.printf("Return value : %b%n", Pattern.compile("Tutorials").matcher("Welcome to Tutorialspoint.com").find());

Categories

Resources