Java REGEX allowing excluded characters [duplicate] - java

This question already has answers here:
How to match hyphens with Regular Expression?
(6 answers)
Why is this regex allowing a caret?
(3 answers)
Closed 5 years ago.
I have the REGEX below which I am expecting to exclude certain characters. These characters are correctly excluded: £"~#¬|{} but these aren't: #[]/?;:
So, for example, test£test is correctly identified as invalid, but test#test is incorrectly identified as valid.
Testing this on https://regex101.com/ identifies the problem as the brackets and indicates that I need to escape the first ( [bracket] and the - [hyphen] like this - ^[a-zA-z0-9!$%^&*\()\-_=+]+?$. On https://regex101.com the expression then behaves as expected but if I try to use escape characters like this in Java the compiler gives an error.
Any ideas how I can get this regular expression to behave as I want? Sorry if this is obvious.
final String REGEX = "^[a-zA-z0-9!$%^&*()-_=+]+?$";
System.out.println ("Please enter a password");
String password = input.next();
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(password);
if (!m.matches()){
System.out.println("Illegal characters");

Brief
^[a-zA-z0-9!$%^&*()-_=+]+?$
^^^ ^^^
The first underlined range is A-z. This matches:
ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz
The second underlined range corresponds to
)*+,-./0123456789:;<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_
Code
See regex in use here: Note the regex is only the set for the first example below. This is to show which characters it's actually matching.
Use either of the following
^[a-zA-Z0-9!$%^&*()\-_=+]+?$
^[a-zA-Z0-9!$%^&*()_=+-]+?$
^[\w!$%^&*()=+-]+?$
^[\w!$%&^(-+=-]+?$

The issue with your regex is that it contains special characters which require escaping.
All the characters referenced in this page will require escaping if they are valid in your password.
Pattern docs
Therefore you should use a regex something like the following. I have not thoroughly tested this, however, so please write some thorough unit tests to cover all legitimate possibilities.
"^[a-zA-z0-9!\\$%\\^&\\*\\(\\)\\-_\\=\\+]+?$"

Sorry - I now realise that I have a number of meta-characters which all need escaping. The following REGEX behaves as expected, with double backslashes to escape each meta character:
final String REGEX = "^[a-zA-z0-9\\!\\$%\\^&*\\(\\)\\-_\\=\\+]+?$";
If there is a more elegant way I'd love to hear it!

Related

Regex for validating a name of this particular format [duplicate]

This question already has answers here:
Regular expression for first and last name
(28 answers)
Closed 2 years ago.
I'm looking for a solution in regex. I have a full name for which the allowed characters are a-z, A-Z, space and /,-.#'
Also it should not start with a blank character/space.
So basically the following names are accepted.
Anjith Sasindran
Anjith# Sasindran'
Anjith
Anjith/,Sasi#n'ran
An-. Sasindr#
And the following are not
An%jith Sasindran
Anjit*) Sasindran
Basically anything other than the ones I listed above.
I'm not sure how to do the same. I've very little knowledge in regex b/w. So any help would be appreciated.
use:
^[^\s][ A-Za-z-'#.,/]*$
^[^\s] checks that there is no whitespace at the beginning
A-Za-z accepts all the alphabets in the given string
-'#.,/ accepts only these special characters
* matches 0 or more preceding token
With regular expressions, you can define classes of characters like the follwing:
[\sa-zA-Z#\.,\-'/].
With that, you define one "allowed" character. You can specify a sequence of it by adding plus operator : [\sa-zA-Z#\.,\-'/]+.
Now, to avoid texts starting with blank character, you can create a character class with all wanted characters except spaces [a-zA-Z#\.,\-'/].
So, now, the final regex is "any of the wanted characters except a blankspace, followed by any of wanted characters a certain number of times" : [a-zA-Z#\.,\-'/][\sa-zA-Z#\.,\-'/]*
With java, you must use Pattern class to apply regex checks on a string:
var nameMatcher = Pattern.compile("[a-zA-Z#\\.,\\-'/][\\sa-zA-Z#\\.,\\-'/]*").asPredicate();
if (nameMatcher.test("Anjhit#")) System.out.println("Name match !");
You can get a lot of information with Oracle documentation.

How to capture string with java regular expression if I do not want the first 10 characters to be "0"

I have 2 strings "0000000000ABCDEF" and "1234567890ABCDEF" and I'm trying to find out how to capture "1234567890ABCDEF" using regular expression which has a rule that the first 10 characters must not be all zeroes "0".
Edit:
Thanks for all the useful comments so far.
My apologies if there is any confusion, by capture I mean to match a regular expression with "1234567890ABCDEF". And the same regular expression should not match "0000000000ABCDEF", therefore I felt that the design I'm trying to come up with should contain a rule that checks:
1) the first 10 characters cannot be all zeroes
I tried something like this (?!0{10}).* but it atill matches "0000000000ABCDEF".
I guess I'll read up more on regular expressions.
You should just be able to use a negative look behind like this:
(?<!0{10})ABCDEF
Here is a regex101 for you to see it working: https://regex101.com/r/l7pX8c/1

Replace the $ symbol in String [duplicate]

This question already has answers here:
Java regular expressions and dollar sign
(5 answers)
Closed 3 years ago.
How to replace all "$$$" present in a String?
I tried
story.replaceAll("$$$","\n")
This displays a warning: Anchor $ in unexpected position and the code fails to work. The code takes the "$" symbol as an anchor for a regular expression. I just need to replace that symbol.
Is there any way to do this?
"$" is a special character for regular expressions.
Try the following:
System.out.println(story.replaceAll("\\$\\$\\$", "\n"));
We are escaping the "$" character with a '\' in the above code.
There are several ways you can do this. It depends on what you want to do, and how elegant your solution is:
String replacement = "\n"; // The replacement string
// The first way:
story.replaceAll("[$]{3}", replacement);
// Second way:
story.replaceAll("\\${3}", replacement);
// Third way:
story.replaceAll("\\$\\$\\$", replacement);
You can replace any special characters (Regular Expression-wise) by escaping that character with a backslash. Since Java-literals use the backslash as escaping-character too, you need to escape the backslash itself.
story.replaceAll("\\${3}", something);
By using {3}behind the $, you say, that it should be found exactly three times. Looks a bit more elegant than "\\$\\$\\$".
something is thus your replacement, for example "" or \n, depending on what you want.
this will surely work..
story.replaceAll("\\$\\$\\$","\n")
YOu can do this for any special character.

Regular expression not working despite testing

I'm trying to enforce validation of an ID that includes the first two letters being letters and the next four being numbers, there can be one 0 i.e. 0333 but can never be full zeroes with 0000 therefore something like ID0000 is not allowed. The expression I came up with seems to check out when testing it online but doesn't seem to work when trying to enforce it in the program:
\b(?![A-Z]{2}[0]{4})[A-Z]{2}[0-9]{4}\b
and heres the code I'm currently using to implement it:
String pattern = "/\b(?![A-Z]{2}[0]{4})[A-Z]{2}[0-9]{4}\b/";
Pattern regEx = Pattern.compile(pattern);
String ingID = ingredID.getText().toString();
Matcher m = regEx.matcher(ingID);
if (m.matches()) {
ingredID.setError("Please enter a valid Ingrediant ID");
}
For some reason it doesn't seem to validate correctly with accepting ids like ID0000 when it shouldn't be. Any thoughts folks ?
Change your regex pattern to "\\b(?![A-Z]{2}[0]{4})[A-Z]{2}[0-9]{4}\\b"
Your problem is essentially that Java isn't all that Regex-friendly; you need to deal with the limitations of Java strings in order to create a string that can be used as a Regex pattern. Since \ is the escape character in Regex and the escape character in Java strings (and since there's no such thing as a raw string literal in Java), you must double-escape anything that must be escaped in the Regex in order to create a literal \ character within the Java string, which, when parsed as a Regex pattern, will be correctly treated as the escape character.
So, for instance, the Regex pattern /\b/ (where /, as mentioned in my comment, delimits the pattern itself) would be represented in Java as the string "\\b".

How do I write a regular expression to find the following pattern?

I am trying to write a regular expression to do a find and replace operation. Assume Java regex syntax. Below are examples of what I am trying to find:
12341+1
12241+1R1
100001+1R2
So, I am searching for a string beginning with one or more digits, followed by a "1+1" substring, followed by 0 or more characters. I have the following regex:
^(\d+)(1\\+1).*
This regex will successfully find the examples above, however, my goal is to replace the strings with everything before "1+1". So, 12341+1 would become 1234, and 12241+1R1 would become 1224. If I use the first grouped expression $1 to replace the pattern, I get the wrong result as follows:
12341+1 becomes 12341
12241+1R1 becomes 12241
100001+1R2 becomes 100001
Any ideas?
Your existing regex works fine, just that you are missing a \ before \d
String str = "100001+1R2";
str = str.replaceAll("^(\\d+)(1\\+1).*","$1");
Working link
IMHO, the regex is correct.
Perhaps you wrote it wrong in the code. If you want to code the regex ^(\d+)(1\+1).* in a string, you have to write something like String regex = "^(\\d+)(1\\+1).*".
Your output is the result of ^(\d+)(1+1).* replacement, as you miss some backslash in the string (e.g. "^(\\d+)(1\+1).*").
Your regex looks fine to me - I don't have access to java but in JavaScript the code..
"12341+1".replace(/(\d+)(1\+1)/g, "$1");
Returns 1234 as you'd expect. This works on a string with many 'codes' in too e.g.
"12341+1 54321+1".replace(/(\d+)(1\+1)/g, "$1");
gives 1234 5432.
Personally, I wouldn't use a Regex at all (it'd be like using a hammer on a thumbtack), I'd just create a substring from (Pseudocode)
stringName.substring(0, stringName.indexOf("1+1"))
But it looks like other posters have already mentioned the non-greedy operator.
In most Regex Syntaxes you can add a '?' after a '+' or '*' to indicate that you want it to match as little as possible before moving on in the pattern. (Thus: ^(\d+?)(1+1) matches any number of digits until it finds "1+1" and then, NOT INCLUDING the "1+1" it continues matching, whereas your original would see the 1 and match it as well).

Categories

Resources