Regular expression to match simple "id" values? - java

I need regex for a line that starts with two characters followed by 2-4 digits or 2-4 digits followed by "-" and followed by 2-4 digits.
Examples:
AB125
AC123-25
BT1-2535
Seems simple , but I got stuck with it ...

Regular expressions always seem simple, right up to the point where you try to use them :-)
This particular one can be done with something along the lines of:
^[A-Z]{2}([0-9]{2,4}-)?[0-9]{2,4}$
That's:
2 alpha (uppercase) characters.
an optional 2-to-4-digit and hyphen sequence.
a mandatory 2-to-4-digit sequence.
start and end markers.
That last one, BT1-2535, doesn't match your textual specification by the way since it only has one digit before the hyphen. I'm assuming that was a typo. You will also have to change the character bit to use [A-Za-z] if you want to allow lowercase as well.

How about:
^[A-Z]{2}\d{2,4}(?:-\d{2,4})?
This matches two uppercase letters followed by 2-4 digits, followed by (optionally) a hyphen and another 2-4 digits.

Related

Need regex for a string having characters followed by even number of digits

Can anyone tell how I can write regex for a string that take one or more alphanumeric character followed by an even number of digits?
Valid:
a11a1121
bbbb11a1121
Invalid:
a11a1
I have tried ^[a-zA-Z*20-9]*$ but it is always giving true.
Can you please help in this regard?
The regex that you have mentioned will search for any number of [either a-z, or A-Z or 2 or 0-9]
You can break down your requirement to groups and then handle it accordingly.
Like you require at least one character. so you start with ^([a-zA-Z]+)$
Then you need numbers in the multiple of 2. so you add ^([a-zA-Z]+(\d\d)+)$
Now you need any number of combination of these. So the exp becomes: ^([a-zA-Z]+(\d\d)+)*$
You can use online tools like regex101 for these purpose. The provided regex in action here
You can achieve it with this regexp: ^[a-z0-9]*[a-z]+([0-9]{2})*$
Explanation :
[a-z0-9]*[a-z]+: a string of at least one character terminated by a non digit one
([0-9]{2})*: an odd sequence of digits (0 or 2*n digits). If the even sequence cannot be null, use ([0-9]{2})+ instead.

Regex to replace repeated characters

Can someone give me a Java regex to replace the following.
If I have a word like this "Cooooool", I need to convert this to "Coool" with 3 o's. So that I can distinguish it with the normal word "cool".
Another ex: "happyyyyyy" should be "happyyy"
replaceAll("(.)\\1+","$1"))
I tried this but it removes all the repeating characters leaving only one.
Change your regex like below.
string.replaceAll("((.)\\2{2})\\2+","$1");
( start of the first caturing group.
(.) captures any character. For this case, you may use [a-z]
\\2 refers the second capturing group. \\2{2} which must be repeated exactly two times.
) End of first capturing group. So this would capture the first three repeating characters.
\\2+ repeats the second group one or more times.
DEMO
I think you might want something like this:
str.replaceAll("([a-zA-Z])\\1\\1+", "$1$1$1");
This will match where a character is repeated 3 or more times and will replace it with the same character, three times.
$1 only matches one character, because you're surrounding the character to match.
\\1\\1+ matches the character only, if it occurs at least three times in a row.
This call is also a lot more readable, than having a huge regex and only using one $1.

Regex Query in Java Program

^[0-9]\\d*(\\.\\d+)?$
I can't quite work out what the above regex pattern is looking for. I'm tempted to interpret it as "find anything that is not the numbers 0-9 inclusive, then find zero or more occurrences of a single digit, then find zero or one occurrences of a decimal point followed by at least one digit" but I'm not sure.
Part of my confusion stems from the fact that in the SCJP6 certification book, the not operator is included inside the square brackets, whereas here it's outside. Also, I am just generally inexperience when it comes to regex.
Can someone please help? [This is from a Java program. Is the above in any way Java specific?] Thanks.
^ start of a string
[0-9] a single digit
\\d* any amount of digits (0-infinity)
(\\.\\d+)? Once, or not at all: a dot followed by at least one digit
$ end of string.
You have a complicated regex that will match any floating point or non floiting point number.
Have a look at the java.util.Pattern class and and the Oracle Java Regex Tutorial.
It is looking a one or more digits, optionally followed by a . and one or more digits. It is confusing as it is needlessly complicated. It is the same as
^\\d+(\\.\\d+)?$
\d is defined as A digit: [0-9]
When the "^" operator is outside of a character class "[]" it denotes the start of input, "$" defines end of input.
So your description is correct, but it should be changed to:
find a single digit from zero to nine...
for more information about regular expressions check out this link

Reg Expression Validation on a String

Can I use Reg Expression for the following use case?
I Need to write a boolean method which takes a String parameter that should satisfy following conditions.
20 character length string.
First 9 characters will be a number
Next 2 characters will be alphabets
Next 2 characters will be a number.(1 to 31 or 99)
Next 1 character will be an alphabet
Last 6 characters will be a number.
In this, I have wrote the code for the first requirement:
[a-zA-Z0-9]{20} - This expression works well for the first case. I don't know how to write a complete reg expression to meet the entire requirement.
Please help.
Yes, it is possible to use regexes for this.
Ignore the "20 characters" part and describe a string created by concatenating 9 digits, 2 letters, 2 digits, 1 letter and another digit.
Start with the string start: ^
Then 9 digits. The \d conveniently describes the character set [0-9], so \d{9} means "nine digits"
Then 2 letters. The \w class is too broad, so stick to [a-zA-Z] for a letter.
Then another two digits. They seem to be from a restricted set, so describe the set with alternation and grouping.
Then another letter and another digit.
And, finally, you have to end at the end of the string: $
For reference, this regex means "the string is nine letters, then 12-15 or 99, then another letter":
^[a-zA-Z]{9}(1[2-5]|99)[a-zA-Z]$
Read the String JavaDocs, especially the part about String.matches() as well as the documentation about regular expressions in Java.
Your first requirement is already implicit in the remaining ones, so I would just skip it. Then, just write the regex code that matches each part one after the other:
[0-9]{9}[a-zA-Z]{2}...
There is one special consideration for the number that might be 1 to 31. While it is possible to match this in one regex, it would be verbose and difficult to understand. Instead, perform basic matching in the regex and extract this part as a capturing group by putting it into parentheses:
([0-9]{2})
If you use Pattern and Matcher to apply your regex, and your string matches the pattern, you can then easily get at just thost two characters, use Integer.parseInt() to convert them to an integer (which is completely safe because you know the two characters are digits), and then check the value normally.
This regular expression takes
^[0-9]{9}[a-zA-Z]{2}([1-9]|[1-2][0-9]|3[0-1]|99)[a-zA-Z]([0-9]{6})$
takes
9 letters at start,
Followed by 2 alphabets,
Followed by number between 1 to 31 or 99,
Followed by an alphabet,
followed by 6 digits.

regex pattern java symbols

I am looking for a regex pattern in Java that corresponds to all characters except the letters a to z.
In other words, I want a regex pattern that corresponds to symbols such as
!"#¤%&/()=?`´\}}][{€$#
Or some way to trim a string into letters only.
As an example lets consider the following string:
"one!#"¤%()=) two}]}[()\ three[{€$"
to:
"one two three"
The Unicode version would be
\PL
\PL are all Unicode code points that does not have the property "Letter".
\pL would be the counterpart, all Unicode code points that does have the property "Letter".
Maybe you can fine here on regular-expressions.info some properties that match your needs better.
You can also combine them into character classes, the same than you would handle predefined classes, e.g.
[^\pl\pN]
Would match any character that is not a letter or a digit numeric character in Unicode.
As an example lets consider the following string:
"one!#"¤%()=) two}]}[()\ three[{€$"
to:
"one two three"
The pattern needed is to match everything that is neither a letter nor a separator. Otherwise you would end up with "onetwothree" instead of the "one two three" you asked for.
[^\pL\pZ]
[^a-zA-Z] is a character class that matches every character apart from the letters a to z in lower or upper case.
The simplest form : [^a-z]
Could also be [^a-zA-Z] if you want to remove uppercase letters also.

Categories

Resources