Java Regex of String start with number and fixed length

Java Regex of String start with number and fixed length - java

I made a regular expression for checking the length of String , all characters are numbers and start with number e.g 123
Following is my expression
REGEX =^123\\d+{9}$";
But it was unable to check the length of String. It validates those strings only their length is 9 and start with 123.
But if I pass the String 1234567891 it also validates it. But how should I do it which thing is wrong on my side.

Like already answered here, the simplest way is just removing the +:
^123\\d{9}$
or
^123\\d{6}$
Depending on what you need exactly.
You can also use another, a bit more complicated and generic approach, a negative lookahead:
(?!.{10,})^123\\d+$
Explanation:
This: (?!.{10,}) is a negative look-ahead (?= would be a positive look-ahead), it means that if the expression after the look-ahead matches this pattern, then the overall string doesn't match. Roughly it means: The criteria for this regular expression is only met if the pattern in the negative look-ahead doesn't match.
In this case, the string matches only if .{10} doesn't match, which means 10 or more characters, so it only matches if the pattern in front matches up to 9 characters.
A positive look-ahead does the opposite, only matching if the criteria in the look-ahead also matches.
Just putting this here for curiosity sake, it's more complex than what you need for this.

Try using this one:
^123\\d{6}$
I changed it to 6 because 1, 2, and 3 should probably still count as digits.
Also, I removed the +. With it, it would match 1 or more \ds (therefore an infinite amount of digits).

Based on your comment below Doorknobs's answer you can do this:
int length = 9;
String prefix = "123"; // or whatever
String regex = "^" + prefix + "\\d{ " + (length - prefix.length()) + "}$";
if (input.matches(regex)) {
// good
} else {
// bad
}

Related

Please justify the output in Regex Java program

I have came across one Java program in Regex .
Below is the program code :
import java.util.regex.*;
public class Regex_demo01 {
public static void main(String[] args) {
boolean b=true;
Pattern p=Pattern.compile("\\d*");
Matcher m=p.matcher("ab34ef");
while(b=m.find())
{
System.out.println(b);
System.out.println(">"+m.start()+"\t"+m.group()+"<");
}
}
}
Output :
true
>0 <
true
>1 <
true
>2 34<
true
>4 <
true
>5 <
true
>6 <
Doubt : As we all know that The find() method returns true if it gets a match and remembers the start position of the match. If find() returns true, you can call the start() method to get the starting position of the match, and you can call the group() method to get the string that represents the actual bit of source data that was matched.
My question is how come ">6 <" is present is the output when the string indexing is till index 5 ?

Anser is simple. x* matche any count of x even 0.
Replace * to + which matche to 1 or more element that is left to it.

My question is how come >6 < is present is the output when the string indexing is till index 5 ?
That behavior is due to your regex i.e. \\d* which matches 0 or more digits.
As you can see it is showing start position 0 as well when there is no digit at the start.
Similarly 6 is last index +1 because there is an empty match past the last character as well.
You should use \\d+ as your regex.

The star quantifier (*) is defined as "zero or more times". That said, your pattern matches zero digits most of the time.
What you actually want is probably the plus quantifier (+), which means "one or more times".
Source: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
Why is there a match at index 6?
RegEx doesn't work on a char-basis, but rather inbetween single chars. When matching an empty string, it will look before and after every character. Duplicate findings are omitted, of course, so an empty string after the first char and before the second char will yield one match instead of two. By default the algorithm is greedy, which means it will match as many characters as possible.
Consider this example:
Input string is 1
RegEx is \\d*
In this case the RegEx engine starts before the first character and tries to match zero, one or more digits. Since it's greedy, it doesn't stop after the empty string it finds at the beginning. It finds a '1' with no digits following. This is the first match. Then it continues the search after the match. It finds an empty string and matches it too, since that equals zero digits.
For RegEx the string '1' looks rather like this:
"" + "1" + ""
The first two units (empty string and the "1") match the pattern, the third, empty string does, too.
In-depth article about this: http://www.regular-expressions.info/zerolength.html

Java regular expression for UserName with length range

I am writing a regular expression to validate UserName.
Here is the rule:
Length: 6 - 20 characters
Must start with letter a-zA-Z
Can contains a-zA-Z0-9 and dot(.)
Can't have 2 consecutive dots
Here is what I tried:
public class TestUserName {
private static String USERNAME_PATTERN = "[a-z](\\.?[a-z\\d]+)+";
private static Pattern pattern = Pattern.compile(USERNAME_PATTERN, CASE_INSENSITIVE);
public static void main(String[] args) {
System.out.println(pattern.matcher("user.name").matches()); // true
System.out.println(pattern.matcher("user.name2").matches()); // true
System.out.println(pattern.matcher("user2.name").matches()); // true
System.out.println(pattern.matcher("user..name").matches()); // false
System.out.println(pattern.matcher("1user.name").matches()); // false
}
}
The pattern I used is good but no length constraint.
I tried to append {6,20} constraint to the pattern but It failed.
"[a-z](\\.?[a-z\\d]+)+{6,20}" // failed pattern to validate length
Anyone has any ideas?
Thanks!

You can use a lookahead regex for all the checks:
^[a-zA-Z](?!.*\.\.)[a-zA-Z.\d]{5,19}$
Using [a-zA-Z.\d]{5,19} because we have already matched one char [a-zA-Z] at start this making total length in the range {6,20}
Negative lookahead (?!.*\.\.) will assert failure if there are 2 consecutive dots
Equivalent Java pattern will be:
Pattern p = Pattern.compile("^[a-zA-Z](?!.*\\.\\.)[a-zA-Z.\\d]{5,19}$");

Use a negative look ahead to prevent double dots:
"^(?!.*\\.\\.)(?i)[a-z][a-z\\d.]{5,19}$"
(?i) means case insensitve (so [a-z] means [a-zA-Z])
(?!.*\\.\\.) means there isn't two consecutive dots anywhere in it
The rest is obvious.
See live demo.

I would use the following regex :
^(?=.{6,20}$)(?!.*\.\.)[a-zA-Z][a-zA-Z0-9.]+$
The (?=.{6,20}$) positive lookahead makes sure the text will contain 6 to 20 characters, while the (?!.*\.\.) negative lookahead makes sure the text will not contain .. at any point.

This will also suffice (for only matching)
(?=^.{6,20}$)(?=^[A-Za-z])(?!.*\.\.)
For capturing, the matched pattern, you can use
(?=^.{6,20}$)(?=^[A-Za-z])(?!.*\.\.)(^.*$)

regex to strip leading zeros treated as string

I have numbers like this that need leading zero's removed.
Here is what I need:
00000004334300343 -> 4334300343
0003030435243 -> 3030435243
I can't figure this out as I'm new to regular expressions. This does not work:
(^0)

You're almost there. You just need quantifier:
str = str.replaceAll("^0+", "");
It replaces 1 or more occurrences of 0 (that is what + quantifier is for. Similarly, we have * quantifier, which means 0 or more), at the beginning of the string (that's given by caret - ^), with empty string.

Accepted solution will fail if you need to get "0" from "00". This is the right one:
str = str.replaceAll("^0+(?!$)", "");
^0+(?!$) means match one or more zeros if it is not followed by end of string.
Thank you to the commenter - I have updated the formula to match the description from the author.

If you know input strings are all containing digits then you can do:
String s = "00000004334300343";
System.out.println(Long.valueOf(s));
// 4334300343
Code Demo
By converting to Long it will automatically strip off all leading zeroes.

Another solution (might be more intuitive to read)
str = str.replaceFirst("^0+", "");
^ - match the beginning of a line
0+ - match the zero digit character one or more times
A exhausting list of pattern you can find here Pattern.

\b0+\B will do the work. See demo \b anchors your match to a word boundary, it matches a sequence of one or more zeros 0+, and finishes not in a word boundary (to not eliminate the last 0 in case you have only 00...000)

The correct regex to strip leading zeros is
str = str.replaceAll("^0+", "");
This regex will match 0 character in quantity of one and more at the string beginning.
There is not reason to worry about replaceAll method, as regex has ^ (begin input) special character that assure the replacement will be invoked only once.
Ultimately you can use Java build-in feature to do the same:
String str = "00000004334300343";
long number = Long.parseLong(str);
// outputs 4334300343
The leading zeros will be stripped for you automatically.

I know this is an old question, but I think the best way to do this is actually
str = str.replaceAll("(^0+)?(\d+)", "$2")
The reason I suggest this is because it splits the string into two groups. The second group is at least one digit. The first group matches 1 or more zeros at the start of the line. However, the first group is optional, meaning that if there are no leading zeros, you just get all of the digits. And, if str is only a zero, you get exactly one zero (because the second group must match at least one digit).
So if it's any number of 0s, you get back exactly one zero. If it starts with any number of 0s followed by any other digit, you get no leading zeros. If it starts with any other digit, you get back exactly what you had in the first place.

Here is the simple and proper solution.
str = str.replaceAll(/^0+/g, "");
Global Flag g is required when using replaceAll with regex

Question mark in regular expression matching is not working

I am new to regular expressions. I thought that this would return matched succesfully, but it does not. Why is that so?
String myString = "SUB_HEADER5_LABEL";
if (myString.matches(Pattern.quote("SUB_HEADER?_LABEL")))
{
System.out.println("matched succesfully");
}

Pattern.qoute() will create a Pattern that only matches exactly the given String. you need
if (myString.matches("SUB_HEADER\\d_LABEL"))
If you expect the number to exceed 9, add the + quantifier like
if (myString.matches("SUB_HEADER\\d+_LABEL"))
if you want to match a digit where you have ? (wich means one or zero R in your case, because it's a quantifier). you need to replace it with [0-9] or \\d.

Java: Change an var's value (String) according to the value of an regex

I would like to know if it is possible (and if possible, how can i implement it) to manipulate an String value (Java) using one regex.
For example:
String content = "111122223333";
String regex = "?";
Expected result: "1111 2222 3333 ##";

With one regex only, I don't think it is possible. But you can:
first, replace (?<=(.))(?!\1) with a space;
then, use a string append to append " ##".
ie:
Pattern p = Pattern.compile("(?<=(.))(?!\\1)");
String ret = p.matcher(input).replaceAll(" ") + " ##";
If what you meant was to separate all groups, then drop the second operation.
Explanation: (?<=...) is a positive lookbehind, and (?!...) a negative lookahead. Here, you are telling that you want to find a position where there is one character behind, which is captured, and where the same character should not follow. And if so, replace with a space. Lookaheads and lookbehinds are anchors, and like all anchors (including ^, $, \A, etc), they do not consume characters, this is why it works.

OK, since the OP has redefined the problem (ie, a group of 12 digits which should be separated in 3 groups of 4, then followed by ##, the solution becomes this:
Pattern p = Pattern.compile("(?<=\\d)(?=(?:\\d{4})+$)");
String ret = p.matcher(input).replaceAll(" ") + " ##";
The regex changes quite a bit:
(?<=\d): there should be one digit behind;
(?=(?:\d{4})+$): there should be one or more groups of 4 digits afterwards, until the end of line (the (?:...) is a non capturing grouping -- not sure it really makes a difference for Java).
Validating that the input is 12 digits long can easily be done with methods which are not regex-related at all. And this validation is, in fact, necessary: unfortunately, this regex will also turn 12345 into 1 2345, but there is no way around that, for the reason that lookbehinds cannot match arbitrary length regexes... Except with the .NET languages. With them, you could have written:
(?<=^(?:\d{4})+)(?=(?:\d{4})+$

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Regex of String start with number and fixed length - java

Try using this one: ^123\\d{6}$ I changed it to 6 because 1, 2, and 3 should probably still count as digits. Also, I removed the +. With it, it would match 1 or more \ds (therefore an infinite amount of digits).

Based on your comment below Doorknobs's answer you can do this: int length = 9; String prefix = "123"; // or whatever String regex = "^" + prefix + "\\d{ " + (length - prefix.length()) + "}$"; if (input.matches(regex)) { // good } else { // bad }

Related

Please justify the output in Regex Java program

Java regular expression for UserName with length range

regex to strip leading zeros treated as string

Question mark in regular expression matching is not working

Java: Change an var's value (String) according to the value of an regex

Categories

Resources