Matching numbers with regex

Matching numbers with regex - java

I need to match numbers where the first character is either a
- (minus) or NOT a 0 (unless it's the only character in the string) and I'm kinda stuck. ^[-|1-9]?[0-9]+ I've currently got this but it'll match any amount of zeroes.
Examples:
Should match:
-16
25
2005
Should not match:
-05
05
00001
0-017

Try a pattern like this:
^-?[1-9][0-9]*$
This will match optional - at the start of the string, followed by a single digit from 1 to 9, followed by zero or more digits from 0 to 9. The start (^) and end ($) anchors ensure that no other characters are allowed before or after the number.
Demonstration
Update It has been pointed out that the above pattern will match any positive or negative decimal integer without leading zeros, but it will not match zero, itself. To handle that case, add an alternation to your pattern like this:
^-?[1-9][0-9]*$|^0$
Or like this:
^(-?[1-9][0-9]*|0)$

Related

Extract exactly n digits in a sentence using REGEX

Example
The no.s 1234 65
Input: n
For n=4, the output should be 1234
For n=2, the output should be : 65 (not 12)
Tried \d{n} which gives 12 and \d{n,} gives 1234 but i want the exact matching one.
Pattern p = Pattern.compile("//\d{n,}");

you need negative lookaround assertion: (?<!..): negative look behind, and (?!..): negative look ahead : regex101
(?<!\d)\d{4}(?!\d)
however not all regex engine supports them, maybe a work around may match also the preceeding character and following character (contrary to look-around which are 0 width matches), (\D matches all excpet a digit)
(?:^|\D)(\d{4})(?:\D|$)

I think what you meant is the \b character.
Hence, the regex you're looking for would be (for n=2):
\b\d{2}\b

From what I understand, you're looking for a regex that will match a number in a string which has n digits, taking into into account the spacing between the numbers. If that's the case, you're looking for something like this:
\b\d{4}\b
The \b will ensure the match is constrained to the start/end of a 'word' where a word is the boundary between anything matched by \w (which includes digits) and anything matched by the opposite, \W (which includes spaces).

I don't code in java but I can try to answer this using regex in general.
If your number is in the format d1d2d3d4 d5d6 and you want to extract digits d5d6, create 3 groups as r'([0-9]+)("/s")([0-9]+)' – each set of parenthesis () represent one group. Now, extract the third group only in another object which is your required output.

Regular expression that accepts only two digit integer or a floating number

I am trying to validate a text field that accepts number like 10.99, 1.99, 1, 10, 21.
\d{0,2}\.\d{1,2}
Above expression is only passing values such as 10.99, 11.99,1.99, but I want something that would satisfy my requirement.

Try this:
^\d{1,2}(\.\d{1,2})?$
^ - Match the start of string
\d{1,2} - Must contains at least 1 digit at most 2 digits
(\.\d{1,2}) - When decimal points occur must have a . with at least 1 and at most 2 digits
? - can have zero to 1 times
$ - Match the end of string

Assuming you don't want to allow edge cases like 00, and want at least 1 and at most 2 decimal places after the point mark:
^(?!00)\d\d?(\.\d\d?)?$
This precludes a required digit before the decimal point, ie ".12" would not match (you would have to enter "0.12", which is best practice).
If you're using String#matches(), you can drop the leading/trailing ^ and $, because that method must to match the entire string to return true.

First \d{0,2} does not seem to fit your requirement as in that case it will be valid for no number as well. It will give you the correct output but logically it does not mean to check no number in your string so you can change it to \d{1,2}
Now, in regex ? is for making things optional, you can use it with individual expression like below:
\d{1,2}\.?\d{0,2}
or you can use it on the combined expression like below
\d{1,2}(\.\d{1,2})?
You can also refer below list for further queries:
abc… Letters
123… Digits
\d Any Digit
\D Any Non-digit character
. Any Character
\. Period
[abc] Only a, b, or c
[^abc] Not a, b, nor c
[a-z] Characters a to z
[0-9] Numbers 0 to 9
\w Any Alphanumeric character
\W Any Non-alphanumeric character
{m} m Repetitions
{m,n} m to n Repetitions
* Zero or more repetitions
+ One or more repetitions
? Optional character
\s Any Whitespace
\S Any Non-whitespace character
^…$ Starts and ends
(…) Capture Group
(a(bc)) Capture Sub-group
(.*) Capture all
(abc|def) Matches abc or def
Useful link : https://regexone.com/

Can you try using this :
(\d{1,2}\.\d{1,2})|(\d{1,2})
Here is a Demo, you can check also simple program
You have two parts or two groups one to check the float numbers #.#, #.##, ##.##, ##.# and the second group to check the integer #, ##, so we can use the or |, float|integer

I think patterns of this type are best handled with alteration:
/^\s*([-+]?[0-9]*\.[0-9]+([eE][-+]?[0-9]+)?)$ #float
| # or
^(\d{1,2})$ # 2 digit int/mx
Demo

Regex to truncate trailing zeroes

I'm trying to construct a single regex (for Java) to truncate trailing zeros past the decimal point. e.g.
50.000 → 50
50.500 → 50.5
50.0500 → 50.05
-5 → -5
50 → 50
5.5 → 5.5
Idea is to represent the real number (or integer) in the most compact form possible.
Here's what I've constructed:
^(-?[.0-9]+?)\.?0+$
I'm using $1 to capture the truncated number string.
The problem with the pattern above is that 50 gets truncated to 5. I need some way to express that the 0+ must follow a . (decimal point).
I've tried using negative-behind, but couldn't get any matches.

The best solution could be using built-in language-specific methods for that task.
If you cannot use them, you may use
^(-?\d+)(?:\.0+|(\.\d*?)0+|\.+)?$
And replace with $1$2.
See the regex demo. Adjust the regex accordingly. Here is the explanation:
^ - start of string
(-?\d+) -Group 1 capturing 1 or 0 minus symbols and then 1 or more digits
(?:\.0+|(\.\d*?)0+|\.+)? - An optional (matches 1 or 0 times due to the trailing ?) non-capturing group matching 3 alternatives:
\.0+ - a decimal point followed with 1+ zeros
(\.\d*?)0+ - Group 2 capturing a dot with any 0+ digits but as few as possible and matching 1+ zeros
\.+ - (optional branch, you may remove it if not needed) - matches the trailing dot(s)
$ - end of string.
Java demo:
String s = "50.000\n50\n50.100\n50.040\n50.\n50.000\n50.500\n50\n-5";
System.out.println(s.replaceAll("(?m)^(-?\\d+)(?:\\.0+|(\\.\\d*?)0+|\\.+)?$", "$1$2"));
// => [50, 50, 50.1, 50.04, 50, 50, 50.5, 50, -5]

For a general regex which should do the trick:
^\d+?0*\.??\d*?(?=0*?[^\d]*$)
You can replace the caret and dollar sign with whatever your boundaries should be. Those could be replaced by whatever you would expect around your number.
basically:
/d+? non-greedy match for any number (needs at least 1 number to start the match)
\.*?? optional match for a decimal. Prefers to match 0 occurrences
\d*? (?=0*?[^\d]*$) - non-greedy match for a number, but would stop at the 0 which is proceeded by a non-number
EDIT: I just realized the original expression also trimmed the last zero on integers, this should work. I added the option 0 match to catch that

Regex to allow only 10 or 16 digit comma separated number

I want to validate a textfield in a Java based app where I want to allow only comma separated numbers and they should be either 10 or 16 digits. I have a regex that ^[0-9,;]+$ to allow only numbers, but it doesn't work for 10 or 16 digits only.

You can use {n,m} to specify length.
So matching one number with either 10 or 16 digits would be
^(\d{10}|\d{16})$
Meaning: match for exactly 10 or 16 digits and the stuff before is start-of-line and the stuff behind is end-of-line.
Now add separator:
^((\d{10}|\d{16})[,;])*(\d{10}|\d{16})$
Some sequences of 10-or-16 digit followed by either , or ; and then one sequece 10-or-16 with end-of-line.
You need to escape those \ in java.
public static void main(String[] args) {
String regex = "^((\\d{10}|\\d{16})[,;])*(\\d{10}|\\d{16})$";
String y = "0123456789,0123456789123456,0123456789";
System.out.println(y.matches(regex)); //Should be true
String n = "0123456789,01234567891234567,0123456789";
System.out.println(n.matches(regex)); //should be false
}

I would probably use this regex:
(\d{10}(?:\d{6})?,?)+
Explanation:
( - Begin capture group
\d{10} - Matching at least 10 digits
(?: - Begin non capture group
\d{6} - Match 6 more digits
)? - End group, mark as optional using ?
,? - optionally capture a comma
)+ - End outer capture group, require at least 1 or more to exist? (mabye change to * for 0 or more)
The following inputs match this regex
1234567890123456,1234567890
1234567890123456
1234567890
these inputs do not match
123,1234567890
12355
123456789012

You need to have both anchors and word boundaries:
/^(?:\b(?:\d{10}|\d{16})\b,?)*$/
The anchors are necessary so you don't get false positives for partial matches and the word boundaries are necessary so you don't get false positives for 20, 26, 30, 32 digit numbers.

Here is my version
(?:\d+,){9}\d+|(?:\d+,){15}\d+
Let's review it. First of all there is a problem to say: 10 or 16. So, I have to create actually 2 expressions with | between them.
Second, the expression itself. Your version just says that you allow digits and commas. However this is not what you really want because for example string like ,,, will match your regex.
So, the regex should be like (?:\d+,){n}\d+ that means: sequence of several digits terminated by comma and then sequence of several digits, e.g. 123,45,678 (where 123,45 match the first part and 678 match the second part)
Finally we get regex that I have written in the beginning of my answer:
(?:\d+,){9}\d+|(?:\d+,){15}\d+
And do not forget that when you write regex in you java code you have to duplicate the back slash, like this:
Pattern.compile("\\d+,{9}\\d+|\\d+,{15}\\d+")
EDIT: I have just added non-capturing group (?: ...... )

Using Regex, is it possible to use an expression such as 'Followed by' or 'Preceded by'

I have the following expression where i want to extract an identifier that is 12 digits long:
([12]\d{3})(\d{6})(\d{2})
This works fine if the string is in the following format:
ABCD123456789101
123456789101
When it gets a string like the following, how does it know which 12 digits to match on:
ABCD1234567894837376383439434343232
1234567894837376383439434343232
In the above scenario, i dont want to select the twelve digits. So the answer i think is to only select the twelve digits, if those twelve digits are not preceded or proceeded by other digits. I tried this change:
[^0-9]([12]\d{3})(\d{6})(\d{2})[^0-9]
This basically says get me the 12 digits only if the characters before and after the 12 digits are non numeric. The problem i have is i am also getting those non-numeric characters as part of the match i.e.
ABCD123456789483X7376383439434343232 returns D123456789483X
Is there anyway of checking what the preceding and proceeding characters are but not include them in the match result? i.e. only match if the preceding and proceeding characters are non numeric but don't include those non-numeric characters in the match result.

You can use lookarounds:
(?<!\\d)([12]\d{3})(\d{6})(\d{2})(?!\\d)
Here:
(?<!\\d) is a negative lookbehind which means your pattern is not preceded by a digit
(?!\\d) is a negative lookahead which means your pattern is not followed by a digit
Read more about lookarounds

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Matching numbers with regex - java

Related

Extract exactly n digits in a sentence using REGEX

Regular expression that accepts only two digit integer or a floating number

Regex to truncate trailing zeroes

Regex to allow only 10 or 16 digit comma separated number

Using Regex, is it possible to use an expression such as 'Followed by' or 'Preceded by'

Categories

Resources