Regex with no leading dot and maximum one leading zero - java

How can I write a regular expression that matches a string with the following properties?:
Contains numbers as well as only dot which is decimal separator (But dot is not necessary which means it can be 123 or 123.4 or 123.56).
No leading dot (Not .12).
Leading zero can be written only if it is followed by a dot (can not be like 000.12, only 0.12).
Have only 2 decimal places.

To the left of the decimal point you want a number (1 or more digits) that doesn't start with a zero:
[1-9][0-9]*
Or it can be just a zero:
0|[1-9][0-9]*
The value may have a decimal point and 1-2 digits after the decimal point:
\.[0-9]{1,2}
Left side is required. Decimal point and fractional digits are optional:
(?:0|[1-9][0-9]*)(?:\.[0-9]{1,2})?
The first non-capturing group is needed to limit the scope of the | pattern. The second non-capturing group is needed to make combined "decimal point and fractional digit" pattern optional.
Note that this will allow trailing zeroes, e.g. 100.00
Depending on preference, [0-9] can also be written as \d. I'd normally use \d, but since regex also has [1-9], I liked [0-9] better here as I felt it helped clarify the difference.
Depending on how regex is used, you may need to add the ^ begin / $ end anchors. They are needed when using find(), and are not needed when using matches() but don't hurt:
^(?:0|[1-9][0-9]*)(?:\.[0-9]{1,2})?$

Using negative look-ahead to ensure the string doesn't start with zero and another digit (but can be just zero, or zero followed by a dot)
^(?!0\d)\d+(?:\.\d{1,2})?$
Explanation and sample: https://regex101.com/r/7ymqcn/1
P.S. Also more efficient than Andreas' answer (takes fewer steps to match)

You could try converting the input string to float (that would take care of removing leading zeros and the dot without zero, as well adding ".00" if the input is only an integer, and you can set the maximum decimal numbers. If an exception is thrown at the conversion, the input doesn't match what you want. You can then convert the float back to string or keep it as a float value for calculation.
See:
https://docs.oracle.com/javase/7/docs/api/java/lang/Float.html#valueOf(java.lang.String)

Related

Regex for latitude with required 6 decimal places

I need a regex in Java that will check if a String representation of a double has required 6 decimal places. Before the decimal point, value can be positive or negative.
1.123456 - correct
-123123123.123456 - correct
123123123.123456 - correct
-123123123.123456 - correct
1.12345 - wrong
-.123456 - wrong
.123456 - wrong
.12345 - wrong
123456 - wrong
I tried:
^\s*(?=.*[1-9])\d*(\.\d{6})?\s*$
but it doesn't cover all edges.
Try this:
^\s*(-|\+)?(0|[1-9]\d*)\.\d{6}\s*$
See live demo.
This allows the first digits to be zero only if it's the only digit before the dot, eg 0.123456 is OK, but not 01.123456. \.\d{6} requires exactly 6 decimal places.
The valid input should
start with optional whitespaces --->^\s*
then optional - or +--->(-|\+)?
then one or multiple digits--->\d+
then one dot ---> .
then six digits --->(\d{6})
end with optional whitespaces --->^\s*
Try this:
^\s*(-|\+)?\d+\.(\d{6})\s*$
In your regex the positive lookahead (?=.*[1-9]) asserts that what is on the right side should contain a digit which will succeed for all examples. After that assertion you match zero or more digits \d* followed by a part that optionally matches a dot and 6 digits (\.\d{6})? so this will match .588888 or also 1.
If you want to match an optional minus sign you could use -?
For your example data you might use:
^-?\d+\.\d{6}$
In Java:
String regex = "^-?\\d+\\.\\d{6}$";
Explanation
^ Assert the start of the line
-? Match an optional minus sign
\d+\.\d{6} Match one or more digits, a dot and 6 digits
$ Assert the end of the line
Demo

Regex for matching different float formats

I'm looking for a regex in scala to match several floats:
9,487,346 -> should match
9.487.356,453->should match
38,4 -> match
-38,4 -> should match
-38.5
-9,487,346.76
-38 -> should match
So basically it should match a number that:
Numbered lists are easy
possibly gave thousand separators (either comma or dot)
possibly are decimal again with either comma or dot as separator
Currently I'm stuck with
val pattern="\\d+((\\.\\d{3}+)?(,\\d{1,2}+)?|(,\\d{3}+)?(\\.\\d{1,2}+)?)"
Edit: I'm mostly concered with European Notation.
Example where the current pattern not matches: 1,052,161
I guess it would be close enough to match that the String only contains numbers,sign, comma and dot
If, as your edit suggests, you are willing to accept a string that simply "contains numbers, sign, comma and dot" then the task is trivial.
[+-]?\d[\d.,]*
update
After thinking it over, and considering some options, I realize that your original request is possible if you'll allow for 2 different RE patterns, one for US-style numbers (commas before dot) and one for Euro-style numbers (dots before comma).
def isValidNum(num: String): Boolean =
num.matches("[+-]?\\d{1,3}(,\\d{3})*(\\.\\d+)?") ||
num.matches("[+-]?\\d{1,3}(\\.\\d{3})*(,\\d+)?")
Note that the thousand separators are not optional, so a number like "1234" is not evaluated as valid. That can be changed by adding more RE patterns: || num.matches("[+-]?\\d+")
Based on your rules,
It should match a number that:
Numbered lists are easy
possibly gave thousand separators (either comma or dot)
possibly are decimal again with either comma or dot as separator
Regex:
^[+-]?\d{1,3}(?:[,.]\d{3})*(?:[,.]\d+)?$
[+-]? Allows + or - or nothing at the start
\d{1,3} allows one to 3 digits
([,.]\d{3}) allows . or , as thousands separator followed by 3 digits (* allows unlimited such matches)
(?:[,.]\d+)? allows . or , as decimal separator followed by at least one digit.
This matches all of the OP's example cases. Take a look at the demo below for more:
Regex101 Demo
However one limitation is it allows . or , as thousand separator and as decimal separator and doesn't validate that if , is thousands separator then . should be decimal separator. As a result the below cases incorrectly show up as matches:
201,350,780,88
211.950.266.4
To fix this as well, the previous regex can have 2 alternatives - one to check for a notation that has , as thousands separator and . as decimal, and another one to check vice-versa. Regex:
^[+-]?\d{1,3}(?:(?:(?:\.\d{3})*(?:\,\d+)?)|(?:(?:\,\d{3})*(?:\.\d+)?))$
Regex101 Demo
Hope this helps!

Java Regular expression for replacing specific strings

I want to replace numbers in a string if it is more than 3 digits (Phone numbers should be replaced) and it should not replace the number if it is followed by $ and if the number has decimal points. I used the below expression.
"\d{3,}+(?!\$/\.)"
Issues I face are , it is replacing numbers that are more than ten digits as i want to replace some numbers which are some ID's with more than 10 digits. Also if a number has more than 3 digits after the decimal , those numbers are also getting replaced. I dont want a number to be replaced if it has decimal points. can some body help?
For Eg, say a number string "3452678916381914". Actually it has to be replaced. But the above regex not replacing that. For numbers like $1234,45.567 - those numbers shouldn't be replaced. But above regex replacing 45.567
use lookahead and lookbehind regex, 1st assert start word boundary is not precede by a $ or ., then assert end word boundary is not follow by a $ or .
It works for both example you provided, you might need to tweak a little bit to handle some corner case
(?<![\$\.])\b\d{3,}\b(?![\$\.])
see demo, it match the first 2 but not the rest
3452678916381914 # match
1234 56789 # match
$1234,45.567
$1234
12.345
12345.6678
123$

Regular expression to identify all numerics, across all localization formats

I'm scanning a text with a Scanner object, let's say lineScanner. Here are the declarations:
String myText= "200,00/100,00/28/65.36/21/458,696/25.125/4.23/6.3/4,2/659845/4524/456,65/45/23.495.254,3";
Scanner lineScanner = new Scanner(myText);
With that Scanner, I would like to find the first BigDecimal, and after the second one, and so on. I declared a BIG_DECIMAL_PATTERN to match any case.
Here are the rules I defined:
Thousands separator is always followed by exactly 3 digits
There is always exactly 1 or 2 digits after the decimal point.
If the thousands separator is the comma symbol, so the decimal point is the dot symbol and conversely
Thousands separator is optional, as decimal part of the number
String nextBigDecimal = lineScanner.findInLine(BIG_DECIMAL_PATTERN);
Now, here is the BIG_DECIMAL_PATTERN I declared:
private final String BIG_DECIMAL_PATTERN=
"\\d+(\\054\\d{3}+)?(\\056\\d{1,2}+)?|\\d+(\\056\\d{3}+)?(\\054\\d{1,2}+)?)";
\\054 is the ASCII octal representation of ","
\\056 is the ASCII octal representation of "."
My problem is that it doesn't work well because when the pattern of the first part is found, the second part (after the |) is not checked and in my example
the first match will be 200 and not 200,00. So I can try this:
private final String BIG_DECIMAL_PATTERN=\\d+([.,]\\d{3}+)?([,.]\\d{1,2}+)?
But there is a new problem: comma and dot are not exclusive, I mean if one is the thousands separator, the decimal point should be the other one.
Thanks for helping.
I believe a variant of your 2nd RegEx will work for you. Consider this regex:
^\\d+(?:([.,])\\d{3})*(?:(?!\\1)[.,]\\d{1,2})?$
Live Demo: http://www.rubular.com/r/vHlEdBMhO9
Explanation: What it does is to first capture the comma or dot in capture group # 1. And then later makes sure same capture group # 1 doesn't appear at decimal point using negative lookahead. Which in other words ensures that if comma comes first then dot will come later and viceversa.
Could you do an either-or regular expression? E.g. something like:
private final String BIG_DECIMAL_PATTERN
= "\\d+((\\.\\d{3}+)?(,\\d{1,2}+)?|(,\\d{3}+)?(\\.\\d{1,2}+)?)"
Note - I haven't checked whether your regex actually works - and suspect this may not be the best way of achieving what you are trying to do. All I'm doing here to get you up and running is suggesting you could try using (regex1|regex2) where regex1 is dots followed by commas and regex2 is commas followed by dots.

Understanding a regex expression

This expression evaluates a string to see if every character is a digit. I don't understand the -?. I know that ? means once or no times, but I'm not sure what putting dash in front of it means.
-?\d+
This is needed because an integer may be negative in which case it will start with a minus (-). So what you do here is to check for sequence of 1 or more digits optionally preceded by a single minus.
It is not a special character. The dash is there to allow negative numbers.

Categories

Resources