Regular Expression to replace integers with floats - java

We're trying to replace integer values with float values in a String, for example:
#var1 * #var2/ 1+100 - 2 + 1.5 - .5
The regular expression should match 1, 100 and 2, but not numbers which are already floats, eg 1.5 and .5
I've gotten as far as /[^\w](\d+)/, which finds digits by themselves.
Now, how do I exclude numbers from this regular expression, that are followed by \.?\d+?
The RegEx should work in Java or Actionscript 3.
Regular Expression Test

This will work in Java: /(?<![.\w])\d+(?![.\w])/. It uses both lookahead and lookbehind to stop matching digits that are either preceeded or succeeded by a dot/letter.

Why no trying lookahead? I believe it works with Java, no clue about ActionScript.
[^\w](\d+)(?![.]\d+)/
Would match only those sequences of digits not immediately followed by a dot integer(s).

You can use a negative look ahead for this
(?<!\.\d+)
would exclude this, but you need to combine this with an anchor otherwise you will get a parital match.
/(?<!\B|\.)(\d+)(?!\.\d+)\b/
You should also change the non-word character before the digits. I used here a negative lookbehind assertion (?<!\B|\.). it ensures that there is no dot before the digit or not a non word boundary (double negation to match on a word boundary.)
See it here on Regexr

If your regex syntax supports lookahead, you can use that. Java does; not sure about AS3.
/[^\w.]\d+(?!\.)/
Note that in this case you'd want to use the entire matched string in the replacement.

My suggestion is
\b(?<!\.)\d+(?!\.)\b

Related

Expression to capture only 1 occurrence for a single character but multiple for others

I am trying to use the following regex to capture following values. This is for use in Java.
(\$|£|$|£)([ 0-9.]+)
Example values which I do want to be captured via above regex which works.
$100
$100.5
$100
$100.6
£200
£200.6
But the following as gets captured which is wrong. I only want to capture values when thereis only 1 dot in the text. Not multiples.
£200.15.
£200.6.6.6.6
Is there a way to select such that multiple periods doesn't count?
I can't do something like following cos that would affect the numbers too. Please advice.
(\$|£|$|£)([ 0-9.]{1})
You can use
(\$|£|$|£)(\d+(?:\.\d+)?)\b(?!\.)
See the regex demo.
In this regex, (\d+(?:\.\d+)?)\b(?!\.) matches
(\d+(?:\.\d+)?) - Group 1: one or more digits, then an optional occurrence of . and one or more digits
\b - a word boundary
(?!\.) - not immediately followed with a . char.
Another solution for Java (where the regex engine supports possessive quantifiers) will be
(\$|£|$|£)(\d++(?:\.\d+)?+)(?!\.)
See this regex demo. \d++ and (?:\.\d+)?+ contain ++ and ?+ possessive quantifiers that prevent backtracking into the quantified subpatterns.
In Java, do not forget to double the backslashes in the string literals:
String regex = "(\\$|£|$|£)(\\d++(?:\\.\\d+)?+)(?!\\.)";
You could try this
(\$|£|$|£)([0-9]+(?:\.[0-9]+)?)$
one or more digits followed by an optional dot and some digits and then the end of the string.
EDIT: some typos fixed
And it's not ok to delete the whole sentence obove, due to one word against my self. :(

Extract exactly n digits in a sentence using REGEX

Example
The no.s 1234 65
Input: n
For n=4, the output should be 1234
For n=2, the output should be : 65 (not 12)
Tried \d{n} which gives 12 and \d{n,} gives 1234 but i want the exact matching one.
Pattern p = Pattern.compile("//\d{n,}");
you need negative lookaround assertion: (?<!..): negative look behind, and (?!..): negative look ahead : regex101
(?<!\d)\d{4}(?!\d)
however not all regex engine supports them, maybe a work around may match also the preceeding character and following character (contrary to look-around which are 0 width matches), (\D matches all excpet a digit)
(?:^|\D)(\d{4})(?:\D|$)
I think what you meant is the \b character.
Hence, the regex you're looking for would be (for n=2):
\b\d{2}\b
From what I understand, you're looking for a regex that will match a number in a string which has n digits, taking into into account the spacing between the numbers. If that's the case, you're looking for something like this:
\b\d{4}\b
The \b will ensure the match is constrained to the start/end of a 'word' where a word is the boundary between anything matched by \w (which includes digits) and anything matched by the opposite, \W (which includes spaces).
I don't code in java but I can try to answer this using regex in general.
If your number is in the format d1d2d3d4 d5d6 and you want to extract digits d5d6, create 3 groups as r'([0-9]+)("/s")([0-9]+)' – each set of parenthesis () represent one group. Now, extract the third group only in another object which is your required output.

Check if String ends with two digits after a dot in Regular Expression?

I'm trying to test if a String ends with EXACTLY two digits after a dot in Java using a Regular Expression. How can achieve this?
Something like "500.23" should return true, while "50.3" or "50" should return false.
I tried things like "500.00".matches("/^[0-9]{2}$/") but it returns false.
Here is a RegEx that might help you:
^\d+\.\d{2,2}$
it may neither be perfect nor the most efficient, but it should lead you in the right direction.
^ says that the expression should start here
\d looks for any digit
+ says, that the leading \d can appear as often as necessary (1–infinity)
\. means you are expecting a dot(.) at one point
\d{2,2} thats the trick: it says you want 2 and exactly 2 digits (not less not more)
$ tells you that the expression ends there (after the 2 digits)
in Java the \ needs to be escaped so it would be:
^\\d*\\.\\d{2,2}$
Edit
if you don't need digits before the dot (.) or if you really don't care what comes before the dot, then you can replace the first \d+ by a .* as in Bohemians answer. The (non escaped) dot means that the expression can contain any character (not only digets). Then even the leading ^ might no longer be necessary.
\\.*\\.\\d{2,2}$
use this regex
String s="987234.42";
if(Pattern.matches("^\\d+(\\.\\d{2})$", s)){ // string must start with digit followed by .(dot) then exactly two digit.
....
}
Firstly, forward slashes are no part of regular expressions whatsoever. They are however used by some languages to delimit regular expressions - but not java, so don't use them.
Secondly, in java matches() must match the whole string to return true (so ^ and $ are implied in the regex).
Try this:
if (str.matches(".*\\.\\d\\d"))
// it ends with dot then 2 digits
Note that in java a bash slash in a regex requires escaping by a further back slash in a string literal.

REGEX Expression to exclude toll-free numbers

I have this regex expression written that should extract toll-free numbers but when there is a number like 1-800-343-2432 (when there is a 1 before the 800 stuff) it doesn't work
(?!(\$|#|800|855|866|877|888))\(?[\\s.-]*([0-9]{3})?[\\s.-]*\)?[\\s.-]*[0-9]{3}[\\s.-]*[0-9]{4}
how can i modify this expression to not take numbers like 1-866-343-1232 too ?!
Without checking your full regex you can use this regex to block 1-888:
(?!(?:1-)?(\\$|#|800|855|866|877|888))\(?[\\s.-]*([0-9]{3})?[\\s.-]*\)?[\\s.-]*[0-9]{3}[\\s.-]*[0-9]{4}
Prepend (1-)? to your regex. This will work for optional 1-.
Modifying your regular expression:
(\+)?(1-)?\(?(\\$|#|800|855|866|877|888)\)?[\\s.-]*([0-9]{3})?[\\s.-]*\)?[\\s.-]*[0-9]{3}[\\s.-]*[0-9]{4}
The key differences here are the following:
(\+)? :: A lazy quantifier `?' matches a + character if it takes place prior to the 1. Many numbers display like +1-800-343-2432
(1-)? :: Matches a 1 followed by a - character. The ? is a lazy quantifier that matches the 1- if it exists.
And I also added \(? and \)? which allow you to match on numbers that present in the format +1-(800)-343-2432

^ and $ in Java regular expression

I know that ^ and $ means "matches the beginning of the line" and "matches the end of line"
However, when I did some coding today, I didn't notice any difference between including them and excluding them in a regular expression used in Java.
For example, I want to match a positive Integer using
^[1-9]\\d*$
, and when I exclude them in the regular expression like
[1-9]\\d*
, it seems that there is no difference. I have tried to test with a String that "contains" an integer like ###123###, and the second regular expression can still recognize it is not valid like the first one.
So are the two regular expressions above completely equal to the other one? Thanks!
Do you need to search a string like 2343, or [SPACE]2345, or abc234?
The anchored regex will only find the number in the first string. The un-anchored will find them in all strings.
It all depends on what your requirements are. Are you analyzing lines in a text file, where each line contains only digits?, or are you analyzing the text in a prose document or source-code, where digits may be interspersed among a whole bunch of other stuff?
In the former case, the anchors are good. In the latter, they are bad.
More info: http://www.regular-expressions.info/anchors.html
They are different, the first input checks the whole line so from the begin to the end of the line and second doesn't care about the line.
For more check: regex-bounds
Well...no, the regular expressions aren't equivalent. They're also not doing what you think they are.
You intend to match a positive digit - what your regular expression aims to do is to match some character between 1 and 9, then match any number of digit characters after that (which includes zero).
The difference between the two is the anchoring, as you've noted - the first regex will only match values that literally begin with a 1 through 9, then zero or more digits, then expect there to be nothing else in the string.
The correct regex to match any positive number anywhere in the string would look like this:
[1-9]*\\d*
...and the correct regex to match any line that is a positive number would be this:
^[1-9]*\\d*$

Categories

Resources