I want to have a regular expression in Java to match a pattern where $ does not come before the first occurrence any digit in a given string. So far what I've got is ([^$].*?)(\\d+?), but it matches the strings where $ comes a few characters before the first digit. Am I missing something?
For eg.,
dfn$jnjkdd84fjbd$bjk should be invalid ($ comes before 8), while
vsdivnsoi5$ier5girneg is valid (5 and then $).
EDIT: Minimum one digit should be present in the string.
^[^$\\d]*[\\d].*$ should do the trick.
We check that all the characters before the first digit are not "$" and not a number.
final String invalid = "dfn$jnjkdd84fjbd$bjk";
final String valid = "vsdivnsoi5$ier5girneg";
final String regexp = "^[^$\\d]*[\\d].*$";
System.out.println(invalid.matches(regexp)); // false
System.out.println(valid.matches(regexp)); // true
You could use a combination of substring() and matches() like this :
public static void main(String[] args) {
String s = "adsaa12s$21";
s = s.substring(0, s.indexOf("$")); // upto $
System.out.println(s);
System.out.println(s.matches(".*?\\d.*?")); // does my string contain digits?
}
O/P:
adsaa12s
true
s = "sfs$21";
O/P:
sfs
false
I'd use:
^[^\\$]*\\d+[^\\$]*\\$
(?![^0-9]+\$.*)(?=.*?\d.*?\$.*)(^.*$)
use this.This uses a lookahead to ensure a digit before $.
See demo.
http://regex101.com/r/yX3eB5/8
My friend helped me find a really short regex -
^[^$0-9]+[0-9].*
Related
I am not quite sure of what is the correct regex for the period in Java. Here are some of my attempts. Sadly, they all meant any character.
String regex = "[0-9]*[.]?[0-9]*";
String regex = "[0-9]*['.']?[0-9]*";
String regex = "[0-9]*["."]?[0-9]*";
String regex = "[0-9]*[\.]?[0-9]*";
String regex = "[0-9]*[\\.]?[0-9]*";
String regex = "[0-9]*.?[0-9]*";
String regex = "[0-9]*\.?[0-9]*";
String regex = "[0-9]*\\.?[0-9]*";
But what I want is the actual "." character itself. Anyone have an idea?
What I'm trying to do actually is to write out the regex for a non-negative real number (decimals allowed). So the possibilities are: 12.2, 3.7, 2., 0.3, .89, 19
String regex = "[0-9]*['.']?[0-9]*";
Pattern pattern = Pattern.compile(regex);
String x = "5p4";
Matcher matcher = pattern.matcher(x);
System.out.println(matcher.find());
The last line is supposed to print false but prints true anyway. I think my regex is wrong though.
Update
To match non negative decimal number you need this regex:
^\d*\.\d+|\d+\.\d*$
or in java syntax : "^\\d*\\.\\d+|\\d+\\.\\d*$"
String regex = "^\\d*\\.\\d+|\\d+\\.\\d*$"
String string = "123.43253";
if(string.matches(regex))
System.out.println("true");
else
System.out.println("false");
Explanation for your original regex attempts:
[0-9]*\.?[0-9]*
with java escape it becomes :
"[0-9]*\\.?[0-9]*";
if you need to make the dot as mandatory you remove the ? mark:
[0-9]*\.[0-9]*
but this will accept just a dot without any number as well... So, if you want the validation to consider number as mandatory you use + ( which means one or more) instead of *(which means zero or more). That case it becomes:
[0-9]+\.[0-9]+
If you on Kotlin, use ktx:
fun String.findDecimalDigits() =
Pattern.compile("^[0-9]*\\.?[0-9]*").matcher(this).run { if (find()) group() else "" }!!
Your initial understanding was probably right, but you were being thrown because when using matcher.find(), your regex will find the first valid match within the string, and all of your examples would match a zero-length string.
I would suggest "^([0-9]+\\.?[0-9]*|\\.[0-9]+)$"
There are actually 2 ways to match a literal .. One is using backslash-escaping like you do there \\., and the other way is to enclose it inside a character class or the square brackets like [.]. Most of the special characters become literal characters inside the square brackets including .. So use \\. shows your intention clearer than [.] if all you want is to match a literal dot .. Use [] if you need to match multiple things which represents match this or that for example this regex [\\d.] means match a single digit or a literal dot
I have tested all the cases.
public static boolean isDecimal(String input) {
return Pattern.matches("^[-+]?\\d*[.]?\\d+|^[-+]?\\d+[.]?\\d*", input);
}
I have the following regex that matches a string to pattern:
(?i)(?<![^\\s\\p{Punct}]) : Look behind
(?![^\\s\\p{Punct}]) : Look ahead
Below is an example that demonstrates how I am using it:
public static void main(String[] args) {
String patternStart = "(?i)(?<![^\\s\\p{Punct}])", patternEnd = "(?![^\\s\\p{Punct}])";
String text = "this is some paragraph";
System.out.println(Pattern.compile(patternStart + Pattern.quote("some paragraph") + patternEnd).matcher(text).find());
}
It returns true which is expected result. However, as the regex uses double negative (i.e. negative look ahead/behind and ^), I thought removing both of the negatives should return the same result. So, I tried with the below:
String patternStart = "(?i)(?<=[\\s\\p{Punct}])", patternEnd = "(?=[\\s\\p{Punct}])";
However, it doesn't seem to be working as expected. I even tried adding ^ and/or $ in the end (of the square bracket) to match beginning/end of string, still, no luck.
Is it possible to convert these regexes into positive look-ups?
Yes, it is possible, but it is less efficient than what you have because in the positive lookarounds you need to use alternation:
String patternStart = "(?i)(?<=^|[\\s\\p{Punct}])", patternEnd = "(?=[\\s\\p{Punct}]|$)";
^^ ^^
The (?<=^|[\\s\\p{Punct}]) lookbehind requires the presence of either start of string (^) or | a whitespace or punctuation symbol ([\\s\\p{Punct}]). The positive lookahead (?=[\\s\\p{Punct}]|$) requires either a whitespace or punctuation, or the end of string.
If you just add ^ or $ into the character classes like [\\s\\p{Punct}^] and [\\s\\p{Punct}$], they will be parsed as literal caret and dollar symbols.
I've got a string in my Java project which looks something like this
9201,92710,94500,920,1002
How can I enter a dot 2 places before the comma? So it looks like
this:
920.1,9271.0,9450.0,92.0,100.2
I had an attempt at it but I can't get the last number to get a dot.
numbers = numbers.replaceAll("([0-9],)", "\\.$1");
The result I got is
920.1,9271.0,9450.0,92.0,1002
Note: The length of the string is not always the same. It can be longer / shorter.
Check if string ends with ",". If not, append a "," to the string, run the same replaceAll, remove "," from end of String.
Split string by the "," delimiter, process each piece adding the "." where needed.
Just add a "." at numbers.length-1 to solve the issue with the last number
As your problem is not only inserting the dot before every comma, but also before end of string, you just must add this additional condition to your capturing group:
numbers = numbers.replaceAll("([0-9](,|$))", "\\.$1");
As suggested by Siguza, you could as well use a non-capturing group which is even more what a "human" would expect to be captured in the capturing group:
numbers = numbers.replaceAll("([0-9](?:,|$))", "\\.$1");
But as a non-capturing group is (although a really nice feature) not standard Regex and the overhead is not that significant here, I would recommend using the first option.
You could use word boundary:
numbers = numbers.replaceAll("(\\d)\b", ".$1");
Your solution is fine, as long as you put a comma at the end like dan said.
So instead of:
numbers = numbers.replaceAll("([0-9],)", "\\.$1");
write:
numbers = (numbers+",").replaceAll("([0-9],)", "\\.$1");
numbers = numbers.substring(0,numbers.size()-1);
You may use a positive lookahead to check for the , or end of string right after a digit and a zeroth backreference to the whole match:
String s = "9201,92710,94500,920,1002";
System.out.println(s.replaceAll("\\d(?=,|$)", ".$0"));
// => 920.1,9271.0,9450.0,92.0,100.2
See the Java demo and a regex demo.
Details:
\\d - exactly 1 digit...
(?=,|$) - that must be before a , or end of string ($).
A capturing variation (Java demo):
String s = "9201,92710,94500,920,1002";
System.out.println(s.replaceAll("(\\d)(,|$)", ".$1$2"));
You where right to go for the replaceAll method. But your regex was not matching the end of the string, the last set of numbers.
Here is my take on your problem:
public static void main(String[] args) {
String numbers = "9201,92710,94500,920,1002";
System.out.println(numbers.replaceAll("(\\d,|\\d$)", ".$1"));
}
the regex (\\d,|\\d$) matches a digit followed by a comma \d,, OR | a digit followed by the end of the string \d$.
I have tested it and found to work.
As others have suggested you could add a comma at the end, run the replace all and then remove it. But it seems as extra effort.
Example:
public static void main(String[] args) {
String numbers = "9201,92710,94500,920,1002";
//add on the comma
numbers += ",";
numbers = numbers.replaceAll("(\\d,)", "\\.$1");
//remove the comma
numbers = numbers.substring(0, numbers.length()-1);
System.out.println(numbers);
}
I have a string that I want to make sure that the format is always a + followed by digits.
The following would work:
String parsed = inputString.replaceAll("[^0-9]+", "");
if(inputString.charAt(0) == '+') {
result = "+" + parsed;
}
else {
result = parsed;
}
But is there a way to have a regex in the replaceAll that would keep the + (if exists) in the beginning of the string and replace all non digits in the first line?
The following statement with the given regex would do the job:
String result = inputString.replaceAll("(^\\+)|[^0-9]", "$1");
(^\\+) find either a plus sign at the beginning of string and put it to a group ($1),
| or
[^0-9] find a character which is not a number
$1 and replace it with nothing or the plus sign at the start of group ($1)
You can use this expression:
String r = s.replaceAll("((?<!^)[^0-9]|^[^0-9+])", "");
The idea is to replace any non-digit when it is not the initial character of the string (that's the (?<!^)[^0-9] part with a lookbehind) or any character that is not a digit or plus that is the initial character of the string (the ^[^0-9+] part).
Demo.
What about just
(?!^)\D+
Java string:
"(?!^)\\D+"
Demo at regex101.com
\D matches a character that is not a digit [^0-9]
(?!^) using a negative lookahead to check, if it is not the initial character
Yes you can use this kind of replacement:
String parsed = inputString.replaceAll("^[^0-9+]*(\\+)|[^0-9]+", "$1");
if present and before the first digit in the string, the + character is captured in group 1. For example: dfd+sdfd12+sdf12 returns +1212 (the second + is removed since its position is after the first digit).
try this
1- This will allow negative and positive number and will match app special char except - and + at first position.
(?!^[-+])[^0-9.]
2- If you only want to allow + at first position
(?!^[+])[^0-9.]
I am trying to understand how to match an email address to the following pattern:
myEmail#something.any
The any should be between 2,4 characters.
Please find Java code below. I cannot understand why it returns true. Thanks!
public static void main(String[] args){
String a = "daniel#gmail.com";
String b = "[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.+-]+.[a-zA-Z]{2,4}";
String c = "MyNameis1#abcx.comfff";
Boolean b1 = c.matches(b);
System.out.println(b1);
}
OUTPUT: true
In regex, . matches any character (except newline). If you want to match . literally, you need to escape it:
[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.+-]+\\.[a-zA-Z]{2,4}
This is better, but it still matches the MyNameis1#abcx.comf portion. We can add an end of string anchor ($) to ensure there are no trailing unmatched characters:
[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.+-]+\\.[a-zA-Z]{2,4}$
Escape the . in String b = "[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.+-]+.[a-zA-Z]{2,4}";. It is a special character in regex.
Use : String b = "[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.+-]+\\.[a-zA-Z]{2,4}";
In your expression dot meant any character. escape that and make sure no character follows post your min/max char checks like below:
[a-zA-Z0-9._%+-]+#[a-zA-Z0-9.+-]+\\.[a-zA-Z]{2,4}$