How to work with regex to check a content of String

How to work with regex to check a content of String - java

i need to check if a string have in your content minimum two commas and maximum three commas and one hyphen. I'm trying to make a regex to validate this String.
Ex:
String address = "Av. Rocio, 45, - Center";
String regex = "//,{2,3}|-{1}";
boolean isValid = address.matches(regex);
But don't working, always return false, what i did wrong? Thanks!

To match a string that has ONLY 2 or 3 commas and not more than 1 hyphen, use:
String regex = "(?s)^(?=([^,]*,){2,3}[^,]*$)(?=[^-]*-[^-]*$).*";
The matches method requires a full string match, thus, we need to add .*.
Note that {1} limiting quantifier is redundant, as - will match exactly 1 hyphen.
See IDEONE demo.
The regex (where . matches a newline due to (?s) inline dotall modifier) matches:
^ - start of string
(?=([^,]*,){2,3}[^,]*$) - Lookahead that checks the presence of 2 or 3 commas
(?=[^-]*-[^-]*$) - lookahead that requires only 1 hyphen to be in the string
.* - match all the string if the 2 conditions above are satisfied.

Related

Regex: string can contain spaces, but not only spaces. It cannot contain `*` nor `:` characters either

I need help finding a regex that will allow most strings, except:
if the string only contains whitespaces
if the string contains : or *
I want to reject the following strings:
"hello:world"
"hello*world"
" " (just a whitespace)
But the following strings will pass:
"hello world"
"hello"
So far, I can accomplish what I want... in two patterns.
[^:*]* rejects the 2 special characters
.*\S.* rejects any string with only whitespaces
I'm not sure how to combine these two patterns into one...
I'll be using the regex pattern along with Java.

An example of how you could combine your two patterns for use with the matches method:
"[^:*]*[^:*\\s][^:*]*"
[^\s] is equivalent to \S.

You could use a negative lookahead:
^(?!\s*$)[^:*]+$
^ - start of string anchor
(?!\s*$) negative lookahead rejecting whitespace-only strings
[^:*]+ - one or more of any character except : and *
$ - end of string anchor
Demo

You can use matches to match the whole string with the doubled backslash:
\\s*[^\\s:*][^:*]*
Explanation
\s* Match optional whitespace chars
[^\s:*] Match a non whitespace char other than : and *
[^:*]* Match optional chars other than : and *
See a regex demo.
As \s can also match a newline, if you don't want to cross matching newlines:
\\h*[^\\s:*][^\\r\\n:*]*
Explanation
\h* Match optional horizontal whitespace chars
[^\s:*] Match a non whitespace char other than : and *
[^\\r\\n:*]* Match optional chars other than : and * or newlines
See another regex demo.

Find a three-digit number in a string using replaceAll()

I have String from which I need to extract a keyword.
Something like: "I have 100 friends and 1 evil".
I need to extract "100" from that String using only replaceAll function and appropriate regex.
I tried to do it in that way:
String input = "I have 100 friends and 1 evil";
String result = input.replaceAll("[^\\d{3}]", "")
But it doesn't work. Any help would be appreciated.

You can consider any of the solutions below:
String result = input.replaceFirst(".*?(\\d{3}).*", "$1");
String result = input.replaceFirst(".*?(?<!\\d)(\\d{3})(?!\\d).*", "$1");
String result = input.replaceFirst(".*?\\b(\\d{3})\\b.*", "$1");
String result = input.replaceFirst(".*?(?<!\\S)(\\d{3})(?!\\S).*", "$1");
See the regex demo. NOTE you may use replaceAll here, too, but it makes little sense as the replacement must occur only once in this case.
Here,
.*? - matches any zero or more chars other than line break chars, as few as possible
(\d{3}) - captures into Group 1 any three digits
.* - matches any zero or more chars other than line break chars, as many as possible.
The (?<!\d) / (?!\d) lookarounds are digit boundaries, there is no match if the sequence is four or more digits. \b are word boundaries, there will be no match of the three digits are glued to a letter, digit or underscore. (?<!\S) / (?!\S) lookarounds are whitespace boundaries, there must be a space or start of string before the match and either a space or end of string after.
The replacement is $1, the value of Group 1.
See the Java demo:
String input = "I have 100 friends and 1 evil";
System.out.println(input.replaceFirst(".*?(\\d{3}).*", "$1"));
System.out.println(input.replaceFirst(".*?(?<!\\d)(\\d{3})(?!\\d).*", "$1"));
System.out.println(input.replaceFirst(".*?\\b(\\d{3})\\b.*", "$1"));
System.out.println(input.replaceFirst(".*?(?<!\\S)(\\d{3})(?!\\S).*", "$1"));
All output 100.

Java int to fraction

How can i change 4 -1/4 -5 to 4/1 -1/4 -5/1 using regex?
String str = "4 -1/4 -5";
String regex = "(-?\\d+/\\d+)";
Matcher matcher = Pattern.compile(regex).matcher(str);
My code finding only fraction but i want to find integer without fraction.

String result = str.replaceAll("(?<!/)\\b\\d+\\b(?!/)", "$0/1");
looks for entire numbers (\b\d+\b), not preceded by ((?<!/)) nor followed by a slash ((?!/)), and adds /1 to them.

Try (?<=-| |^)(\d+)(?!\d*\/)
Explanation:
(?<=...) - positive lookahead, assert, what precedes matches pattern inside
-| |^ - match either -, , or beginning of a line ^
(\d+) - match one or more digits and store in first capturing group
(?!\d*\/) - negative lookahead, assert what follows is not zero or mroe digits followed by \/.
Replace it with \1/1, so first capturing group followed by /1
Demo

I'm not sure I understand what you want to do here, but if you want to remove the slashes you can use:
str.replaceAll("\\/", " ");
This will leave you with a string having only the integers.

Java Regexp to match words only (', -, space)

What is the Java Regular expression to match all words containing only :
From a to z and A to Z
The ' - Space Characters but they must not be in the beginning or the
end.
Examples
test'test match
test' doesn't match
'test doesn't match
-test doesn't match
test- doesn't match
test-test match

You can use the following pattern: ^(?!-|'|\\s)[a-zA-Z]*(?!-|'|\\s)$
Below are the examples:
String s1 = "abc";
String s2 = " abc";
String s3 = "abc ";
System.out.println(s1.matches("^(?!-|'|\\s)[a-zA-Z]*(?!-|'|\\s)$"));
System.out.println(s2.matches("^(?!-|'|\\s)[a-zA-Z]*(?!-|'|\\s)$"));
System.out.println(s3.matches("^(?!-|'|\\s)[a-zA-Z]*(?!-|'|\\s)$"));

When you mean the whitespace char it is: [a-zA-Z ]
So it checks if your string contains a-z(lowercase) and A-Z(uppercase) chars and the whitespace chars. If not, the test will fail

Here's my solution:
/(\w{2,}(-|'|\s)\w{2,})/g
You can take it for a spin on Regexr.
It is first checking for a word with \w, then any of the three qualifiers with "or" logic using |, and then another word. The brackets {} are making sure the words on either end are at least 2 characters long so contractions like don't aren't captured. You could set that to any value to prevent longer words from being captured or omit them entirely.
Caveat: \w also looks for _ underscores. If you don't want that you could replace it with [a-zA-Z] like so:
/([a-zA-Z]{2,}(-|'|\s)[a-zA-Z]{2,})/g

Java replaceAll regex error

I want to transforme all "*" into ".*" excepte "\*"
String regex01 = "\\*toto".replaceAll("[^\\\\]\\*", ".*");
assertTrue("*toto".matches(regex01));// True
String regex02 = "toto*".replaceAll("[^\\\\]\\*", ".*");
assertTrue("tototo".matches(regex02));// True
String regex03 = "*toto".replaceAll("[^\\\\]\\*", ".*");
assertTrue("tototo".matches(regex03));// Error
If the "*" is the first character a error occure :
java.util.regex.PatternSyntaxException:
Dangling meta character '*' near index 0
What is the correct regex ?

This is currently the only solution capable of dealing with multiple escaped \ in a row:
String regex = input.replaceAll("\\G((?:[^\\\\*]|\\\\[\\\\*])*)[*]", "$1.*");
How it works
Let's print the string regex to have a look at the actual string being parsed by the regex engine:
\G((?:[^\\*]|\\[\\*])*)[*]
((?:[^\\*]|\\[\\*])*) matches a sequence of characters not \ or *, or escape sequence \\ or \*. We match all the characters that we don't want to touch, and put it in a capturing group so that we can put it back.
The above sequence is followed by an unescaped asterisk, as described by [*].
In order to make sure that we don't "jump" when the regex can't match an unescaped *, \G is used to make sure the next match can only start at the beginning of the string, or from where the last match ends.
Why such a long solution? It is necessary, since the look-behind construct to check whether the number of consecutive \ preceding a * is odd or even is not officially supported by Java regex. Therefore, we need to consume the string from left to right, taking into account escape sequences, until we encounter an unescaped * and replace it with .*.
Test program
String inputs[] = {
"toto*",
"\\*toto",
"\\\\*toto",
"*toto",
"\\\\\\\\*toto",
"\\\\*\\\\\\*\\*\\\\\\\\*"};
for (String input: inputs) {
String regex = input.replaceAll("\\G((?:[^\\\\*]|\\\\[\\\\*])*)[*]", "$1.*");
System.out.println(input);
System.out.println(Pattern.compile(regex));
System.out.println();
}
Sample output
toto*
toto.*
\*toto
\*toto
\\*toto
\\.*toto
*toto
.*toto
\\\\*toto
\\\\.*toto
\\*\\\*\*\\\\*
\\.*\\\*\*\\\\.*

You need to use negative lookbehind here:
String regex01 = input.replaceFirst("(?<!\\\\)\\*", ".*");
(?<!\\\\) is a negative lookbehind that means match * if it is not preceded by a backslash.
Examples:
regex01 = "\\*toto".replaceAll("(?<!\\\\)\\*", ".*");
//=> \*toto
regex01 = "*toto".replaceAll("(?<!\\\\)\\*", ".*");
//=> .*toto

You have to cater for the case of a string starting with * in your regex:
(^|[^\\\\])\\*
The single caret represents the 'beginning of the string' ( 'start anchor' ).
Edit
Apart from the correction above, the replacement string in the replaceAll call must be $1.* instead of .* lest a matched character before an unescaped * be lost.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to work with regex to check a content of String - java

Related

Regex: string can contain spaces, but not only spaces. It cannot contain `*` nor `:` characters either

Find a three-digit number in a string using replaceAll()

Java int to fraction

Java Regexp to match words only (', -, space)

Java replaceAll regex error

Categories

Resources