String replace with condition not be a subpart of word Java - java

String replace change more that i want.
For example
String input = "The blue house Theatres";
input = input.replace("the", "AAA");
output it will be:
"AAA blue house AAAatres"
I don't whant to change when is a subpart of a word.

First you should try to use replaceAll(regex, replacement) instead of replace(literal, replacement) since the latter works on literals only, i.e. you can't use expressions, while the former uses regular expressions to find matches.
Next your regular expression should use word boundaries, e.g. \bthe\b where \b marks a word boundary.
Finally if you want to do a case-insensitive replacement you'll need to either handle the possible cases in the epxression (e.g. \b[tT]he\b) or switch the expression to case-insensitive mode by prepending it with (?i), i.e. (?i)\bthe\b. Note that the expression [tT]he would not match THE while the case-insensitive expression would, so depending on your requirements you'd need to choose one or the other.
Using all that you'd get input = input.replaceAll("(?i)\\bthe\\b", "AAA");.
Edit:
According to your comment on the question you don't want to use word boundaries but only look for characters before and after. You can achieve that with negative look-around expressions, e.g. (?i)(?<![a-z])the(?![a-z]). Note that I used the quite simple character class [a-z] here, if you need to exclude more characters you'd need to expand it.
The above expression would match !The, the, THE? etc. but not Theatre or aether etc. since if requires the match to not be preceded by a character ((?<![a-z])) and not be followed by one ((?![a-z])).

Use a regex with word boundaries \b:
String input = "The blue house Theatres";
input.replaceAll("\\bThe\\b", "AAA");

Related

How to use two types of regex in single regex?

I have a string field. I need to pass UUID string or digits number to that field.
So I want to validate this passing value using regex.
sample :
stringField = "1af6e22e-1d7e-4dab-a31c-38e0b88de807";
stringField = "123654";
For UUID I can use,
"[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
For digits I can use
"\\d+"
Is there any way to use above 2 pattern in single regex
Yes..you can use |(OR) between those two regex..
[\\da-f]{8}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{12}|\\d+
^
try:
"(?:[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})|(?:\\d+)"
You can group regular expressions with () and use | to allow alternatives.
So this will work:
(([0-9a-fA-F]){8}-([0-9a-fA-F]){4}-([0-9a-fA-F]){4}-([0-9a-fA-F]){4}-([0-9a-fA-F]){12})|(\\d+)
Note that I've adjusted your UUID regular expression a little to allow for upper case letters.
How are you applying the regex? If you use the matches(), all you have to do is OR them together as #Anirudh said:
return myString.matches(
"[\\da-f]{8}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{12}|\\d+");
This works because matches() acts as if the regex were enclosed in a non-capturing group and anchored at both ends, like so:
"^(?:[\\da-f]{8}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{12}|\\d+)$"
If you use Matcher's find() method, you have to add the group and the anchors yourself. That's because find() returns a positive result if any substring of the string matches the regex. For example, "xyz123<>&&" would match because the "123" matches the "\\d+" in your regex.
But I recommend you add the explicit group and anchors anyway, no matter what method you use. In fact, you probably want to add the inline modifier for case-insensitivity:
"(?i)^(?:[\\da-f]{8}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{12}|\\d+)$"
This way, anyone who looks at the regex will be able to tell exactly what it's meant to do. They won't have to notice that you're using the matches() method and remember that matches() automatically anchors the match. (This will be especially helpful for people who learned regexes in a non-Java context. Almost every other regex flavor in the world uses the find() semantics by default, and has no equivalent for Java's matches(); that's what anchors are for.)
In case you're wondering, the group is necessary because alternation (the | operator) has the lowest precedence of all the regex constructs. This regex would match a string that starts with something that looks like a UUID or ends with one or more digits.
"^[\\da-f]{8}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{12}|\\d+$" // WRONG

Regular expression to match a character only once before any whitespace

In Java, what regular expression would I use to match a string that has exactly one colon and makes sure that the colon appears before any whitespace?
For example, it should match these strings:
label: print "Enter input"
But: I still had the money.
ghjkdhfjkgjhalergfyujhrageyjdfghbg:
area:54
But not
label: print "Enter input:"
There was one more thing: I still had the money.
ghfdsjhgakjsdhfkjdsagfjkhadsjkhflgadsjklfglsd
area::54
If you use it with matches (which requires to match the entire string), you could use
[^\\s:]*:[^:]*
Which means: arbitrarily many non-whitespace, non-: characters, then a :, then more arbitrarily many non-: characters.
I've really only used two regex concepts: (negated) character classes and repetition.
If you want to require at least one character before or after :, replace the corresponding * with + (as jlordo pointed out in a comment).
The following should work:
^[^\s:]*:(?!.*:)
If your strings can contain line breaks, use the DOTALL flag or change the regex to the following:
(?s)^[^\s:]*:(?!.*:)
It depends on what we call white space, it could be
[^\\p{Space}:]*:[^:]
The following should get you started:
Matcher MatchedPattern = Pattern.compile("^(\\w+\\:{1}[\"\\w\\s\\.]*)$").matcher("yourstring");

regular expression to match one or more of char a or just one of char b

I am taking user input through UI, and I have to validate it. Input text should obey the following ondition
It should either end with one or more
white space characters OR with just
single '='
I can use
".*[\s=]+"
but it matches multiple '=' also which I don't want to.
Please help.
You can use alternation:
(\s+|=)$
This expression means match one or more whitespace character or one equals, at the end of the string. The $ is an anchor which matches the end of the string (as you mentioned you're looking for characters at the end of the string).
(As tchrist correctly pointed out in the comments, $ matches the end of line instead of end of string when in multiline mode. If this is true in your case, and you are indeed looking for the end of the string instead of the end of the line, you can use \Z instead, which matches the end of the string regardless of multiline mode.)
If you want to ensure that there is only one = at the end, you can use a lookaround (in this case, a negative lookbehind, specifically). A lookaround is a zero-width assertion which tells the regex engine that the assertion must pass for the pattern to match, but it does not consume any characters.
(\s+|(?<!=)=)$
In this case, (?<!=) tells the regex engine, the character before the current position cannot be an =. When put into the expression, (?<!=)= means that the = will only match if the previous character is not also a =.
Begin string
Anything not "=" ( to avoid the double "==")
One or more blank spaces OR one "="
End of string
^([^=]*[\s+|=])$
Should work :-)
Try this expression:
".*(\\s+|=)"

Regex for java's String.matches method?

Basically my question is this, why is:
String word = "unauthenticated";
word.matches("[a-z]");
returning false? (Developed in java1.6)
Basically I want to see if a string passed to me has alpha chars in it.
The String.matches() function matches your regular expression against the whole string (as if your regex had ^ at the start and $ at the end). If you want to search for a regular expression somewhere within a string, use Matcher.find().
The correct method depends on what you want to do:
Check to see whether your input string consists entirely of alphabetic characters (String.matches() with [a-z]+)
Check to see whether your input string contains any alphabetic character (and perhaps some others) (Matcher.find() with [a-z])
Your code is checking to see if the word matches one character. What you want to check is if the word matches any number of alphabetic characters like the following:
word.matches("[a-z]+");
with [a-z] you math for ONE character.
What you’re probably looking for is [a-z]*

Pattern Matching - String Search

I am trying to work out a formula to match a following pattern:
input string example:
'444'/'443'/'434'/'433'/'344'/'334'/'333'
if any of the patterns above exist in a particular input string I want to match it as the same pattern.
also is it possible to do a variable substitution using regex? meaning check for the 3 chars of the string by using each character as a variable and just doing an increment/decrement for each character? so that you dont have to specify the particular number ranges (hardcoding the pattern string ) for different patterns?
Is there any good library one can use for this?? I was working with Pattern class in java.
If you have any link which would be helpful please pass it through :)
Thank you.
Let's first consider this pattern: [34]{3}
The […] is a character class, it matches exactly one of the characters in the set. The {n} is an exact finite repetition.
So, [34]{3} informally means "exactly 3 of either '3' or '4'". Thus, it matches "333", "334", "343", "344", "433", "434", "443", "444", and nothing else.
As a string literal, the pattern is "[34]{3}". If you don't want to hardcode this pattern, then just generate similar-looking strings that follows this template "[…]{n}". Just put the characters that you want to match in the …, and substitute n with the number you want.
Here's an example:
String alpha = "aeiou";
int n = 5;
String pattern = String.format("[%s]{%s}", alpha, n);
System.out.println(pattern);
// [aeiou]{5}
We've now seen that the pattern is not hardcoded, but rather programmatically generated depending on the values of the variables alpha and n. The pattern [aeiou]{5} will 5 consecutive lowercase vowels, e.g. "ooiae", "ioauu", "eeeee", etc.
It's again not clear if you just want to match these kinds of strings, or if they have to appear like '…'/'…'/'…'/'…'/'…'. If the latter is desired, then simply compose the pattern as desired, using repetition and grouping as necessary. You can also just programmatically copy and paste the pattern 5 times if that's simpler. Here's an example:
String p5 = String.format("'%s'/'%<s'/'%<s'/'%<s'/'%<s'", pattern);
System.out.println(p5);
// '[aeiou]{5}'/'[aeiou]{5}'/'[aeiou]{5}'/'[aeiou]{5}'/'[aeiou]{5}'
This will now match strings like "'aeooi'/'eeiuu'/'uaooo'/'eeeia'/'eieio'".
Caveat
Do be careful about what goes in alpha. Specifically, -, [. ], &&, ^, etc, are special metacharacters in Java character class definition. If you restrict alpha to contain only digits/letters, then you will probably not run into any problems, but e.g. [^a] does NOT mean "either '^' or 'a'". It in fact means "anything but 'a'. See java.util.regex.Pattern for exact character class syntax.
You can use the regex:
('\\d{3}'/){6}'\\d{3}'
Pattern.Compile takes a String as its parameter. Though that's probably most often supplied in the form of a string literal, if you have variable upper and lower bounds for your pattern, you can use something like StringBuilder to build your string, then pass that result to Pattern.Compile.

Categories

Resources