Regex in java to check for double consonant - java

I want to write a regex in Java to check if a string ends in double consonant.
My regex is not working.
\\w+[^aeiou]\\1$
Appreciate your help
Thanks a ton.

It doesn't work since \1 references a non-existent subpattern. You need to assign a capturing group. Capturing groups could be used later on in the regular expression as a backreference to what was matched in that captured group.
\\w+([^aeiou])\\1$
Based off the comment above about your regular expression not only matching double consonants, I would consider combining an intersection with negation to make sure the grouped character is an actual letter character.
(?i)\\w+([a-z&&[^aeiou]])\\1$

This might work.
# "(?i)\\w+(?:(?![aeiou])[a-z]){2}$"
(?i) # Case independent
\w+
(?:
(?! [aeiou] ) # Not a vowel ahead
[a-z] # Consonant only
){2}
$

Related

java regex to capture any number of periods within a string

I am trying to match on any of the following:
$tag:parent.child$
$tag:grand.parent.child$
$tag:great.grand.parent.child$
I have tried a bunch of combos but not sure how to do this without an exp for each one: https://regex101.com/r/cMvx9I/1
\$tag:[a-z]*\.[a-z]*\$
I know this is wrong, but haven't been able to find the right method yet. Help is greatly appreciated.
Your regex was: \$tag:[a-z]*\.[a-z]*\$
You need a repeating group of .name, so use: \$tag:[a-z]+(?:\.[a-z]+)+\$
That assumes there has to be at least 2 names. If only one name is allowed, i.e. no period, then change last + to *.
You can use \$tag:(?:[a-z]+\.)*[a-z]+\$
\$ a literal $
tag: literal tag:
(?:...) a non-capturing group of:
[a-z]+ one or more lower-case letters and
\. a literal dot
* any number of the previous group (including zero of them)
[a-z]+ one or more lower-case letters
\$ a literal $
The following pattern will match any periods within a string:
\.
Not sure if this is what you want, but you can make a non-capturing group out of a pattern and then find that a certain number of times:
\$tag:(?:[a-z]+?\.*){1,4}\$
\$tag: - Literal $tag:
(?:[a-z]+?\.*) - Non-capturing group of any word character one or more times (shortest match) followed by an optional literal period
{1,4} - The capturing group appears anywhere between 1-4 times (you can change this as needed, or use a simple + if it could be any number of times).
\$ - Literal $
I normally prefer \w instead of [a-z] as it is equivalent to [a-zA-Z0-9_], but using this depends on what you are trying to find.
Hope this helps.

Regex with Whitespace

I am try to write a regex to match the following:
act=MATCHME
act=Match me too
I have the following regex to match either one but not both. Here is my effort:
matches MATCHME: act=(\w+)
matches Match me too: (\w+\s\w+\s\w+)
Is there anyway to can combine the two with OR, or may I be looking at this wrong?
I am using the JAVA regex engine.
You may use an optional non-capturing group:
act=(\w+(?:\s+\w+\s+\w+)?)
^^^^^^^^^^^^^^^^^
See the regex demo
The ? matches 1 or 0 occurrences of the quantified subpattern. When it is applied to a grouping construct, the quantification is applied to the whole pattern sequence, so (?:\s+\w+\s+\w+)? matches 1 or 0 sequences of 1+ whitespaces, 1+ word chars, 1+ whitespaces and again 1+ word chars.
You may further subsegment the pattern if you need to capture 2-word substrings after act=.
Surely you know how to compose regular expressions by alternation.
This regular expression may help you
^[a-zA-Z ]*$

regex expression to remove eed from string

I am trying to replace 'eed' and 'eedly' with 'ee' from words where there is a vowel before either term ('eed' or 'eedly') appears.
So for example, the word indeed would become indee because there is a vowel ('i') that happens before the 'eed'. On the other hand the word 'feed' would not change because there is no vowel before the suffix 'eed'.
I have this regex: (?i)([aeiou]([aeiou])*[e{2}][d]|[dly]\\b)
You can see what is happening with this here.
As you can see, this is correctly identifying words that end with 'eed', but it is not correctly identifying 'eedly'.
Also, when it does the replace, it is replacing all words that end with 'eed' , even words like feed which it should not remove the eed
What should I be considering here in order to make it correctly identify the words based on the rules I specified?
You can use:
str = str.replaceAll("(?i)\\b(\\w*?[aeiou]\\w*)eed(?:ly)?", "$1ee");
Updated RegEx Demo
\\b(\\w*?[aeiou]\\w*) before eed or eedly makes sure there is at least one vowel in the same word before this.
To expedite this regex you can use negated expression regex:
\\b([^\\Waeiou]*[aeiou]\\w*)eed(?:ly)?
RegEx Breakup:
\\b # word boundary
( # start captured group #`
[^\\Waeiou]* # match 0 or more of non-vowel and non-word characters
[aeiou] # match one vowel
\\w* # followed by 0 or more word characters
) # end captured group #`
eed # followed by literal "eed"
(?: # start non-capturing group
ly # match literal "ly"
)? # end non-capturing group, ? makes it optional
Replacement is:
"$1ee" which means back reference to captured group #1 followed by "ee"
find dly before finding d. otherwise your regex evaluation stops after finding eed.
(?i)([aeiou]([aeiou])*[e{2}](dly|d))

Java Regular Expression (greedy/nongreedy)

So I'm trying to separate the following two groups formatted as:
FIRST - GrouP second.group.txt
The first group can contain any character
The second group is a dot(.) delimited string.
I'm using the following regex to separate these two groups:
([A-Z].+).*?([a-z]+\.[a-z]+)
However, it gives a wrong result:
1: FIRST - GrouP second.grou
2: p.txt
I don't understand because I'm using "nongreedy" separater (.*?) instead of the greedy one (. *)
What am I doing wrong here?
Thanks
You can this regex to match both groups:
\b([A-Z].+?)\s*\b([a-z]+(?:\.[a-z]+)+)\b
RegEx Demo
Breakup:
\b # word boundary
([A-Z].+?) # match [A-Z] followed by 1 or more chars (lazy)
\s* # match 0 or more spaces
\b # word boundary
([a-z]+ # match 1 or more of [a-z] chars
(?:\.[a-z]+)+) # match a group of dot followed by 1 or more [a-z] chars
\b # word boundary
PS: (?:..) is used for non-capturing group.
This is one possible solution that should be pretty compact:
(.*?-\s*\S+)|(\S+\.?)+
https://regex101.com/r/iW8mE5/1
It is looking for anything followed by a dash, zero or more spaces, and then non-whitespace characters. And if it doesn't find that, it looks for non-whitespace followed by an optional decimal.

Can I negate the dot?

The following regular expression matches the character a:
"a"
The following regular expression matches all characters except a:
"[^a]"
The following regular expression matches a ton of characters:
"."
How do I match everything that is not matched by "."? I can't use the same technique as above:
"[^.]"
because inside the brackets, the . changes meaning and only stands for the character . itself :(
The below negative lookahead will work.
(?:(?!.)[\S\s])
Java regex would be,
"(?:(?!.)[\\S\\s])"
DEMO
The idea behind the above regex is, it would match only \r or \n or \t or \f that is the characters which aren't matched by a dot (Multiline mode).
"[^\\.]"
use double backslash for regex used character. for example
\\.\\]\\[\\-\\)\\(\\?

Categories

Resources