How to write this Java regex? - java

I need to break the string into words by a hyphen. For example:
"WorkInProgress" is converted to "Work-In-Progress"
"NotComplete" is converted to "Not-Complete"
Most of cases one word starts with capital and ends with lowercase.
But there is one exception, "CIInProgress" should be converted to "CI-In-Progress".
I wrote like the code below, any pattern that has lowercase or "CI", followed by an capital, will be added "-" in middle. But it still can't work for "CIInProgress". Can anyone tell me how to correct it?
String str;
String pattern = "([a-z|CI]+)([A-Z])";
str= str.replaceAll(pattern, "$1\\-$2");

You could use a negative lookbehind,
Regex:
(?<!^)([A-Z][a-z])
Replacement string:
-$1
DEMO
Explanation:
(?<!^) Negative lookbehind is used here , which asserts what proceeds the characters [A-Z](uppercase) and also the following [a-z](lowercase) is not a starting anchor. An uppercase letter and the following lowercase letter will be matched only if it satisfies the above mentioned condition.() capturing groups are used to capture the matched characters, captured chars are stored into groups. Later you could get the captured chars by referring it's group index number.
Code:
System.out.println("WorkInProgress".replaceAll("(?<!^)([A-Z][a-z])", "-$1"));
System.out.println("NotComplete".replaceAll("(?<!^)([A-Z][a-z])", "-$1"));
System.out.println("CIInProgress".replaceAll("(?<!^)([A-Z][a-z])", "-$1"));
Output:
Work-In-Progress
Not-Complete
CI-In-Progress

You can't have | in a character class; it will just get interpreted as a literal vertical bar character. Try:
String pattern = "([a-z]+|CI)([A-Z])";

try this
str= str.replaceAll("(?<=\\p{javaLowerCase})(?=\\p{javaUpperCase})", "-");

Related

Replace all non numeric characters by only one word

I want to do the next replacement:
WORD1234 -> W1234
So, I'm using the regex:
([^\d]*)([0-9]+)([^\d]*)
Replacement: W$2
If the word is WORD1234AAAAA, using the previous regex I have the same result: W1234, which is what I want.
But if the word is WO12RD34 the result I have is: W12W34
What I want basically in all the cases is to remove all non-numeric characters and add the letter W at the beginning.
Update:
The input string does not always start with a W. It can be for example ABC12DE34 and the desired result is: FA1234. Meaning, remove all non-numeric characters and add a word at the beginning.
Try this:
String regex = "(?<start>^W)|(\\D)";
String replacement = "${start}";
System.out.println("WO12RD34".replaceAll(regex, replacement)); //prints W1234
System.out.println("WORD1234AAAAA".replaceAll(regex, replacement)); //prints W1234
With this regex, the "start" capturing group will only be set when the first character is matched. Otherwise, it will be empty.
The idea is that, when the start of the string followed by W is matched, the named "start" pattern would be initialised to ^W. Then, just replace ^W with itself.
Otherwise, when any non-digit character is matched, then the start pattern will not be set (and be empty). Then, also replace the non-digit character with nothing.

Regular to match a string with atleast one letter and all characters should be in lower case

I am new to Regular expression. I am looking efficient way to write regular expression to match a string with at least one letter and all characters should be in lower case
Ex-
test->true
tesT->false
test123->true
test##$->true
ABC->false
teST123->false
Please help me on this.
Thanks in advance.
I think what you're looking for is this:
^[^A-Z]*[a-z]+[^A-Z]*$
A string that matches this must contain at least one lowercase letter and cannot have uppercase letters.
The ^ and $ obliges the regular expression to match the whole string (not just a part). The [^A-Z]* means an empty string or a string not containing upper case letters. This is combined on both sides to the [a-z]+ which is a string containing one or more lowercase letters.
Try this Regex:
^(?=[^A-Z]+$)(?=[^a-z]*[a-z]).*$
Click for Demo
Explanation:
^ - asserts the start of the string
(?=[^A-Z]+$) - Positive lookahead to validate that there is no capital letter present in the test string until the end of the string
(?=[^a-z]*[a-z]) - Positive lookahead to validate that there is atleast one smallcase letter a-z
.* - matches 0+ occurrences of any character except the newline character. This works in conjunction with the above-mentioned 2 conditions.
$ - asserts the end of the string
This should do what you ask for
[a-z]+
try this
import re
list1 = ['test','tesT','test123','test##$','ABC','teST123']
for word in list1:
matchtext = re.match("(^[^A-Z]+$)",word)
print(matchtext)
output:
<_sre.SRE_Match object; span=(0, 4), match='test'>
None
<_sre.SRE_Match object; span=(0, 7), match='test123'>
<_sre.SRE_Match object; span=(0, 7), match='test##$'>
None
None

How to write a regex to match this String in Java?

I want to make a regular expression to the following string :
String s = "fdrt45B45"; // s is a password (just an example) with only
// letters and digits(must have letters and digits)
I tried with this pattern:
String pattern= "^(\\w{1,}|\\d{1,})$"; //but it doesn't work
I want to get a not match if my password doesn't contains letters or digits and a match if its contains both.
I know that: \w is a word character: [a-zA-Z_0-9] and \d is a digit: [0-9], but i can not mix \w and \d to get what i want. Any help or tips is very appreciated for a newbie.
A positive lookahead would do the trick :
String s = "fdrt45B45";
System.out.println(s.matches("(?=.*[a-zA-Z])(?=.*\\d)[a-zA-Z0-9]+"));
You should be able to use the following regex to achieve what you want. This uses a positive look ahead and will match any string containing at least one letter and at least one number.
^(?=.*\\d)(?=.*\\w).*

How to match a substring following after a string satisfying the specific pattern

Imagine, that I have the string 12.34some_text.
How can I match the substring following after the second character (4 in my case) after the . character. In that particular case the string I want to match is some_text.
For the string 56.78another_text it will be another_text and so on.
All accepted strings have the pattern \d\d\.\d\d\w*
If you wish to match everything from the second character after a specific one (i.e. the dot) you can use a lookbehind, like this:
(?<=[.]\d{2})(\w*)
demo
(?<=[.]\d{2}) is a positive lookbehind that matches a dot [.] followed by two digits \d{2}.
Since you are using java and the given pattern is \d\d\.\d\d\w* you will get some_text from 12.34some_textby using
String s="12.34some_text";
s.substring(5,s.length());
and you can compare the substring!

match whole sentence with regex

I'm trying to match sentences without capital letters with regex in Java:
"Hi this is a test" -> Shouldn't match
"hi thiS is a test" -> Shouldn't match
"hi this is a test" -> Should match
I've tried the following regex, but it also matches my second example ("hi, thiS is a test").
[a-z]+
It seems like it's only looking at the first word of the sentence.
Any help?
[a-z]+ will match if your string contains any lowercase letter.
If you want to make sure your string doesn't contain uppercase letters, you could use a negative character class: ^[^A-Z]+$
Be aware that this won't handle accentuated characters (like É) though.
To make this work, you can use Unicode properties: ^\P{Lu}+$
\P means is not in Unicode category, and Lu is the uppercase letter that has a lowercase variant category.
^[a-z ]+$
Try this.This will validate the right ones.
It's not matching because you haven't used a space in the match pattern, so your regex is only matching whole words with no spaces.
try something like ^[a-z ]+$ instead (notice the space is the square brackets) you can also use \s which is shorthand for 'whitespace characters' but this can also include things like line feeds and carriage returns so just be aware.
This pattern does the following:
^ matches the start of a string
[a-z ]+ matches any a-z character or a space, where 1 or more exists.
$ matches the end of the string.
I would actually advise against regex in this case, since you don't seem to employ extended characters.
Instead try to test as following:
myString.equals(myString.toLowerCase());

Categories

Resources