Java String Regex replacement - java

Sample Input:
a:b
a.in:b
asds.sdsd:b
a:b___a.sds:bc___ab:bd
Sample Output:
a:replaced
a.in:replaced
asds.sdsd:replaced
a:replaced___a.sds:replaced___ab:replaced
String which comes after : should be replaced with custom function.
I have done the same without Regex. I feel it can be replaced with regex as we are trying to extract string out of specific pattern.
For first three cases, it's simple enough to extract String after :, but I couldn't find a way to deal with third case, unless I split the string ___ and apply the approach for first type of pattern and again concatenate them.

Just replace only the letters with exists next to : with the string replaced.
string.replaceAll("(?<=:)[A-Za-z]+", "replaced");
DEMO
or
If you also want to deal with digits, then add \d inside the char class.
string.replaceAll("(?<=:)[A-Za-z\\d]+", "replaced");

(:)[a-zA-Z]+
You can simply do this with string.replaceAll.Replace by $1replaced.See demo.
https://regex101.com/r/fX3oF6/18

Related

Extract specific data from string with regex

I want to capture multiple string which match some specific patterns,
For example my string is like
String textData = "#1_Label for UK#2_Label for US#4_Label for FR#";
I want to get string between two # which match with string like for UK
Output should like this
if match string is UK than
output should be 1_Label for UK
if match string is label than
output should be 1_Label for UK, 2_Label for US and 4_Label for FR
if match string is 1_ than
output should be 1_Label for UK
I don't want to extract data via array list and extraction should be case insensitive.
Can you please help me out from this problem?
Regards,
Ashish Mishra
You can use this regex for search:
#([^#]*?Label[^#]*)(?=#)
Replace Label with your search keyword.
RegEx Demo
Java Pattern:
Pattern p = Pattern.compile( "#([^#]*?" + Pattern.quote(keyword) + "[^#]*)(?=#)" );
If the data always is between two hashes, try a regex like this: (?i)#.*your_match.*# where your_match would be UK, label, 1_ etc.
Then use this expression in conjunction with the Pattern and Matcher classes.
If you want to match multiple strings, you'd need to exclude the hashes from the match by using look-around methods as well as reluctant modifiers, e.g. (?i)(?<=#).*?label.*?(?=#).
Short breakdown:
(?i) will make the expression case insensitive
(?<=#) is a positive look-behind, i.e. the match must be preceeded by a hash (but doesn't include the hash)
.*? matches any sequence of characters but is reluctant, i.e. it tries to match as few characters as possible
(?=#) is a positive look-ahead, which means the match must be followed by a hash (also not included in the match)
Without the look-around methods the hashes would be included in the match and thus using Matcher.find() you'd skip every other label in your test string, i.e. you'd get the matches #1_Label for UK# and #4_Label for FR# but not #2_Label for US#.
Without the relucatant modifiers the expression would match everything between the first and the last hash.
As an alternative and better, replace .*? with [^#]*, which would mean that the match cannot contain any hash, thus removing the need for reluctant modifiers as well as removing the problem that looking for US would match 1_Label for UK#2_Label for US.
So most probably the final regex you're after looks like this: (?i)(?<=#)[^#]*your_match[^#]*(?=#).
([^#]*UK[^#]*) for UK
([^#]*Label[^#]*) for Label
([^#]*1_[^#]*) for 1_
Try this.Grab the captures.See demo.
http://regex101.com/r/kQ0zR5/3
http://regex101.com/r/kQ0zR5/4
http://regex101.com/r/kQ0zR5/5
I have solved this problem with below pattern,
(?i)([^#]*?us[^#]*)(?=#)
Thank you so much Anubhava, VKS and Thomas for you reply.
Regards,
Ashish Mishra

Remove repeating set of characters in a string

I want to remove the sequesnce "-~-~-" if it repeats in a string, but only if they are together.
I have tried to create a regex based on the removing of multiple white spaces regex:
test.replaceAll("\\s+", " ");
Unfortunately I was unsuccessful. Can someone please help me write the correct regex? thanks.
Example:
string test = "hello-~-~--~-~--~-~-"
output:
hello-~-~-
Another example
string test = "-~-~--~-~--~-~-hello-~-~--~-~--~-~-"
output:
-~-~-hello-~-~-
The regex is:
test.replaceAll("(-~-~-){2,}", "-~-~-")
replaceAll replaces all occurrences matched by the regex (the first parameter) with the second parameter.
the () groups the expression -~-~- together, {2,} means two or more occurrences.
EDIT
Like #anubhava said, instead of using -~-~- for the replacement string, you could also use $1 which backreferences the first capturing group (i.e. the expression in the regex surrounded by ()).
test.replaceAll("(-~-~-)+", "-~-~-");
This is the regex you need:
(-~-~-){2}

if then condition using regex in java

I have a pattern which goes like this
String1 :"String2",
i have to validate this pattern. here if u see there are two cases, the somestring1 can contain special characters if it is given within double quotes.
eg: "xxxx-xxx" :"yyyyyyyy",--------> is valid
but xxxx-xxx :"yyyyyyyy",--------> is not valid
"xxxx-xxx :"yyyyyyyy",--------> is not valid
So i need to create a regex which will check whether the double quotes is closed properly if it is present in String1.
Short answer: Regex doesn't work like that.
What you can do however, is to use two separate patterns to validate:
\"[^\"]+?\" :.*
To check the one that can contain special characters, and:
[a-zA-Z]+? :.*
To check the one that can't
EDIT:
Thinking some more about it, you could combine the two patterns above like so:
^(\"[^\"]+?\"|[a-zA-Z]+?) :.*$
Which will match something :"something" and "some-thing" :"something" but not "some-thing : "something" or some-thing : "something". Assuming that the string only contains the given text.
If I understand your question right, this simple regex should work
\"string1\" :\"string2\"
Maybe something like this?
(?<normalString>^[a-zA-Z]+$)|(?<specialString>^".*?"$)
This will capture only a-z characters and put them in the "normalString" group, or if there's an string within quotation marks, capture that and put it in the "specialString" group.

Java: validating a certain string pattern

I am trying to validate a string in a 'iterative way' and all my tryouts just fail!
I find it a bit complicated and i'm guessing maybe you could teach me how to do it right.
I assume that most of you will suggest me to use regex patterns but i dont really know how, and in general, how can a regex be defined for infinite "sets"?
The string i want to validate is
"ANYTHING|NUMBER_ONLY,ANYTHING|NUMBER_ONLY..."
for example: "hello|5,word|10" and "hello|5,word|10," are both valid.
note: I dont mind if the string ends with or without a comma ','.
Kleene star (*) lets you define "infinite sets" in regular expressions. Following pattern should do the trick:
[^,|]+\|\d+(,[^,|]+\|\d+)*,?
A----------B--------------C-
Part A matches the first element. Part B matches any following elements (notice the star). Part C is the optional comma at the end.
WARNING: Remember to escape backslashes in Java string.
I'd suggest splitting your string to array by | delimiter. And validate each part separately. Each part (except first one) should match following pattern \d+(,.*)?
UPDATED
Split by , and validate each part with .*|\d+

Need some help getting some stuff off a string

I want to get some info out of my string but there's two possible "expressions" for the string. I want to get "a" & "b" out of the string. This is how they look:
Format one:
http://default.com/default/a/b
Format two:
http://default.com/#!default|1|a|b|1
How can I do this?
If the strings always looks like this, you could do the following:
Search for the #-char to decide, if you have type 1 or 2.
In case of type 1, split with delimiter '/' and always take the last and the one before. For type 2, also first split with '/' and then, split the last part again with delimiter '|' and take results[2] and results[3].
Use a regex to split the string.
Split on "default"
Regex Split
There are many ways you can do this - regular expressions is the most common.
In pseudo code:
if the string contains "/#!default" then:
Use the regular expression ^.*/([^/])/([^/])$
if the string contains "/default" then:
Use the regular expression ^.*|([^|])|([^|])|1$
Take the 1st and 2nd blocks from the matcher

Categories

Resources