Replace string part with regex pattern

Replace string part with regex pattern - java

I would like to replace the following string.
img/s/430x250/
The problem is there are variations, like:
img/s/265x200/
or:
img/s/110x73/
So I would like to replace this part in whole, but the numbers are changeable, so how could I make a pattern that replaces it from a string?

Is your goal to match all three of those cases?
If so, this should work: img\/s\/\d+x\d+\/
It searches for img/s/[1 or more digits]x[1 or more digits]/

This regular expression will match your examples
img\/s\/\d+?x\d+?\/
the / matches /
the \d matches digits 0-9 and the + means 1 or more. The ? makes it lazy instead of greedy.
the img and s just match that literally
check out https://regex101.com/ to try out regular expressions. It's much easier than testing them by debugging code. Once you find an expression that works, you can move on to make sure your specific code will perform the same.

Related

REGEX: how to make this statement become 2 match in REGEX

I have statement: <%=anything%><%=anything%>
and a regular expression: <%=\\s*(\\S+)\\s*%>.
The regex matches the stament as 1 match instead of 2 matches.
Can someone fix my regex?
Btw I use Java for my application

You are currently matching it all into one match, because regex usually is greedy, thus taking everything it can match into the match - so =anything%><%=anything is all matched by \S+. You could use the lazy modifier for the \S, so it matches as small as it has to, like so: <%=\\s*(\\S+?)\\s*%>. But there is an even better way to work with - as you don't want to match the closing >, just include it into a negative character class: <%=\\s*([^\\s>]+)\\s*%>
Here is a demo of it: https://regex101.com/r/bA4qY9/1
Note that you might have to double the backslashes again after testing in regex101
If you want to read further into it, have a look at http://www.regular-expressions.info/repeat.html

Regular expression non-greedy but still

I have some larger text which in essence looks like this:
abc12..manycharshere...hi - abc23...manyothercharshere...jk
Obviously there are two items, each starting with "abc", the numbers (12 and 23) are interesting as well as the "hi" and "jk" at the end.
I would like to create a regular expression which allows me to parse out the numbers, but only if the two characters at the end match, i.e. I am looking for the number related to "jk", but the following regular expression matches the whole string and thus returns "12", not "23" even when non-greedy matching the area with the following:
abc([0-9]+).*?jk
Is there a way to construct a regular expression which matches text like the one above, i.e. retrieving "23" for items ending in "jk"?
Basically I would need something like "match abc followed by a number, but only if there is "jk" at the end before another instance of "abc followed by a number appears"
Note: the texts/matches are an abstraction here, the actual text is more complicated, espially the things that can appear as "manyothercharactershere", I simplified to show the underlying problem more clearly.

Use a regex like this. .*abc([0-9]+).*?jk
demo here

I think you want something like this,
abc([0-9]+)(?=(?:(?!jk|abc[0-9]).)*jk)
DEMO

You need to use negative lookahead here to make it work:
abc(?!.*?abc)([0-9]+).*?jk
RegEx Demo
Here (?!.*?abc) is negative lookahead that makes sure to match abc where it is NOT followed by another abc thus making sure closes string between abc and jk is matched.

Being non-greedy does not change the rule, that the first match is returned. So abc([0-9]+).*?jk will find the first jk after “abcnumber” rather than the last one, but still match the first “abcnumber”.
One way to solve this is to tell that the dot should not match abc([0-9]+):
abc([0-9]+)((?!abc([0-9]+)).)*jk
If it is not important to have the entire pattern being an exact match you can do it simpler:
.*(abc([0-9]+).*?jk)
In this case, it’s group 1 which contains your intended match. The pattern uses a greedy matchall to ensure that the last possible “abcnumber” is matched within the group.

Assuming that hyphen separates "items", this regex will capture the numbers from the target item:
abc([0-9]+)[^-]*?jk
See demo

How to use two types of regex in single regex?

I have a string field. I need to pass UUID string or digits number to that field.
So I want to validate this passing value using regex.
sample :
stringField = "1af6e22e-1d7e-4dab-a31c-38e0b88de807";
stringField = "123654";
For UUID I can use,
"[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
For digits I can use
"\\d+"
Is there any way to use above 2 pattern in single regex

Yes..you can use |(OR) between those two regex..
[\\da-f]{8}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{12}|\\d+
^

try:
"(?:[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})|(?:\\d+)"

You can group regular expressions with () and use | to allow alternatives.
So this will work:
(([0-9a-fA-F]){8}-([0-9a-fA-F]){4}-([0-9a-fA-F]){4}-([0-9a-fA-F]){4}-([0-9a-fA-F]){12})|(\\d+)
Note that I've adjusted your UUID regular expression a little to allow for upper case letters.

How are you applying the regex? If you use the matches(), all you have to do is OR them together as #Anirudh said:
return myString.matches(
"[\\da-f]{8}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{12}|\\d+");
This works because matches() acts as if the regex were enclosed in a non-capturing group and anchored at both ends, like so:
"^(?:[\\da-f]{8}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{12}|\\d+)$"
If you use Matcher's find() method, you have to add the group and the anchors yourself. That's because find() returns a positive result if any substring of the string matches the regex. For example, "xyz123<>&&" would match because the "123" matches the "\\d+" in your regex.
But I recommend you add the explicit group and anchors anyway, no matter what method you use. In fact, you probably want to add the inline modifier for case-insensitivity:
"(?i)^(?:[\\da-f]{8}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{12}|\\d+)$"
This way, anyone who looks at the regex will be able to tell exactly what it's meant to do. They won't have to notice that you're using the matches() method and remember that matches() automatically anchors the match. (This will be especially helpful for people who learned regexes in a non-Java context. Almost every other regex flavor in the world uses the find() semantics by default, and has no equivalent for Java's matches(); that's what anchors are for.)
In case you're wondering, the group is necessary because alternation (the | operator) has the lowest precedence of all the regex constructs. This regex would match a string that starts with something that looks like a UUID or ends with one or more digits.
"^[\\da-f]{8}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{4}-[\\da-f]{12}|\\d+$" // WRONG

How do I write a regular expression to find the following pattern?

I am trying to write a regular expression to do a find and replace operation. Assume Java regex syntax. Below are examples of what I am trying to find:
12341+1
12241+1R1
100001+1R2
So, I am searching for a string beginning with one or more digits, followed by a "1+1" substring, followed by 0 or more characters. I have the following regex:
^(\d+)(1\\+1).*
This regex will successfully find the examples above, however, my goal is to replace the strings with everything before "1+1". So, 12341+1 would become 1234, and 12241+1R1 would become 1224. If I use the first grouped expression $1 to replace the pattern, I get the wrong result as follows:
12341+1 becomes 12341
12241+1R1 becomes 12241
100001+1R2 becomes 100001
Any ideas?

Your existing regex works fine, just that you are missing a \ before \d
String str = "100001+1R2";
str = str.replaceAll("^(\\d+)(1\\+1).*","$1");
Working link

IMHO, the regex is correct.
Perhaps you wrote it wrong in the code. If you want to code the regex ^(\d+)(1\+1).* in a string, you have to write something like String regex = "^(\\d+)(1\\+1).*".
Your output is the result of ^(\d+)(1+1).* replacement, as you miss some backslash in the string (e.g. "^(\\d+)(1\+1).*").

Your regex looks fine to me - I don't have access to java but in JavaScript the code..
"12341+1".replace(/(\d+)(1\+1)/g, "$1");
Returns 1234 as you'd expect. This works on a string with many 'codes' in too e.g.
"12341+1 54321+1".replace(/(\d+)(1\+1)/g, "$1");
gives 1234 5432.

Personally, I wouldn't use a Regex at all (it'd be like using a hammer on a thumbtack), I'd just create a substring from (Pseudocode)
stringName.substring(0, stringName.indexOf("1+1"))
But it looks like other posters have already mentioned the non-greedy operator.
In most Regex Syntaxes you can add a '?' after a '+' or '*' to indicate that you want it to match as little as possible before moving on in the pattern. (Thus: ^(\d+?)(1+1) matches any number of digits until it finds "1+1" and then, NOT INCLUDING the "1+1" it continues matching, whereas your original would see the 1 and match it as well).

Refactor Regex Pattern - Java

I have the following aaaa_bb_cc string to match and written a regex pattern like
\\w{4}+\\_\\w{2}\\_\\w{2} and it works. Is there any simple regex which can do this same ?

You don't need to escape the underscores:
\w{4}+_\w{2}_\w{2}
And you can collapse the last two parts, if you don't capture them anyway:
\w{4}+(?:_\w{2}){2}
Doesn't get shorter, though.
(Note: Re-add the needed backslashes for Java's strings, if you like; I prefer to omit them while talking about regular expressions :))

I sometimes do what I call "meta-regexing" as follows:
String pattern = "x{4}_x{2}_x{2}".replace("x", "[a-z]");
System.out.println(pattern); // prints "[a-z]{4}_[a-z]{2}_[a-z]{2}"
Note that this doesn't use \w, which can match an underscore. That is, your original pattern would match "__________".
If x really needs to be replaced with [a-zA-Z0-9], then just do it in the one place (instead of 3 places).
Other examples
Regex for metamap in Java
How do I convert CamelCase into human-readable names in Java?

Yes, you can use just \\w{4}_\\w{2}_\\w{2} or maybe \\w{4}(_\\w{2}){2}.

Looks like your \w does not need to match underscore, so you can use [a-zA-Z0-9] instead
[a-zA-Z0-9]{4}_[a-zA-Z0-9]{2}_[a-zA-Z0-9]{2}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Replace string part with regex pattern - java

I would like to replace the following string. img/s/430x250/ The problem is there are variations, like: img/s/265x200/ or: img/s/110x73/ So I would like to replace this part in whole, but the numbers are changeable, so how could I make a pattern that replaces it from a string?

Is your goal to match all three of those cases? If so, this should work: img\/s\/\d+x\d+\/ It searches for img/s/[1 or more digits]x[1 or more digits]/

Related

REGEX: how to make this statement become 2 match in REGEX

Regular expression non-greedy but still

How to use two types of regex in single regex?

How do I write a regular expression to find the following pattern?

Refactor Regex Pattern - Java

Categories

Resources