regex to filter out string - java

I'm filtering out string using below regex
^(?!.*(P1 | P2)).*groupName.*$
Here group name is specific string which I replace at run time. This regex is already running fine.
I've two input strings which needs to pass through from this regex. Can't change ^(?!.*(P1 | P2)) part of regex, so would like to change regex after this part only. Its a very generic regex which is being used at so many places, so I have only place to have changes is groupName part of regex. Is there any way where only 2 string could pass through this regex ?
1) ADMIN-P3-UI-READ-ONLY
2) ADMIN-P3-READ-ONLY
In regex groupName is a just a variable which will be replaced at run time with required string. In this case I want 2 string to be passed, so groupName part can be replaced with READ-ONLY but it will pass 1 string too.
Can anyone suggest on this how to make this work ?

You could use negative lookBehind:
(?<!UI-)READ-ONLY
so there must be no UI- before READ-ONLY

You can add another lookahead at the very start of your pattern to further restrict what it matches because your pattern is of the "match-everything-but" type.
So, it may look like
String extraCondition = "^(?!.*UI)";
String regex = "^(?!.*(P1|P2)).*READ-ONLY.*$";
String finalRegex = extraCondition + regex;
The pattern will look like
^(?!.*UI)^(?!.*(P1|P2)).*READ-ONLY.*$
matching
^(?!.*UI) - no UI after any zero or more chars other than line break chars as many as possible from the start of string
^(?!.*(P1|P2)) - no P1 nor P2 after any zero or more chars other than line break chars as many as possible from the start of string
.*READ-ONLY - any zero or more chars other than line break chars as many as possible and then READ-ONLY
.*$ - the rest of the string. Note you may safely remove $ here unless you want to make sure there are no extra lines in the input string.

Related

Regex to match user and user#domain

A user can login as "user" or as "user#domain". I only want to extract "user" in both cases. I am looking for a matcher expression to fit it, but im struggling.
final Pattern userIdPattern = Pattern.compile("(.*)[#]{0,1}.*");
final Matcher fieldMatcher = userIdPattern.matcher("user#test");
final String userId = fieldMatcher.group(1)
userId returns "user#test". I tried various expressions but it seems that nothing fits my requirement :-(
Any ideas?
If you use "(.*)[#]{0,1}.*" pattern with .matches(), the (.*) grabs the whole line first, then, when the regex index is still at the end of the line, the [#]{0,1} pattern triggers and matches at the end of the line because it can match 0 # chars, and then .* again matches at that very location as it matches any 0+ chars. Thus, the whole line lands in your Group 1.
You may use
String userId = s.replaceFirst("^([^#]+).*", "$1");
See the regex demo.
Details
^ - start of string
([^#]+) - Group 1 (referred to with $1 from the replacement pattern): any 1+ chars other than #
.* - the rest of the string.
A little bit of googling came up with this:
(.*?)(?=#|$)
Will match everthing before an optional #
I would suggest keeping it simple and not relying on regex in this case if you are using java and have a simple case like you provided.
You could simply do something like this:
String userId = "user#test";
if (userId.indexOf("#") != -1)
userId = userId.substring(0, userId.indexOf("#"));
// from here on userId will be "user".
This will always either strip out the "#test" or just skip stripping it out when it is not there.
Using regex in most cases makes the code less maintainable by another dev in the future because most devs are not very good with regular expressions, at least in my experience.
You included the # as optional, so the match tries to get the longest user name. As you didn't put the restriction of a username is not allowed to have #s in it, it matched the longest string.
Just use:
[^#]*
as the matching subexpr for usernames (and use $0 to get the matched string)
Or you can use this one that can be used to find several matches (and to get both the user part and the domain part):
\b([^#\s]*)(#[^#\s]*)?\b
The \b force your string to be tied to word boundaries, then the first group matches non-space and non-# chars (any number, better to use + instead of * there, as usernames must have at least one char) followed (optionally) by a # and another string of non-space and non-# chars). In this case, $0 matches the whole email addres, $1 matches the username part, and $2 the #domain part (you can refine to only the domain part, adding a new pair of parenthesis, as in
b([^#\s]*)(#([^#\s]*))?\b
See demo.

JAVA - Split a String containing delimiter #| . String should also split if there are multiple occurances of #| like - #|#|

I have a string which needs to be splitted into parts using #| as a delimiter. In some cases #|#| is appearing as well. It can go multiple times too.
Example String :
kh73j563741043f4611144u3ol#|h73j5637411432vk651p4601#|sadf#|12342134#|ADHVSF#|1#|0#|0#|DFSFS#|SDFSBFSF#|2017-07-03 19:56:37.0#|3#|6#|#|SDJHSJKDSDKSDS ODHDO ODHSUDSD 34234234 PODSOF pfjfs
What I have written :
String input [] = line.split("\\#\\|");
Above code is splitting the input into 13 different strings but the above code is not working for the last String where "#|#|" is used as a delimeter.
How do I make a REGEX which could pass multiple instances of #| as a delimeter ?
You may wrap the pattern with a non-capturing group and set a + quantifier after it:
.split("(?:#\\|)+")
See the regex demo
Now, (?:#\\|)+ matches 1 or more consecutive occurrences of a two-char sequence, # and |. Note you do not need to escape # (unless you want to use Pattern.COMMENTS option (and you won't need it here with such a short pattern).

Java Regular expression validation

I want to validate a string which allows only alpha numeric values and only
one dot character and only underscore character in java .
String fileName = (String) request.getParameter("read");
I need to validate the fileName retrieving from the request and should
satisfy the above criteria
I tried in "^[a-zA-Z0-9_'.']*$" , but this allows more than one dot character
I need to validate my string in the given scenarios ,
1 . Filename contains only alpha numeric values .
2 . It allows only one dot character (.) , example : fileRead.pdf ,
fileWrite.txt etc
3 . it allows only underscore characters . All the other symbols should be
declined
Can any one help me on this ?
You should use String.matches() method :
System.out.println("My_File_Name.txt".matches("\\w+\\.\\w+"));
You can also use java.util.regex package.
java.util.regex.Pattern pattern =
java.util.regex.Pattern.compile("\\w+\\.\\w+");
java.util.regex.Matcher matcher = pattern.matcher("My_File_Name.txt");
System.out.println(matcher.matches());
For more information about REGEX and JAVA, look at this page :
https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
You could use two negative lookaheads here:
^((?!.*\..*\.)(?!.*_.*_)[A-Za-z0-9_.])*$
Each lookahead asserts that either a dot or an underscore does not occur two times, implying that it can occur at most once.
It wasn't completely clear whether you require one dot and/or underscore. I assumed not, but my regex could be easily modified to this requirement.
Demo
You can first check the special characters which have the number limits.
Here is the code:
int occurance = StringUtils.countOccurrencesOf("123123..32131.3", ".");
or
int count = StringUtils.countMatches("123123..32131.3", ".");
If it does not match your request you can discard it before regex check.
If there is no problem you can now put your String to alphanumeric value check.

Merge three regex groups

I have three different sentences which contains repetetive parts.
I want to merge three different regex groups in one, and then replace all mathes to white space.
I am asking you for help, how should I megre these groups ?
String locked = "LOCKED (center)"; //LOCKED() - always the same part
String idle = "Idle (second)"; // Idle() - always the same part
String OK = "example-OK"; // -OK - always the same part
I've built three regular expressions, but they are split. How should i megre them ?
String forLocked = locked.replaceAll("^LOCKED\\s\\((.*)\\)", "$1");
String forIdle = idle.replaceAll("^Idle\\s\\((.*)\\)", "$1");
String forOK = OK.replaceAll("(.*)\\-OK", "$1");
I think this technically works, but it doesn't "feel great."
private static final String REGEX =
"^((Idle|LOCKED) *)?\\(?([a-z]+)\\)?(-OK)?$";
... your code ...
System.out.println(locked.replaceAll(REGEX, "$3"));
System.out.println(idle.replaceAll(REGEX, "$3"));
System.out.println(OK.replaceAll(REGEX, "$3"));
Output is:
center
second
example
Breaking down the expression:
^((Idle|LOCKED) *)? - Possibly starts with Idle or Locked followed by zero or more spaces
\\(?([a-z]+)\\)? - Has a sequence of lowercase characters possible embedded inside optional parentheses (also, we want to match that sequence)
(-OK)?$ - Possibly ends with the literal -OK.
There are still some issues though. The optional parentheses aren't in any way tied together, for example. Also, this would give false positives for compounds like Idle (second)-OK --> second.
I had a more stringent regex at first, but one of the additional challenges is to keep a consistent match index on the group you want to replace with (here, $3.) In other words, there's a whole set of regex where if you could use, say $k and $j in different situations, it would be easier. But, that goes against the whole point of having a single regex to begin with (if you need some pre-existing knowledge of the input you're about to match.) Better would be to assume that we know nothing about what is inside the identifiers locked, idle, and OK.
You can merge them with | like this:
String regex = "^LOCKED\\s\\((.*)\\)|^Idle\\s\\((.*)\\)|(.*)\\-OK$";
String forLocked = locked.replaceAll(regex, "$1");
String forIdle = idle.replaceAll(regex, "$2");
String forOK = OK.replaceAll(regex, "$3");

Java String Regex replacement

Sample Input:
a:b
a.in:b
asds.sdsd:b
a:b___a.sds:bc___ab:bd
Sample Output:
a:replaced
a.in:replaced
asds.sdsd:replaced
a:replaced___a.sds:replaced___ab:replaced
String which comes after : should be replaced with custom function.
I have done the same without Regex. I feel it can be replaced with regex as we are trying to extract string out of specific pattern.
For first three cases, it's simple enough to extract String after :, but I couldn't find a way to deal with third case, unless I split the string ___ and apply the approach for first type of pattern and again concatenate them.
Just replace only the letters with exists next to : with the string replaced.
string.replaceAll("(?<=:)[A-Za-z]+", "replaced");
DEMO
or
If you also want to deal with digits, then add \d inside the char class.
string.replaceAll("(?<=:)[A-Za-z\\d]+", "replaced");
(:)[a-zA-Z]+
You can simply do this with string.replaceAll.Replace by $1replaced.See demo.
https://regex101.com/r/fX3oF6/18

Categories

Resources