String replacement when regex reverse group is null in java - java

I want to convert a software version number into a github tag name by regular expression.
For example, the version of ognl is usually 3.2.1. What I want is the tag name OGNL_3_2_1
So we can use String::replaceAll(String regex, String replacement) method like this
"3.2.1".replaceAll("(\d+).(\d+).(\d+)", "OGNL_$1_$2_$3")
And we can get the tag name OGNL_3_2_1 easily.
But when it comes to 3.2, I want the regex still working so I change it into (\d+).(\d+)(?:.(\d+))?.
Execute the code again, what I get is OGNL_3_2_ rather than OGNL_3_2. The underline _ at the tail is not what I want. It is resulted by the null group for $3
So how can I write a suitable replacement to solve this case?
When the group for $3 is null, the underline _ should disappear
Thanks for your help !!!

You can make the last . + digits part optional by enclosing it with an optional non-capturing group and use a lambda as a replacement argument with Matcher.replaceAll in the latest Java versions:
String regex = "(\\d+)\\.(\\d+)(?:\\.(\\d+))?";
Pattern p = Pattern.compile(regex);
String s="3.2.1";
Matcher m = p.matcher(s);
String result = m.replaceAll(x ->
x.group(3) != null ? "OGNL_" + x.group(1) + "_" + x.group(2) + "_" + x.group(3) :
"OGNL_" + x.group(1) + "_" + x.group(2) );
System.out.println(result);
See the Java demo.
The (\d+)\.(\d+)(?:\.(\d+))? pattern (note that literal . are escaped) matches and captures into Group 1 any one or more digits, then matches a dot, then captures one or more digits into Group 2 and then optionally matches a dot and digits (captured into Group 3). If Group 3 is not null, add the _ and Group 3 value, else, omit this part when building the final replacement value.

Related

How to remove everything after specific character in string using Java

I have a string that looks like this:
analitics#gmail.com#5
And it represents my userId.
I have to send that userId as parameter to the function and send it in the way that I remove number 5 after second # and append new number.
I started with something like this:
userService.getUser(user.userId.substring(0, userAfterMigration.userId.indexOf("#") + 1) + 3
What is the best way of removing everything that comes after the second # character in string above using Java?
Here is a splitting option:
String input = "analitics#gmail.com#5";
String output = String.join("#", input.split("#")[0], input.split("#")[1]) + "#";
System.out.println(output); // analitics#gmail.com#
Assuming your input would only have two at symbols, you could use a regex replacement here:
String input = "analitics#gmail.com#5";
String output = input.replaceAll("#[^#]*$", "#");
System.out.println(output); // analitics#gmail.com#
You can capture in group 1 what you want to keep, and match what comes after it to be removed.
In the replacement use capture group 1 denoted by $1
^((?:[^#\s]+#){2}).+
^ Start of string
( Capture group 1
(?:[^#\s]+#){2} Repeat 2 times matching 1+ chars other than #, and then match the #
) Close group 1
.+ Match 1 or more characters that you want to remove
Regex demo | Java demo
String s = "analitics#gmail.com#5";
System.out.println(s.replaceAll("^((?:[^#\\s]+#){2}).+", "$1"));
Output
analitics#gmail.com#
If the string can also start with ##1 and you want to keep ## then you might also use:
^((?:[^#]*#){2}).+
Regex demo
The simplest way that would seem to work for you:
str = str.replaceAll("#[^.]*$", "");
See live demo.
This matches (and replaces with blank to delete) # and any non-dot chars to the end.

How to get regEx for \" string in a given string

We need to give like this :I\"THIS IS TEST[1]\" for the data : I\"THIS IS TEST[1]\" - to be inserted into the Redshift.
But the above thing has to be done only if we have \" in the given string , So i am searching how to identify the \" in a given string, tried with regEx , but no luck. Can you please suggest here how to check if the string has \" so that if a match is found then i will replaceAll ("\"" , "\\"");
Thank you!!!
In order to check whether a String contains the string \" you must firstly match the character \ literally by adding \ to your Regular Expression and then match the " character by adding \".
The end Regular Expression is \\" and this is tested to match your provided string on https://regex101.com/
Quick test:
String regex = "\\\"";
Pattern pattern = Pattern.compile(regex);
String toTest = "hi there \". Do I match?";
boolean found = pattern.matcher(toTest).find();
System.out.println("Boolean value for whether pattern matches toTest: " + found);
Output: "Boolean value for whether pattern matches toTest: true"
After found is true, simply run your replaceAll command like so:
String#replaceAll(regex, "new value");
Note: \" is just a ". \\" will match \". \\\" will match \". Every time you want to match an additional \, just add \\ to your Regular Expression - Alternatively, if you want to match an additional " add \" to your Regular Expression

How to trim/cut string in java by symbol?

I'm working on a project where my API returns url with id at the end of it and I want to extract it to be used in another function. Here's example url:
String advertiserUrl = http://../../.../uuid/advertisers/4 <<< this is the ID i want to extract.
At the moment I'm using java's string function called substring() but this not the best approach as IDs could become 3 digit numbers and I would only get part of it. Heres my current approach:
String id = advertiserUrl.substring(advertiserUrl.length()-1,advertiserUrl.length());
System.out.println(id) //4
It works in this case but if id would be e.g "123" I would only get it as "3" after using substring, so my question is: is there a way to cut/trim string using dashes "/"? lets say theres 5 / in my current url so the string would get cut off after it detects fifth dash? Also any other sensible approach would be helpful too. Thanks.
P.s uuid in url may vary in length too
You don't need to use regular expressions for this.
Use String#lastIndexOf along with substring instead:
String advertiserUrl = "http://../../.../uuid/advertisers/4";// <<< this is the ID i want to extract.
// this implies your URLs always end with "/[some value of undefined length]".
// Other formats might throw exception or yield unexpected results
System.out.println(advertiserUrl.substring(advertiserUrl.lastIndexOf("/") + 1));
Output
4
Update
To find the uuid value, you can use regular expressions:
String advertiserUrl = "http://111.111.11.111:1111/api/ppppp/2f5d1a31-878a-438b-a03b-e9f51076074a/adver‌​tisers/9";
// | preceded by "/"
// | | any non-"/" character, reluctantly quantified
// | | | followed by "/advertisers"
Pattern p = Pattern.compile("(?<=/)[^/]+?(?=/adver‌​tisers)");
Matcher m = p.matcher(advertiserUrl);
if (m.find()) {
System.out.println(m.group());
}
Output
2f5d1a31-878a-438b-a03b-e9f51076074a
You can either split the string on slashes and take the last position of the array returned, or use the lastIndexOf("/") to get the index of the last slash and then substring the rest of the string.
Use the lastIndexOf() method, which returns the index of the last occurrence of the specified character.
String id = advertiserUrl.substring(advertiserUrl.lastIndexOf('/') + 1, advertiserUrl.length());

Weird password check matching using regex in Java

I'm trying to check a password with the following constraint:
at least 9 characters
at least 1 upper case
at least 1 lower case
at least 1 special character into the following list:
~ ! # # $ % ^ & * ( ) _ - + = { } [ ] | : ; " ' < > , . ?
no accentuated letter
Here's the code I wrote:
Pattern pattern = Pattern.compile(
"(?!.*[âêôûÄéÆÇàèÊùÌÍÎÏÐîÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ€£])"
+ "(?=.*\\d)"
+ "(?=.*[a-z])"
+ "(?=.*[A-Z])"
+ "(?=.*[`~!##$%^&*()_\\-+={}\\[\\]\\\\|:;\"'<>,.?/])"
+ ".{9,}");
Matcher matcher = pattern.matcher(myNewPassword);
if (matcher.matches()) {
//do what you've got to do when you
}
The issue is that some characters like € or £ doesn't make the password wrong.
I don't understand why this is working that way since I explicitly exclude € and £ from the authorized list.
Rather than trying to disallow those non-ascii characters why not makes your regex accept only ASCII characters like this:
Pattern pattern = Pattern.compile(
"(?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(?=.*\\p{Print})\\p{ASCII}{9,})");
Also see use of \p{Print} instead of the big character class. I believe that would be suffice for you.
Check Javadoc for more details
This just allows printable Ascii. Note that it allows space character, but you could disallow space by setting \x21 instead.
Edit - I didn't see a number in the requirement, saw it in your regex, but wasn't sure.
# "^(?=.*[A-Z])(?=.*[a-z])(?=.*[`~!##$%^&*()_\\-+={}\\[\\]|:;\"'<>,.?])[\\x20-\\x7E]{9,}$"
^
(?= .* [A-Z] )
(?= .* [a-z] )
(?= .* [`~!##$%^&*()_\-+={}\[\]|:;"'<>,.?] )
[\x20-\x7E]{9,}
$

Java String Replace Regex

I am doing some string replace in SQL on the fly.
MySQLString = " a.account=b.account ";
MySQLString = " a.accountnum=b.accountnum ";
Now if I do this
MySQLString.replaceAll("account", "account_enc");
the result will be
a.account_enc=b.account_enc
(This is good)
But look at 2nd result
a.account_enc_num=a.account_enc_num
(This is not good it should be a.accountnum_enc=b.accountnum_enc)
Please advise how can I achieve what I want with Java String Replace.
Many Thanks.
From your comment:
Is there anyway to tell in Regex only replace a.account=b.account or a.accountnum=b.accountnum. I do not want accountname to be replace with _enc
If I understand correctly you want to add _enc part only to account or accountnum. To do this you can use
MySQLString = MySQLString.replaceAll("\\baccount(num)?\\b", "$0_enc");
(num)? mean that num is optional so regex will accept account or accountnum
\\b at start mean that there can be no letters, numbers or "_" before account so it wont accept (affect) something like myaccount, or my_account.
\\b at the end will prevent other letters, numbers or "_" after account or accountnum.
It's hard to extrapolate from so few examples, but maybe what you want is:
MySQLString = MySQLString.replaceAll("account\\w*", "$0_enc");
which will append _enc to any sequence of letters, digits, and underscores that starts with account.
try
String s = " a.accountnum=b.accountnum ".replaceAll("(account[^ =]*)", "$1_enc");
it means replace any sequence characters which are not ' ' or '=' which starts the word "account" with the sequence found + "_enc".
$1 is a reference to group 1 in regex; group 1 is the expression in parenthesis (account[^ =]+), i.e. our sequence
See http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html for details

Categories

Resources