regular expressions in java

regular expressions in java - java

How to validate an expression for a single dot character?
For example if I have an expression "trjb....fsf..ib.bi." then it should return only dots at index 15 and 18. If I use Pattern p=Pattern.compile("(\\.)+"); I get
4 ....
11 ..
15 .
18 .

This seems to do the trick:
String input = "trjb....fsf..ib.bi.";
Pattern pattern = Pattern.compile("[^\\.]\\.([^\\.]|$)");
Matcher matcher = pattern.matcher(" " + input);
while (matcher.find()) {
System.out.println(matcher.start());
}
The extra space in front of the input does two things:
Allows for a . to be detected as the first character of the input string
Offsets the matcher.start() by one to account for the character in front of the matched .
Result is:
15
18

add a blank at the beginning and at the end of the string and then use the pattern
"[^\\.]\\.[^\\.]"

you need to use negative lookarounds .
Something like Pattern.compile("(?<!\\.)\\.(?!\\.)");

Try
Pattern.compile("(?<=[^\\.])\\.(?=[^\\.])")
or even better...
Pattern.compile("(?<![\\.])\\.(?![\\.])")
This uses negative lookaround.
(?<![\\.]) => not preceeded by a .
\\. => a .
(?![\\.]) => not followed by a .

Related

How to remove everything after specific character in string using Java

I have a string that looks like this:
analitics#gmail.com#5
And it represents my userId.
I have to send that userId as parameter to the function and send it in the way that I remove number 5 after second # and append new number.
I started with something like this:
userService.getUser(user.userId.substring(0, userAfterMigration.userId.indexOf("#") + 1) + 3
What is the best way of removing everything that comes after the second # character in string above using Java?

Here is a splitting option:
String input = "analitics#gmail.com#5";
String output = String.join("#", input.split("#")[0], input.split("#")[1]) + "#";
System.out.println(output); // analitics#gmail.com#
Assuming your input would only have two at symbols, you could use a regex replacement here:
String input = "analitics#gmail.com#5";
String output = input.replaceAll("#[^#]*$", "#");
System.out.println(output); // analitics#gmail.com#

You can capture in group 1 what you want to keep, and match what comes after it to be removed.
In the replacement use capture group 1 denoted by $1
^((?:[^#\s]+#){2}).+
^ Start of string
( Capture group 1
(?:[^#\s]+#){2} Repeat 2 times matching 1+ chars other than #, and then match the #
) Close group 1
.+ Match 1 or more characters that you want to remove
Regex demo | Java demo
String s = "analitics#gmail.com#5";
System.out.println(s.replaceAll("^((?:[^#\\s]+#){2}).+", "$1"));
Output
analitics#gmail.com#
If the string can also start with ##1 and you want to keep ## then you might also use:
^((?:[^#]*#){2}).+
Regex demo

The simplest way that would seem to work for you:
str = str.replaceAll("#[^.]*$", "");
See live demo.
This matches (and replaces with blank to delete) # and any non-dot chars to the end.

String replacement when regex reverse group is null in java

I want to convert a software version number into a github tag name by regular expression.
For example, the version of ognl is usually 3.2.1. What I want is the tag name OGNL_3_2_1
So we can use String::replaceAll(String regex, String replacement) method like this
"3.2.1".replaceAll("(\d+).(\d+).(\d+)", "OGNL_$1_$2_$3")
And we can get the tag name OGNL_3_2_1 easily.
But when it comes to 3.2, I want the regex still working so I change it into (\d+).(\d+)(?:.(\d+))?.
Execute the code again, what I get is OGNL_3_2_ rather than OGNL_3_2. The underline _ at the tail is not what I want. It is resulted by the null group for $3
So how can I write a suitable replacement to solve this case?
When the group for $3 is null, the underline _ should disappear
Thanks for your help !!!

You can make the last . + digits part optional by enclosing it with an optional non-capturing group and use a lambda as a replacement argument with Matcher.replaceAll in the latest Java versions:
String regex = "(\\d+)\\.(\\d+)(?:\\.(\\d+))?";
Pattern p = Pattern.compile(regex);
String s="3.2.1";
Matcher m = p.matcher(s);
String result = m.replaceAll(x ->
x.group(3) != null ? "OGNL_" + x.group(1) + "_" + x.group(2) + "_" + x.group(3) :
"OGNL_" + x.group(1) + "_" + x.group(2) );
System.out.println(result);
See the Java demo.
The (\d+)\.(\d+)(?:\.(\d+))? pattern (note that literal . are escaped) matches and captures into Group 1 any one or more digits, then matches a dot, then captures one or more digits into Group 2 and then optionally matches a dot and digits (captured into Group 3). If Group 3 is not null, add the _ and Group 3 value, else, omit this part when building the final replacement value.

Java regular expression match two same number

I want to use RE to match the file paths like below:
../90804/90804_0.jpg
../89246/89246_8.jpg
../89247/89247_14.jpg
Currently, I use the code as below to match:
Pattern r = Pattern.compile("^(.*?)[/](\\d+?)[/](\\d+?)[_](\\d+?).jpg$");
Matcher m = r.matcher(file_path);
But I found it will be an unexpected match like for:
../90804/89246_0.jpg
Is impossible in RE to match two same number?

You may use a \2 backreference instead of the second \d+ here:
s.matches("(.*?)/(\\d+)/(\\2)_(\\d+)\\.jpg")
See the regex demo. Note that if you use matches method, you won't need ^ and $ anchors.
Details
(.*?) - Group 1: any 0+ chars other than line break chars as few as possible
/ - a slash
(\\d+) - Group 2: one or more digits
/ - a slash
(\\2) - Group 3: the same value as in Group 2
_ - an underscore
(\\d+) - Group 4: one or more digits
\\.jpg - .jpg.
Java demo:
Pattern r = Pattern.compile("(.*?)/(\\d+)/(\\2)_(\\d+)\\.jpg");
Matcher m = r.matcher(file_path);
if (m.matches()) {
System.out.println("Match found");
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
System.out.println(m.group(4));
}
Output:
Match found
..
90804
90804
0

You can use this regex with a capture group and back-reference of the same:
(\d+)/\1
RegEx Demo
Equivalent Java regex string will be:
final String regex = "(\\d+)/\\1";
Details:
(\d+): Match 1+ digits and capture it in group #1
/: Math literal /
\1: Using back-reference #1, match same number as in group #1

this regEx ^(.*)\/(\d+?)\/(\d+?)_(\d+?)\.jpg$
is matching stings like this:
../90804/90804_0.jpg
../89246/89246_8.jpg
../89247/89247_14.jpg
into 4 parts.
See example Result:

Weird password check matching using regex in Java

I'm trying to check a password with the following constraint:
at least 9 characters
at least 1 upper case
at least 1 lower case
at least 1 special character into the following list:
~ ! # # $ % ^ & * ( ) _ - + = { } [ ] | : ; " ' < > , . ?
no accentuated letter
Here's the code I wrote:
Pattern pattern = Pattern.compile(
"(?!.*[âêôûÄéÆÇàèÊùÌÍÎÏÐîÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ€£])"
+ "(?=.*\\d)"
+ "(?=.*[a-z])"
+ "(?=.*[A-Z])"
+ "(?=.*[`~!##$%^&*()_\\-+={}\\[\\]\\\\|:;\"'<>,.?/])"
+ ".{9,}");
Matcher matcher = pattern.matcher(myNewPassword);
if (matcher.matches()) {
//do what you've got to do when you
}
The issue is that some characters like € or £ doesn't make the password wrong.
I don't understand why this is working that way since I explicitly exclude € and £ from the authorized list.

Rather than trying to disallow those non-ascii characters why not makes your regex accept only ASCII characters like this:
Pattern pattern = Pattern.compile(
"(?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(?=.*\\p{Print})\\p{ASCII}{9,})");
Also see use of \p{Print} instead of the big character class. I believe that would be suffice for you.
Check Javadoc for more details

This just allows printable Ascii. Note that it allows space character, but you could disallow space by setting \x21 instead.
Edit - I didn't see a number in the requirement, saw it in your regex, but wasn't sure.
# "^(?=.*[A-Z])(?=.*[a-z])(?=.*[`~!##$%^&*()_\\-+={}\\[\\]|:;\"'<>,.?])[\\x20-\\x7E]{9,}$"
^
(?= .* [A-Z] )
(?= .* [a-z] )
(?= .* [`~!##$%^&*()_\-+={}\[\]|:;"'<>,.?] )
[\x20-\x7E]{9,}
$

Regular Expression for string in java

I am trying to write a regular expression for these find of strings
05 IMA-POLICY-ID PIC X(15). 00020068
05 (AMENT)-GROUPCD PIC X(10).
I want to parse anything between 05 and first tab .
The line might start with tabs or spaces and then digit
Initial number can be anything 05,10,15 .
So In the first line I need to pasrse IMA-POLICY-ID and in second line (AMENT)-GROUPCD
This is the code i have written and its not finding the pattern where am i going wrong ?
Pattern p1 = Pattern.compile("^[0-9]+\\s\\S+\t$");
Matcher m1 = p1.matcher(line);
System.out.println("m1 =="+m1.group());

Pattern p1 = Pattern.compile("\\b(?:05|1[05])\\b[^\\t]*\\t");
will match anything from 05, 10 or 15 until the nearest \t.
Explanation:
\b # Start of number/word
(?:05|1[05]) # Match 05, 10 or 15
\b # End of number/word
[^\t]* # Match any number of characters except tab
\t # Match a tab

^\d+\s+([^\s]+)
this will match your requirement
demo here : http://regex101.com/r/rQ7fT3

Your regex is almost correct. Just remove the \t$ at the end of your regex. and capture the \\S+ as a group.
Pattern p1 = Pattern.compile("^[0-9]+\\s(\\S+)");
Now print it as:
if (m.find( )) {
System.out.println(m.group(1));
}

Your pattern expects the line to end after IMA-POLICY-ID etc, because of the $ at the end.
If there is no white space in the string you want to match (I assume there isn't because of your use of \S+, I'd change the pattern to ^\d+\s+(\S+) which should be sufficient to match any number at the start of a line, followed by whitespace and then the group of non-whitespace characters you want to match (note that a tab is whitespace as well).
If you need to match until the first tab or the end of the input and include other whitespace, replace (\S+) with ([^\t]+).

I can see two things that might prevent your Pattern from working.
Firstly your input Strings contain multiple tab-separated values, therefore the $ "end-of-input" character at the end of your Pattern will fail to match the String
Secondly, you want to find what's in between 05 (etc.) and the 1st tab. Therefore you need to wrap your desired expression between parenthesis (e.g. (\\S+)) and refer it by its group number (in this case, it would be group 1)
Here's an example:
String input = "05 IMA-POLICY-ID\tPIC X(15).\t00020068" +
"\r\n05 (AMENT)-GROUPCD\tPIC X(10).";
// | 0, 1, or 5 twice (refine here if needed)
// | | 1 whitespace
// | | | your queried expression (here I use a
// | | | reluctant dot search
// | | | | tab
// | | | | | anything after, reluctant
Pattern p = Pattern.compile("[015]{2}\\s(.+?)\t.+?");
Matcher m = p.matcher(input);
while (m.find()) {
System.out.println("Found: " + m.group(1));
}
Output
Found: IMA-POLICY-ID
Found: (AMENT)-GROUPCD

This is what i came up with and it worked :
String re = "^\\s+\\d+\\s+([^\\s]+)";
Pattern p1 = Pattern.compile(re, Pattern.MULTILINE);
Matcher m1 = p1.matcher(line);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

regular expressions in java - java

How to validate an expression for a single dot character? For example if I have an expression "trjb....fsf..ib.bi." then it should return only dots at index 15 and 18. If I use Pattern p=Pattern.compile("(\\.)+"); I get 4 .... 11 .. 15 . 18 .

add a blank at the beginning and at the end of the string and then use the pattern "[^\\.]\\.[^\\.]"

you need to use negative lookarounds . Something like Pattern.compile("(?<!\\.)\\.(?!\\.)");

Try Pattern.compile("(?<=[^\\.])\\.(?=[^\\.])") or even better... Pattern.compile("(?<![\\.])\\.(?![\\.])") This uses negative lookaround. (?<![\\.]) => not preceeded by a . \\. => a . (?![\\.]) => not followed by a .

Related

How to remove everything after specific character in string using Java

String replacement when regex reverse group is null in java

Java regular expression match two same number

Weird password check matching using regex in Java

Regular Expression for string in java

Categories

Resources