I want to match exactly 2 times with regex - java

I have something like this string.
XXXX^^^141409i1^^^XXXX.
I want to match those 3 ^ in a group and the group exactly 2 times. I wrote this but it doesn't seem to work.
(?:(\^){3}){2}
EDIT
I have to split it and extract the number in the middle. The point is that that group should consist of exactly 3 ^ and exactly 2 times. If the first group has only 1 or 2 ^ it will stop matching. That string is user input and if he inputs more than that string, for example XXXX^^^141409i1^^^XXXX^^^^XXXX then it shouldn't match the last group, only the first 2. (Sorry if I'm too ambiguous.)
EDIT2
The point of the exercise is to split the string and get the number in the middle, I wrote this line but the problem is that it matches every ^^^ and i only want to match 2 times exactly.
String[] split = s.split("(\\^){3}");

If I correctly understood what you want, I hope this will help you:
String input = "XXXX^^^141409i1^^^XXXX^^^^XXXX";
Pattern pattern = Pattern.compile(".*?\\^{3}(\\w+)\\^{3}");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
System.out.println("The number in the middle: " + matcher.group(1));
}
Output:
The number in the middle: 141409i1
Here you can see how it works: https://regexr.com/51r9e

Related

Partially mask data of a group of number using regex

I would like to partially mask data using regex. Here is the input :
123-12345-1234567
And here is what I'd like as output :
1**-*****-*****67
I figure out how to replace for the last group but I don't know to do for the rest of the data.
String s = "123-12345-1234567";
System.out.println(s.replaceAll("\\d(?=\\d{2})", "*")); // output is *23-***45-*****67
Also, I'd like to use only regex because I have different type of data, so different type of mask. I don't want to create functions for each type of data.
For example :
AAAAAAAAA // becomes ********AA
12334567 // becomes 123******
Thanks for your help !
We can use the following regex replacement approach:
String input = "123-12345-1234567";
String output = input.substring(0, 1) +
input.substring(1, input.length()-2).replaceAll("\\d", "*") +
input.substring(input.length()-2);
System.out.println(output); // 1**-*****-*****67
Here we concatenate together the first digit, followed by the middle portion with all digits replaced by *, along with the final two digits.
Edit: A pure regex solution, which, however, is more lines of code than the above and might be less performant.
String input = "123-12345-1234567";
String pattern = "^(\\d)(.*)(\\d{2})$";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(input);
if (m.find()) {
String output = m.group(1) + m.group(2).replaceAll("\\d", "*") + m.group(3);
System.out.println(output); // 1**-*****-*****67
}
Java supports a fixed quantifier in a lookbehind, so what you might do is use a pattern with an alternation to account for the different scenario's if you must use a regex only.
Using the lookarounds you can select a single character to be replaced by *
Note that this is hard to maintain, and it would be a better option to write separate functions for the different data formats using separate patterns or string functions (perhaps accompanied by unit tests)
(?<=^\d{3,7})\d(?=\d*$)|(?<=^[A-Z]{0,6})[A-Z](?=[A-Z]*$)|\d(?<=^\d{2,3})(?=\d?-\d{5}-\d{7}$)|\d(?<=^\d{3}-\d{1,5}(?:-\d{1,5})?)
The separate parts match:
(?<=^\d{3,7})\d(?=\d*$) Match a digit asserting 3-7 digits to the left and only digits to the right
| Or
(?<=^[A-Z]{0,6})[A-Z](?=[A-Z]*$) Match A-Z asserting 0-6 chars to the left and only chars A-Z to the right
| Or
\d(?<=^\d{2,3})(?=\d?-\d{5}-\d{7}$) Match a digit asserting 2-3 digits to the left and optional digit, - with 5 digits and - with 7 digits to the right
| Or
\d(?<=^\d{3}-\d{1,5}(?:-\d{1,5})?) Match a digit asserting 3 digits to the left followed - and 1-5 digits and optionally - with 1-5 digits
Regex demo | Java demo
String regex = "(?<=^\\d{3,7})\\d(?=\\d*$)|(?<=^[A-Z]{0,6})[A-Z](?=[A-Z]*$)|\\d(?<=^\\d{2,3})(?=\\d?-\\d{5}-\\d{7}$)|\\d(?<=^\\d{3}-\\d{1,5}(?:-\\d{1,5})?)";
String s1 = "123-12345-1234567";
String s2 = "AAAAAAAAA";
String s3 = "12334567";
System.out.println(s1.replaceAll(regex, "*"));
System.out.println(s2.replaceAll(regex, "*"));
System.out.println(s3.replaceAll(regex, "*"));
Output
1**-*****-*****67
*******AA
123*****
public static void main(String[] args) {
System.out.println("123-12345-1234567".replaceAll("(?<=.{1,})\\d(?=.{3,})", "*"));
System.out.println("AAAAAAAAA".replaceAll(".(?=.{2,})", "*"));
System.out.println("12334567".replaceAll("(?<=.{3,}).", "*"));
}
output:
1**-*****-*****67
*******AA
123*****

Split String at different lengths in Java

I want to split a string after a certain length.
Let's say we have a string of "message"
123456789
Split like this :
"12" "34" "567" "89"
I thought of splitting them into 2 first using
"(?<=\\G.{2})"
Regexp and then join the last two and again split into 3 but is there any way to do it on a single go using RegExp. Please help me out
Use ^(.{2})(.{2})(.{3})(.{2}).* (See it in action in regex101) to group the String to the specified length and grab the groups as separate Strings
String input = "123456789";
List<String> output = new ArrayList<>();
Pattern pattern = Pattern.compile("^(.{2})(.{2})(.{3})(.{2}).*");
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
output.add(matcher.group(i));
}
}
System.out.println(output);
NOTE: Group capturing starts from 1 as the group 0 matches the whole String
And a Magnificent Sorcery from #YCF_L from comment
String pattern = "^(.{2})(.{2})(.{3})(.{2}).*";
String[] vals = "123456789".replaceAll(pattern, "$1-$2-$3-$4").split("-");
Whats the magic here is you can replace the captured group by replaceAll() method. Use $n (where n is a digit) to refer to captured subsequences. See this stackoverflow question for better explanation.
NOTE: here its assumed that no input string contains - in it.
if so, then find any other character that will not be in any of
your input strings so that it can be used as a delimiter.
test this regex in regex101 with 123456789 test string.
^(\d{2})(\d{2})(\d{3})(\d{2})$
output :
Match 1
Full match 0-9 `123456789`
Group 1. 0-2 `12`
Group 2. 2-4 `34`
Group 3. 4-7 `567`
Group 4. 7-9 `89`

Parsing array syntax using regex

I think what I am asking is either very trivial or already asked, but I have had a hard time finding answers.
We need to capture the inner number characters between brackets within a given string.
so given the string
StringWithMultiArrayAccess[0][9][4][45][1]
and the regex
^\w*?(\[(\d+)\])+?
I would expect 6 capture groups and access to the inner data.
However, I end up only capturing the last "1" character in capture group 2.
If it is important heres my java junit test:
#Test
public void ensureThatJsonHandlerCanHandleNestedArrays(){
String stringWithArr = "StringWithMultiArray[0][0][4][45][1]";
Pattern pattern = Pattern.compile("^\\w*?(\\[(\\d+)\\])+?");
Matcher matcher = pattern.matcher(stringWithArr);
matcher.find();
assertTrue(matcher.matches()); //passes
System.out.println(matcher.group(2)); //prints 1 (matched from last array symbols)
assertEquals("0", matcher.group(2)); //expected but its 1 not zero
assertEquals("45", matcher.group(5)); //only 2 capture groups exist, the whole string and the 1 from the last array brackets
}
In order to capture each number, you need to change your regex so it (a) captures a single number and (b) is not anchored to--and therefore limited by--any other part of the string ("^\w*?" anchors it to the start of the string). Then you can loop through them:
Matcher mtchr = Pattern.compile("\\[(\\d+)\\]").matcher(arrayAsStr);
while(mtchr.find()) {
System.out.print(mtchr.group(1) + " ");
}
Output:
0 9 4 45 1

Regex - Match numbers & special cases

I'm trying to make a regex that would produce the following results :
for 7.0 + 5 - :asc + (8.256 - :b)^2 + :d/3 : 7.0, 5, :asc, 8.256, :b, 2, :d, 3
for -+*-/^^ )รง# : nothing
It's should first match numbers which can be float, so in my regex I have : [0-9]+(\\.[0-9])? but it should also mach special cases like :a or :Abc.
To be more precise, it should (if possible) match anything but mathematical operators /*+^- and parentheses.
So here is my final regex : ([0-9]+(\\.[0-9])?)|(:[a-zA-Z]+) but it's not working because matcher.groupCount() returns 3 for both of the examples I gave.
Groups are what you specifically group in the regex. Anything surrounded in parentheses is a group. (Hello) World has 1 group, Hello. What you need to be doing is finding all the matches.
In your code ([0-9]+(\\.[0-9])?)|(:[a-zA-Z]+), 3 sets of parentheses can be seen. This is why you will always be given 3 groups in every match.
Your code works fine as it is, here is an example:
String text = "7.0 + 5 - :asc + (8.256 - :b)^2 + :d/3";
Pattern p = Pattern.compile("([0-9]+(\\.[0-9]+)?)|(:[a-zA-Z]+)");
Matcher m = p.matcher(text);
List<String> matches = new ArrayList<String>();
while (m.find()) matches.add(m.group());
for (String match : matches) System.out.println(match);
The ArrayList matches will contain all of the matches that your regex finds.
The only change I made was add a + after the second [0-9].
Here is the output:
7.0
5
:asc
8.256
:b
2
:d
3
Here is some more information about groups in java.
Does that help?
Your regex is correct, run the following code:
String input = "7.0 + 5 - :asc + (8.256 - :b)^2 + :d/3"; // your input
String regex = "(\\d+(\\.\\d+)?)|(:[a-z-A-Z]+)"; // exactly yours.
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group());
}
Your problem is the understanding of the method matcher.groupCount(). JavaDoc clearly says
Returns the number of capturing groups in this matcher's pattern.
([^\()+\-*\s])+ //put any mathematical operator inside square bracket

How to count a position of element, relative to another element using regex?

Given String
// 1 2 3
String a = "letters.1223434.more_letters";
I'd like to recognize that numbers come in a 2nd position after the first dot
I then would like to use this knowledge to replace "2nd position of"
// 1 2 3
String b = "someWords.otherwords.morewords";
with "hello" to effectively make
// 1 2 3
String b = "someWords.hello.morewords";
Substitution would have to be done based on the original position of matched element in String a
How can this be done using regex please?
For finding those numbers you can use group mechanism (round brackets in regular expresions):
import java.util.regex.*;
...
String data = "letters.1223434.more_letters";
String pattern="(.+?)\\.(.+?)\\.(.+)";
Matcher m = Pattern.compile(pattern).matcher(data);
if (m.find()) //or while if needed
for (int i = 1; i <= m.groupCount(); i++)
//group 0 == whole String, so I ignore it and start from i=1
System.out.println(i+") [" + m.group(i) + "] start="+m.start(i));
// OUT:
//1) [letters] start=0
//2) [1223434] start=8
//3) [more_letters] start=16
BUT if your goal is just replacing text between two dots try maybe replaceFirst(String regex, String replacement) method on String object:
//find ALL characters between 2 dots once and replace them
String a = "letters.1223434abc.more_letters";
a=a.replaceFirst("\\.(.+)\\.", ".hello.");
System.out.println(a);// OUT => letters.hello.more_letters
regex tells to search all characters between two dots (including these dots), so replacement should be ".hello." (with dots).
If your String will have more dots it will replace ALL characters between first and last dot. If you want regex to search for minimum number of characters necessary to satisfy the pattern you need to use Reluctant Quantifier ->? like:
String b = "letters.1223434abc.more_letters.another.dots";
b=b.replaceFirst("\\.(.+?)\\.", ".hello.");//there is "+?" instead of "+"
System.out.println(b);// OUT => letters.hello.more_letters.another.dots
What you want to do is not directly possible in RegExp, because you cannot get access to the number of the capture group and use this in the replacement operation.
Two alternatives:
If you can use any programming language: Split a using regexp into groups. Check each group if it matches your numeric identifier condition. Split the b string into groups. Replace the corresponding match.
If you only want to use a number of regexp, then you can concatenate a and b using a unique separator (let's say |). Then match .*?\.\d+?\..*?|.*?\.(.*?)\..*? and replace $1. You need to apply this regexp in the three variations first position, second position, third position.
the regex for string a would be
\w+\.(\d+)\.\w+
using the match group to grab the number.
the regex for the second would be
\w+\.(\w+)\.\w+
to grab the match group for the second string.
Then use code like this to do what you please with the matches.
Pattern pattern = Pattern.compile(patternStr);
Matcher matcher = pattern.matcher(inputStr);
boolean matchFound = matcher.find();
where patternStr is the pattern I mentioned above and inputStr is the input string.
You can use variations of this to try each combination you want. So you can move the match group to the first position, try that. If it returns a match, then do the replacement in the second string at the first position. If not, go to position 2 and so on...

Categories

Resources