I have some strings which differ only in one word.(E.g. foo byte bar and foo word bar). I used mutiple regex to parse them.(E.g. (\w+) byte (\w+) -> $1 1 $2 and (\w+) word (\w+) -> $1 2 $2) Is it possible to choose the output depending on the input word? (E.g. (\w+) (\w+) (\w+) -> $1 <depending on $2> $3) Tell me, if you need more examples.
This is best solved using Matcher.appendReplacement / Matcher.appendTail as follows:
String input = "hello byte world";
Pattern p = Pattern.compile("(\\w+) (\\w+) (\\w+)");
Matcher m = p.matcher(input);
StringBuffer sb = new StringBuffer();
while (m.find()) {
// Compute replacement for middle word
String w = m.group(2);
String s = w.equals("byte") ? "<A BYTE!>"
: w.equals("word") ? "<A WORD!>"
: "something else";
m.appendReplacement(sb, "$1 " + s + " $3");
}
m.appendTail(sb);
System.out.println(sb);
Output:
hello <A BYTE!> world
Related
I have one input String like this:
"I am Duc/N Ta/N Van/N"
String "/N" present it is the Name of one person.
The expected output is:
Name: Duc Ta Van
How can I do it by using regular expression?
You can use Pattern and Matcher like this :
String input = "I am Duc/N Ta/N Van/N";
Pattern pattern = Pattern.compile("([^\\s]+)/N");
Matcher matcher = pattern.matcher(input);
String result = "";
while (matcher.find()) {
result+= matcher.group(1) + " ";
}
System.out.println("Name: " + result.trim());
Output
Name: Duc Ta Van
Another Solution using Java 9+
From Java9+ you can use Matcher::results like this :
String input = "I am Duc/N Ta/N Van/N";
String regex = "([^\\s]+)/N";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
String result = matcher.results().map(s -> s.group(1)).collect(Collectors.joining(" "));
System.out.println("Name: " + result); // Name: Duc Ta Van
Here is the regex to use to capture every "name" preceded by a /N
(\w+)\/N
Validate with Regex101
Now, you just need to loop on every match in that String and concatenate the to get the result :
String pattern = "(\\w+)\\/N";
String test = "I am Duc/N Ta/N Van/N";
Matcher m = Pattern.compile(pattern).matcher(test);
StringBuilder sbNames = new StringBuilder();
while(m.find()){
sbNames.append(m.group(1)).append(" ");
}
System.out.println(sbNames.toString());
Duc Ta Van
It is giving you the hardest part. I let you adapt this to match your need.
Note :
In java, it is not required to escape a forward slash, but to use the same regex in the entire answer, I will keep "(\\w+)\\/N", but "(\\w+)/N" will work as well.
I've used "[/N]+" as the regular expression.
Regex101
[] = Matches characters inside the set
\/ = Matches the character / literally (case sensitive)
+ = Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
I have large database. I want to check my database capitalize errors. I use this pattern for repeated chars. Pattern works but i need to start and end condition with string.
Pattern:
(\w)\1+
Target String:
Javaaa
result: aaa
I want to add condition to regex; Start with Ja and end with a*. Result **only must be repetead characters.
(I dont want to control programmatically only regex do this if its possible
(I'm do this with String.replaceAll(regex, string) not to
Pattern or Matcher class)
You may use a lookahead anchored at the leading word boundary:
\b(?=Ja\w*a\b)\w*?((\w)\2+)\w*\b
See the regex demo
Details:
\b - leading word boundary
(?=Ja\w*a\b) - a positive lookahead that requires the whole word to start with Ja, then it can have 0+ word characters and end with a
\w*? - 0+ word characters but as few as possible
((\w)\2+) - Group 1 matching identical consecutive characters
\w* - any remaining word characters (0 or more)
\b - trailing word boundary.
The result you are seeking is in Group 1.
String s = "Prooo\nJavaaa";
Pattern pattern = Pattern.compile("\\b(?=Ja\\w*a\\b)\\w*?((\\w)\\2+)\\w*\\b");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(1));
}
See the Java demo.
Another code example (inspired from #Wiktor Stribizew's code ) as per your expected input and output format.
public static void main( String[] args )
{
String[] input =
{ "Javaaa", "Javaaaaaaaaa", "Javaaaaaaaaaaaaaaaaaa", "Paoooo", "Paoooooooo", "Paooooooooxxxxxxxxx" };
for ( String str : input )
{
System.out.println( "Target String :" + str );
Pattern pattern = Pattern.compile( "((.)\\2+)" );
Matcher matcher = pattern.matcher( str );
while ( matcher.find() )
{
System.out.println( "result: " + matcher.group() );
}
System.out.println( "---------------------" );
}
System.out.println( "Finish" );
}
Output:
Target String :Javaaa
result: aaa
---------------------
Target String :Javaaaaaaaaa
result: aaaaaaaaa
---------------------
Target String :Javaaaaaaaaaaaaaaaaaa
result: aaaaaaaaaaaaaaaaaa
---------------------
Target String :Paoooo
result: oooo
---------------------
Target String :Paoooooooo
result: oooooooo
---------------------
Target String :Paooooooooxxxxxxxxx
result: oooooooo
result: xxxxxxxxx
---------------------
Finish
I have to extract values from string using regex groups.
Inputs are like this,
-> 1
-> 5.2
-> 1(2)
-> 3(*)
-> 2(3).2
-> 1(*).5
Now I write following code for getting values from these inputs.
String stringToSearch = "2(3).2";
Pattern p = Pattern.compile("(\\d+)(\\.|\\()(\\d+|\\*)\\)(\\.)(\\d+)");
Matcher m = p.matcher(stringToSearch);
System.out.println("1: "+m.group(1)); // O/P: 2
System.out.println("3: "+m.group(3)); // O/P: 3
System.out.println("3: "+m.group(5)); // O/P: 2
But, my problem is only first group is compulsory and others are optional.
Thats why I need regex like, It will check all patterns and extract values.
Use non-capturing groups and turn them to optional by adding ? quantifier next to those groups.
^(\d+)(?:\((\d+|\*)\))?(?:\.(\d+))?$
DEMO
Java regex would be,
"(?m)^(\\d+)(?:\\((\d\+|\\*)\\))?(?:\\.(\\d+))?$"
Example:
String input = "1\n" +
"5.2\n" +
"1(2)\n" +
"3(*)\n" +
"2(3).2\n" +
"1(*).5";
Matcher m = Pattern.compile("(?m)^(\\d+)(?:\\((\\d+|\\*)\\))?(?:\\.(\\d+))?$").matcher(input);
while(m.find())
{
if (m.group(1) != null)
System.out.println(m.group(1));
if (m.group(2) != null)
System.out.println(m.group(2));
if (m.group(3) != null)
System.out.println(m.group(3));
}
Here is an alternate approach that is simpler to understand.
First replace all non-digit, non-* characters by a colon
Split by :
Code:
String repl = input.replaceAll("[^\\d*]+", ":");
String[] tok = repl.split(":");
RegEx Demo
Why does this regex pattern fail to match the groups in Java. When I run the same example with in a bash shell with echo and sed it works.
String s = "Match foo and bar and baz";
//Pattern p = Pattern.compile("Match (.*) or (.*) or (.*)"); //was a typo
Pattern p = Pattern.compile("Match (.*) and (.*) and (.*)");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1));
}
I am expecting to match foo, bar, and baz.
$ echo "Match foo and bar and baz" | sed 's/Match \(.*\) and \(.*\) and \(.*\)/\1, \2, \3/'
foo, bar, baz
It is due to greedy nature of .*. You can use this regex:
Pattern p = Pattern.compile("Match (\\S+) and (\\S+) and (\\S+)");
Here this regex is using \\S+ which means match 1 or more non-spaces.
Full code
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1) + ", " + m.group(2) + ", " + m.group(3));
}
You're trying to match the whole String, so
while (m.find()) {
will only iterate once.
That single find() will capture all the groups. As such, you can print them out as
System.out.println(m.group(1) + " " + m.group(2) + m.group(3));
Or use a for loop over the Matcher#groupCount().
Your regex is correct, but you need to print the different groups and not only the 1st, ex:
while (m.find()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
}
It seems like a simple typo (or -> and):
Pattern p = Pattern.compile("Match (.*) and (.*) and (.*)");
UPDATE
To replace:
String s = "Match foo and bar and baz";
String replaced = s.replaceAll("Match (.*) and (.*) and (.*)", "$1, $2, $3");
System.out.println(replaced);
So i need to get a word between 2 other words; and im using pattern and matcher.
Pattern p = Pattern.compile("Hello(.*?)GoodBye");
Matcher m = p.matcher(line);
In this example i'm getting the word between Hello and Goodbye and it works.
What i want to do is replace Hello and GoodBye bye variables such as:
String StartDelemiter = "Hello";
String EndDelemiter = "GoodBye";
How should write it in Pattern p = Pattern.compile(---); I Tried :
Pattern p = Pattern.compile( "{ "+StartDelemiter +" (.*?) "+EndDelemiter+" }" );
But application crashes !!
You need to escape { and } with backslashes, something like:
Pattern p = Pattern.compile( "\\{ "+StartDelemiter +" (.*?) "+EndDelemiter+" \\}" );
The curly braces are Regex quantifiers
<pattern>{n} Match exactly n times
<pattern>{n,} Match at least n times
<pattern>{n,m} Match at least n but not more than m times