Why Regular expression matches one character less at the end? - java

Problem is last one character never gets matched.
When I tried displaying using group ,it shows all match except last character.
Its same in all cases.
Below is the code and its o/p.
package mon;
import java.util.*;
import java.util.regex.*;
class HackerRank {
static void Pattern(String text) {
String p="\\d{1,2}|(0|1)\\d{2}|2[0-4]\\d|25[0-5]";
String pattern="(("+p+")\\.){3}"+p;
Pattern pi=Pattern.compile(pattern);
Matcher m=pi.matcher(text);
// System.out.println(m.group());
if(m.find() && m.group().equals(text))
System.out.println(m.group()+"true");
else
System.out.println(m.group()+" false");
}
public static void main(String[] args) {
Scanner sc=new Scanner(System.in);
while(sc.hasNext()) {
Pattern(sc.next());
}
sc.close();
}
}
I/P:000.12.12.034;
O/P:000.12.12.03 false

You should properly group the alternatives inside the octet pattern:
String p="(?:\\d{1,2}|[01]\\d{2}|2[0-4]\\d|25[0-5])";
// ^^^ ^
Then build the patter like
String pattern = p + "(?:\\." + p + "){3}";
It will become a bit more efficient. Then, use matches to require a full string match:
if(m.matches()) {...
See a Java demo:
String p="(?:\\d{1,2}|[01]\\d{2}|2[0-4]\\d|25[0-5])";
String pattern = p + "(?:\\." + p + "){3}";
String text = "192.156.34.56";
// System.out.println(pattern); => (?:\d{1,2}|[01]\d{2}|2[0-4]\d|25[0-5])(?:\.(?:\d{1,2}|[01]\d{2}|2[0-4]\d|25[0-5])){3}
Pattern pi=Pattern.compile(pattern);
Matcher m=pi.matcher(text);
if(m.matches())
System.out.println(m.group()+" => true");
else
System.out.println("False"); => 192.156.34.56 => true
And here is the resulting regex demo.

Related

regex expression in java using wildcards

Is there a way to use a regex expression with wild cards? Specifically, I have a String phrase and another String target. I would like to use the match method to find the first occurrence of the target in the phrase where the character before and after the target is anything other than a-z.
Updated:
Is there a way to use the String method matches() with the following regex:
"(?<![a-z])" + "hello" + "(?![a-z])";
You can use the regex, "(?<![a-z])" + Pattern.quote(phrase) + "(?![a-z])"
Demo at regex101 with phrase = "hello".
(?<![a-z]): Negative lookbehind for [a-z]
(?![a-z]): Negative lookahead for [a-z]
Java Demo:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.Stream;
public class Main {
public static void main(String[] args) {
// Test
String phrase = "hello";
String regex = "(?<![a-z])" + Pattern.quote(phrase) + "(?![a-z])";
Pattern pattern = Pattern.compile(regex);
Stream.of(
"hi hello world",
"hihelloworld"
).forEach(s -> {
Matcher matcher = pattern.matcher(s);
System.out.print(s + " => ");
if(matcher.find()) {
System.out.println("Match found");
}else {
System.out.println("No match found");
}
});
}
}
Output:
hi hello world => Match found
hihelloworld => No match found
In case you want the full-match, use the regex, .*(?<![a-z]) + Pattern.quote(phrase) +(?![a-z]).* as demonstrated at regex101.com. The pattern, .* means any character any number of times. The rest of the patterns are already explained above. The presence of .* before and after the match will ensure covering the whole string.
Java Demo:
import java.util.regex.Pattern;
import java.util.stream.Stream;
public class Main {
public static void main(String[] args) {
// Test
String phrase = "hello";
String regex = ".*(?<![a-z])" + Pattern.quote(phrase) + "(?![a-z]).*";
Stream.of(
"hi hello world",
"hihelloworld"
).forEach(s -> System.out.println(s + " => " + (s.matches(regex) ? "Match found" : "No match found")));
}
}
Output:
hi hello world => Match found
hihelloworld => No match found

Regex back reference to match a number (or any char sequence) with itself

I am missing something basic here. I have this regex (.*)=\1 and I am using it to match 100=100 and its failing. When I remove the back reference from the regex and continue to use the capturing group, it shows that the captured group is '100'. Why does it not work when I try to use the back reference?
package test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegexTest {
public static void main(String[] args) {
String eqPattern = "(.*)=\1";
String input[] = {"1=1"};
testAndPrint(eqPattern, input); // this does not work
eqPattern = "(.*)=";
input = new String[]{"1=1"};
testAndPrint(eqPattern, input); // this works when the backreference is removed from the expr
}
static void testAndPrint(String regexPattern, String[] input) {
System.out.println("\n Regex pattern is "+regexPattern);
Pattern p = Pattern.compile(regexPattern, Pattern.CASE_INSENSITIVE);
boolean found = false;
for (String str : input) {
System.out.println("Testing "+str);
Matcher matcher = p.matcher(str);
while (matcher.find()) {
System.out.println("I found the text "+ matcher.group() +" starting at " + "index "+ matcher.start()+" and ending at index "+matcher.end());
found = true;
System.out.println("Group captured "+matcher.group(1));
}
if (!found) {
System.out.println("No match found");
}
}
}
}
When I run this, I get the following output
Regex pattern is (.*)=\1
Testing 100=100
No match found
Regex pattern is (.*)=
Testing 100=100
I found the text 100= starting at index 0 and ending at index 4
Group captured 100 -->If the group contains 100, why doesnt it match when I add \1 above
?
You have to escape the pattern string.
String eqPattern = "(.*)=\\1";
I think you need to escape the backslash.
String eqPattern = "(.*)=\\1";

Regex for Words with Apostrophes (Java)

I am trying to figure out the regex to match strings that contain only letters and apostrophes. If a string contains an apostrophe, I only want to match it if there is a letter on both sides of it.
What I have so far is [a-zA-Z]+('[a-zA-Z])?
I want to match strings like:
a'a
aa'a
a'aaa
But not:
bb'
'bb
You're almost there, just you need to add + after the char class present inside the optional group.
^[a-zA-Z]+('[a-zA-Z]+)?$
OR
Use this if you want to deal with more than one apostrophe.
^[a-zA-Z]+(?:'[a-zA-Z]+)*$
DEMO
String s = "a'a'a'a a' a'a-'bb";
String parts[] = s.split("[ -]");
for(String i:parts) {
if(!i.isEmpty())
{
System.out.println(i + " => " + i.matches("[a-zA-Z]+(?:'[a-zA-Z]+)*"));
}
}
Output:
a'a'a'a => true
a' => false
a'a => true
'bb => false
public static void main(String[] args) {
String s = "a'a'a";
Pattern pattern = Pattern.compile("^[a-zA-Z]+(?:'[a-zA-Z]+)*$");
Matcher matcher = pattern.matcher(s);
if (matcher.matches()) {
System.out.println("true");
} else {
System.out.println("false");
}
}
output
false

how to get character length of the unicode along with space in java

I need to find the length of my string "பாரதீய ஜனதா இளைஞர் அணி தலைவர் அனுராக்சிங் தாகூர் எம்.பி. நேற்று தேர்தல் ஆணையர் வி.சம்பத்". I got the string length as 45 but i expect the string length to be 59. Here i need to add the regular expression condition for spaces and dot (.). My code
import java.util.*;
import java.lang.*;
import java.util.regex.*;
class UnicodeLength
{
public static void main (String[] args)
{
String s="பாரதீய ஜனதா இளைஞர் அணி தலைவர் அனுராக்சிங் தாகூர் எம்பி நேற்று தேர்தல் ஆணையர் விசம்பத்";
List<String> characters=new ArrayList<String>();
Pattern pat = Pattern.compile("\\p{L}\\p{M}*");
Matcher matcher = pat.matcher(s);
while (matcher.find()) {
characters.add(matcher.group());
}
// Test if we have the right characters and length
System.out.println(characters);
System.out.println("String length: " + characters.size());
}
}
The code below worked for me. There were three issues that I fixed:
I added a check for spaces to your regular expression.
I added a check for punctuation to your regular expression.
I pasted the string from your comment into the string in your code. They weren't the same!
Here's the code:
public static void main(String[] args) {
String s = "பாரதீய ஜனதா இளைஞர் அணி தலைவர் அனுராக்சிங் தாகூர் எம்.பி. நேற்று தேர்தல் ஆணையர் வி.சம்பத்";
List<String> characters = new ArrayList<String>();
Pattern pat = Pattern.compile("\\p{P}|\\p{L}\\p{M}*| ");
Matcher matcher = pat.matcher(s);
while (matcher.find()) {
characters.add(matcher.group());
}
// Test if we have the right characters and length
int i = 1;
for (String character : characters) {
System.out.println(String.format("%d = [%s]", i++, character));
}
System.out.println("Characters Size: " + characters.size());
}
It's probably worth pointing out that your code is remarkably similar to the solution for this SO. One comment on that solution in particular led me to discover the missing check for punctuation in your code and allowed me to notice that the string from your comment didn't match the string in your code.

How do you replace groups in a regular expression?

How, exactly, do you replace groups while appending them to a string buffer?
For Example:
(a)(b)(c)
How can you replace group 1 with d, group 2 with e and so on?
I'm working with the Java regex engine.
Thanks in advance.
You could use Matcher's appendReplacement
Here is an example sample using:
input: "hello bob How is your cat?"
regular expression: "(bob|cat)"
output: "hello alice How is your dog"
public static void main(String[] args) {
Pattern p = Pattern.compile("(bob|cat)");
Matcher m = p.matcher("hello bob How is your cat?");
StringBuffer s = new StringBuffer();
while (m.find()) {
m.appendReplacement(s, doReplace(m.group(1)));
}
m.appendTail(s);
System.out.println(s.toString());
}
public static String doReplace(String s) {
if(s.equals("bob")) {
return "alice";
}
if(s.equals("cat")) {
return "dog";
}
return "";
}
You could use Matcher#start(group) and Matcher#end(group) to build a generic replacement method:
public static String replaceGroup(String regex, String source, int groupToReplace, String replacement) {
return replaceGroup(regex, source, groupToReplace, 1, replacement);
}
public static String replaceGroup(String regex, String source, int groupToReplace, int groupOccurrence, String replacement) {
Matcher m = Pattern.compile(regex).matcher(source);
for (int i = 0; i < groupOccurrence; i++)
if (!m.find()) return source; // pattern not met, may also throw an exception here
return new StringBuilder(source).replace(m.start(groupToReplace), m.end(groupToReplace), replacement).toString();
}
public static void main(String[] args) {
// replace with "%" what was matched by group 1
// input: aaa123ccc
// output: %123ccc
System.out.println(replaceGroup("([a-z]+)([0-9]+)([a-z]+)", "aaa123ccc", 1, "%"));
// replace with "!!!" what was matched the 4th time by the group 2
// input: a1b2c3d4e5
// output: a1b2c3d!!!e5
System.out.println(replaceGroup("([a-z])(\\d)", "a1b2c3d4e5", 2, 4, "!!!"));
}
Check online demo here.
Are you looking for something like this?
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Program1 {
public static void main(String[] args) {
Pattern p = Pattern.compile("(a)(b)(c)");
String str = "111abc222abc333";
String out = null;
Matcher m = p.matcher(str);
out = m.replaceAll("z$3y$2x$1");
System.out.println(out);
}
}
This gives 111zcybxa222zcybxa333 as output.
I guess you will see what this example does.
But OK, I think there's no ready built-in
method through which you can say e.g.:
- replace group 3 with zzz
- replace group 2 with yyy
- replace group 1 with xxx

Categories

Resources