I'm trying regex after a long time. I'm not sure if the issue is with regex or the logic.
String test = "project/components/content;contentLabel|contentDec";
String regex = "(([A-Za-z0-9-/]*);([A-Za-z0-9]*))";
Map<Integer, String> matchingGroups = new HashMap<>();
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(test);
//System.out.println("Input: " + test + "\n");
//System.out.println("Regex: " + regex + "\n");
//System.out.println("Matcher Count: " + matcher.groupCount() + "\n");
if (matcher != null && matcher.find()) {
for (int i = 0; i < matcher.groupCount(); i++) {
System.out.println(i + " -> " + matcher.group(i) + "\n");
}
}
I was expecting the above to give me the output as below:
0 -> project/components/content;contentLabel|contentDec
1 -> project/components/content
2 -> contentLabel|contentDec
But when running the code the group extractions are off.
Any help would be really appreciated.
Thanks!
You have a few issues:
You're missing | in your second character class.
You have an unnecessary capture group around the whole regex.
When outputting the groups, you need to use <= matcher.groupCount() because matcher.group(0) is reserved for the whole match, so your capture groups are in group(1) and group(2).
This will work:
String test = "project/components/content;contentLabel|contentDec";
String regex = "([A-Za-z0-9-/]*);([A-Za-z0-9|]*)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(test);
if (matcher != null && matcher.find()) {
for (int i = 0; i <= matcher.groupCount(); i++) {
System.out.println(i + " -> " + matcher.group(i) + "\n");
}
}
I have an input that looks like this : 0; expires=2016-12-27T16:52:39
I am trying extract from this only the date, using Pattern and Matcher.
private String extractDateFromOutput(String result) {
Pattern p = Pattern.compile("(expires=)(.+?)(?=(::)|$)");
Matcher m = p.matcher(result);
while (m.find()) {
System.out.println("group 1: " + m.group(1));
System.out.println("group 2: " + m.group(2));
}
return result;
}
Why does this matcher find more than 1 group ? The output is as follows:
group 1: expires=
group 2: 2016-12-27T17:04:39
How can I get only group 2 out of this?
Thank you !
Because you have used more than one capturing group in your regex.
Pattern p = Pattern.compile("expires=(.+?)(?=::|$)");
Just remove the capturing group around
expires
::
private String extractDateFromOutput(String result) {
Pattern p = Pattern.compile("expires=(.+?)(?=::|$)");
Matcher m = p.matcher(result);
while (m.find()) {
System.out.println("group 1: " + m.group(1));
// no group 2, accessing will gives you an IndexOutOfBoundsException
//System.out.println("group 2: " + m.group(2));
}
return result;
}
Pattern r = Pattern.compile("(\\w+)\\s+(\\w+)\\s+(\\w+)\\s*");
Matcher m = r.matcher(line);
if (m.find()) {
System.out.println("Found: " + m.group(2));
} else {
System.out.println("Not found");
}
And when i use this:
HEY_YO NICE GUYHERE
It shows output: Not found.
How to get a string with underscore? (_)
Input:
HEY_YO NICE GUYHERE
And i want to output:
Found: HEY_YO
i think you are not passing "HEY_YO NICE GUYHERE" as input because for this input your code will produce "Found: NICE" as output.To get the output you want
replace
System.out.println("Found: " + m.group(2));
with
System.out.println("Found: " + m.group(1));
When using matcher.find() you can specify only what you want to capture :
public static void main(String[] args) {
String line = "HEY_YO NICE GUYHERE";
Pattern r = Pattern.compile("[a-zA-Z]+_[a-zA-Z]+");
Matcher m = r.matcher(line);
if (m.find()) {
System.out.println("Found: " + m.group(0));
} else {
System.out.println("Not found");
}
}
O/P :
Found: HEY_YO
My input string is:
subtype=forward,level=notice,vd=root,srcip=10.100.1.121,srcport=55844,srcintf=port1,dstip=173.193.156.43,dstport=80,dstintf=port16,sessionid=1224203695,status=close
subtype=forward,level=notice,vd=root,srcip=10.100.1.121,srcport=55844,srcintf=port1,dstip=173.193.156.43,dstport=80,dstintf=port16,sessionid=1224203695,status=open
This is the code I tried:
Pattern patt = Pattern.compile("(srcip=(?:\\d+\\.)+\\d+)(?:.*)?(dstip=(?:\\d+\\.)+\\d+)(?:.*)?(status=(?=.*close.*)(?:.*)?(dstport=(\\d+))");
BufferedReader r = new BufferedReader(new FileReader("ttext.txt"));
// For each line of input, try matching in it.
String line;
while ((line = r.readLine()) != null) {
// For each match in the line, extract and print it.
Matcher m = patt.matcher(line);
while (m.find()) {
// Simplest method:
System.out.print(" " + m.group(1) + " " );
System.out.print(" " + m.group(2) + " " );
System.out.print(" " + m.group(3) + " " );
System.out.println(" " + m.group(4));
The expected output was:
srcip=10.100.1.121 dstip=173.193.156.43 srcport=55844 status=close dstport=80
But this is the output I get:
srcip=10.100.1.121 dstip=173.193.156.43 dstip=173.193.156.43 dstport=80
srcip=10.100.1.121 dstip=173.193.156.43 dstip=173.193.156.43 dstport=80
Any suggestions?
The order of your capturing groups doesn't correspond to the order of the fields in the input. Here is a re-ordered version of your regex that catches the 5 groups you need:
String s = "subtype=forward,level=notice,vd=root,srcip=10.100.1.121,srcport=55844,srcintf=port1,dstip=173.193.156.43,dstport=80,dstintf=port16,sessionid=1224203695,status=close";
Pattern p = Pattern.compile("(srcip=(?:\\d+\\.)+\\d+)(?:.*)?(srcport=(?:\\d+))(?:.*)?(dstip=(?:\\d+\\.)+\\d+)(?:.*)?(dstport=(?:\\d+))(?:.*)?(status=(?:.*))");
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println(m.group(1) + " " + m.group(3) + " " + m.group(2) + " " + m.group(5) + " " + m.group(4));
}
Note that I also fixed an unclosed group, and made one of the groups non capturing.
i want to filter out srcport and dstport from the input string. here is the code i tried:
String input = "2014<>10.100.2.3<><189>date=2014-01-16,time=11:26:14,devname=B3909601569,devid=B3909601569,logid=000013,type=traffic,srcip=192.168.192.123,srcport=2072,srcintf=port2,dstip=10.180.1.105,dstport=3206,dstintf=port1,sessionid=121543,status=close,policyid=196,service=MYSQL,proto=6,duration=10,sentbyte=3910,rcvdbyte=175085,sentpkt=74,rcvdpkt=132";
Pattern p = Pattern.compile("(srcport=)(\\d+).[\\s]?(dstport=)(\\d+)");
Matcher m = p.matcher(input);
StringBuffer result=new StringBuffer();
while (m.find()) {
System.out.println("Srcport: " + m.group(2) + " & ");
System.out.println("Dstport: " + m.group(4));
}
System.out.println(result);
but its not showing any output. Is there a mistake in the regex
Pattern p = Pattern.compile("(srcport=)(\\d+).[\\s]?(dstport=)(\\d+)");
or the println lines
System.out.println("Srcport: " + m.group(2) + " & ");
System.out.println("Dstport: " + m.group(4));"
any suggestions will be highly appreciated.
See following changes to both the regex and the captured groups:
String input = "2014<>10.100.2.3<><189>date=2014-01-16,time=11:26:14,devname=B3909601569,devid=B3909601569,logid=000013,type=traffic,srcip=192.168.192.123,srcport=2072,srcintf=port2,dstip=10.180.1.105,dstport=3206,dstintf=port1,sessionid=121543,status=close,policyid=196,service=MYSQL,proto=6,duration=10,sentbyte=3910,rcvdbyte=175085,sentpkt=74,rcvdpkt=132";
Pattern p = Pattern.compile("srcport=(\\d+).*?dstport=(\\d+)"); // update regex
Matcher m = p.matcher(input);
StringBuffer result=new StringBuffer();
while (m.find()) {
System.out.println("Srcport: " + m.group(1)); //print groups 1 + 2
System.out.println("Dstport: " + m.group(2));
}
System.out.println(result);
You forgot to use or(|) in your regex
srcport=(\\d+)|dstport=(\\d+)
Your code would be
while (m.find())
{
if(m.group().startsWith("srcport"))
System.out.println("Srcport: " + m.group(1) + " & ");
else
System.out.println("Dstport: " + m.group(1));
}
Try this :
Pattern p = Pattern.compile("srcport=(\\d+)|dstport=(\\d+)");
Try the below code. I have run this in my system and it it working fine.
String input = "2014<>10.100.2.3<><189>date=2014-01-16,time=11:26:14,devname=B3909601569,devid=B3909601569,logid=000013,type=traffic,srcip=192.168.192.123,srcport=2072,srcintf=port2,dstip=10.180.1.105,dstport=3206,dstintf=port1,sessionid=121543,status=close,policyid=196,service=MYSQL,proto=6,duration=10,sentbyte=3910,rcvdbyte=175085,sentpkt=74,rcvdpkt=132";
Pattern p = Pattern.compile("(srcport=)(\\d+)((.*)?)(dstport=)(\\d+)(\\.)*");
Matcher m = p.matcher(input);
StringBuffer result=new StringBuffer();
while (m.find()) {
System.out.println(m.group());
System.out.println("Srcport: " + m.group(2) );
System.out.println("Dstport: " + m.group(6));
}