Need help with using regular expression in Java - java

I am trying to match pattern like '#(a-zA-Z0-9)+ " but not like 'abc#test'.
So this is what I tried:
Pattern MY_PATTERN
= Pattern.compile("\\s#(\\w)+\\s?");
String data = "abc#gere.com #gogasig #jytaz #tibuage";
Matcher m = MY_PATTERN.matcher(data);
StringBuffer sb = new StringBuffer();
boolean result = m.find();
while(result) {
System.out.println (" group " + m.group());
result = m.find();
}
But I can only see '#jytaz', but not #tibuage.
How can I fix my problem? Thank you.

This pattern should work: \B(#\w+)
The \B scans for non-word boundary in the front. The \w+ already excludes the trailing space. Further I've also shifted the parentheses so that the # and + comes in the correct group. You should preferably use m.group(1) to get it.
Here's the rewrite:
Pattern pattern = Pattern.compile("\\B(#\\w+)");
String data = "abc#gere.com #gogasig #jytaz #tibuage";
Matcher m = pattern.matcher(data);
while (m.find()) {
System.out.println(" group " + m.group(1));
}

Related

Get text in the URL with dynamic date - Regex Java

I need to get the text between the URL which has a date in Java
Input 1:
/test1/raw/2019-06-11/testcustomer/usr/pqr/DATA/mn/export/
Output: testcustomer
Only /raw/ remains, date will change and testcustomer will change
Input 2:
/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/
Output: newcustomer
String url = "/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/";
String customer = getCustomer(url);
public String getCustomer (String _url){
String source = "default";
String regex = basePath + "/raw/\\d{4}-\\d{2}-\\d{2}/usr*";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(_url);
if (m.find()) {
source = m.group(1);
} else {
logger.error("Cant get customer with regex " + regex);
}
return source;
}
It's returning 'default' :(
Your regex /raw/\\d{4}-\\d{2}-\\d{2}/usr* is missing the part for the value you want, you need a regex that find the date, and keep what's next :
/\w*/raw/[0-9-]+/(\w+)/.* or (?<=raw\/\d{4}-\d{2}-\d{2}\/)(\w+) will be good
Pattern p = Pattern.compile("/\\w*/raw/[0-9-]+/(\\w+)/.*");
Matcher m = p.matcher(str);
if (m.find()) {
String value = m.group(1);
System.out.println(value);
}
Or if it's always the 4th part, use split()
String value = str.split("/")[4];
System.out.println(value);
And here a >> code demo
Here, we can likely use raw followed by the date as a left boundary, then we would collect our desired output in a capturing group, we would add an slash and consume the rest of our string, with an expression similar to:
.+raw\/[0-9]{4}-[0-9]{2}-[0-9]{2}\/(.+?)\/.+
Demo
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = ".+raw\\/[0-9]{4}-[0-9]{2}-[0-9]{2}\\/(.+?)\\/.+";
final String string = "/test1/raw/2019-06-11/testcustomer/usr/pqr/DATA/mn/export/\n"
+ "/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
RegEx
If this expression wasn't desired or you wish to modify it, please visit regex101.com.
RegEx Circuit
jex.im visualizes regular expressions:

RegEx to extract text between tags in Java

I need to extract the values after :70: in the following text file using RegEx. Value may contain line breaks as well.
My current solution is to extract the string between :70: and : but this always returns only one match, the whole text between the first :70: and last :.
:32B:xxx,
:59:yyy
something
:70:ACK1
ACK2
:21:something
:71A:something
:23E:something
value
:70:ACK2
ACK3
:71A:something
How can I achive this using Java? Ideally I want to iterate through all values, i.e.
ACK1\nACK2,
ACK2\nACK3
Thanks :)
Edit: What I'm doing right now,
Pattern pattern = Pattern.compile("(?<=:70:)(.*)(?=\n)", Pattern.DOTALL);
Matcher matcher = pattern.matcher(data);
while (matcher.find()) {
System.out.println(matcher.group())
}
Try this.
String data = ""
+ ":32B:xxx,\n"
+ ":59:yyy\n"
+ "something\n"
+ ":70:ACK1\n"
+ "ACK2\n"
+ ":21:something\n"
+ ":71A:something\n"
+ ":23E:something\n"
+ "value\n"
+ ":70:ACK2\n"
+ "ACK3\n"
+ ":71A:something\n";
Pattern pattern = Pattern.compile(":70:(.*?)\\s*:", Pattern.DOTALL);
Matcher matcher = pattern.matcher(data);
while (matcher.find())
System.out.println("found="+ matcher.group(1));
result:
found=ACK1
ACK2
found=ACK2
ACK3
You need a loop to do this.
Pattern p = Pattern.compile(regexPattern);
List<String> list = new ArrayList<String>();
Matcher m = p.matches(input);
while (m.find()) {
list.add(m.group());
}
As seen here Create array of regex matches

Find multiple string matches using Java regex

I am trying to use regex to find a match for a string between Si and (P) or Si and (I).
Below is what I wrote. Why isn't it working and how do I fix it?
String Channel = "Si0/4(I) Si0/6( Si0/8K Si0/5(P)";
if (Channel.length() > 0) {
String pattern1 = "Si";
String pattern2 = "(P)";
String pattern3 = "(I)";
String P1 = Pattern.quote(pattern1) + "(.*?)[" + Pattern.quote(pattern2) + "|" + Pattern.quote(pattern3) + "]";
Pattern p = Pattern.compile(P1);
Matcher m = p.matcher(Channel);
while(m.find()){
if (m.group(1)!= null)
{
System.out.println(m.group(1));
}
else if (m.group(2)!= null)
{
System.out.println(m.group(2));
}
}
}
Expected output
0/4
0/5
Actual output
0/4
0/6
0/8K Si0/5
Use a lookbehind and lookahead in your regex. And also you need to add space inside the character class, so that it won't this 0/8K string .
(?<=Si)[^\\( ]*(?=\\((?:P|I)\\))
DEMO
String str="Si0/4(I) Si0/6( Si0/8K Si0/5(P)";
String regex="(?<=Si)[^\\( ]*(?=\\([PI]\\))";
Pattern pattern = Pattern.compile(regex);
Matcher matcher =pattern.matcher(str);
while(matcher.find()){
System.out.println(matcher.group(0));
}
Output:
0/4
0/5
You need to group your regex.It is currently
Si(.*?)[(P)|(I)]
Whereas it should be
Si(.*?)\(I\)|Si(.*?)\(P\)
See demo.
http://regex101.com/r/oO8zI4/8
[] means "any of these character", so it evaluates every letter in the block as if they were separated with OR.
If the result you're searching is always: number/number
You can use:
Si(\d+\/\d+)(?:\(P\)|\(I\))

Get an array of Strings matching a pattern from a String

I have a long string let's say
I like this #computer and I want to buy it from #XXXMall.
I know the regular expression pattern is
Pattern tagMatcher = Pattern.compile("[#]+[A-Za-z0-9-_]+\\b");
Now i want to get all the hashtags in an array. How can i use this expression to get array of all hash tags from string something like
ArrayList hashtags = getArray(pattern, str)
You can write like?
private static List<String> getArray(Pattern tagMatcher, String str) {
Matcher m = tagMatcher.matcher(str);
List<String> l = new ArrayList<String>();
while(m.find()) {
String s = m.group(); //will give you "#computer"
s = s.substring(1); // will give you just "computer"
l.add(s);
}
return l;
}
Also you can use \\w- instead of A-Za-z0-9-_ making the regex [#]+[\\w]+\\b
This link would surely be helpful for achieving what you want.
It says:
The find() method searches for occurrences of the regular expressions
in the text passed to the Pattern.matcher(text) method, when the
Matcher was created. If multiple matches can be found in the text, the
find() method will find the first, and then for each subsequent call
to find() it will move to the next match.
The methods start() and end() will give the indexes into the text
where the found match starts and ends.
Example:
String text =
"This is the text which is to be searched " +
"for occurrences of the word 'is'.";
String patternString = "is";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(text);
int count = 0;
while(matcher.find()) {
count++;
System.out.println("found: " + count + " : "
+ matcher.start() + " - " + matcher.end());
}
You got the hint now.
Here is one way, using Matcher
Pattern tagMatcher = Pattern.compile("#+[-\\w]+\\b");
Matcher m = tagMatcher.matcher(stringToMatch);
ArrayList<String> hashtags = new ArrayList<>();
while (m.find()) {
hashtags.add(m.group());
}
I took the liberty of simplifying your regex. # does not need to be in a character class. [A-Za-z0-9_] is the same as \w, so [A-Za-z0-9-_] is the same as [-\w]
You can use :
String val="I like this #computer and I want to buy it from #XXXMall.";
String REGEX = "(?<=#)[A-Za-z0-9-_]+";
List<String> list = new ArrayList<String>();
Pattern pattern = Pattern.compile(REGEX);
Matcher matcher = pattern.matcher(val);
while(matcher.find()){
list.add(matcher.group());
}
(?<=#) Positive Lookbehind - Assert that the character # literally be matched.
you can use the following code for getting the names
String saa = "#{akka}nikhil#{kumar}aaaaa";
Pattern regex = Pattern.compile("#\\{(.*?)\\}");
Matcher m = regex.matcher(saa);
while(m.find()) {
String s = m.group(1);
System.out.println(s);
}
It will print
akka
kumar

solve following regex

Please consider the following text :
String tempStr =
"$#<div style=\"text-align:left;\">$#Order-CAS No#$</div>$#abc#$";
Pattern p = Pattern.compile("(?<=\\$#)(\\w*)(?=#\\$)");
Matcher m = p.matcher(tempStr);
List<String> tokens = new ArrayList<String>();
while (m.find()) {
System.out.println("Found a " + m.group() + ".");
but it give me just abc..i want answer as Order-CASNo and abc.
The expression \\w* does not match the hyphen or space. Try [\\w\\s-]* instead.
Pattern p = Pattern.compile("(?<=\\$#)([\\w\\s-]*)(?=#\\$)");
Read more about character classes here:
Character Classes or Character Sets
Finally got solution.
Pattern p = Pattern.compile("(?<=\\$#)([\\w-\\s\\w]*)(?=#\\$)");

Categories

Resources