Regular Expression to extract in parantheses values java - java

I need to extract the values of the string that looks like this:
nameClass (val1)(val2)
to have:
nameClass
val1
val2
The problem is that it must also be applicable to this:
nameClass
and
nameClass (val1)(val2)(val1)...(valn)
I tried to create the regex but it fits only for the
nameClass (val1)(val2)
variant and looks like this (after being improved by Viorel Moraru):
String pattern = "((?:[a-z]+[A-Z][a-z]+))(([ |(]+)([-|+]?\\d+)([ |(|)]+)([-|+]?\\d+)([ |)]+))*";
How do I make the pattern to be applicable to all
nameClass
and
nameClass (val1)(val2)(val1)...(valn)
?
Java Code:
String txt = "inputTestdata(12)(-13)";
String patern = "((?:[a-z]+[A-Z][a-z]+))([ |(]+)([-|+]?\\d+)([ |(|)]+)([-|+]?\\d+)([ |)]+)";
Pattern p = Pattern.compile(patern);
Matcher m = p.matcher(txt);
if (m.find())
{
for (int i = 1; i < m.groupCount(); i ++)
{
System.out.print(m.group(i) + "\n");
}
}

You can use this code:
String s = "nameClass(val1)(val2)(val3)";
Pattern p = Pattern.compile("^(\\w+) *(.*)$");
Matcher m = p.matcher(s);
String ps = "";
if (m.matches())
{
ps = m.group(2);
System.out.printf("Outside parantheses:<%s>\n", m.group(1));
}
Pattern p1 = Pattern.compile("\\(([^)]*)\\)");
Matcher m1 = p1.matcher(ps);
while (m1.find())
{
System.out.printf("Inside parentheses:<%s>%n", m1.group(1));
}
OUTPUT:
Outside parantheses:<nameClass>
Inside parentheses:<val1>
Inside parentheses:<val2>
Inside parentheses:<val3>

Assuming that:
Your input does not need to be validated insofar as it starts with
nameClass
You want to sanitize the parenthesis (as your question currently leads me to understand)
... why don't you just replace everything between parenthesis with its content?
For instance:
Pattern p = Pattern.compile("\\((.+?)\\)");
String[] inputs = {"nameClass", "nameClass (var1)", "nameClass (var1) (var2)"};
Matcher m;
for (String input: inputs) {
m = p.matcher(input);
System.out.println("Input: " + input + " --> replacement: " + m.replaceAll("$1"));
// resetting matcher after "replaceAll" and accessing values directly by group 1 reference
m.reset();
while (m.find()) {
System.out.println("\tFound value: " + m.group(1));
}
}
Output:
Input: nameClass --> replacement: nameClass
Input: nameClass (var1) --> replacement: nameClass var1
Found value: var1
Input: nameClass (var1) (var2) --> replacement: nameClass var1 var2
Found value: var1
Found value: var2

I'm no regex expert but does this work?
\s*\w+\s*(\(\w+\))*

Related

Get text in the URL with dynamic date - Regex Java

I need to get the text between the URL which has a date in Java
Input 1:
/test1/raw/2019-06-11/testcustomer/usr/pqr/DATA/mn/export/
Output: testcustomer
Only /raw/ remains, date will change and testcustomer will change
Input 2:
/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/
Output: newcustomer
String url = "/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/";
String customer = getCustomer(url);
public String getCustomer (String _url){
String source = "default";
String regex = basePath + "/raw/\\d{4}-\\d{2}-\\d{2}/usr*";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(_url);
if (m.find()) {
source = m.group(1);
} else {
logger.error("Cant get customer with regex " + regex);
}
return source;
}
It's returning 'default' :(
Your regex /raw/\\d{4}-\\d{2}-\\d{2}/usr* is missing the part for the value you want, you need a regex that find the date, and keep what's next :
/\w*/raw/[0-9-]+/(\w+)/.* or (?<=raw\/\d{4}-\d{2}-\d{2}\/)(\w+) will be good
Pattern p = Pattern.compile("/\\w*/raw/[0-9-]+/(\\w+)/.*");
Matcher m = p.matcher(str);
if (m.find()) {
String value = m.group(1);
System.out.println(value);
}
Or if it's always the 4th part, use split()
String value = str.split("/")[4];
System.out.println(value);
And here a >> code demo
Here, we can likely use raw followed by the date as a left boundary, then we would collect our desired output in a capturing group, we would add an slash and consume the rest of our string, with an expression similar to:
.+raw\/[0-9]{4}-[0-9]{2}-[0-9]{2}\/(.+?)\/.+
Demo
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = ".+raw\\/[0-9]{4}-[0-9]{2}-[0-9]{2}\\/(.+?)\\/.+";
final String string = "/test1/raw/2019-06-11/testcustomer/usr/pqr/DATA/mn/export/\n"
+ "/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
RegEx
If this expression wasn't desired or you wish to modify it, please visit regex101.com.
RegEx Circuit
jex.im visualizes regular expressions:

Java regex pattern matching not working for second occurrence

I am using java.util.Regex to match regex expression in a string. The string basically a html string.
Within that string I have two lines;
<style>templates/style/color.css</style>
<style>templates/style/style.css</style>
My requirement is to get the content inside style tag (<style>). Now I am using the pattern like;
String stylePattern = "<style>(.+?)</style>";
When I am trying to get the result using;
Pattern styleRegex = Pattern.compile(stylePattern);
Matcher matcher = styleRegex.matcher(html);
System.out.println("Matcher count : "+matcher.groupCount()+ " and "+matcher.find()); //output 1
if(matcher.find()) {
System.out.println("Inside find");
for (int i = 0; i < matcher.groupCount(); i++) {
String matchSegment = matcher.group(i);
System.out.println(matchSegment); //output 2
}
}
The result I am getting from output 1 as :
Matcher count : 1 and true
And from output 2 as;
<style>templates/style/style.css</style>
Now, I am just lost after lot of trying that how do I get both lines. I tried many other suggestion in stackoverflow itself, none worked.
I think I am doing some conceptual mistake.
Any help will be very good for me. Thanks in advance.
EDIT
I have changed code as;
Matcher matcher = styleRegex.matcher(html);
//System.out.println("find : "+matcher.find() + "Groupcount = " +matcher.groupCount());
//matcher.reset();
int i = 0;
while(matcher.find()) {
System.out.println(matcher.group(i));
i++;
}
Now the result is like;
`<style>templates/style/color.css</style>
templates/style/style.css`
Why one with style tag and another one is without style tag?
This will find all occurrences from your string.
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group());
}
Can try this:
String text = "<style>templates/style/color.css</style>\n" +
"<style>templates/style/style.css</style>";
Pattern pattern = Pattern.compile("<style>(.+?)</style>");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(text.substring(matcher.start(), matcher.end()));
}
Or:
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
System.out.println(matcher.group());
}

How to compile different Patterns with String?

I need to define multi Pattern to compile with String and after running it should give me any thing in the string that has the same format in my Pattern. here is codes :
String line = "This order was places for QT 30.00$ !OK ? ";
Pattern[] patterns = new Pattern[]{
Pattern.compile("\\d+[.,]\\d+.[$] ", Pattern.CASE_INSENSITIVE),
Pattern.compile("\\d:\\d\\d",Pattern.CASE_INSENSITIVE | Pattern.MULTILINE)
}; // Create a Pattern object
// Now create matcher object.
for (Pattern scriptPattern : patterns){
Matcher m = scriptPattern.matcher(line);
System.out.println(m.group());
} }
Is this what you are looking for
private static Pattern[] patterns = new Pattern[]{
Pattern.compile("Your pattern ", Pattern.CASE_INSENSITIVE),
Pattern.compile("your pattern",Pattern.CASE_INSENSITIVE | Pattern.MULTILINE)
};
You can use this to go through the patterns and match them
for (Pattern scriptPattern : patterns){
Matcher m = scriptPattern.matcher(line)
while (m.find()) {
String d = m.group();
if(d != null) {
System.out.print(d);
}
}
}
Using your original question prior to the edit here is a few tweaks to make it work:
public static void main(String[] args) {
// String to be scanned to find the pattern.
String line = "This order was places for QT 30.00$ !OK ? 2:37 ";
String pattern = "(\\d+[.,]\\d+.[$])(.*)(\\d:\\d\\d)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find()) {
System.out.println("Found value: " + m.group(1) + " with time: " + m.group(3));
}
}
Output:
Found value: 30.00$ with time: 2:37

regex capture groups returning as null after an OR operator

Matcher matcher = Pattern.compile("\\bwidth\\s*:\\s*(\\d+)px|\\bbackground\\s*:\\s*#([0-9A-Fa-f]+)").matcher(myString);
if (matcher.find()) {
System.out.println(matcher.group(2));
}
Example data:
myString = width:17px;background:#555;float:left; will produce null.
What I wanted:
matcher.group(1) = 17
matcher.group(2) = 555
I've just started using regex on Java, any help?
I would suggest to split things a bit up.
Instead of building one large regex (maybe you want to add more rules into the String?) you should split up the string in multiple sections:
String myString = "width:17px;background:#555;float:left;";
String[] sections = myString.split(";"); // split string in multiple sections
for (String section : sections) {
// check if this section contains a width definition
if (section.matches("width\\s*:\\s*(\\d+)px.*")) {
System.out.println("width: " + section.split(":")[1].trim());
}
// check if this section contains a background definition
if (section.matches("background\\s*:\\s*#[0-9A-Fa-f]+.*")) {
System.out.println("background: " + section.split(":")[1].trim());
}
...
}
Here is a working example. Having | (or) in the regexp is usually confusing so I've added two more matchers to show how I would do it.
public static void main(String[] args) {
String myString = "width:17px;background:#555;float:left";
int matcherOffset = 1;
Matcher matcher = Pattern.compile("\\bwidth\\s*:\\s*(\\d+)px|\\bbackground\\s*:\\s*#([0-9A-Fa-f]+)").matcher(myString);
while (matcher.find()) {
System.out.println("found something: " + matcher.group(matcherOffset++));
}
matcher = Pattern.compile("width:(\\d+)px").matcher(myString);
if (matcher.find()) {
System.out.println("found width: " + matcher.group(1));
}
matcher = Pattern.compile("background:#(\\d+)").matcher(myString);
if (matcher.find()) {
System.out.println("found background: " + matcher.group(1));
}
}

rules for use reg exp java

i have pattern:
host=([a-z0-9./:]*)
it's find for me host address. And i have content
host=http//:sdf3452.domain.com/
And my code is:
Matcher m;
Pattern hostP = Pattern.compile("host=([a-z0-9./:]*)");
m=hostP.matcher(content);//string 1
String match = m.group();//string 2
Log.i("host", ""+hostP.matcher(content).find());
if i delete string 1 and 2 i see true in logcat. If left as is I got exception nothing found.
I've tried all kinds of pattern. Through debug looked m variable, finds no match. Please teach me use reg exp!
Before you group() a match, you need to invoke find().
Try it like this:
Pattern hostP = Pattern.compile("host=([a-z0-9./:]*)");
Matcher m = hostP.matcher(content);
if(m.find()) {
String match = m.group();
// ...
}
EDIT
and a little demo that shows what each match-group contains:
Pattern p = Pattern.compile("host=([a-z0-9./:]*)");
Matcher m = p.matcher("host=http://sdf3452.domain.com/");
if (m.find()) {
for(int i = 0; i <= m.groupCount(); i++) {
System.out.printf("m.group(%d) = '%s'\n", i, m.group(i));
}
}
which will print:
m.group(0) = 'host=http://sdf3452.domain.com/'
m.group(1) = 'http://sdf3452.domain.com/'
As you can see, group(0), which is the same as group(), contains what the entire pattern matches.
But realize that a URL can contain much more than what your defined in [a-z0-9./:]*!
String content = "host=http://sdf3452.domain.com/";
Matcher mm;
Pattern hostP = Pattern.compile("host=([a-z0-9./:]*)");
mm=hostP.matcher(content);
String match = "";
if (mm.find()){//use m.find() first
match = mm.group(1);//1 is order number of brackets
}

Categories

Resources