Use variables in pattern - java

So i need to get a word between 2 other words; and im using pattern and matcher.
Pattern p = Pattern.compile("Hello(.*?)GoodBye");
Matcher m = p.matcher(line);
In this example i'm getting the word between Hello and Goodbye and it works.
What i want to do is replace Hello and GoodBye bye variables such as:
String StartDelemiter = "Hello";
String EndDelemiter = "GoodBye";
How should write it in Pattern p = Pattern.compile(---); I Tried :
Pattern p = Pattern.compile( "{ "+StartDelemiter +" (.*?) "+EndDelemiter+" }" );
But application crashes !!

You need to escape { and } with backslashes, something like:
Pattern p = Pattern.compile( "\\{ "+StartDelemiter +" (.*?) "+EndDelemiter+" \\}" );
The curly braces are Regex quantifiers
<pattern>{n} Match exactly n times
<pattern>{n,} Match at least n times
<pattern>{n,m} Match at least n but not more than m times

Related

Using Regular Expression in Java to extract information from a String

I have one input String like this:
"I am Duc/N Ta/N Van/N"
String "/N" present it is the Name of one person.
The expected output is:
Name: Duc Ta Van
How can I do it by using regular expression?
You can use Pattern and Matcher like this :
String input = "I am Duc/N Ta/N Van/N";
Pattern pattern = Pattern.compile("([^\\s]+)/N");
Matcher matcher = pattern.matcher(input);
String result = "";
while (matcher.find()) {
result+= matcher.group(1) + " ";
}
System.out.println("Name: " + result.trim());
Output
Name: Duc Ta Van
Another Solution using Java 9+
From Java9+ you can use Matcher::results like this :
String input = "I am Duc/N Ta/N Van/N";
String regex = "([^\\s]+)/N";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
String result = matcher.results().map(s -> s.group(1)).collect(Collectors.joining(" "));
System.out.println("Name: " + result); // Name: Duc Ta Van
Here is the regex to use to capture every "name" preceded by a /N
(\w+)\/N
Validate with Regex101
Now, you just need to loop on every match in that String and concatenate the to get the result :
String pattern = "(\\w+)\\/N";
String test = "I am Duc/N Ta/N Van/N";
Matcher m = Pattern.compile(pattern).matcher(test);
StringBuilder sbNames = new StringBuilder();
while(m.find()){
sbNames.append(m.group(1)).append(" ");
}
System.out.println(sbNames.toString());
Duc Ta Van
It is giving you the hardest part. I let you adapt this to match your need.
Note :
In java, it is not required to escape a forward slash, but to use the same regex in the entire answer, I will keep "(\\w+)\\/N", but "(\\w+)/N" will work as well.
I've used "[/N]+" as the regular expression.
Regex101
[] = Matches characters inside the set
\/ = Matches the character / literally (case sensitive)
+ = Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)

extract a set of a characters between some characters

I have a string email = John.Mcgee.r2d2#hitachi.com
How can I write a java code using regex to bring just the r2d2?
I used this but got an error on eclipse
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = patter.matcher
for (Strimatcher.find()){
System.out.println(matcher.group(1));
}
To match after the last dot in a potential sequence of multiple dots request that the sequence that you capture does not contain a dot:
(?<=[.])([^.]*)(?=#)
(?<=[.]) means "preceded by a single dot"
(?=#) means "followed by # sign"
Note that since dot . is a metacharacter, it needs to be escaped either with \ (doubled for Java string literal) or with square brackets around it.
Demo.
Not sure if your posting the right code. I'll rewrite it based on what it should look like though:
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = pattern.matcher(email);
int count = 0;
while(matcher.find()) {
count++;
System.out.println(matcher.group(count));
}
but I think you just want something like this:
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = pattern.matcher(email);
if(matcher.find()){
System.out.println(matcher.group(1));
}
No need to Pattern you just need replaceAll with this regex .*\.([^\.]+)#.* which mean get the group ([^\.]+) (match one or more character except a dot) which is between dot \. and #
email = email.replaceAll(".*\\.([^\\.]+)#.*", "$1");
Output
r2d2
regex demo
If you want to go with Pattern then you have to use this regex \\.([^\\.]+)# :
String email = "John.Mcgee.r2d2#hitachi.com";
Pattern pattern = Pattern.compile("\\.([^\\.]+)#");
Matcher matcher = pattern.matcher(email);
if (matcher.find()) {
System.out.println(matcher.group(1));// Output : r2d2
}
Another solution you can use split :
String[] split = email.replaceAll("#.*", "").split("\\.");
email = split[split.length - 1];// Output : r2d2
Note :
Strings in java should be between double quotes "John.Mcgee.r2d2#hitachi.com"
You don't need to escape # in Java, but you have to escape the dot with double slash \\.
There are no syntax for a for loop like you do for (Strimatcher.find()){, maybe you mean while

Match Strings which begin with X and end with Y?

I want to match every file name which ends with .js and is stored in a directory called lib.
Therefore I created the following regular expression: (lib/)(.*?).js$.
I tested the expression (lib/)(.*?).js$ in a Regex Tester and matched this filename: src/main/lib/abc/DocumentHandler.js.
To use my expression in Java, I escaped it to: (lib/)(.*?)\\.js$.
Nevertheless, Java tells me that my expression does not match.
Here is my code:
String regEx = "(lib/)(.*?).js$";
String escapedRegEx = "(lib/)(.*?)\\.js$";
Pattern pattern = Pattern.compile(escapedRegEx);
Matcher matcher = pattern.matcher("src/main/lib/abc/DocumentHandler.js");
System.out.println("Matches: " + matcher.matches()); // false :-(
Did I forgot to escape something?
Use Matcher.find() instead of Matcher.matches() to check for subset of any string.
As per Java Doc:
Matcher#matches()
Attempts to match the entire region against the pattern.
Matcher#find()
Attempts to find the next subsequence of the input sequence that matches the pattern.
sample code:
String regEx = "(lib/)(.*)\\.js$";
String str = "src/main/lib/abc/DocumentHandler.js";
Pattern pattern = Pattern.compile(regEx);
Matcher matcher = pattern.matcher(str);
if (matcher.find()) { // <== returns true if found
System.out.println("Matches: " + matcher.group());
System.out.println("Path: " + matcher.group(2));
}
output:
Matches: lib/abc/DocumentHandler.js
Path: abc/DocumentHandler
Use Matcher#group(index) to get the matched group that is grouped by enclosing inside parenthesis (...) in the regex pattern.
You can use String#matches() method to match the whole string.
String regEx = "(.*)(/lib/)(.*?)\\.js$";
String str = "src/main/lib/abc/DocumentHandler.js";
System.out.println("Matched :" + str.matches(regEx)); // Matched : true
Note: Don't forget to escape dot . that has special meaning in regex pattern to match any thing other than new line.
Try this RegEx pattern
String regEx = "(.*)(lib\\/)(.*)(\\.js$)";
Pattern pattern = Pattern.compile(regEx);
Matcher matcher = pattern.matcher("src/main/lib/abc/DocumentHandler.js");
It's working for me:
Firstly you don't need to escape it, and secondly you are not matching the first part of the string.
String regEx = "(.*)(lib/)(.*?).js$";
Pattern pattern = Pattern.compile(regEx);
Matcher matcher = pattern.matcher("src/main/lib/abc/DocumentHandler.js");

Regex for matching pattern within quotes

I have some input data such as
some string with 'hello' inside 'and inside'
How can I write a regex so that the quoted text (no matter how many times it is repeated) is returned (all of the occurrences).
I have a code that returns a single quotes, but I want to make it so that it returns multiple occurances:
String mydata = "some string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'(.*?)+'");
Matcher matcher = pattern.matcher(mydata);
while (matcher.find())
{
System.out.println(matcher.group());
}
Find all occurences for me:
String mydata = "some '' string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'[^']*'");
Matcher matcher = pattern.matcher(mydata);
while(matcher.find())
{
System.out.println(matcher.group());
}
Output:
''
'hello'
'and inside'
Pattern desciption:
' // start quoting text
[^'] // all characters not single quote
* // 0 or infinite count of not quote characters
' // end quote
I believe this should fit your requirements:
\'\w+\'
\'.*?' is the regex you are looking for.

REGEX : How to escape []?

I'm working on strings like "[ro.multiboot]: [1]". How do I just select 1(it can also be 0) out of this string?
I am looking for a regex in Java.
Usually, you would do something like (assuming 0 and 1 were the only options):
^.*\[([01])\].*$
If you only wanted the value for ro.multiboot, you could change it to something like:
^.*\[ro.multiboot\].*\[([01])\].*$
(depending on how complex any of the non-bracketed stuff is allowed to be).
These would both basically only extract the value between square brackets if it were zero or one, and capture it into a capture variable so you could use it.
Of course, regex is not a world-wide standard, nor are the environments in which you use it. That means it depends a lot on your actual environment how you will actually code this up.
For Java, the following sample program may help:
import java.util.regex.*;
class Test {
public static void main(String args[]) {
Pattern p = Pattern.compile("^.*\\[ro.multiboot\\].*\\[([01])\\].*$");
String str;
Matcher m;
str = "[ro.multiboot]: [0]";
m = p.matcher (str);
if (m.find()) {
System.out.println ("str0 has " + m.group(1));
}
str = "[ro.multiboot]: [1]";
m = p.matcher (str);
if (m.find()) {
System.out.println ("str1 has " + m.group(1));
}
str = "[ro.multiboot]: [2]";
m = p.matcher (str);
if (m.find()) {
System.out.println ("str2 has " + m.group(1));
}
}
}
This results in (as expected):
str0 has 0
str1 has 1
#paxdiablo's regexps are correct, but complete answer for "How do I just select 1(it can also be 0) out of this string?" is:
1. very simple solution
String input = "[ro.multiboot]: [1]";
String matched = input.replaceFirst( "^.*\\[ro.multiboot\\].*\\[([01])\\].*$", "$1" );
2. same functionality, more complicated but with better performance
String input = "[ro.multiboot]: [1]";
Pattern p = Pattern.compile( "^.*\\[ro.multiboot\\].*\\[([01])\\].*$" );
Matcher m = p.matcher( input );
String matched = null;
if ( m.matches() ) matched = m.group( 1 );
Performance is better because the pattern is compiled just once (for example when you are matching array os such Strings);
Notes:
in both examples the group is part of regexps between ( and ) (if not escaped)
in Java you have to use \\[, because \[ returns error - it is not correct escape sequence for String

Categories

Resources