I have several strings in the rough form:
String s = "Rendering content from websiteNAme using user agent userAgentNameWithSpaces ; for user username ; at time someTime";
I want to extract the values for websiteName, userAgentNameWithSpaces, username and someTime.
I have tried the following code.
private static final Pattern USER_NAME_PATTERN = Pattern.compile("for user.*;");
final Matcher matcher = USER_NAME_PATTERN.matcher(line);
matcher.find() ? Optional.of(matcher.group(group)) : Optional.empty();
It returns the whole string " for user username" after that I have to replace the for user string with empty string to get the user name.
However, I want to know if there is regex to just get the username directly?
You can use regex groups:
Pattern pattern = Pattern.compile("for user (\\w+)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
The pair of parenthesis ( and ) forms a group that can be obtained by the matcher using group method (as it's the first parenthesis, it's group 1).
\w means a "word character" (letters, numbers and _) and + means "one or more ocurrences". So \w+ means basically "a word" (assuming your username has only these characters). PS: note that I had to escape \, so the resulting expression is \\w+.
The ouput of this code is:
username
If you want to match all the values (websiteName, userAgentNameWithSpaces and so on), you could do the following:
Pattern pattern = Pattern.compile("Rendering content from (.*) using user agent (.*) ; for user (.*) ; at time (.*)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
System.out.println(matcher.group(3));
System.out.println(matcher.group(4));
}
The output will be:
websiteNAme
userAgentNameWithSpaces
username
someTime
Note that if userAgentNameWithSpaces contains spaces, \w+ won't work (because \w doesn't match spaces), so .* will work in this case.
But you can also use [\w ]+ - the brackes [] means "any of the characters inside me", so [\w ] means "a word character, or a space" (note that there's a space between w and ]. So the code would be (testing with a username with spaces):
String s = "Rendering content from websiteNAme using user agent userAgent Name WithSpaces ; for user username ; at time someTime";
Pattern pattern = Pattern.compile("Rendering content from (.*) using user agent ([\\w ]+) ; for user (.*) ; at time (.*)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
System.out.println(matcher.group(3));
System.out.println(matcher.group(4));
}
And the output will be:
websiteNAme
userAgent Name WithSpaces
username
someTime
Note: you can test if the groups were matched before calling matcher.group(n). The method matcher.groupCount() returns how many groups were matched (because if you call matcher.group(n) and group n is not available, you'll get an IndexOutOfBoundsException)
I think you want to use lookaheads and lookbehinds:
String s = "Rendering content from websiteNAme using user agent userAgentNameWithSpaces ; for user username ; at time someTime";
Pattern USER_NAME_PATTERN = Pattern.compile("(?<=for user).*?(?=;)");
final Matcher matcher = USER_NAME_PATTERN.matcher(s);
matcher.find();
System.out.println(matcher.group(0).trim());
Output:
username
Related
A password consists of digits and Latin letters in any case;
a password always follow by the "password" word (in any case), but they can be separated by any number of spaces and the colon : characters.
I try this regular expression
Pattern pattern = Pattern.compile("password\\s\\w*",Pattern.CASE_INSENSITIVE);
If text is
String text = "My email javacoder#gmail.com with password SECRET115. Here is my old PASSWORD: PASS111.\n";//scanner.nextLine();
We need to find SECRET115 and PASS111.
Now program fails and cannot find pattern.
You may add an optional : after password, and match 0 or more whitespaces with \s*:
password:?\s*(\w+)
See the regex demo.
Details
password - a fixed string
:? - 1 or 0 colons
\s* - 0+ whitespaces
(\w+) - Capturing group 1: one or more word chars.
Java demo:
String s = "My email javacoder#gmail.com with password SECRET115. Here is my old PASSWORD: PASS111.\n";
Pattern pattern = Pattern.compile("password:?\\s*(\\w+)", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(1));
}
Output:
SECRET115
PASS111
I have a string email = John.Mcgee.r2d2#hitachi.com
How can I write a java code using regex to bring just the r2d2?
I used this but got an error on eclipse
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = patter.matcher
for (Strimatcher.find()){
System.out.println(matcher.group(1));
}
To match after the last dot in a potential sequence of multiple dots request that the sequence that you capture does not contain a dot:
(?<=[.])([^.]*)(?=#)
(?<=[.]) means "preceded by a single dot"
(?=#) means "followed by # sign"
Note that since dot . is a metacharacter, it needs to be escaped either with \ (doubled for Java string literal) or with square brackets around it.
Demo.
Not sure if your posting the right code. I'll rewrite it based on what it should look like though:
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = pattern.matcher(email);
int count = 0;
while(matcher.find()) {
count++;
System.out.println(matcher.group(count));
}
but I think you just want something like this:
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = pattern.matcher(email);
if(matcher.find()){
System.out.println(matcher.group(1));
}
No need to Pattern you just need replaceAll with this regex .*\.([^\.]+)#.* which mean get the group ([^\.]+) (match one or more character except a dot) which is between dot \. and #
email = email.replaceAll(".*\\.([^\\.]+)#.*", "$1");
Output
r2d2
regex demo
If you want to go with Pattern then you have to use this regex \\.([^\\.]+)# :
String email = "John.Mcgee.r2d2#hitachi.com";
Pattern pattern = Pattern.compile("\\.([^\\.]+)#");
Matcher matcher = pattern.matcher(email);
if (matcher.find()) {
System.out.println(matcher.group(1));// Output : r2d2
}
Another solution you can use split :
String[] split = email.replaceAll("#.*", "").split("\\.");
email = split[split.length - 1];// Output : r2d2
Note :
Strings in java should be between double quotes "John.Mcgee.r2d2#hitachi.com"
You don't need to escape # in Java, but you have to escape the dot with double slash \\.
There are no syntax for a for loop like you do for (Strimatcher.find()){, maybe you mean while
Given the following string
Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305
I want to be able to extract the following using a regular expression
i-b9b4ffaa
ami-dbcf88b1
vol-e97db305
This is the regular expression I came up with, which currently doesn't do what I need :
Pattern p = Pattern.compile("Created by CreateImage([a-z]+[0.9]+)([a-z]+[0.9]+)([a-z]+[0.9]+)",Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305");
System.out.println(m.matches()); --> false
You may match all words starting with letters, followed with a hyphen, and then having alphanumeric chars:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
Pattern pattern = Pattern.compile("(?i)\\b[a-z]+-[a-z0-9]+");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(0));
}
// => i-b9b4ffaa, ami-dbcf88b1, vol-e97db305
See the Java demo
Pattern details:
(?i) - a case insensitive modifier (embedded flag option)
\\b - a word boundary
[a-z]+ - 1 or more ASCII letters
- - a hyphen
[a-z0-9]+ - 1 or more alphanumerics.
To make sure these values appear on the same line after Created by CreateImage, use a \G-based regex:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
Pattern pattern = Pattern.compile("(?i)(?:Created by CreateImage|(?!\\A)\\G)(?:(?!\\b[a-z]+-[a-z0-9]+).)*\\b([a-z]+-[a-z0-9]+)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(1));
}
See this demo.
Note that the above pattern is based on the \G operator that matches the end of the last successful match (so we only match after a match or after Created...) and a tempered greedy token (?:(?!\\b[a-z]+-[a-z0-9]+).)* (matching any symbol other than a newline that does not start a sequence: word boundary+letters+-+letters|digits) that is very resource consuming.
You should consider using a two-step approach to first check if a string starts with Created... string, and then process it:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
if (s.startsWith("Created by CreateImage")) {
Matcher n = Pattern.compile("(?i)\\b[a-z]+-[a-z0-9]+").matcher(s);
while(n.find()) {
System.out.println(n.group(0));
}
}
See another demo
All , I want to write a pattern regex to extract the: "/images/colorbox/ie6/borderBottomRight.png" from cssContent=".cboxIE6 #cboxBottomRight{background:url(../images/colorbox/ie6/borderBottomRight.png);}"
Who can write a pattern regex for me? Thanks a lot.
My regex can't work as:
Pattern pattern = Pattern.compile("[.*]*/:url/(/././/(.+?)/)/;[.*]*");
Matcher matcher = pattern.matcher(cssContent);
if(matcher.find()){
System.out.println(matcher.group(0));
}
Pattern pattern = Pattern.compile(":url\\(\\.\\.([^)]+)\\)");
Matcher matcher = pattern.matcher(cssContent);
if(matcher.find()){
System.out.println(matcher.group(1));
}
The regex used to match is (quoted and without \ escaped)
":url\(\.\.([^)]+)\)"
which looks for :url(.. followed by [^)] anything that's not a closing ) bracket + one or more times; finally followed by the closing ) bracket. The group () captured is available at group(1) whereas group(0) would give you the complete string that matched i.e. from :url to the closing ).
The biggest error you were making was using "/" to escape your literal characters. You need to use "\", and annoyingly, in a java string "\" must be escaped with "\", so the total escape sequence is "\\". Then, you have matcher.group(0), which matches the entire pattern. You needmatcher.group(1)` to match the first (and only) group in your regex, which contains your string of interest. Here's the corrected code:
String cssContent = "cssContent=\".cboxIE6 #cboxBottomRight{background:url(../images/colorbox/ie6/borderBottomRight.png);}\"";
String regex = ".*?:url\\(\\.\\.(.+?)\\);[.*]*";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(cssContent);
if(matcher.find()){
System.out.println(matcher.group(1));
}
I have some input data such as
some string with 'hello' inside 'and inside'
How can I write a regex so that the quoted text (no matter how many times it is repeated) is returned (all of the occurrences).
I have a code that returns a single quotes, but I want to make it so that it returns multiple occurances:
String mydata = "some string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'(.*?)+'");
Matcher matcher = pattern.matcher(mydata);
while (matcher.find())
{
System.out.println(matcher.group());
}
Find all occurences for me:
String mydata = "some '' string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'[^']*'");
Matcher matcher = pattern.matcher(mydata);
while(matcher.find())
{
System.out.println(matcher.group());
}
Output:
''
'hello'
'and inside'
Pattern desciption:
' // start quoting text
[^'] // all characters not single quote
* // 0 or infinite count of not quote characters
' // end quote
I believe this should fit your requirements:
\'\w+\'
\'.*?' is the regex you are looking for.