Regular expression for extracting instance ID, AMI ID, Volume ID - java

Given the following string
Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305
I want to be able to extract the following using a regular expression
i-b9b4ffaa
ami-dbcf88b1
vol-e97db305
This is the regular expression I came up with, which currently doesn't do what I need :
Pattern p = Pattern.compile("Created by CreateImage([a-z]+[0.9]+)([a-z]+[0.9]+)([a-z]+[0.9]+)",Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305");
System.out.println(m.matches()); --> false

You may match all words starting with letters, followed with a hyphen, and then having alphanumeric chars:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
Pattern pattern = Pattern.compile("(?i)\\b[a-z]+-[a-z0-9]+");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(0));
}
// => i-b9b4ffaa, ami-dbcf88b1, vol-e97db305
See the Java demo
Pattern details:
(?i) - a case insensitive modifier (embedded flag option)
\\b - a word boundary
[a-z]+ - 1 or more ASCII letters
- - a hyphen
[a-z0-9]+ - 1 or more alphanumerics.
To make sure these values appear on the same line after Created by CreateImage, use a \G-based regex:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
Pattern pattern = Pattern.compile("(?i)(?:Created by CreateImage|(?!\\A)\\G)(?:(?!\\b[a-z]+-[a-z0-9]+).)*\\b([a-z]+-[a-z0-9]+)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(1));
}
See this demo.
Note that the above pattern is based on the \G operator that matches the end of the last successful match (so we only match after a match or after Created...) and a tempered greedy token (?:(?!\\b[a-z]+-[a-z0-9]+).)* (matching any symbol other than a newline that does not start a sequence: word boundary+letters+-+letters|digits) that is very resource consuming.
You should consider using a two-step approach to first check if a string starts with Created... string, and then process it:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
if (s.startsWith("Created by CreateImage")) {
Matcher n = Pattern.compile("(?i)\\b[a-z]+-[a-z0-9]+").matcher(s);
while(n.find()) {
System.out.println(n.group(0));
}
}
See another demo

Related

extract a set of a characters between some characters

I have a string email = John.Mcgee.r2d2#hitachi.com
How can I write a java code using regex to bring just the r2d2?
I used this but got an error on eclipse
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = patter.matcher
for (Strimatcher.find()){
System.out.println(matcher.group(1));
}
To match after the last dot in a potential sequence of multiple dots request that the sequence that you capture does not contain a dot:
(?<=[.])([^.]*)(?=#)
(?<=[.]) means "preceded by a single dot"
(?=#) means "followed by # sign"
Note that since dot . is a metacharacter, it needs to be escaped either with \ (doubled for Java string literal) or with square brackets around it.
Demo.
Not sure if your posting the right code. I'll rewrite it based on what it should look like though:
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = pattern.matcher(email);
int count = 0;
while(matcher.find()) {
count++;
System.out.println(matcher.group(count));
}
but I think you just want something like this:
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = pattern.matcher(email);
if(matcher.find()){
System.out.println(matcher.group(1));
}
No need to Pattern you just need replaceAll with this regex .*\.([^\.]+)#.* which mean get the group ([^\.]+) (match one or more character except a dot) which is between dot \. and #
email = email.replaceAll(".*\\.([^\\.]+)#.*", "$1");
Output
r2d2
regex demo
If you want to go with Pattern then you have to use this regex \\.([^\\.]+)# :
String email = "John.Mcgee.r2d2#hitachi.com";
Pattern pattern = Pattern.compile("\\.([^\\.]+)#");
Matcher matcher = pattern.matcher(email);
if (matcher.find()) {
System.out.println(matcher.group(1));// Output : r2d2
}
Another solution you can use split :
String[] split = email.replaceAll("#.*", "").split("\\.");
email = split[split.length - 1];// Output : r2d2
Note :
Strings in java should be between double quotes "John.Mcgee.r2d2#hitachi.com"
You don't need to escape # in Java, but you have to escape the dot with double slash \\.
There are no syntax for a for loop like you do for (Strimatcher.find()){, maybe you mean while

How to truncate a string after 5 delimiter in java?

String s = aaa-bbb-ccc-ddd-ee-23-xyz;
I need to convert the above string into aaa-bbb-ccc-ddd-ee, which means my output should only print words before fifth delimiter. could any help to solve this?
You could use a Regex:
String s = "aaa-bbb-ccc-ddd-ee-23-xyz";
Pattern p = Pattern.compile("^\\w+\\-\\w+\\-\\w+\\-\\w+\\-\\w+");
Matcher matcher = p.matcher(s);
matcher.find();
System.out.println(matcher.group(0));
Output is aaa-bbb-ccc-ddd-ee
If you have more than just letters you can replace the \\w with [^\\-] which grabs all characters but the delemiter.
Use Pattern and Matcher like this:
String s = "aaa-bbb-ccc-ddd-ee-23-xyz";
Pattern pattern = Pattern.compile("^((.+?-){4}[^-]+).*$");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
s = matcher.group(1);
}
.* - search all symbols. ? - for lazy work
(.*?-) - search character sequence which end with symbol '-'
{4} - in your result string '-' 4 times
[^-]+ - after you search characters without '-'
.* - another characters after you serch
matcher.group(1) - return first group. This is ((.+?-){4}[^-]+)

Match Strings which begin with X and end with Y?

I want to match every file name which ends with .js and is stored in a directory called lib.
Therefore I created the following regular expression: (lib/)(.*?).js$.
I tested the expression (lib/)(.*?).js$ in a Regex Tester and matched this filename: src/main/lib/abc/DocumentHandler.js.
To use my expression in Java, I escaped it to: (lib/)(.*?)\\.js$.
Nevertheless, Java tells me that my expression does not match.
Here is my code:
String regEx = "(lib/)(.*?).js$";
String escapedRegEx = "(lib/)(.*?)\\.js$";
Pattern pattern = Pattern.compile(escapedRegEx);
Matcher matcher = pattern.matcher("src/main/lib/abc/DocumentHandler.js");
System.out.println("Matches: " + matcher.matches()); // false :-(
Did I forgot to escape something?
Use Matcher.find() instead of Matcher.matches() to check for subset of any string.
As per Java Doc:
Matcher#matches()
Attempts to match the entire region against the pattern.
Matcher#find()
Attempts to find the next subsequence of the input sequence that matches the pattern.
sample code:
String regEx = "(lib/)(.*)\\.js$";
String str = "src/main/lib/abc/DocumentHandler.js";
Pattern pattern = Pattern.compile(regEx);
Matcher matcher = pattern.matcher(str);
if (matcher.find()) { // <== returns true if found
System.out.println("Matches: " + matcher.group());
System.out.println("Path: " + matcher.group(2));
}
output:
Matches: lib/abc/DocumentHandler.js
Path: abc/DocumentHandler
Use Matcher#group(index) to get the matched group that is grouped by enclosing inside parenthesis (...) in the regex pattern.
You can use String#matches() method to match the whole string.
String regEx = "(.*)(/lib/)(.*?)\\.js$";
String str = "src/main/lib/abc/DocumentHandler.js";
System.out.println("Matched :" + str.matches(regEx)); // Matched : true
Note: Don't forget to escape dot . that has special meaning in regex pattern to match any thing other than new line.
Try this RegEx pattern
String regEx = "(.*)(lib\\/)(.*)(\\.js$)";
Pattern pattern = Pattern.compile(regEx);
Matcher matcher = pattern.matcher("src/main/lib/abc/DocumentHandler.js");
It's working for me:
Firstly you don't need to escape it, and secondly you are not matching the first part of the string.
String regEx = "(.*)(lib/)(.*?).js$";
Pattern pattern = Pattern.compile(regEx);
Matcher matcher = pattern.matcher("src/main/lib/abc/DocumentHandler.js");

I want to extracting css image path by Java Pattern expression

All , I want to write a pattern regex to extract the: "/images/colorbox/ie6/borderBottomRight.png" from cssContent=".cboxIE6 #cboxBottomRight{background:url(../images/colorbox/ie6/borderBottomRight.png);}"
Who can write a pattern regex for me? Thanks a lot.
My regex can't work as:
Pattern pattern = Pattern.compile("[.*]*/:url/(/././/(.+?)/)/;[.*]*");
Matcher matcher = pattern.matcher(cssContent);
if(matcher.find()){
System.out.println(matcher.group(0));
}
Pattern pattern = Pattern.compile(":url\\(\\.\\.([^)]+)\\)");
Matcher matcher = pattern.matcher(cssContent);
if(matcher.find()){
System.out.println(matcher.group(1));
}
The regex used to match is (quoted and without \ escaped)
":url\(\.\.([^)]+)\)"
which looks for :url(.. followed by [^)] anything that's not a closing ) bracket + one or more times; finally followed by the closing ) bracket. The group () captured is available at group(1) whereas group(0) would give you the complete string that matched i.e. from :url to the closing ).
The biggest error you were making was using "/" to escape your literal characters. You need to use "\", and annoyingly, in a java string "\" must be escaped with "\", so the total escape sequence is "\\". Then, you have matcher.group(0), which matches the entire pattern. You needmatcher.group(1)` to match the first (and only) group in your regex, which contains your string of interest. Here's the corrected code:
String cssContent = "cssContent=\".cboxIE6 #cboxBottomRight{background:url(../images/colorbox/ie6/borderBottomRight.png);}\"";
String regex = ".*?:url\\(\\.\\.(.+?)\\);[.*]*";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(cssContent);
if(matcher.find()){
System.out.println(matcher.group(1));
}

Regex for matching pattern within quotes

I have some input data such as
some string with 'hello' inside 'and inside'
How can I write a regex so that the quoted text (no matter how many times it is repeated) is returned (all of the occurrences).
I have a code that returns a single quotes, but I want to make it so that it returns multiple occurances:
String mydata = "some string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'(.*?)+'");
Matcher matcher = pattern.matcher(mydata);
while (matcher.find())
{
System.out.println(matcher.group());
}
Find all occurences for me:
String mydata = "some '' string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'[^']*'");
Matcher matcher = pattern.matcher(mydata);
while(matcher.find())
{
System.out.println(matcher.group());
}
Output:
''
'hello'
'and inside'
Pattern desciption:
' // start quoting text
[^'] // all characters not single quote
* // 0 or infinite count of not quote characters
' // end quote
I believe this should fit your requirements:
\'\w+\'
\'.*?' is the regex you are looking for.

Categories

Resources