I have HTML that I need to extract a part number from, the HTML looks like:
javascript:selectItem('ABC123 1', '.....
I need to get the ABC123 from the above.
My code snippet:
Patterp p = Pattern.Compile("?????");
Matcher m = p.matcher(html);
if(m.find())
partNumber = m.group(1).trim();
BTW, in the pattern, how do I escape for the character (
I now for quotes I do \"
thanks allot!
You escape ( by putting a \ before it. Because it's in a String, you need to escape the \ so the sequence is \\(. This should parse that snippet:
Pattern p = Pattern.compile("javascript:selectItem\\('(\\w+)");
Matcher m = p.matcher(html);
if (m.find()) {
String partNumber = m.group(1);
}
I've assumed the part number is one or more word characters (meaning digits, letters or underscore).
You could use this:
Pattern regex = Pattern.compile("(?<=selectItem\\(')\\S*",Pattern.CASE_INSENSITIVE);
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
ResultString = regexMatcher.group(1);
}
Related
I have a string email = John.Mcgee.r2d2#hitachi.com
How can I write a java code using regex to bring just the r2d2?
I used this but got an error on eclipse
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = patter.matcher
for (Strimatcher.find()){
System.out.println(matcher.group(1));
}
To match after the last dot in a potential sequence of multiple dots request that the sequence that you capture does not contain a dot:
(?<=[.])([^.]*)(?=#)
(?<=[.]) means "preceded by a single dot"
(?=#) means "followed by # sign"
Note that since dot . is a metacharacter, it needs to be escaped either with \ (doubled for Java string literal) or with square brackets around it.
Demo.
Not sure if your posting the right code. I'll rewrite it based on what it should look like though:
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = pattern.matcher(email);
int count = 0;
while(matcher.find()) {
count++;
System.out.println(matcher.group(count));
}
but I think you just want something like this:
String email = John.Mcgee.r2d2#hitachi.com
Pattern pattern = Pattern.compile(".(.*)\#");
Matcher matcher = pattern.matcher(email);
if(matcher.find()){
System.out.println(matcher.group(1));
}
No need to Pattern you just need replaceAll with this regex .*\.([^\.]+)#.* which mean get the group ([^\.]+) (match one or more character except a dot) which is between dot \. and #
email = email.replaceAll(".*\\.([^\\.]+)#.*", "$1");
Output
r2d2
regex demo
If you want to go with Pattern then you have to use this regex \\.([^\\.]+)# :
String email = "John.Mcgee.r2d2#hitachi.com";
Pattern pattern = Pattern.compile("\\.([^\\.]+)#");
Matcher matcher = pattern.matcher(email);
if (matcher.find()) {
System.out.println(matcher.group(1));// Output : r2d2
}
Another solution you can use split :
String[] split = email.replaceAll("#.*", "").split("\\.");
email = split[split.length - 1];// Output : r2d2
Note :
Strings in java should be between double quotes "John.Mcgee.r2d2#hitachi.com"
You don't need to escape # in Java, but you have to escape the dot with double slash \\.
There are no syntax for a for loop like you do for (Strimatcher.find()){, maybe you mean while
I have the following string :
String xmlnode = "<firstname id="{$person.id}"> {$person.firstname} </firstname>";
How can I write a regex to extract the data inside the {$STRING_I_WANT}
The part I need is without {$} how can I achieve that?
You can use this regex \{\$(.*?)\} with pattern like this :
String xmlnode = "<firstname id=\"{$person.id}\"> {$person.firstname} </firstname>";
Pattern pattern = Pattern.compile("\\{\\$(.*?)\\}");
Matcher matcher = pattern.matcher(xmlnode);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Note : you have to escape each character { $ } with \ because each one is special character in regex.
Outputs
person.id
person.firstname
String s = aaa-bbb-ccc-ddd-ee-23-xyz;
I need to convert the above string into aaa-bbb-ccc-ddd-ee, which means my output should only print words before fifth delimiter. could any help to solve this?
You could use a Regex:
String s = "aaa-bbb-ccc-ddd-ee-23-xyz";
Pattern p = Pattern.compile("^\\w+\\-\\w+\\-\\w+\\-\\w+\\-\\w+");
Matcher matcher = p.matcher(s);
matcher.find();
System.out.println(matcher.group(0));
Output is aaa-bbb-ccc-ddd-ee
If you have more than just letters you can replace the \\w with [^\\-] which grabs all characters but the delemiter.
Use Pattern and Matcher like this:
String s = "aaa-bbb-ccc-ddd-ee-23-xyz";
Pattern pattern = Pattern.compile("^((.+?-){4}[^-]+).*$");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
s = matcher.group(1);
}
.* - search all symbols. ? - for lazy work
(.*?-) - search character sequence which end with symbol '-'
{4} - in your result string '-' 4 times
[^-]+ - after you search characters without '-'
.* - another characters after you serch
matcher.group(1) - return first group. This is ((.+?-){4}[^-]+)
The following returns no matches:
String patternStr = "((19\\d{2}|20\\d{2})-([0-2]\\d{2}|3[0-5]\\d)-(([0-1]\\d|2[0-3])[0-5]\\d[0-5]\\d))";
String fullPath = aFile.getAbsolutePath();
// fullPath should expand to this: "/home/user1/2013-023-135159_abcd_001/File.txt"
Pattern p = Pattern.compile(patternStr);
Matcher m = p.matcher(fullPath);
if (m.matches())
{
System.out.println("Matches found");
}
It should match the date portion, 2013-023-135159. I tested it online and the regex looks OK.
You will need to use:
m.find()
instead of:
m.matches()
As your regex is matching the parts of the input string not fully as expected by m.matches()
RegEx Demo
I have some input data such as
some string with 'hello' inside 'and inside'
How can I write a regex so that the quoted text (no matter how many times it is repeated) is returned (all of the occurrences).
I have a code that returns a single quotes, but I want to make it so that it returns multiple occurances:
String mydata = "some string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'(.*?)+'");
Matcher matcher = pattern.matcher(mydata);
while (matcher.find())
{
System.out.println(matcher.group());
}
Find all occurences for me:
String mydata = "some '' string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'[^']*'");
Matcher matcher = pattern.matcher(mydata);
while(matcher.find())
{
System.out.println(matcher.group());
}
Output:
''
'hello'
'and inside'
Pattern desciption:
' // start quoting text
[^'] // all characters not single quote
* // 0 or infinite count of not quote characters
' // end quote
I believe this should fit your requirements:
\'\w+\'
\'.*?' is the regex you are looking for.