So I have myString which contains the string
"border-bottom: solid 1px #ccc;width:8px;background:#bc0b43;float:left;height:12px"
I want to use regex to check that it contains "width:8px" (\bwidth\s*:\s*(\d+)px)
If true, add the width value (i.e. 8 for above example) to my myList.
Attempt:
if (myString.contains("\\bwidth\\s*:\\s*(\\d+)px")) {
myList.add(valueofwidth) //unsure how to select the width value using regex
}
Any help?
EDIT: So I've looked into contains method and found that it doesn't allow regex. matches will allow regex but it looks for a complete match instead.
You need to use Matcher#find() method for that.
From the documentation: -
Attempts to find the next subsequence of the input sequence that
matches the pattern.
And then you can get the captured group out of it: -
Matcher matcher = Pattern.compile("\\bwidth\\s*:\\s*(\\d+)px").matcher(myString);
if (matcher.find()) {
myList.add(matcher.group(1));
}
You have to use a Matcher and matcher.find():
Pattern pattern = Pattern.compile("(\\bwidth\\s*:\\s*(?<width>\\d+)px)");
Matcher matcher = pattern.matcher(args);
while (matcher.find()) {
myList.add(matcher.group("width");
}
Your main problem is that contains() doesn't accept a regex, it accepts a literal String.
matches() on the other hand does accept a regex parameter, but must match the entire string to return true.
Next, once you have your match, you can use replaceAll() to extract your target content:
if (myString.matches(".*\\bwidth\\s*:\\s*\\d+px.*")) {
myList.add(myString.replaceAll(".*\\bwidth\\s*:\\s*(\\d+)px.*", "$1"))
}
This replaces the entire input String with the contents of group #1, which your original regex captures.
Note that I removed the redundant brackets from your original matching regex, but left them in for the replace to capture the target content.
Related
I have this code that needs to get words after / or in between this character.
Pattern pattern = Pattern.compile("\\/([a-zA-Z0-9]{0,})"); // Regex: \/([a-zA-Z0-9]{0,})
Matcher matcher = pattern.matcher(path);
if(matcher.matches()){
return matcher.group(0);
}
The regex \/([a-zA-Z0-9]{0,}) works but not in Java, what could be the reason?
You need to get the value of Group 1 and use find to get a partial match:
Pattern pattern = Pattern.compile("/([a-zA-Z0-9]*)");
Matcher matcher = pattern.matcher(path);
if(matcher.find()){
return matcher.group(1); // Here, use Group 1 value
}
Matcher.matches requires a full string match, only use it if your string fully matches the pattern. Else, use Matcher.find.
Since the value you need is captured into Group 1 (([a-zA-Z0-9]*), the subpattern enclosed with parentheses), you need to return that part.
You needn't escape the / in Java regex. Also, {0,} functions the same way as * quantifier (matches zero or more occurrences of the quantified subpattern).
Also, [a-zA-Z0-9] can be replaced with \p{Alnum} to match the same range of characters (see Java regex syntax reference. The pattern declaration will look like
"/(\\p{Alnum}*)"
I have such text:
120.65UAH Produkti Kvartal
5*14 14:24
Bal. 16603.52UAH
What I want to do:
If this text contains "5*14", I need to get 16603.52 via one java reg exp.
this
and this
and this
I tried to create conditional regexp like this:
(5*14 ([\d\.*]+)UAH)
(5*14 d{2}:d{2} Bal. ([\d\.*]+))
etc
But no luck, can you please share your th
You can use a regex like this:
(?=5\*14)[\s\S]*?(\d{5}\.\d{2})
Working demo
Update: you even don't need the look ahead, you can just use:
5\*14[\s\S]*?(\d{5}\.\d{2})
(\d*\.\d\d)(?>\w*)$
will match a group on the last set of DDDDD.DD in the line. You will need to take the contents of the first matching group.
If you have 5*14 before the float number you need to get, you can just use
(?s)\\b5\\*14\\b.*?\\b(\\d+\\.\\d+)
See demo. The value will be in Group 1. I also used Java escaping style.
Note that 5\*14 can match in 145*143 that is why I am using word boundaries \b. .*? with (?s) matches any number of any symbols but as few as possible. \d+\.\d+ matches simple float number (irrespective of the number of digits there are in it).
IDEONE demo:
String str = "120.65UAH Produkti Kvartal\n5*14 14:24\nBal. 16603.52UAH";
Pattern ptrn = Pattern.compile("(?s)\\b5\\*14\\b.*?\\b(\\d+\\.\\d+)");
Matcher matcher = ptrn.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Result: 16603.52
I'm using Matcher to capture groups using a regular expression in Java and it keeps throwing an IllegalStateException even though I know that the expression matches.
This is my code:
String safeName = Pattern.compile("(\\.\\w+)$").matcher("google.ca").group();
I'm expecting safeName to be .ca as captured with the capturing group in the regular expression but instead I get:
IllegalStateException: No match found
I also tried with .group(0) and .group(1) but the same error occurs.
According to the documentation for group() and group(int group):
Capturing groups are indexed from left to right, starting at one. Group zero denotes the entire pattern, so the expression m.group(0) is equivalent to m.group().
What am I doing wrong?
Matcher is helper class which handles iterating over data to search for substrings matching regex. It is possible that entire string will contain many sub-strings which can be matched, so by calling group() you can't specify which actual match you are interested in. To solve this problem Matcher lets you iterate over all matching sub-strings and then use parts you are interested in.
So before you can use group you need to let Matcher iterate over your string to find() match for your regex. To check if regex matches entire String we can use matches() method instead of find().
Generally to find all matching substrings we are using
Pattern p = Pattern.compiler("yourPattern");
Matcher m = p.matcher("yourData");
while(m.find()){
String match = m.group();
//here we can do something with match...
}
Since you are assuming that text you want to find exists only once in your string (at its end) you don't need to use loop, but simple if (or conditional operator) should solve your problem.
Matcher m = Pattern.compile("(\\.\\w+)$").matcher("google.ca");
String safeName = m.find() ? m.group() : null;
Hello I have a question about RegEx. I am currently trying to find a way to grab a substring of any letter followed by any two numbers such as: d09.
I came up with the RegEx ^[a-z]{1}[0-9]{2}$ and ran it on the string
sedfdhajkldsfakdsakvsdfasdfr30.reed.op.1xp0
However, it never finds r30, the code below shows my approach in Java.
Pattern pattern = Pattern.compile("^[a-z]{1}[0-9]{2}$");
Matcher matcher = pattern.matcher("sedfdhajkldsfakdsakvsdfasdfr30.reed.op.1xp0");
if(matcher.matches())
System.out.println(matcher.group(1));
it never prints out anything because matcher never finds the substring (when I run it through the debugger), what am I doing wrong?
There are three errors:
Your expression contains anchors. ^ matches only at the start of the string, and $ only matches at the end. So your regular expression will match "r30" but not "foo_r30_bar". You are searching for a substring so you should remove the anchors.
The matches should be find.
You don't have a group 1 because you have no parentheses in your regular expression. Use group() instead of group(1).
Try this:
Pattern pattern = Pattern.compile("[a-z][0-9]{2}");
Matcher matcher = pattern.matcher("sedfdhajkldsfakdsakvsdfasdfr30.reed.op.1xp0");
if(matcher.find()) {
System.out.println(matcher.group());
}
ideone
Matcher Documentation
A matcher is created from a pattern by invoking the pattern's matcher method. Once created, a matcher can be used to perform three different kinds of match operations:
The matches method attempts to match the entire input sequence against the pattern.
The lookingAt method attempts to match the input sequence, starting at the beginning, against the pattern.
The find method scans the input sequence looking for the next subsequence that matches the pattern.
It doesn't match because ^ and $ delimite the start and the end of the string. If you want it to be anywhere, remove that and you will succed.
Your regex is anchored, as such it will never match unless the whole input matches your regex. Use [a-z][0-9]{2}.
Don't use .matches() but .find(): .matches() is shamefully misnamed and tries to match the whole input.
How about "[a-z][0-9][0-9]"? That should find all of the substrings that you are looking for.
^[a-z]{1}[0-9]{2}$
sedfdhajkldsfakdsakvsdfasdfr30.reed.op.1xp0
as far as i can read this
find thr first lower gives[s] caps letter after it there should be two numbers meaning the length of your string is and always will be 3 word chars
Maybe if i have more data about your string i can help
EDIT
if you are sure of *number of dots then
change this line
Matcher matcher = pattern.matcher("sedfdhajkldsfakdsakvsdfasdfr30.reed.op.1xp0");
to
Matcher matcher = pattern.matcher("sedfdhajkldsfakdsakvsdfasdfr30.reed.op.1xp0".split("\.")[0]);
note:-
using my solution you should omit the leading ^ for pattern
read this page for Spliting strings
I'm trying to make a regex all or nothing in the sense that the given word must EXACTLY match the regular expression - if not, a match is not found.
For instance, if my regex is:
^[a-zA-Z][a-zA-Z|0-9|_]*
Then I would want to match:
cat9
cat9_
bob_____
But I would NOT want to match:
cat7-
cat******
rango78&&
I want my regex to be as strict as possible, going for an all or nothing approach. How can I go about doing that?
EDIT: To make my regex absolutely clear, a pattern must start with a letter, followed by any number of numbers, letters, or underscores. Other characters are not permitted. Below is the program in question I am using to test out my regex.
Pattern p = Pattern.compile("^[a-zA-Z][a-zA-Z|0-9|_]*");
Scanner in = new Scanner(System.in);
String result = "";
while(!result.equals("-1")){
result = in.nextLine();
Matcher m = p.matcher(result);
if(m.find())
{
System.out.println(result);
}
}
I think that if you use String.matches(regex), then you will get the effect you are looking for. The documentation says that matches() will return true only if the entire string matches the pattern.
The regex won't match the second example. It's already strict, since * and & are not in the allowed set of characters.
It may match a prefix, but you can avoid this by adding '$' to the end of the regex, which explicitly matches end of input. So try,
^[a-zA-Z][a-zA-Z|0-9|_]*$
This will ensure the match is against the entire input string, and not just a prefix.
Note that \w is the same as [A-Za-z0-9_]. And you need to anchor to the end of the string like so:
Pattern p = Pattern.compile("^[a-zA-Z]\\w*$")