Java String RegularExpressions

Java String RegularExpressions - java

Team,
I had a task. i.e., i want to check 98% in a blcvk of data.
I trying to write some regex but its giving continuous error.
String str="OAM-2 OMFUL abmasc01 and prdrot01 98% users NB in host nus918pe locked.";
if(str.matches("[0-9][0-9]%"))
but it is returning false.
Response is truly appreciated.

Use the pattern/matcher/find method. matches applies the regex to the whole string.
Pattern pattern = Pattern.compile("[0-9]{2}%");
String test = "OAM-2 OMFUL abmasc01 and prdrot01 98% users NB in host nus918pe locked.";
Matcher matcher = pattern.matcher(test);
if(matcher.find()) {
System.out.println("Matched!");
}

Try:
str.matches(".*[0-9][0-9]%.*")
or (\d = digit):
str.matches(".*\\d\\d%.*")
The matching pattern should also match the characters that come before/after the 98% which is why you should add the .*
Comment:
You can use Pattern matcher like the others suggested, it's especially effective if you want to extract 98% out of the string - but if you're just looking to find if there's a match - I find .matches() to be simpler to use.

str.matches("[0-9][0-9]%") actually applies this regex ^[0-9][0-9]%$, which is anchored at start and end. Others have described solutions to this already.

You can try this regex \d{1,2}(\.\d{0,2})?% this will match 98% or percentage with decimal values like 98.56%as well.
Pattern pattern = Pattern.compile("\\d{1,2}(\\.\\d{0,2})?%");
String yourString= "OAM-2 OMFUL abmasc01 and prdrot01 98% users NB in host nus918pe locked.";
Matcher matcher = pattern.matcher(yourString);
while(matcher.find()) {
System.out.println(matcher.group());
}

Related

Pattern matching using Regex in Java

I have a stream from which I read a string that looks like the following:
event.tag.report tag_id=0xABCD0029605, type=ISOB_80K, antenna=1, frequency=918250, rssi=-471, tx_power=330, time=2017-12-18T19:44:07.198
^^^^^^^^^^^^^
I am trying to use Regex to just get the highlighted part (underlined by ^^^^) for every string that I read. My pattern for the Regex is as follows:
.*\\s(tag_id=)(.{38})(\\,\\s)(.*)$
However, this does not work for tag_ids which are longer than or shorter than 38 digits.
Can someone help me with a string pattern that will help me just get the highlighted area in the string independent of its size?

Looks to me as though you want all hexidecimal characters:
"tag_id=(0x[A-F0-9]+)"
So
Pattern pattern = Pattern.compile("tag_id=(0x[A-F0-9]+)");
Matcher matcher = pattern.matcher("event.tag.report tag_id=0x313532384D3135374333343435393031, type=ISOC");
if (matcher.find())
System.out.println(matcher.group(1));
returns:
0x313532384D3135374333343435393031

Regular expression for hgsv notation in java

HGSV nomenclature has a pattern:
xxxxx.yyyy:charactersnumbercharacters
I would like to make a regex in java and fetch the all the tokens from above eg:
it should have 5 tokens :
{ 'xxxxx', 'yyyy', 'characters', 'number' , 'characters'}
I have used simple split methodology to fetch the tokens, but I don't find its an optimal solution:
my current code is :
String hgsv = "BRAF.p:V600E";
String[] tokens = hgsv.split(".");
this.symbol = tokens[0];
String type = tokens[1].split(":")[0];
I would like to use Pattern and Matcher in Java. No idea, how to make regex for the above token.
Any clue how to do that?
(even to separate characters, numbers, characters I will be using regex). So why not to use REGEX for entire token.
I found link but this is in Python, I need similar in Java.

I think what you're probably looking for is to use capture groups, like this:
String s = "BRAF.p:V600E";
Pattern p = Pattern.compile("(\\w+)\\.(\\w+):([a-zA-Z]+)(\\d+)([a-zA-Z]+)");
Matcher m = p.matcher(s);
if (m.matches()) {
String[] parts = {m.group(1),
m.group(2),
m.group(3),
m.group(4),
m.group(5)};
// Prints "[BRAF, p, V, 600, E]"
System.out.println(Arrays.toString(parts));
} else {
// The input String is invalid.
}
That's really just a lot like a split, but it's more stable because you're using the pattern to validate the String beforehand.
Note that I have no idea if that is the exact right pattern that you should be using. I don't know the exact details of the HGSV notation you're talking about and your description is actually pretty vague. (What are e.g. xxxxx and yyyy? What are "characters"?) If you link me to some sort of specification or detailed description of this notation I can try to write a regex that's more definitely correct.
Anyhow, my example shows the basic idea. You might also see http://www.regular-expressions.info/brackets.html for more information.

Regex matching in online tester but not in JAVA

I'm trying to extract the text BetClic from this string popup_siteinfo(this, '/click/betclic', '373', 'BetClic', '60€');
I wrote a simple regex that works on Regex Tester but that doesn't work on Java.
Here's the regex
'\d+', '(.*?)'
here's Java output
Exception in thread "main" java.lang.IllegalStateException: No match found
at java.util.regex.Matcher.group(Matcher.java:485)
at javaapplication1.JavaApplication1.main(JavaApplication1.java:74)
Java Result: 1
and here's my code
Pattern pattern = Pattern.compile("'\\d+', '(.*?)'");
Matcher matcher = pattern.matcher(onMouseOver);
System.out.print(matcher.group(1));
where the onMouseOver string is popup_siteinfo(this, '/click/betclic', '373', 'BetClic', '60€');
I'm not an expert with regex, but I'm quite sure that mine isn't wrong at all!
Suggestions?

You need to call find() before group(...):
Pattern pattern = Pattern.compile("'\\d+', '(.*?)'");
Matcher matcher = pattern.matcher(onMouseOver);
if(matcher.find()) {
System.out.print(matcher.group(1));
}
else {
System.out.print("no match");
}

You're calling group(1) without having first called a matching operation (such as find()).- which is the cause of IllegalStateException.
And if you have to use that grouped cases for replacement then this isn't needed if you're just using $1 since the replaceAll() is the matching operation.

Negating a Regular Expression for string replacement

I have the following code that can replace the email address in a String in Java:
addressStr.replaceFirst("([a-zA-Z0-9_\\-\\.]+)#((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.)|(([a-zA-Z0-9\\-]+\\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})", "")
So, a string with John Smith <john#smith.com> would become John Smith <>. How do I negate it so that it will instead replace all that doesn't match the email address and have the final result as just john#smith.com?
I tried to put in the ^ and ?<= at the front but it doesn't work.

Well, it's not the regex you need to change but the calling code. Your regex matches the e-mail address (in a weird way), and the replace() removes it from the string.
So just use
Pattern regex = Pattern.compile("([a-zA-Z0-9_\\-\\.]+)#((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.)|(([a-zA-Z0-9\\-]+\\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})");
Matcher regexMatcher = regex.matcher(addressStr);
if (regexMatcher.find()) {
address = regexMatcher.group();
}

The complete Java regex for catching e-mails would be as follows:
"(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")#(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])"
Take a look at https://www.rfc-editor.org/rfc/rfc2822#section-3.4.1 for more info on this.
A bit complicated but it is valid for all known and valid emails formats (yours do not allows mails like bob+bib#gmail.com which are valid).
For your problem, as stated multiple times, just find (stealing Tim Pietzcker piece of code):
Pattern regex = Pattern.compile("(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")#(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])");
Matcher regexMatcher = regex.matcher(addressStr);
foundMatch = regexMatcher.find();

You can try:
String mailId = Pattern.compile(regexp, Pattern.LITERAL).matcher(addressStr).group();
Idea here is to get the matched string rather than trying to replace everything else with blank. You can extract the pattern into a field if this operation is repetitive.

Just don't replace.... use match(es) instead.

Getting value of $1 from matcher.replaceAll()

In my application I need get the link and break it if it is bigger than 10(example) chars.
The problem is, if I send the whole text, for example: "this is my website www.stackoverflow.com" directly to this matcher
Pattern patt = Pattern.compile("(?i)\\b((?:https?://|www\\d{0,3}[.]|[a-z0-9.\\-]+[.][a-z]{2,4}/)(?:[^\\s()<>]+|\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\))+(?:\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\)|[^\\s`!()\\[\\]{};:\'\".,<>???“”‘’]))");
Matcher matcher = patt.matcher(text);
matcher.replaceAll("$1");
it would show the whole website, without breaking it.
What I was trying to do, is to get the value of $1, so i could break the second one, keeping the first one correctly.
I've got another method to break the string up.
UPDATE
What I want to get is only the website so I could break it after all. It would help me a lot.

You can't use replaceAll; you should iterate through the matches and process each one individually. Java's Matcher already has an API for this:
// expanding on the example in the 'appendReplacement' JavaDoc:
Pattern p = Pattern.compile("..."); // your URL regexp
Matcher m = p.matcher(text);
StringBuffer sb = new StringBuffer();
while (m.find()) {
String truncatedURL = m.group(1).replaceFirst("^(.{10}).*","$1..."); // i iz smrt
m.appendReplacement(sb,
"<a href=\"http://$1\" target=\"_blank\">"); // simple replacement for $1
sb.append(truncatedURL);
sb.append("</a>");
}
m.appendTail(sb);
System.out.println(sb.toString());
(For performance, you should factor out compiled Patterns for the replace* calls inside the loop.)
Edit: use sb.append() so not to worry about escaping $ and \ in 'truncatedURL'.

I think that you have a similar problem to the one mentioned on this question
Java : replacing text URL with clickable HTML link
they suggested something like this
String basicUrlRegex = "(.*://[^<>[:space:]]+[[:alnum:]/])";
myString.replaceAll(basicUrlRegex, "$1");

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java String RegularExpressions - java

str.matches("[0-9][0-9]%") actually applies this regex ^[0-9][0-9]%$, which is anchored at start and end. Others have described solutions to this already.

Related

Pattern matching using Regex in Java

Regular expression for hgsv notation in java

Regex matching in online tester but not in JAVA

Negating a Regular Expression for string replacement

Getting value of $1 from matcher.replaceAll()

Categories

Resources