Java pattern match for cisco configuration line - java

I have a text cisco configuration.
The hostname line I should match is "125-hostname billdevice".
I am using below pattern but not matching true.
Pattern ciscohostname = Pattern.compile("^[0-9999999]-hostname");
Matcher matcherx = ciscohostname.matcher(BlockIndexList.get(k).toString());
How can I match this line.

What you want is
"^[0-9]+-hostname"
This means:
Match if string starts with at least one character in range of [0-9](aka. digits) followed by string "-hostname"

As you've specified a range in your code (i.e., 0-9999999) then you can use this RegEx
^[0-9]{1,7}-hostname
This will ensure that only 1 to 7 digit numbers are matched and any number more than that will be eliminated.
0-hostname billdevice //match
9999999-hostname billdevice //match
10000000-hostname billdevice //no match
DEMO

Related

Java regex to detect semver strings is failing without qualifiers

I am trying to get a Java method to validate whether or not a String argument is a properly-formatted "semver" (semantic versioning) version string.
In my app, semver strings must be of the form:
<major>.<minor>.<patch>-<qualifier>
Where:
<major> is a positive integer (1+)
<minor> and <patch> are both non-negative integers (0+)
<qualifier> is an alphanumeric string (([0-9][a-z][A-Z])+)
Valid examples:
1.2.40
1.0.0-SNAPSHOT
2.0.45-RC
3.10.0
My best attempt thus far:
public boolean isSemVer(String version) {
Pattern versionPattern = Pattern.compile("^[a-zA-Z-]+\\d+\\.\\d+\\.\\d+");
Matcher matcher = versionPattern.matcher(version);
return matcher.matches();
}
Produces false for the first valid example of 1.2.40. Can anyone tell me where I'm going awry and what I need to tweak in my regex to get it to accept my use cases? Thanks in advance!
Your valid strings start with digits and not with letters, so [a-zA-Z-]+ in your pattern already makes the pattern wrong.
Use
^[1-9]\d*\.\d+\.\d+(?:-[a-zA-Z0-9]+)?$
See the regex demo
Details
^ - start of string
[1-9]\d* - a digit from 1 to 9 and then 0 or more digits
\.\d+\.\d+ - two occurrences of . and 1+ digits (can be written as (?:\.\d+){2})
(?:-[a-zA-Z0-9]+)? - an optional occurrence of - and 1+ alphanumeric chars ([a-zA-Z0-9] can be written as \p{Alnum})
$ - end of string.
In Java, use with .matches():
public boolean isSemVer(String version) {
Pattern versionPattern = Pattern.compile("[1-9]\\d*\\.\\d+\\.\\d+(?:-[a-zA-Z0-9]+)?");
Matcher matcher = versionPattern.matcher(version);
return matcher.matches();
}
You can try with the official SemVer regex
"^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$"gm

Regex match only if text contains something before

Given the following text
KEYWORD This is a test
We want to match the following groups 1:YES 2:YES 3:YES
I want to match with "1:YES", "2:YES" and "3:YES" using
((\d):YES)
If and only if the first word in the complete text is "KEYWORD"
Given this test:
This is a test
We want to match the following groups 1:YES 2:YES 3:YES
No matches should be found
Java (as with most regex engines) doesn't support unbounded length look behinds, however there is a work-around!
String str = "KEYWORD This is a test\n" +
"We want to match the following groups 1:YES 2:YES 3:YES";
Matcher matcher = Pattern.compile("(?s)(?<=\\AKEYWORD\\b.{1,99999})(\\d+:YES)").matcher(str);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Which outputs:
1:YES
2:YES
3:YES
The trick here is the look behind (?<=\\AKEYWORD.{1,99999}) which has a large (but not unbounded) length. (?s) means DOTALL flag (dot matches newline too) and \A means start of input which is needed because ^ matches start of line when DOTALL flag is used.
Without tricking lookbhinds in Java you can capture \d+:YES\b strings with using \G. \G causes a match to start from where previous match ended or it will match beginning of string the same as \A.
We are in need of its first capability:
(?:\AKEYWORD|\G(?!\A))[\s\S]*?(\d:YES\b)
Breakdown:
(?: Start of non-capturing group
\A Match beginning of subject string
KEYWORD Match keyword
| Or
\G(?!\A) Continue from where previous match ends
) End of NCG
[\s\S]*? Match anything else un-greedily
(\d+:YES\b) Match and capture our desired part
Live demo
Java code:
Pattern p = Pattern.compile("(?:\\AKEYWORD|\\G(?!\\A))[\\s\\S]*?(\\d+:YES\\b)");
Matcher m = p.matcher(string);
while (m.find()) {
System.out.println(m.group(1));
}
Live demo

Regex starts with "ATG" ends with "TAG, TAA orTGA" but does not contain "ATG" and "TAG, TAA or TGA" in between

I'm searching for patterns in a String starting with ATG, ending with TAG, TAA or TGA and length = multiple of 3. ATG and TAG, TAA or TGA can only appear at respectively beginning or end. Which means:
From ATGTTGTGATGT extract ATGTTGTGA
From ATGATGTTGTGATGT extract ATGTTGTGA
Currently I'm using regex (ATG)([ATG]{3})+?(TAG|TAA|TGA).
For ATGATGTTGTGATGT this gets me the wrong result ATGATGTTGTGA.
I've tried:
(^ATG)(!?=.*ATG)([ATG]{3})+?(TAG|TAA|TGA)
(^ATG)(!?=(ATG)+)([ATG]{3})+?(TAG|TAA|TGA)
How to tell it to contain ATG only once in the beginning and no more after that?
You may use
ATG(?:(?!ATG)[ATG]{3})*?(?:TAG|TAA|TGA)
See the regex demo
Details
ATG - an ATG substring
(?:(?!ATG)[ATG]{3})*? - a tempered greedy token matching any sequence of 3 chars from the [ATG] character set that is not equal to ATG (that is restricted with the negative lookahead (?!ATG))
(?:TAG|TAA|TGA) - either of the three alternatives defined in the non-capturing group: TAG, TAA or TGA.
Java demo:
String rx = "ATG(?:(?!ATG)[ATG]{3})*?(?:TAG|TAA|TGA)";
String s = "ATGTTGTGATGT, ATGATGTTGTGATGT, ATGATGTTGTGATGT";
Pattern pattern = Pattern.compile(rx);
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
Result:
ATGTTGTGA
ATGTTGTGA
ATGTTGTGA

Change group using regex java

I need help in regular expression using in regex java.
I need change group in string:
Example:
Input:
=sum($var1;2) or =if($result<10;"little";"big") ...
Need Output:
=sum(teste;2) or =if(teste<10;"little";"big") ...
Code I have:
Pattern p = Pattern.compile("(\\.*)(\\$\\w)(\\.*)");
Matcher m = p.matcher(total);
if (m.find()) {
System.out.println(m.replaceAll("$2teste"));
}
Output I have:
=sum($vtestear1;2)
=if($r testeesultado<5;"maior";"menor")
Why match everything when all you need is to match variable tokens?
Pattern p = Pattern.compile("\\b\\$[a-z0-9]+\\b");
p.matcher(total).replaceAll("teste");
Change the [a-z0-9] part if you can have more than lowercase ASCII letters and digits.
Also, you don't need to test for .find() or anything if you .replace(): no match means nothing will be replaced.

Regex for removing part of a line if it is preceded by some word in Java

There's a properties language bundle file:
label.username=Username:
label.tooltip_html=Please enter your username.</center></html>
label.password=Password:
label.tooltip_html=Please enter your password.</center></html>
How to match all lines that have both "_html" and "</center></html>" in that order and replace them with the same line except the ending "</center></html>". For example, line:
label.tooltip_html=Please enter your username.</center></html>
should become:
label.tooltip_html=Please enter your username.
Note: I would like to do this replacement using an IDE (IntelliJ IDEA, Eclipse, NetBeans...)
Since you clarified that this regex is to be used in the IDE, I tested this in Eclipse and it works:
FIND:
(_html.*)</center></html>
REPLACE WITH:
$1
Make sure you turn on the Regular expressions switch in the Find/Replace dialog. This will match any string that contains _html.* (where the .* greedily matches any string not containing newlines), followed by </center></html>. It uses (…) brackets to capture what was matched into group 1, and $1 in the replacement substitutes in what group 1 captured.
This effectively removes </center></html> if that string is preceded by _html in that line.
If there can be multiple </center></html> in a line, and they are all to be removed if there's a _html_ to their left, then the regex will be more complicated, but it can be done in one regex with \G continuing anchor if absolutely need be.
Variations
Speaking more generally, you can also match things like this:
(delete)this part only(please)
This now creates 2 capturing groups. You can match strings with this pattern and replace with $1$2, and it will effectively delete this part only, but only if it's preceded by delete and followed by please. These subpatterns can be more complicated, of course.
if (line.contains("_html=")) {
line = line.replace("</center></html>", "");
}
No regExp needed here ;) (edit) as long as all lines of the property file are well formed.
String s = "label.tooltip_html=Please enter your password.</center></html>";
Pattern p = Pattern.compile("(_html.*)</center></html>");
Matcher m = p.matcher(s);
System.out.println(m.replaceAll("$1"));
Try something like this:
Pattern p = Pattern.compile(".*(_html).*</center></html>");
Matcher m = p.matcher(input_line); // get a matcher object
String output = input_line;
if (m.matches()) {
String output = input_line.replace("</center></html>", "");
}
/^(.*)<\/center><\/html>/
finds you the
label.tooltip_html=Please enter your username.
part. then you can just put the string together correctly.

Categories

Resources