I have a regex [a-zA-Z0-9]\\.(.*) to match:
[any character, any digit] followed by a dot and then followed by anything. For example e1.abc, r11.xyz, etc.
This works fine. However I have a case where if string is e.abc then it should not match i.e. only if it is e. then it should not match.
How do I modify my regex to handle this specific exclusion?
You can modify your regex by adding a negative lookahead assertion before the first pre-dot character. This lookahead will ensure that this first letter is not e. Here is the pattern:
.*(?!e)[a-zA-Z0-9]\.(.*)
Sample code:
String match = "a.abc";
if (match.matches(".*(?!e)[a-zA-Z0-9]\\.(.*)")) {
System.out.println("match");
}
String noMatch = "e.abc";
if (noMatch.matches(".*(?!e)[a-zA-Z0-9]\\.(.*)")) {
System.out.println("no match");
}
Note that I assume that there is only one dot in your string. If not, then this answer would need to change.
Demo here:
Rextester
Just get all the matches using your current regex, then just add an If statement as followings:
String test="e.this is test";
if(!test.startsWith("e."){
//Do someting
}
Related
I'm looking for a way to match a list of parameters that include some predefined characters and some variable characters using Java's String#matches method. For instance:
Possible Parameter 1: abc;[variable lowercase letters with maybe an underscore]
Possible Parameter 2: cde;[variable lowercase letters with maybe an underscore]
Possible Parameter 3: g;4
Example 1: abc;erga_sd,cde;dfgef,g;4
Example 2: g;4,abc;dsfaweg
Example 3: cde;df_ger
Each of the parameters would be comma-separated but they can come in any order and include 1, 2, and/or 3 (no duplicates)
This is the regex I have so far that partially works:
(abc;[a-z_,]+){0,1}|(cde;[a-z,]+){0,1}|(g;4,){0,1}
The problem is that it also finds something like this valid: abc;dsfg,dfvser where the beginning of the string after the comma does not start with a valid abc; or cde; or g;4
As you said:
The problem is that it also finds something like this valid:
abc;dsfg,dfvser where the beginning of the string after the comma does
not start with a valid abc; or cde; or g;4
Therefore the valid entries will always have the patterns after the comma. What you can do is, you can split the each inputs with the delimiter "," and apply the valid regex pattern to the split elements and then combine the matching results of the split elements to get the matching result of the whole input line.
Your regex should be:
(abc;[a-z_]+)|(cde;[a-z_]+)|(g;4)
You'll get any of these three patterns just like you have mentioned in your post earlier, in a valid element which you've gotten by doing a split on the input line.
Here's the code:
String regex = "(abc;[a-z_]+)|(cde;[a-z_]+)|(g;4)";
boolean finalResult = true;
for (String input: inputList.split(",")) {
finalResult = finalResult && Pattern.matches(regex,input);
}
System.out.println(finalResult);
If you want to use matches, then the whole string has to match.
^(?:(?:abc|cde);[a-z_]+|g;4)(?:,(?:(?:abc|cde);[a-z_]+|g;4))*$
Explanation
^ Start of string
(?: Non capture group
(?:abc|cde);[a-z_]+ match either abc; or cde; and 1+ chars a-z or _
| Or
g;4 Match literally
) Close non capture group
(?: Non capture group
,(?:(?:abc|cde);[a-z_]+|g;4) Match a comma, and repeat the first pattern
)* Close non capture group and optionally repeat
$ End of string
See a regex demo and a Java demo
Example code
String[] strings = {
"abc;erga_sd,cde;dfgef,g;4",
"g;4,abc;dsfaweg",
"cde;df_ger",
"g;4",
"abc;dsfg,dfvser"
};
String regex = "^(?:(?:abc|cde);[a-z_]+|g;4)(?:,(?:(?:abc|cde);[a-z_]+|g;4))*$";
Pattern pattern = Pattern.compile(regex);
for (String s : strings) {
Matcher matcher = pattern.matcher(s);
if (matcher.matches()) {
System.out.printf("Match for %s%n", s);
} else {
System.out.printf("No match for %s%n", s);
}
}
Output
Match for abc;erga_sd,cde;dfgef,g;4
Match for g;4,abc;dsfaweg
Match for cde;df_ger
Match for g;4
No match for abc;dsfg,dfvser
If there should not be any duplicate abc; cde or g;4 you can rule that out using a negative lookahead with a backreference to match the same twice at the start of the pattern.
^(?!.*(abc;|cde;|g;4).*\1)(?:(?:abc|cde);[a-z_]+|g;4)(?:,(?:(?:abc|cde);[a-z_]+|g;4))*$
Regex demo
I am not quite sure of what is the correct regex for the period in Java. Here are some of my attempts. Sadly, they all meant any character.
String regex = "[0-9]*[.]?[0-9]*";
String regex = "[0-9]*['.']?[0-9]*";
String regex = "[0-9]*["."]?[0-9]*";
String regex = "[0-9]*[\.]?[0-9]*";
String regex = "[0-9]*[\\.]?[0-9]*";
String regex = "[0-9]*.?[0-9]*";
String regex = "[0-9]*\.?[0-9]*";
String regex = "[0-9]*\\.?[0-9]*";
But what I want is the actual "." character itself. Anyone have an idea?
What I'm trying to do actually is to write out the regex for a non-negative real number (decimals allowed). So the possibilities are: 12.2, 3.7, 2., 0.3, .89, 19
String regex = "[0-9]*['.']?[0-9]*";
Pattern pattern = Pattern.compile(regex);
String x = "5p4";
Matcher matcher = pattern.matcher(x);
System.out.println(matcher.find());
The last line is supposed to print false but prints true anyway. I think my regex is wrong though.
Update
To match non negative decimal number you need this regex:
^\d*\.\d+|\d+\.\d*$
or in java syntax : "^\\d*\\.\\d+|\\d+\\.\\d*$"
String regex = "^\\d*\\.\\d+|\\d+\\.\\d*$"
String string = "123.43253";
if(string.matches(regex))
System.out.println("true");
else
System.out.println("false");
Explanation for your original regex attempts:
[0-9]*\.?[0-9]*
with java escape it becomes :
"[0-9]*\\.?[0-9]*";
if you need to make the dot as mandatory you remove the ? mark:
[0-9]*\.[0-9]*
but this will accept just a dot without any number as well... So, if you want the validation to consider number as mandatory you use + ( which means one or more) instead of *(which means zero or more). That case it becomes:
[0-9]+\.[0-9]+
If you on Kotlin, use ktx:
fun String.findDecimalDigits() =
Pattern.compile("^[0-9]*\\.?[0-9]*").matcher(this).run { if (find()) group() else "" }!!
Your initial understanding was probably right, but you were being thrown because when using matcher.find(), your regex will find the first valid match within the string, and all of your examples would match a zero-length string.
I would suggest "^([0-9]+\\.?[0-9]*|\\.[0-9]+)$"
There are actually 2 ways to match a literal .. One is using backslash-escaping like you do there \\., and the other way is to enclose it inside a character class or the square brackets like [.]. Most of the special characters become literal characters inside the square brackets including .. So use \\. shows your intention clearer than [.] if all you want is to match a literal dot .. Use [] if you need to match multiple things which represents match this or that for example this regex [\\d.] means match a single digit or a literal dot
I have tested all the cases.
public static boolean isDecimal(String input) {
return Pattern.matches("^[-+]?\\d*[.]?\\d+|^[-+]?\\d+[.]?\\d*", input);
}
Using java regex how to find particular word anywhere in the string. My need is to check whether the string "Google" contains the word "gooe" or not.
For example:-
String: Goolge
word to find : gooe
The string "Google" contains all the characters g,o,o,e then it should return true.
IF the string is "wikipedia" and my word to find is "gooe" then it should return false.
How to form regex expression in this scenario..?
I've just tested such RegEx that makes a use of "look-ahead":
(?=^.*g)(?=^.*o)(?=^.*e)
It should return true for all strings that contain g, o and e, while returning false if any of these characters is missing.
If you want to find word in whole string you can use:
"^(?=.*e)(?=.*o.*o)(?=.*g).*"
You have to build a positive lookahead for each letter. In case of having gooe as search term our RegEx would be:
(?i)(?=.*g)(?=.*o)(?=.*o)(?=.*e)
It's obvious that we have two exact same lookaheads. They will satisfy at the position of second o letter, so one is redundant. You can remove duplicate letters from search term before building final pattern. (?i) sets case-insensitivity flag on.
String term = "Gooe"; // Search term
String word = "google"; // Against word `Google`
String pattern = "(?i)(?=.*" + String.join(")(?=.*", term.split("(?!^)")) + ")";
Pattern regex = Pattern.compile(pattern);
Matcher match = regex.matcher(word);
if (match.find()) {
// Matched
}
See demo here
If order is important and while looking for two os, exactly both of them should exist then our RegEx would be:
(?i).*?g.*?o.*?o.*?e
Java:
String pattern = "(?i).*?" + String.join(".*?", term.split("(?!^)"));
I'd like to know how to detect word that is between any characters except a letter from alphabet. I need this, because I'm working on a custom import organizer for Java. This is what I have already tried:
The regex expression:
[^(a-zA-Z)]InitializationEvent[^(a-zA-Z)]
I'm searching for the word "InitializationEvent".
The code snippet I've been testing on:
public void load(InitializationEvent event) {
It looks like adding space before the word helps... is the parenthesis inside of alphabet range?
I tested this in my program and it didn't work. Also I checked it on regexr.com, showing same results - class name not recognized.
Am I doing something wrong? I'm new to regex, so it might be a really basic mistake, or not. Let me know!
Lose the parentheses:
[^a-zA-Z]InitializationEvent[^a-zA-Z]
Inside [], parentheses are taken literally, and by inverting the group (^) you prevent it from matching because a ( is preceding InitializationEvent in your string.
Note, however, that the above regex will only match if InitializationEvent is neither at the beginning nor at the end of the tested string. To allow that, you can use:
(^|[^a-zA-Z])InitializationEvent([^a-zA-Z]|$)
Or, without creating any matching groups (which is supposed to be cleaner, and perform better):
(?:^|[^a-zA-Z])InitializationEvent(?:[^a-zA-Z]|$)
how to detect word that is between any characters except a letter from alphabet
This is the case where lookarounds come handy. You can use:
(?<![a-zA-Z])InitializationEvent(?![a-zA-Z])
(?<![a-zA-Z]) is negative lookbehind to assert that there is no alphabet at previous position
(?![a-zA-Z]) is negative lookahead to assert that there is no alphabet at next position
RegEx Demo
The parentheses are causing the problem, just skip them:
"[^a-zA-Z]InitializationEvent[^a-zA-Z]"
or use the predefined non-word character class which is slightly different because it also excludes numbers and the underscore:
"\\WInitializationEvent\\W"
But as it seems you want to match a class name, this might be ok because the remaining character are exactly those that are allowed in a class name.
I'm not sure about your application but from a regexp perspective you can use negative lookaheads and negative lookbehinds to define what cannot surround the String to specify a match.
I have added the negative lookahead (?![a-zA-Z]) and the negative lookbehind (?<![a-zA-Z]) in place of your [^(a-zA-Z)] originally supplied to create: (?<![a-zA-Z])InitializationEvent(?![a-zA-Z])
Quick Fiddle I created:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class HelloWorld{
public static void main(String []args){
String pattern = "(?<![a-zA-Z])InitializationEvent(?![a-zA-Z])";
String sourceString = "public void load(InitializationEvent event) {";
String sourceString2 = "public void load(BInitializationEventA event) {";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(sourceString);
if (m.find( )) {
System.out.println("Found value of pattern in sourceString: " + m.group(0) );
} else {
System.out.println("NO MATCH in sourceString");
}
Matcher m2 = r.matcher(sourceString2);
if (m2.find( )) {
System.out.println("Found value of pattern in sourceString2: " + m2.group(0) );
} else {
System.out.println("NO MATCH in sourceString2");
}
}
}
output:
sh-4.3$ java -Xmx128M -Xms16M HelloWorld
Found value of pattern in sourceString: InitializationEvent
NO MATCH in sourceString2
You seem really close:
[^(a-zA-Z)]*(InitializationEvent)[^(a-zA-Z)]*
I think this is what you are looking for. The asterisk provides a match for zero or many of the character or group before it.
EDIT/UPDATE
My apologies on the initial response.
[^a-zA-Z]+(InitializationEvent)[^a-zA-Z]+
My regex is a little rusty, but this will match on any non-alphabet character one or many times prior to the InitializationEvent and after.
I am working on a project where i need to search for a particular string token and find if this token has the [3:0] format of number, how can i check it? i searched for reference on stack overflow, i could find how to search "{my string }:" in a string like the following:
String myStr = "this is {my string: } ok";
if (myStr.trim().contains("{my string: }")) {
//Do something.
}
But, could not find how to search if a string contains a number in the regular expression, i tried using the following, but it did not work:
String myStr = "this is [3 string: ] ok";
if (myStr.trim().contains("[\\d string: ]")) {
//Do something.
}
Please help!
for "[int:int]" use \\[\\d*:\\d*\\] it's working
You cannot use a regex inside String#contains, instead, use .matches() with a regex.
To match [3 string: ]-like patterns inside larger strings (where string is a literal word string), use a regex like (?s).*\\[\\d+\\s+string:\\s*\\].*:
String myStr = "this is [3 string: ] ok";
if (myStr.matches("(?s).*\\[\\d+\\s+string:\\s*\\].*")) {
System.out.println("FOUND");
}
See IDEONE demo
The regex will match any number of any characters from the start of string with .* (as many as possible) before a [+1 or more digits+1 or more whitespace+string:+0 or more whitespace+]+0 or more any characters up to the end of string.
The (?s) internal modifier makes the dot match newline characters, too.
Note we need .* on both sides because .matches() requires a full string match.
To match [3:3]-like pattern inside larger strings use:
"(?s).*\\[\\d+\\s*:\\s*\\d+\\].*"
See another IDEONE demo
Remove \\s* if whitespace around : is not allowed.