I am trying to come up with the regex to find strings matching the following pattern:
(someNumber - someNumber) With the parenthesis included.
I tried:
"\\([1-9]*-[1-9]*\\)"
but that doesn't seem to work.
I also need to match:
The letter W or L followed by (someNumber - someNumber) With the parenthesis included.
I tried to use the same pattern above, slightly modified, but again, no luck:
"W|L \\([1-9]*-[1-9]*\\)"
Any help would be appreciated
Include W|L in parentheses:
(W|L)
If you want to include space characters before and after the minus, add \s or a space before and after -
"((W|L)\\s)?\\([1-9]*\\s-\\s[1-9]*\\)"
If you already know that there will be at least one digit, use + instead of *, as * matches zero or more, whereas + matches 1 or more.
The pattern given above matches with and without a W or L in front.
Here's a pattern that matches with and without space around the - and with or without W or L in front. Additionally, it also captures numbers containing 0, which you excluded in your original regular expression.
"((W|L)\\s)?\\(\\d+\\s?-\\s?\\d+\\)"
Further to blueygh2's answer, your regex will fail if the numbers contain zeroes. My guess is you want to avoid leading zeroes, in which case use [1-9]\d* (or [1-9][0-9]*). If you want to allow the numbers to equal 0 but otherwise avoid leading zeroes, do ([1-9]\d*|0).
You can try this :
"(W|L)\\s*\\(\\d+-\\d+\\)"
Related
I don't write many regular expressions so I'm going to need some help on the one.
I need a regular expression that can validate that a string is an alphanumeric comma delimited string.
Examples:
123, 4A67, GGG, 767 would be valid.
12333, 78787&*, GH778 would be invalid
fghkjhfdg8797< would be invalid
This is what I have so far, but isn't quite right: ^(?=.*[a-zA-Z0-9][,]).*$
Any suggestions?
Sounds like you need an expression like this:
^[0-9a-zA-Z]+(,[0-9a-zA-Z]+)*$
Posix allows for the more self-descriptive version:
^[[:alnum:]]+(,[[:alnum:]]+)*$
^[[:alnum:]]+([[:space:]]*,[[:space:]]*[[:alnum:]]+)*$ // allow whitespace
If you're willing to admit underscores, too, search for entire words (\w+):
^\w+(,\w+)*$
^\w+(\s*,\s*\w+)*$ // allow whitespaces around the comma
Try this pattern: ^([a-zA-Z0-9]+,?\s*)+$
I tested it with your cases, as well as just a single number "123". I don't know if you will always have a comma or not.
The [a-zA-Z0-9]+ means match 1 or more of these symbols
The ,? means match 0 or 1 commas (basically, the comma is optional)
The \s* handles 1 or more spaces after the comma
and finally the outer + says match 1 or more of the pattern.
This will also match
123 123 abc (no commas) which might be a problem
This will also match 123, (ends with a comma) which might be a problem.
Try the following expression:
/^([a-z0-9\s]+,)*([a-z0-9\s]+){1}$/i
This will work for:
test
test, test
test123,Test 123,test
I would strongly suggest trimming the whitespaces at the beginning and end of each item in the comma-separated list.
You seem to be lacking repetition. How about:
^(?:[a-zA-Z0-9 ]+,)*[a-zA-Z0-9 ]+$
I'm not sure how you'd express that in VB.Net, but in Python:
>>> import re
>>> x [ "123, $a67, GGG, 767", "12333, 78787&*, GH778" ]
>>> r = '^(?:[a-zA-Z0-9 ]+,)*[a-zA-Z0-9 ]+$'
>>> for s in x:
... print re.match( r, s )
...
<_sre.SRE_Match object at 0xb75c8218>
None
>>>>
You can use shortcuts instead of listing the [a-zA-Z0-9 ] part, but this is probably easier to understand.
Analyzing the highlights:
[a-zA-Z0-9 ]+ : capture one or more (but not zero) of the listed ranges, and space.
(?:[...]+,)* : In non-capturing parenthesis, match one or more of the characters, plus a comma at the end. Match such sequences zero or more times. Capturing zero times allows for no comma.
[...]+ : capture at least one of these. This does not include a comma. This is to ensure that it does not accept a trailing comma. If a trailing comma is acceptable, then the expression is easier: ^[a-zA-Z0-9 ,]+
Yes, when you want to catch comma separated things where a comma at the end is not legal, and the things match to $LONGSTUFF, you have to repeat $LONGSTUFF:
$LONGSTUFF(,$LONGSTUFF)*
If $LONGSTUFF is really long and contains comma repeated items itself etc., it might be a good idea to not build the regexp by hand and instead rely on a computer for doing that for you, even if it's just through string concatenation. For example, I just wanted to build a regular expression to validate the CPUID parameter of a XEN configuration file, of the ['1:a=b,c=d','2:e=f,g=h'] type. I... believe this mostly fits the bill: (whitespace notwithstanding!)
xend_fudge_item_re = r"""
e[a-d]x= #register of the call return value to fudge
(
0x[0-9A-F]+ | #either hardcode the reply
[10xks]{32} #or edit the bitfield directly
)
"""
xend_string_item_re = r"""
(0x)?[0-9A-F]+: #leafnum (the contents of EAX before the call)
%s #one fudge
(,%s)* #repeated multiple times
""" % (xend_fudge_item_re, xend_fudge_item_re)
xend_syntax = re.compile(r"""
\[ #a list of
'%s' #string elements
(,'%s')* #repeated multiple times
\]
$ #and nothing else
""" % (xend_string_item_re, xend_string_item_re), re.VERBOSE | re.MULTILINE)
Try ^(?!,)((, *)?([a-zA-Z0-9])\b)*$
Step by step description:
Don't match a beginning comma (good for the upcoming "loop").
Match optional comma and spaces.
Match characters you like.
The match of a word boundary make sure that a comma is necessary if more arguments are stacked in string.
Please use - ^((([a-zA-Z0-9\s]){1,45},)+([a-zA-Z0-9\s]){1,45})$
Here, I have set max word size to 45, as longest word in english is 45 characters, can be changed as per requirement
I have an input string like this:
one `two three` four five `six` seven
where some parts can be wrapped by grave accent character (`).
I want to match only these parts which are not wrapped by it, it is one, four five and seven in example (skip two three and six).
I tryied to do it using lookaheads ((?<=) and (?=)) but it recognised four five group like two three and six. Is it possible to solve this problem using regex only, or I have to do it programmatically? (I'm using java 1.8)
If you are sure that there are no unclosed backticks, you could do this:
((?:\w| )+)(?=(?:[^`]*`[^`]*`)*[^`]*$)
This will match:
"one "
" four five "
" seven"
But it's a little bit expensive, because the lookahead that checks whether the number of backtics in the remaining part of line is divisible by 2 takes O(n^2) time to scan through the entire string.
Note that this works regardless of where the whitespace is, it really counts the backticks, it does not care about the relative position of the backticks. If you don't need this kind of robustness, #anubhava's answer is certainly more performant.
Demo: regex101.
You may use this regex using a lookahead and lookbehind:
(?<!`)\b\w+(?:\s+\w+)*\b(?!`)
RegEx Demo
Explanation:
- (?<!`): Negative Lookbehind to assert that we don't have ` at previous position
- \b\w+(?:\s+\w+)*\b: Match our text surrounded by word boundaries
- (?!`): Negative Lookahead to assert that we don't have ` at next position
I solve issues like this by specifying to exclude closing characters (in your case whitespace) like so:
`[^\s]+`
Hi i want to find Strings like "+19" in Java
so a + sign followed by infinite amount of numbers.
How do i do this?
Tried "+[0123456789]"
and "\+[0123456789]"
thank you :)
This is the regex you want to use:
\\+\\d+
Two kinds of plus are being used here. The first is escaped with two backslashes because it is treated as a literal. The second one means match 1 of more times (i.e. match any digit one or more times).
Code:
String input = "+19";
if (input.matches("\\+\\d+")) {
System.out.println("input string matches");
}
Yes, to match a plus you need to escape it with two backslashes in a C string literal that Java uses. A literal plus needs to be either escaped or put into a character class, [+]. If you just use a plus symbol, it becomes a quantifier that matches the previous symbol or group one or more number of times.
Also, note that the \d shorthand digit class can match more than just ASCII digits if Pattern.UNICODE_CHARACTER_CLASS flag is passed to Pattern.compile (or embedded (?U) flag is added at the start of the pattern). It is advised to use unambiguous patterns in case the code might be maintained or enhanced/adjusted by different developers later.
Most people prefer patterns without escaping backslashes if possible since that allows to avoid issues like the one you faced.
Here is a version of the regex that does not require any escaping:
"[+][0-9]+"
Also, the plus quantifier does not match an infinite number of digits, only MAX_UINT number of times.
I have numbers like this that need leading zero's removed.
Here is what I need:
00000004334300343 -> 4334300343
0003030435243 -> 3030435243
I can't figure this out as I'm new to regular expressions. This does not work:
(^0)
You're almost there. You just need quantifier:
str = str.replaceAll("^0+", "");
It replaces 1 or more occurrences of 0 (that is what + quantifier is for. Similarly, we have * quantifier, which means 0 or more), at the beginning of the string (that's given by caret - ^), with empty string.
Accepted solution will fail if you need to get "0" from "00". This is the right one:
str = str.replaceAll("^0+(?!$)", "");
^0+(?!$) means match one or more zeros if it is not followed by end of string.
Thank you to the commenter - I have updated the formula to match the description from the author.
If you know input strings are all containing digits then you can do:
String s = "00000004334300343";
System.out.println(Long.valueOf(s));
// 4334300343
Code Demo
By converting to Long it will automatically strip off all leading zeroes.
Another solution (might be more intuitive to read)
str = str.replaceFirst("^0+", "");
^ - match the beginning of a line
0+ - match the zero digit character one or more times
A exhausting list of pattern you can find here Pattern.
\b0+\B will do the work. See demo \b anchors your match to a word boundary, it matches a sequence of one or more zeros 0+, and finishes not in a word boundary (to not eliminate the last 0 in case you have only 00...000)
The correct regex to strip leading zeros is
str = str.replaceAll("^0+", "");
This regex will match 0 character in quantity of one and more at the string beginning.
There is not reason to worry about replaceAll method, as regex has ^ (begin input) special character that assure the replacement will be invoked only once.
Ultimately you can use Java build-in feature to do the same:
String str = "00000004334300343";
long number = Long.parseLong(str);
// outputs 4334300343
The leading zeros will be stripped for you automatically.
I know this is an old question, but I think the best way to do this is actually
str = str.replaceAll("(^0+)?(\d+)", "$2")
The reason I suggest this is because it splits the string into two groups. The second group is at least one digit. The first group matches 1 or more zeros at the start of the line. However, the first group is optional, meaning that if there are no leading zeros, you just get all of the digits. And, if str is only a zero, you get exactly one zero (because the second group must match at least one digit).
So if it's any number of 0s, you get back exactly one zero. If it starts with any number of 0s followed by any other digit, you get no leading zeros. If it starts with any other digit, you get back exactly what you had in the first place.
Here is the simple and proper solution.
str = str.replaceAll(/^0+/g, "");
Global Flag g is required when using replaceAll with regex
I need a regular expression for below pattern
It can start with / or number
It can only contain numbers, no text
Numbers can have space in between them.
It can contain /*, at least 1 number and space or numbers and /*
Valid Strings:
3232////33 43/323//
3232////3343/323//
/3232////343/323//
Invalid Strings:
/sas/3232/////dsds/
/ /34343///// /////
///////////
My Problem is, it can have space between numbers like /3232 323/ but not / /.
How to validate it ?
I have tried so far:
(\\d[\\d ]*/+) , (/*\\d[\\d ]*/+) , (/*)(\\d*)(/*)
This regex should work for you:
^/*(?:\\d(?: \\d)*/*)+$
Live Demo: http://www.rubular.com/r/pUOYFwV8SQ
My solution is not so simple but it works
^(((\d[\d ]*\d)|\d)|/)*((\d[\d ]*\d)|\d)(((\d[\d ]*\d)|\d)|/)*$
Just use lookarounds for the last criteria.
^(?=.*?\\d)([\\d/]*(?:/ ?(?!/)|\\d ?))+$
The best would have been to use conditional regex, but I think Java doesn't support them.
Explanation:
Basically, numbers or slashes, followed by one number and a space, or one slash and a space which is not followed by another slash. Repeat that. The space is made optional because I assume there's none at the end of your string.
Try this java regex
/*(\\d[\\d ]*(?<=\\d)/+)+
It meets all your criteria.
Although you didn't specifically state it, I have assumed that a space may not appear as the first or last character for a number (ie spaces must be between numbers)
"(?![A-z])(?=.*[0-9].*)(?!.*/ /.*)[0-9/ ]{2,}(?![A-z])"
this will match what you want but keep in mind it will also match this
/3232///// from /sas/3232/////dsds/
this is because part of the invalid string is correct
if you reading line by line then match the ^ $ and if you are reading an entire block of text then search for \r\n around the regex above to match each new line