Regex of Binary sequence that does not contains 00 - java

I am trying to get a regular expression that:
- Has any sequence of 0 and 1. (Binary only)
- And Does not contains 00
I know them separate but how I can combine them?
(?:[0-1]+)+
the above for sequence of 0101 of any kind.
Here is screenshot of the part of the question:
any clue reference would be appreciated.

I came to this form:
0?(1+0?)*
Explained:
0? - can start with 0
1+ - non-empty sequence of 1s
0? - followed by at most one 0
(1+0?)* - 2-3 repeated any number of times

Regular expressions such as 0?(1+0)* will match against any part of the string so it will match the middle part of a string such as 000011000000, it will match the 0110. To check that the whole string matches need to add the start and end of string anchors, giving ^0?(1+0)*$. This will also match an empty string. To match against a non-empty string we could use ^0?(1+0)+$ but this will not match a string with a single 0. So we need to add an alternative (using |) to match the 0, leading to the total expression ^((0?(1+0?)+)|0)$.
These brackets are capturing brackets, they could be changed to non-capturing forms but that would make the expression bigger and, visually, more complex.

You could try something like this:
(10)+|(01)+|1+
Demo: https://regex101.com/r/2CFroT/4

Related

Regular expression replace characters by a given match between strings

I am trying to replace a given character by a regular expression match.
For example, given the following string:
If you look at what you have in life, you'll always have more. If you look at what you don't have in life, you'll never have enough
I would like to replace all 't' with a '!' only where the match is between the characters 'ok' and 'fe'.
I get the match between 'ok' and 'fe' with this regular expression:
(?<=ok).*?(?=fe)
And I can only match one character with the following regex:
(?<=ok).*?(t).*?(?=fe)
I tried to transform that regex in the following way but it does not work:
(?<=ok).*?((t).*?)*?(?=fe)
How can I match all 't' between 'ok' and 'fe'?
https://regex101.com/r/ORgseA/1
You can use
String result = text.replaceAll("(?s)(\\G(?!\\A)|ok)((?:(?!ok|fe|t).)*)t(?=(?:(?!ok|fe).)*fe)", "$1$2!");
See the regex demo and the Java demo:
String text = "If you look at what you have in life, you'll always have more. If you look at what you don't have in life, you'll never have enough";
String result = text.replaceAll("(?s)(\\G(?!\\A)|ok)((?:(?!ok|fe|t).)*)t(?=(?:(?!ok|fe).)*fe)", "$1$2!");
System.out.println(result);
// => If you look a! wha! you have in life, you'll always have more. If you look a! wha! you don'! have in life, you'll never have enough
Details:
(?s) - Pattern.DOTALL embedded flag option (to make . match line break chars)
(\G(?!\A)|ok) - Group 1 ($1): ok or the end of the previous successful match
((?:(?!ok|fe|t).)*) - Group 2 ($2): any one char, zero or more occurrences, as many as possible, that does not start a ok, fe or t char sequence
t - a t char
(?=(?:(?!ok|fe).)*fe) - immediately to the right, there must be any single char, zero or more occurrences, as many as possible, that does not start ok or fe char sequences and then a fe substring.

Complicated regex and possible simple way to do it [duplicate]

I don't write many regular expressions so I'm going to need some help on the one.
I need a regular expression that can validate that a string is an alphanumeric comma delimited string.
Examples:
123, 4A67, GGG, 767 would be valid.
12333, 78787&*, GH778 would be invalid
fghkjhfdg8797< would be invalid
This is what I have so far, but isn't quite right: ^(?=.*[a-zA-Z0-9][,]).*$
Any suggestions?
Sounds like you need an expression like this:
^[0-9a-zA-Z]+(,[0-9a-zA-Z]+)*$
Posix allows for the more self-descriptive version:
^[[:alnum:]]+(,[[:alnum:]]+)*$
^[[:alnum:]]+([[:space:]]*,[[:space:]]*[[:alnum:]]+)*$ // allow whitespace
If you're willing to admit underscores, too, search for entire words (\w+):
^\w+(,\w+)*$
^\w+(\s*,\s*\w+)*$ // allow whitespaces around the comma
Try this pattern: ^([a-zA-Z0-9]+,?\s*)+$
I tested it with your cases, as well as just a single number "123". I don't know if you will always have a comma or not.
The [a-zA-Z0-9]+ means match 1 or more of these symbols
The ,? means match 0 or 1 commas (basically, the comma is optional)
The \s* handles 1 or more spaces after the comma
and finally the outer + says match 1 or more of the pattern.
This will also match
123 123 abc (no commas) which might be a problem
This will also match 123, (ends with a comma) which might be a problem.
Try the following expression:
/^([a-z0-9\s]+,)*([a-z0-9\s]+){1}$/i
This will work for:
test
test, test
test123,Test 123,test
I would strongly suggest trimming the whitespaces at the beginning and end of each item in the comma-separated list.
You seem to be lacking repetition. How about:
^(?:[a-zA-Z0-9 ]+,)*[a-zA-Z0-9 ]+$
I'm not sure how you'd express that in VB.Net, but in Python:
>>> import re
>>> x [ "123, $a67, GGG, 767", "12333, 78787&*, GH778" ]
>>> r = '^(?:[a-zA-Z0-9 ]+,)*[a-zA-Z0-9 ]+$'
>>> for s in x:
... print re.match( r, s )
...
<_sre.SRE_Match object at 0xb75c8218>
None
>>>>
You can use shortcuts instead of listing the [a-zA-Z0-9 ] part, but this is probably easier to understand.
Analyzing the highlights:
[a-zA-Z0-9 ]+ : capture one or more (but not zero) of the listed ranges, and space.
(?:[...]+,)* : In non-capturing parenthesis, match one or more of the characters, plus a comma at the end. Match such sequences zero or more times. Capturing zero times allows for no comma.
[...]+ : capture at least one of these. This does not include a comma. This is to ensure that it does not accept a trailing comma. If a trailing comma is acceptable, then the expression is easier: ^[a-zA-Z0-9 ,]+
Yes, when you want to catch comma separated things where a comma at the end is not legal, and the things match to $LONGSTUFF, you have to repeat $LONGSTUFF:
$LONGSTUFF(,$LONGSTUFF)*
If $LONGSTUFF is really long and contains comma repeated items itself etc., it might be a good idea to not build the regexp by hand and instead rely on a computer for doing that for you, even if it's just through string concatenation. For example, I just wanted to build a regular expression to validate the CPUID parameter of a XEN configuration file, of the ['1:a=b,c=d','2:e=f,g=h'] type. I... believe this mostly fits the bill: (whitespace notwithstanding!)
xend_fudge_item_re = r"""
e[a-d]x= #register of the call return value to fudge
(
0x[0-9A-F]+ | #either hardcode the reply
[10xks]{32} #or edit the bitfield directly
)
"""
xend_string_item_re = r"""
(0x)?[0-9A-F]+: #leafnum (the contents of EAX before the call)
%s #one fudge
(,%s)* #repeated multiple times
""" % (xend_fudge_item_re, xend_fudge_item_re)
xend_syntax = re.compile(r"""
\[ #a list of
'%s' #string elements
(,'%s')* #repeated multiple times
\]
$ #and nothing else
""" % (xend_string_item_re, xend_string_item_re), re.VERBOSE | re.MULTILINE)
Try ^(?!,)((, *)?([a-zA-Z0-9])\b)*$
Step by step description:
Don't match a beginning comma (good for the upcoming "loop").
Match optional comma and spaces.
Match characters you like.
The match of a word boundary make sure that a comma is necessary if more arguments are stacked in string.
Please use - ^((([a-zA-Z0-9\s]){1,45},)+([a-zA-Z0-9\s]){1,45})$
Here, I have set max word size to 45, as longest word in english is 45 characters, can be changed as per requirement

Extract exactly n digits in a sentence using REGEX

Example
The no.s 1234 65
Input: n
For n=4, the output should be 1234
For n=2, the output should be : 65 (not 12)
Tried \d{n} which gives 12 and \d{n,} gives 1234 but i want the exact matching one.
Pattern p = Pattern.compile("//\d{n,}");
you need negative lookaround assertion: (?<!..): negative look behind, and (?!..): negative look ahead : regex101
(?<!\d)\d{4}(?!\d)
however not all regex engine supports them, maybe a work around may match also the preceeding character and following character (contrary to look-around which are 0 width matches), (\D matches all excpet a digit)
(?:^|\D)(\d{4})(?:\D|$)
I think what you meant is the \b character.
Hence, the regex you're looking for would be (for n=2):
\b\d{2}\b
From what I understand, you're looking for a regex that will match a number in a string which has n digits, taking into into account the spacing between the numbers. If that's the case, you're looking for something like this:
\b\d{4}\b
The \b will ensure the match is constrained to the start/end of a 'word' where a word is the boundary between anything matched by \w (which includes digits) and anything matched by the opposite, \W (which includes spaces).
I don't code in java but I can try to answer this using regex in general.
If your number is in the format d1d2d3d4 d5d6 and you want to extract digits d5d6, create 3 groups as r'([0-9]+)("/s")([0-9]+)' – each set of parenthesis () represent one group. Now, extract the third group only in another object which is your required output.

regular expression in Java (Spring configuration) with 2 specific characters in begining

I need regular expression which will start with 2 specific letters and will be 28 characters long.
The regular expression is needed, as this is in conjunction with Spring configuration, which will only take a regular expression.
I've been trying to do with this, it's not working (^[AK][28]*)
If you mean that the string should be like "AKxxxxxxxx" (28 characters in total), then you can use:
^AK.{26}$ //using 26 since AK already count for 2 characters
Regex is nothing specific to Java, nor is it that difficult if you have a look at any tutorial (and there's plenty!).
To answer your question:
AK[a-zA-Z]{26}
The above regex should solve your issue regarding a 28 character String with the first two letters being AK.
Elaboration:
AK[a-zA-Z]{26}> Characters written as such, without any special characters will be matched as is (that means they must be where they were written, in exactly that fashion)
AK[a-zA-Z]{26}> By using square brackets you can define a set of characters, signs, etc. to be matched against a part of the regex (1 by default) - you can write down all the possible characters/signs or make use of groups (e.g. a-z, /d for digits, and so forth)
AK[a-zA-Z]{26}> for each set of characters/signs you can define a repetition count, this defines how often the set can/must be applied. E.g. {26} means it must match 26 times. Other possibilities are {2, 26} meaning it must match at least 2 times but at most 26 times, or for example use an operator like *, + or ? which denote that the set can be matched 0 or more times, at least once or 0 or 1 time
In case you need it matching a whole line you would likely want to add ^ and $ at the beginning and end respectively, to tell the regex parser that it has to match a whole line/String and not just a part:
^AK[a-zA-Z]{26}$
If you need to count the number of repetitions use the {min, max} syntax. Omiting both the comma and max tells the regex parser to look for exactly minrepetitions.
For example :
.{1,3} will match for any character (shown by the dot) sequence between 1 and 3 characters long.
[AK]{2} will match for exactly 2 characters that are either A or K :
AK, AA, KA or KK.
Additionnaly, your regex uses [AK]. This means that it will match against one of the characters given, i.e. A or K.
If you need to match for the specific "AK" sequence then you need to get rid of the '[' ']' tokens.
Therefore you regex could be AK.{28} meaning it will match for AK followed by exactly 28 characters.

Reg Expression Validation on a String

Can I use Reg Expression for the following use case?
I Need to write a boolean method which takes a String parameter that should satisfy following conditions.
20 character length string.
First 9 characters will be a number
Next 2 characters will be alphabets
Next 2 characters will be a number.(1 to 31 or 99)
Next 1 character will be an alphabet
Last 6 characters will be a number.
In this, I have wrote the code for the first requirement:
[a-zA-Z0-9]{20} - This expression works well for the first case. I don't know how to write a complete reg expression to meet the entire requirement.
Please help.
Yes, it is possible to use regexes for this.
Ignore the "20 characters" part and describe a string created by concatenating 9 digits, 2 letters, 2 digits, 1 letter and another digit.
Start with the string start: ^
Then 9 digits. The \d conveniently describes the character set [0-9], so \d{9} means "nine digits"
Then 2 letters. The \w class is too broad, so stick to [a-zA-Z] for a letter.
Then another two digits. They seem to be from a restricted set, so describe the set with alternation and grouping.
Then another letter and another digit.
And, finally, you have to end at the end of the string: $
For reference, this regex means "the string is nine letters, then 12-15 or 99, then another letter":
^[a-zA-Z]{9}(1[2-5]|99)[a-zA-Z]$
Read the String JavaDocs, especially the part about String.matches() as well as the documentation about regular expressions in Java.
Your first requirement is already implicit in the remaining ones, so I would just skip it. Then, just write the regex code that matches each part one after the other:
[0-9]{9}[a-zA-Z]{2}...
There is one special consideration for the number that might be 1 to 31. While it is possible to match this in one regex, it would be verbose and difficult to understand. Instead, perform basic matching in the regex and extract this part as a capturing group by putting it into parentheses:
([0-9]{2})
If you use Pattern and Matcher to apply your regex, and your string matches the pattern, you can then easily get at just thost two characters, use Integer.parseInt() to convert them to an integer (which is completely safe because you know the two characters are digits), and then check the value normally.
This regular expression takes
^[0-9]{9}[a-zA-Z]{2}([1-9]|[1-2][0-9]|3[0-1]|99)[a-zA-Z]([0-9]{6})$
takes
9 letters at start,
Followed by 2 alphabets,
Followed by number between 1 to 31 or 99,
Followed by an alphabet,
followed by 6 digits.

Categories

Resources