Regex match numbers with spaces but not without spaces

Regex match numbers with spaces but not without spaces - java

Trying to match a string of numbers with spaces in between while ignoring other strings of numbers without spaces in between them. I'd like to match 16 characters.
eg. Would like to match 12345 67890 1234 but NOT 1234567890123456
I have tried this:
[0-9 ]{16}
But this matches both sets of strings.

I used and corrected #Wiktor Stribiżew regex, because original regex will match a space at the beginning and the end of the number.
Regex: \b(?![0-9]{16})\d[0-9 ]{14}\d\b
Details:
\b assert position at a word boundary (^\w|\w$|\W\w|\w\W)
(?!) Negative Lookahead
[] Match a single character present in the list 0-9
{n} Matches exactly n times
\d matches a digit (equal to [0-9])
RegEx demo

You can use this regex to enforcees at least one space in between numbers:
\d+(?:\h+\d+)+
RegEx Demo
\d+: Match 1+ digits
(?:\h+\d+)+: Match 1+ group of 1+ whitespace and 1+ digits

Related

how to create a regular expression to validate the numbers are not same even separated by hyphen (-)

Using the following regex
^(\d)(?!\1+$)\d{3}-\d{1}$
It works for the pattern but I need to validate that all numbers are not the same even after /separated by the hyphen (-).
Example:
0000-0 not allowed (because of all are same digits)
0000-1 allowed
1111-1 not allowed (because of all are same digits)
1234-2 allowed

TheFourthBird's answer surely works that uses a negative lookahead. Here is another variant of this regex that might be slightly faster:
^(\d)(?!\1{3}-\1$)\d{3}-\d$
RegEx Demo
Explanation:
^(\d) matches and captures first digit after start in group #1
(?!\1{3}-\1$) is a negative lookahead that will fail the match if we have 3 repetitions and a hyphen and another repeat of 1st digit.

You could exclude only - or the same digit only to the right till the end of the string:
^(\d)(?!(?:\1|-)*$)\d{3}-\d$
^ Start of string
(\d) Capture group 1, match a digit
(?! Negative lookahead, assert what is to the right is not
(?:\1|-)*$ Optionally repeat either the backrefernce to what is already captured or - till the end of the string
) Close the non capture group
\d{3}-\d Match 3 digits - and a digit
$ End of string
Regex demo
If you don't want to match double -- or an - at the end of the string and match optional repetitions:
^(\d)(?!(?:\1|-)*$)\d*(?:-\d+)*$
Explanation
^ Start of string
(\d) Capture a single digits in group 1
(?!(?:\1|-)*$) Negative lookahead, assert not only - and the same digit till the end of the string
\d* Match optional digits
(?:-\d+)* Optionally repeat matching - and 1+ digits
$ End of string
Regex demo

You'll need a back reference, for example:
^(\d){4}-\1$

Regex to identify consecutive and non-consecutive duplicate words in multiline text

I'm writing a syntax checker (in Java) for a file that has the keywords and comma (separation)/semicolon (EOL) separated values. The amount of spaces between two complete constructions is unspecified.
What is required:
Find any duplicate words (consecutive and non-consecutive) in the multiline file.
// Example_1 (duplicate 'test'):
item1 , test, item3 ;
item4,item5;
test , item6;
// Example_2 (duplicate 'test'):
item1 , test, test ;
item2,item3;
I've tried to apply the (\w+)(s*\W\s*\w*)*\1 pattern, which doesn't catch duplicate properly.

You may use this regex with mode DOTALL (single line):
(?s)(\b\w+\b)(?=.*\b\1\b)
RegEx Demo
RegEx Details:
(?s): Enable DOTALL mode
(\b\w+\b): Match a complete word and capture it in group #1
(?=.*\b\1\b): Lookahead to assert that we have back-reference \1 present somewhere ahead. \b is used to make sure we match exact same word again.
Additionally:
Based on earlier comments below if intent was to not match consecutive word repeats like item1 item1, then following regex may be used:
(?s)(\b\w+\b)(?!\W+\1\b)(?=.*\b\1\b)
RegEx Demo 2
There is one extra negative lookahead assertion here to make sure we don't match consecutive repeats.
(?!\W+\1\b): Negative lookahead to fail the match for consecutive repeats.

You may use
\b(\w+)\b(?:\s*[^\w\s]\s*\w+)+\s*[^\w\s]\s*\b\1\b
See the regex demo
Details
\b(\w+)\b - Group 1: one or more word chars as a whole word
(?:\s*[^\w\s]\s*\w+)+ - 1 or more occurrences of:
\s* - 0+ whitespaces
[^\w\s] - 1 char other than a word and whitespace char
\s* - 0+ whitespaces
\w+ - 1+ word chars
\s* - 0+ whitespaces
[^\w\s] - 1 char other than a word and whitespace char
\s* - 0+ whitespaces
\b\1\b - the same value as in Group 1 as whole word.
To only match the word, put the second part of the regex into a positive lookahead:
\b(\w+)\b(?=(?:\s*[^\w\s]\s*\w+)+\s*[^\w\s]\s*\b\1\b)
^^^ ^
See this regex demo.
Java regex variable declaration:
String regex = "\\b(\\w+)\\b(?:\\s*[^\\w\\s]\\s*\\w+)+\\s*[^\\w\\s]\\s*\\b\\1\\b";
To make it fully Unicode aware add (?U):
String regex = "(?U)\\b(\\w+)\\b(?:\\s*[^\\w\\s]\\s*\\w+)+\\s*[^\\w\\s]\\s*\\b\\1\\b";

Regular expression to determine if the String consists of more than 4 numbers

I want to extract URL strings from a log which looks like below:
<13>Mar 27 11:22:38 144.0.116.31 AgentDevice=WindowsDNS AgentLogFile=DNS.log PluginVersion=X.X.X.X Date=3/27/2019 Time=11:22:34 AM Thread ID=11BC Context=PACKET Message= Internal packet identifier=0000007A4843E100 UDP/TCP indicator=UDP Send/Receive indicator=Snd Remote IP=X.X.X.X Xid (hex)=9b01 Query/Response=R Opcode=Q Flags (hex)=8081 Flags (char codes)=DR ResponseCode=NOERROR Question Type=A Question Name=outlook.office365.com
I am looking to extract Name text which contains more that 5 digits.
A possible way suggested is (\d.*?){5,} but does not seem to work, kindly suggest another way get the field.
Example of string match:
outlook12.office345.com
outlook.office12345.com

You can look for the following expression:
Name=([^ ]*\d{5,}[^ ]*)
Explanation:
Name= look for anything that starts with "Name=", than capture if:
[^ ]* any number of characters which is not a space
\d{5,} then 5 digits in a row
[^ ]* then again, all digits up to a white space

This regular expression:
(?<=Name=).*\d{5,}.*?(?=\s|$)
would extract strings like outlook.office365666.com (with 5 or more consecutive digits) from your example input.
Demo: https://regex101.com/r/YQ5l2w/1

Try this pattern: (?=\b.*(?:\d[^\d\s]*){5,})\S*
Explanation:
(?=...) - positive lookahead, assures that pattern inside it is matched somewhere ahead :)
\b - word boundary
(?:...) - non-capturing group
\d[^\d\s]* - match digit \d, then match zero or more of any characters other than whitespace \s or digit \d
{5,} - match preceeding pattern 5 or more times
\S* - match zero or more of any characters other than space to match the string if assertion is true, but I think you just need assertion :)
Demo
If you want only consecutive numbers use simplified pattern (?=\b.*\d{5,})\S*.
Another demo
Of course, you have to add positive lookbehind: (?<=Name=) to assert that you have Name= string preceeding

Try this regex
([a-z0-9]{5,}.[a-z0-9]{5,})+.com
https://regex101.com/r/OzsChv/3
It Groups,
outlook.office365.com
outlook12.office345.com
also all url strings

Regex to allow a space instead of following numbers after first two letters

I need a RegEx that allow a single space after two letters i.e. AB123 should not be allowed but AB 123 should be allowed ?

Here is the regex [a-zA-Z]{2}\s\S*
[a-zA-Z] means character from a to Z
{2} means character twice
\s means white space
\S means non white space.
* duplicate with 0 or more
https://regex101.com/r/uWYci4/1

This pattern will do the work: ^[a-zA-Z]{2} \d+$
Explanation:
^ - match beginning of a string
[a-zA-Z]{2} - match two letters (upper- or lowercase),
- match space
\d+ - match one or more digits
$ - match end of a string
Demo

value which should match only numbers, dots and commas

I have a regex:
"(\\d+\\.\\,?)+"
And the value:
3.053,500
But my regex pattern does not match it.
I want to have a pattern which validates numbers, dots and commas.
For exmaple values which are valid:
1
12
1,2
1.2
1,23,456
1,23.456
1.234,567
etc.

Your (\d+\.\,?)+ regex matches 1 or more repetitions of 1+ digits, a dot, and an opional ,. It means the strings must end with a dot. 3.053,500 does not end with a dot.
You may use
s.matches("\\d+(?:[.,]\\d+)*")
See the regex demo
Note that the ^ and $ anchors are not necessary in Java's .matches() method as the match is anchored to the start/end of the string automatically. At regex101.com, the anchors are meant to match start/end of the line (since the demo is run against a multiline string).
Pattern details
\d+ - 1+ digits
(?: - start of a non-capturing group:
[.,] - a dot or ,
\d+ - 1+ digits
)* - 0 or more repetitions.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regex match numbers with spaces but not without spaces - java

Trying to match a string of numbers with spaces in between while ignoring other strings of numbers without spaces in between them. I'd like to match 16 characters. eg. Would like to match 12345 67890 1234 but NOT 1234567890123456 I have tried this: [0-9 ]{16} But this matches both sets of strings.

You can use this regex to enforcees at least one space in between numbers: \d+(?:\h+\d+)+ RegEx Demo \d+: Match 1+ digits (?:\h+\d+)+: Match 1+ group of 1+ whitespace and 1+ digits

Related

how to create a regular expression to validate the numbers are not same even separated by hyphen (-)

Regex to identify consecutive and non-consecutive duplicate words in multiline text

Regular expression to determine if the String consists of more than 4 numbers

Regex to allow a space instead of following numbers after first two letters

value which should match only numbers, dots and commas

Categories

Resources