Java regex split with any number of asterisks - java

I am learning regex (with this site) and trying to figure out how to parse the following string 1***2 to give me [1,2] (without using a specific case for 3 asterisk). There can be any number of asterisks that I need to split as one delimiter, so I am looking for the * char followed by the * wildcard. The delimiters could be letters as well.
The output should only only be numbers so I use ^-^0-9 to split by everything else.
So far I have tried:
input.split("[^-^0-9]"); // Gives me [1, , ,2]
input.split("[^-^0-9\\**]"); // Gives me [1***2]
input.split("[^-^0-9+\\**]"); // Gives me [1***2]
\* does not work as it is not recognized as a valid escape character.
Thanks!

You are looking for
input.split("[*]+");
This splits the string on one or more consecutive asterisks.
To allow other characters (e.g. letters) within delimiters, add them to the [*] character class.

If the delimiters could be letter..
you can use
\D+
OR
[^\d]+

Related

Regex-How to prevent repeated special characters?

I don't have an experience on Regular Expressions. I need to a regular expression which doesn't allow to repeat of special characters (+-*/& etc.)
The string can contain digits, alphanumerics, and special characters.
This should be valid : abc,df
This should be invalid : abc-,df
i will be really appreciated if you can help me ! Thanks for advance.
Two solutions presented so far match a string that is not allowed.
But the tilte is How to prevent..., so I assume that the regex
should match the allowed string. It means that the regex should:
match the whole string if it does not contain 2
consecutive special characters,
not match otherwise.
You can achieve this putting together the following parts:
^ - start of string anchor,
(?!.*[...]{2}) - a negative lookahead for 2 consecutive special
characters (marked here as ...), in any place,
a regex matching the whole (non-empty) string,
$ - end of string anchor.
So the whole regex should be:
^(?!.*[!##$%^&*()\-_+={}[\]|\\;:'",<.>\/?]{2}).+$
Note that within a char class (between [ and ]) a backslash
escaping the following char should be placed before - (if in
the middle of the sequence), closing square bracket,
a backslash itself and / (regex terminator).
Or if you want to apply the regex to individual words (not the whole
string), then the regex should be:
\b(?!\S*[!##$%^&*()\-_+={}[\]|\\;:'",<.>\/?]{2})\S+
[\,\+\-\*\/\&]{2,} Add more characters in the square bracket if you want.
Demo https://regex101.com/r/CBrldL/2
Use the following regex to match the invalid string.
[^A-Za-z0-9]{2,}
[^\w!\s]{2,} This would be a shortest version to match any two consecutive special characters (ignoring space)
If you want to consider space, please use [^\w]{2,}

Regular expression in Java for matching series of integers

I'm trying to write a regular expression in Java to match strings that look like
(1, 2, 3, 4, 5, 6)
That is, a left parenthesis, followed by a nonzero amount of nonnegative integers (separated by a comma and then any amount of whitespace), and ending with a right parenthesis.
I've tried
([0-9]+,\s+)
Does anyone know how to write such a regular expression?
You can try this pattern:
Pattern pattern = Pattern.compile("\\(\\d+(,\\s*\\d+)*\\)");
\\d: a digit (0 to 9)
\\s: a whitespace character
+: one or more occurrences
*: zero or more occurrences
See http://regex101.com/r/wT5wX7/1
Your regex ([0-9]+,\s+) is close somehow to matching the input string but the comma has only one occurrence (you'd expect zero or more commas), and it should be followed by digits, not just whitespace.
Use this: \(([0-9]+[\,]{1}[\s]*)+[0-9]+\)
Edit: \(([0-9]+[\,]{1}[\s]*)*[0-9]+\) - also matches (1)
Something like this could possibly help you.
(\(){1}(\d+,[ ]+)+(\)){1}
or with the leading and trailng /
/(\(){1}(\d+,[ ]+)+(\)){1}/
The method you tried ([0-9]+,\s+)
is saying that you can have as many digits as you would like, followed by a comma, followed by white space. in your attempt did you account for multiple digits followed by commas, or the leading and trailing parenthesizes.

How to spot * in regular expressions?

I want to spot and delete all lines that have *** in them. How can I do this?
I tried to use regex but got
Exception in thread "main" java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 6
Here is my regular expression: (?m)^**.*.
.........text...........
***..........text....... //want to delete this line
........................
The * character in a regular expression has a special meaning. To show the Pattern you don't mean for this special meaning, you have to "escape" it. The easiest way to do it is to put your expression through Pattern.quote().
For example:
String searchFor = Pattern.quote("***");
Then use that string to search
Note that* is a special character in regex so you have to use \\*
Your expression will be: (?m)^\\*\\*.\\*
This is not perfect, but it'll get you started:
// 4 lines, 2 of each containing "***" at random locations
String input = "abc***def\nghijkl\n***mnop\n**blah";
// replacing multiline pattern starting with any character 0 or more times,
// followed by 3 escaped "*"s,
// followed by any character 0 or more times
System.out.println(input.replaceAll("(?m).*\\*{3}.*", ""));
Output:
ghijkl
**blah
If the three asterisks are not always at the begining of the line, you can use this pattern that removes newlines too:
(\r?\n)?[^\r\n*]*\Q***\E.*((1)?|\r?\n?)
If all you're doing is looking for three specific characters together in a string, you don't need a regex at all:
if (line.contains("***")) {
...
}
(But if things get more complicated and you do need a regex, then use a backslash or Pattern.quote as the other answers say.)
(This is assuming you're reading lines one at a time, instead of having one big long buffer containing all the lines with newline characters. Some of the other answers handle the latter case.)

Need regular expression for pattern this

I need a regular expression for below pattern
It can start with / or number
It can only contain numbers, no text
Numbers can have space in between them.
It can contain /*, at least 1 number and space or numbers and /*
Valid Strings:
3232////33 43/323//
3232////3343/323//
/3232////343/323//
Invalid Strings:
/sas/3232/////dsds/
/ /34343///// /////
///////////
My Problem is, it can have space between numbers like /3232 323/ but not / /.
How to validate it ?
I have tried so far:
(\\d[\\d ]*/+) , (/*\\d[\\d ]*/+) , (/*)(\\d*)(/*)
This regex should work for you:
^/*(?:\\d(?: \\d)*/*)+$
Live Demo: http://www.rubular.com/r/pUOYFwV8SQ
My solution is not so simple but it works
^(((\d[\d ]*\d)|\d)|/)*((\d[\d ]*\d)|\d)(((\d[\d ]*\d)|\d)|/)*$
Just use lookarounds for the last criteria.
^(?=.*?\\d)([\\d/]*(?:/ ?(?!/)|\\d ?))+$
The best would have been to use conditional regex, but I think Java doesn't support them.
Explanation:
Basically, numbers or slashes, followed by one number and a space, or one slash and a space which is not followed by another slash. Repeat that. The space is made optional because I assume there's none at the end of your string.
Try this java regex
/*(\\d[\\d ]*(?<=\\d)/+)+
It meets all your criteria.
Although you didn't specifically state it, I have assumed that a space may not appear as the first or last character for a number (ie spaces must be between numbers)
"(?![A-z])(?=.*[0-9].*)(?!.*/ /.*)[0-9/ ]{2,}(?![A-z])"
this will match what you want but keep in mind it will also match this
/3232///// from /sas/3232/////dsds/
this is because part of the invalid string is correct
if you reading line by line then match the ^ $ and if you are reading an entire block of text then search for \r\n around the regex above to match each new line

Regex for matching alternating sequences

I'm working in Java and having trouble matching a repeated sequence. I'd like to match something like:
a.b.c.d.e.f.g.
and be able to extract the text between the delimiters (e.g. return abcdefg) where the delimiter can be multiple non-word characters and the text can be multiple word characters. Here is my regex so far:
([\\w]+([\\W]+)(?:[\\w]+\2)*)
(Doesn't work)
I had intended to get the delimiter in group 2 with this regex and then use a replaceAll on group 1 to exchange the delimiter for the empty string giving me the text only. I get the delimiter, but cannot get all the text.
Thanks for any help!
Replace (\w+)\W+ by $1
Replace (\w+)(\W+|$) with $1. Make sure that global flag is turned on.
It replaces a sequence of word chars followed by a sequence of non-word-chars or end-of-line with the sequence of words.
String line = "Am.$#%^ar.$#%^gho.$#%^sh";
line = line.replaceAll("(\\w+)(\\W+|$)", "$1");
System.out.println(line);//prints my name
Why not use String.split?
Why not ..
find all occurences of (\w+) and then concatenate them; or
find all non word characters (\W+) and then use Matcher.html#replaceAll with an empty string?

Categories

Resources