I'm wondering is it possible to match bracket with regex expression
say I need to find "[Detail]" in a text file but since "[]" is reserved it would result in selecting character "adeilt" which is not what I wanted in the first place
Scanner s = new Scanner(data)
s.findInLine("[Detail]")
Many thanks.
escape reserved characters
\\[Detail\\]
Related
I need to split my text into pieces and also keep the delimiter as well, I know I can use below code to do that as explained Here:
Arrays.toString("a;b;c;d".split("((?<=;)|(?=;))"))
but what I'm stuck is that my text contains delimiter with a value inside it, my delimiter is #[x]#
where x is a value which can be any number. eg: #[1]#, #[44]#. What I want to achieve is to get an array as below:
text : "Hello my Name is blabla.#[1]#How are you today?#[2]#ByeBye"
and what I need to get:
[ "Hello my Name is blabla.", "#[1]#", "How are you today?", "#[2]#", "ByeBye" ]
How can I achieve that? Thanks in advance.
Try the following regex as a delimiter:
((?<=(#\\[\\d\\]#))|(?=(#\\[\\d\\]#)))
basically replacing the semi-colon with the expression (#\\[\\d\\]#) where \d matches any digit.
If more than one digit can exist, you can specify a range for the possible number of digits, for example \d{1,1000} instead of \d to have a maximum of 1000 digits. An unknown number of digits using an expression like \d+ cannot be used with Java lookbehind regular expressions.
I have a list of files in a folder:
maze1.in.txt
maze2.in.txt
maze3.in.txt
I've used substring to remove the .txt extensions.
How do I use regex to match the front and the back of the file name?
I need it to match "maze" at the front and ".in" at the back, and the middle must be a digit (can be single or double digit).
I've tried the following
if (name.matches("name\\din")) {
//dosomething
}
It doesn't match anything. What is the correct regex expression to use?
I'm a little confused what you are asking for in particular
^(maze[0-9]*\.in)$
This will match maze(any number).in
^(maze[0-9]*\.in)\.txt$
this will match maze(any number).in.txt -- excludes the .txt NO NEED FOR USING SUB STRING!
Edit live on Debuggex
The think i would be wary about as of right now is the capture groups... I'm not particularly sure what you are doing with this regex. However, I believe explaining capture groups could benefit you.
A capture group for instance is denoted by () this is basically store them in the pattern array and is a way to parse stuff.
example maze1.in.txt
So if you want to capture the entire line minus .txt i would use this ^(maze[0-9]*\.in\.txt)$
However, if I wanted to capture things separately I would do this ^(maze)([0-9]*)(\.in)\.txt$ this will exclude .txt but include maze, the number, and .in IN separate indexes of the pattern array.
Your original solution doesn't work because string "name" is not in your text. It is "maze".
You can try this
name.matches("maze\\d{1,2}\\.in")
d{1,2} is used to match a digit(can be single or double digit).
You need regex anchors that tell the regex to
start at the beginning: ^
and signal the end of the string: $
^maze[\d]{0,2}\.in$
or in Java:
name.matches("^maze[\\d]{0,2}\\.in$");
Also, your regex wasn't matching strings with a dot (.) which would not accept your examples given. You need to add \. to the regex to accept dots because . is a special character.
It is always good to think of what you are trying to do in english, before you create regular expressions.
You want to match a word maze followed by a digit, followed by a literal period . followed by another word.
word `\w` matches a word character
digit `\d` matches a single digit
period `\.` matches a literal period
word `\w` matches a word character
putting it all together into a single string you get (keep in mind the double backslash for the Java escape and the pluses to repeat the previous match one or more times):
"\\w+\\d\\.\\w+"
The above is the generic case for any file name in the format xxx1.yyy, if you wanted to match maze and in specifically, you can just add those in as literal strings.
"maze\\d+\\.in"
example: http://ideone.com/rS7tw1
name.matches("^maze[0-9]+\\.in\\.txt$")
I need a regular expression for below pattern
It can start with / or number
It can only contain numbers, no text
Numbers can have space in between them.
It can contain /*, at least 1 number and space or numbers and /*
Valid Strings:
3232////33 43/323//
3232////3343/323//
/3232////343/323//
Invalid Strings:
/sas/3232/////dsds/
/ /34343///// /////
///////////
My Problem is, it can have space between numbers like /3232 323/ but not / /.
How to validate it ?
I have tried so far:
(\\d[\\d ]*/+) , (/*\\d[\\d ]*/+) , (/*)(\\d*)(/*)
This regex should work for you:
^/*(?:\\d(?: \\d)*/*)+$
Live Demo: http://www.rubular.com/r/pUOYFwV8SQ
My solution is not so simple but it works
^(((\d[\d ]*\d)|\d)|/)*((\d[\d ]*\d)|\d)(((\d[\d ]*\d)|\d)|/)*$
Just use lookarounds for the last criteria.
^(?=.*?\\d)([\\d/]*(?:/ ?(?!/)|\\d ?))+$
The best would have been to use conditional regex, but I think Java doesn't support them.
Explanation:
Basically, numbers or slashes, followed by one number and a space, or one slash and a space which is not followed by another slash. Repeat that. The space is made optional because I assume there's none at the end of your string.
Try this java regex
/*(\\d[\\d ]*(?<=\\d)/+)+
It meets all your criteria.
Although you didn't specifically state it, I have assumed that a space may not appear as the first or last character for a number (ie spaces must be between numbers)
"(?![A-z])(?=.*[0-9].*)(?!.*/ /.*)[0-9/ ]{2,}(?![A-z])"
this will match what you want but keep in mind it will also match this
/3232///// from /sas/3232/////dsds/
this is because part of the invalid string is correct
if you reading line by line then match the ^ $ and if you are reading an entire block of text then search for \r\n around the regex above to match each new line
I want to do validation for a String which can only contains alphanumeric and only one special character. I tried with (\\W).{1,1}(\\w+).
But it is true only when I start with a special character. But I can have one special character at any place in String.
Use the ^ and $ anchors to instruct the regex engine to start matching from the beginning of the string and stop matching at the end of the string, so taking your regex:
^(\\W).{1,1}(\\w+)$
Please take a look at this Oracle (Java) tutorial on regular expressions.
Try this regexp: \w*\W?\w* (Java string: "\\w*\\W?\\w*")
This expression has a drawback of matching zero-length strings. If your input must have exactly one special character, remove the question mark ? from the expression.
use matcher.find() and not matcher.match() and search for \\w and remove plus (+) because it will match all alphanumeric characters sequence in your string.If your string contains only them, your regex will match whole string.
if I understand your regex correctly, this could solve your problem:
([\w]+)([^\w])([\w]+)
I am writing a program that scans text files and then writes each word into a Hashmap.
The Scanner class has a defualt delimiter of space. But I ended up having my words stored with punctuations attached to them. I want the scanner to recognize periods, comas and other types of common punctuations as a sign to stop the token. Here's what I have attempted:
Scanner line_scanner = new Scanner(line).useDelimiter("[.,:;()?!\" \t]+~\\s");
The scanner basically ignored all the spaces even though I have '\\s' as part of the expression. Sorry, but I have hardly any understanding of regex.
Scanner line_scanner = new Scanner(line).useDelimiter("[.,:;()?!\"\\s]+");
You might go for no unicode letters:
useDelimiter("[^\\p{L}\\p{M}]+");
([^...] is not, Capital p means Unicode category, L are the letters, M the diacritical combining marks (accents).)