why does this regex not work? - java

I am trying to match a string with a java regex and I cannot succeed. I'm pretty new to java and with most of my experience being linux based regex, I've had no success. Can someone help me?
Below are the codes that Im using.
The regex is-
//vod//final\_\d{0,99}.\d{0,99}\\-Frag\d{0,99}
The line that I'm trying to match is
/vod/final_1.3Seg1-Frag1
where I want 1.3, 1 and 1 to be wildcarded.
Someone please help me out... :(

You are missing the Seg1 part. Also you are escaping characters that need not to be escaped. Try out this regexp: /vod/final_\\d+\\.\\d+Seg1-Frag\\d+

This should work:
Pattern p = Pattern.compile( "/vod/final_\\d+\\.\\d+Seg\\d+-Frag\\d+" );
Notes: To protect special characters, you can use Pattern.quote()
When running into problems like this, start with a simple text and pattern and build from there. I.e. first try to match /, then /vod/, then /vod/final_1, etc.

You're escaping too much. Don't escape /, _, -.
Something like:
/vod/final_\d{0,99}.\d{0,99}-Frag\d{0,99}

Does this work?
/\/vod\/final\_\d{0,99}.\d{0,99}Seg\d-Frag\d{0,99}
Also, here's what I used to edit the regex you provided above: http://rubular.com/
It says it's for ruby, but it also mentions that it works for java too.

Related

Regex for all Symbols except for |

Is there a way to use the patter \p{Punct} in a regex in Java without the Symbol |.
I tried \\p{Punct}&&[^|], but it didn't worked.
What you have is nearly correct. The correct syntax is:
[\\p{Punct}&&[^|]]
Did you see this question: Punctuation Regex in Java?
I think you can modify that answer slightly to use (?![|])\\p{Punct}
do you try with this class : [^\\P{Punct}|] (note the uppercase P)

Regex with -, ::, ( and )

I need to split the string
(age-is-25::OR::last_name-is-qa6)::AND::(age-is-20::OR::first_name-contains-test)
into
string[0] = (age-is-25::OR::last_name-is-qa6)
string[1] = AND
string[2] = (age-is-20::OR::first_name-contains-test)
I tried writing so many regex expressions, but nothing works as expected.
Using the following regex, Matcher.groupCount() which returns 2 but assigning results to an arraylist returns null as the elements.
Pattern pattern = Pattern.compile("(\\)::)?|(::\\()?");
I tried to split it using ):: or ::(.
I know the regex looks too stupid, but being a beginner this is the best I could write.
You can use positive lookahead and lookbehind to match the first and last parentheses.
String str = "(age-is-25::OR::last_name-is-qa6)::AND::(age-is-20::OR::first_name-contains-test)";
for (String s : str.split("(?<=\\))::|::(?=\\()"))
System.out.println(s);
Outputs:
(age-is-25::OR::last_name-is-qa6)
AND
(age-is-20::OR::first_name-contains-test)
Just a note however: It seems like you are parsing some kind of recursive language. Regular expressions are not good at doing this. If you are doing advanced parsing I would recommend you to look at other parsing methods.
To me it looks like a big part of your stress comes from the need for escaping special characters in your search term. I highly recommend to not do manual escaping of special characters, but instead to use Pattern.quote(...) for the escaping.
This should works
"(?<=\\))::|::(?=\\()"
This should work for you.
\)::|::\(
textString.split("\\)::|::\\(")
should work.

How do I write this regex in Java?

Basically, for this regex
{(\(\(("\w{1,}",{0,1}){2}\),\(("[^:=;#"\)\(\{\}\[\]]{1,}",{0,1}){2}"[LR]{1}"\)\),{0,1}){1,}}
Which I've tested on Regexpal for this input:
{(("st0","sy0"),("st1","sy3","L")),(("st0","sy0"),("st1","^","L"))}
I now need in Java. I can't seem to figure out how to convert it. Can somebody show me how to?
You need to escape the special chars - specifically the backslashes and the quote marks.
The regular expression could work on Java, the only thing that you have to do, is escape the backslash .

How do I write a regular expression to find the following pattern?

I am trying to write a regular expression to do a find and replace operation. Assume Java regex syntax. Below are examples of what I am trying to find:
12341+1
12241+1R1
100001+1R2
So, I am searching for a string beginning with one or more digits, followed by a "1+1" substring, followed by 0 or more characters. I have the following regex:
^(\d+)(1\\+1).*
This regex will successfully find the examples above, however, my goal is to replace the strings with everything before "1+1". So, 12341+1 would become 1234, and 12241+1R1 would become 1224. If I use the first grouped expression $1 to replace the pattern, I get the wrong result as follows:
12341+1 becomes 12341
12241+1R1 becomes 12241
100001+1R2 becomes 100001
Any ideas?
Your existing regex works fine, just that you are missing a \ before \d
String str = "100001+1R2";
str = str.replaceAll("^(\\d+)(1\\+1).*","$1");
Working link
IMHO, the regex is correct.
Perhaps you wrote it wrong in the code. If you want to code the regex ^(\d+)(1\+1).* in a string, you have to write something like String regex = "^(\\d+)(1\\+1).*".
Your output is the result of ^(\d+)(1+1).* replacement, as you miss some backslash in the string (e.g. "^(\\d+)(1\+1).*").
Your regex looks fine to me - I don't have access to java but in JavaScript the code..
"12341+1".replace(/(\d+)(1\+1)/g, "$1");
Returns 1234 as you'd expect. This works on a string with many 'codes' in too e.g.
"12341+1 54321+1".replace(/(\d+)(1\+1)/g, "$1");
gives 1234 5432.
Personally, I wouldn't use a Regex at all (it'd be like using a hammer on a thumbtack), I'd just create a substring from (Pseudocode)
stringName.substring(0, stringName.indexOf("1+1"))
But it looks like other posters have already mentioned the non-greedy operator.
In most Regex Syntaxes you can add a '?' after a '+' or '*' to indicate that you want it to match as little as possible before moving on in the pattern. (Thus: ^(\d+?)(1+1) matches any number of digits until it finds "1+1" and then, NOT INCLUDING the "1+1" it continues matching, whereas your original would see the 1 and match it as well).

Negating a set of words via java regex

I would like to negate a set of words using java regex.
Say, I want to negate cvs, svn, nvs, mvc. I wrote a regex which is ^[(svn|cvs|nvs|mvc)].
Some how that seems not to be working.
Try this:
^(?!.*(svn|cvs|nvs|mvc)).*$
this will match text if it doesn't contain one of svn, cvs, nvs or mvc.
This is a similar question: C# Regex to match a string that doesn't contain a certain string?
It's not that simple. If you want to negate a word you have to split it to letters and negate each letter.
so to negate
/svn/
you have to write
/[^s][^v][^n]/
So what you want to filter out will turn into really ugly regex and I think it's better idea to use this regex
/svn|cvs|nvs|mvc/
and when you test your string against it, just negate the result.
In JS this would look more less like that:
!/svn|cvs|nvs|mvc/.test("this is your test string");
Your regex is wrong. Between square brackets, you can put characters to require or to ignore. If you don't find ^(svn|cvs|nvs|mvc)$, you're fine.

Categories

Resources