Regular expression not working despite testing

Regular expression not working despite testing - java

I'm trying to enforce validation of an ID that includes the first two letters being letters and the next four being numbers, there can be one 0 i.e. 0333 but can never be full zeroes with 0000 therefore something like ID0000 is not allowed. The expression I came up with seems to check out when testing it online but doesn't seem to work when trying to enforce it in the program:
\b(?![A-Z]{2}[0]{4})[A-Z]{2}[0-9]{4}\b
and heres the code I'm currently using to implement it:
String pattern = "/\b(?![A-Z]{2}[0]{4})[A-Z]{2}[0-9]{4}\b/";
Pattern regEx = Pattern.compile(pattern);
String ingID = ingredID.getText().toString();
Matcher m = regEx.matcher(ingID);
if (m.matches()) {
ingredID.setError("Please enter a valid Ingrediant ID");
}
For some reason it doesn't seem to validate correctly with accepting ids like ID0000 when it shouldn't be. Any thoughts folks ?

Change your regex pattern to "\\b(?![A-Z]{2}[0]{4})[A-Z]{2}[0-9]{4}\\b"

Your problem is essentially that Java isn't all that Regex-friendly; you need to deal with the limitations of Java strings in order to create a string that can be used as a Regex pattern. Since \ is the escape character in Regex and the escape character in Java strings (and since there's no such thing as a raw string literal in Java), you must double-escape anything that must be escaped in the Regex in order to create a literal \ character within the Java string, which, when parsed as a Regex pattern, will be correctly treated as the escape character.
So, for instance, the Regex pattern /\b/ (where /, as mentioned in my comment, delimits the pattern itself) would be represented in Java as the string "\\b".

Related

Matching sequence of unicode value in Java with regular expression

I have a text file that contains some sequence of unicode characters value like
"{"\u0985\u0982\u09b6\u0998\u099f\u09bf\u09a4","\u0985\u0982\u09b6\u09be\u0982\u09b6\u09bf","\u0985\u0982\u09b6\u09be\u0999\u09cd\u0995\u09bf\u09a4","\u0985\u0982\u09b6\u09be\u09a6\u09bf","\u0985\u0982\u09b6\u09be\u09a8\u09cb"}"
I am trying to match and group values inside the quotes using Pattern class in java like below but can not find any match.
Pattern p = Pattern.compile("\"(\\[u]{1}\\w+)+\"");
Example
I am actually willing to find out where is the technical error in my given regexp.

Try something more like this:
Pattern p = Pattern.compile("\"(\\\\u[0-9a-f]{4})+\"");
In order to match the string \u you need the regex \\u, and to express that regex as a Java string literal means \\\\u. Following the u there must be exactly four hex digits.

First, this bit [u]{1} means that you want to match values from the list only once, so you can replace it with simply u
Once that is done, your regex wants to match a quote, a slash, then a u, then another slash, then one or more w's, then a slash. It is matching w's instead of word characters because you have too many slashes before it.
Happy coding!
Edit
Try replacing the \\ before the u with a \\\\. \u is not valid in some regex's and so when you put in a Java string, it's probably becoming \u, breaking the regex

Regular Expression Java Error

I can't run this regular expression on Java:
String regex = "/^{m:\"(.*)\",s:([0-9]{1,15}),r:([0-9]{1,15}),t:([0-9]{1,2})}$/";
String data = "{m:\"texttexttext\",s:1231,r:23123,t:1}";
Pattern p = Pattern.compile(regex_Write_clientToServer);
Matcher a = p.matcher(data);
This the same regex and the same data on regex site's tester ( as http://gskinner.com/RegExr/ ) works fine!

Two problems:
In java, (unlike perl etc) regexes are not wrapped in / characters
You must escape your { literals:
Try this:
String regex = "^\\{m:\"(.*)\",s:([0-9]{1,15}),r:([0-9]{1,15}),t:([0-9]{1,2})\\}$";

There are two problems:
The forward slashes aren't part of the pattern itself, and shouldn't be included.
You need to escape the braces at the start and end, as otherwise they'll be treated as repetition quantifiers. This may not be the case in other regular expression implementations, but it's certainly the case in Java - when I tried just removing the slashes, I got an exception in Pattern.compile.
Try this:
String regex="^\\{m:\"(.*)\",s:([0-9]{1,15}),r:([0-9]{1,15}),t:([0-9]{1,2})\\}$";
(That works with your sample data.)
As an aside, if this is meant to be parsing JSON, I would personally not try to do it with regular expressions - use a real JSON parser instead. It'll be a lot more flexible in the long run.

Two things:
Java does not require you to have any kind of begin/end character. so you can drop the / chars
Also, Java requires you to escape any regex metacharacters if you want to match them. In your case, the brace characters '{' and '}' need to be preceded by a double backslash (one for java escape, one for regex escape):
"^\\{m:\"(.*)\",s:([0-9]{1,15}),r:([0-9]{1,15}),t:([0-9]{1,2})\\}$"

Possible Regular Expression Question

I have a simple program that looks up details of an IP you give it, and I will show you an example of some of my code
int regIndex = src.indexOf("Region:") + 16;
int endIndex = src.indexOf("<", regIndex);
String region = src.substring(regIndex, endIndex);
if(regIndex == 15) region = "None";
int counIndex = src.indexOf("Country:") + 17;
int couneIndex = src.indexOf(" <", counIndex);
String country = src.substring(counIndex, couneIndex);
As you can see, it is definitely not the most efficient way to do this. The website I am using gives the information like this: http://whatismyipaddress.com/ip/1.1.1.1
I have never really used Regular Expressions before, but it seems to me like there might be one that could really make this more efficient and easier to program, but I've been looking around and I'm pretty lost.
So basically my question is, how could I use a Regular Expression for this (Or if there is another more efficient way).
Any help would be great,
Thanks :)

You can do something like this:
String s = "bla Country: Australia <bla";
Pattern pattern = Pattern.compile("Country: (.*) [<]");
Matcher matcher = pattern.matcher(s);
if(matcher.find()) {
System.out.println("Country = " + matcher.group(1));
}

The source would look like this
<tr><th>Country:</th><td>Australia <img src="http://whatismyipaddress.com/images/flags/au.png" alt="au flag"> </td></tr>
To use regular expression means to match a pattern.
The pattern that indicates your wanted data is pretty straight forward Country:. You need also to match the following tags like <\/th><td>. The only thing is you need to escape the forward slash. Then there is the data you are looking for, I would suggest to match everything that is not a <, so [^<], this is a capturing group with a negation at the beginning, meaning any character that is not a <, to repeat this add a + at the end, meaning at least one of the preceding character.
So, the complete thing should look like this:
Country:<\/th><td>\s*([^<]+)\s*<
I added here also the brackets, they mean put the found pattern into a variable, so your result can be found in capturing group 1. I added also \s*, this is a whitespace character repeated 0 or more times, this is to match whitespace before or after your data, I assume that you don't need that.

Firstly there are some online sites that can help you to develop a regular expression. They let you enter some text, and a regular expression and then show you the result of applying the expression to the text. This saves you having to write code as you develop the expression and expand your understanding. A good site I use alot is FileFormat regex because it allows me to test one expression against multiple test strings. A quick search also brought up regex Planet, RegExr and RegexPal. There are lots of others.
In terms of resources, the Java Pattern class reference is useful for Java development and I quite like regular-expression.info as well.
For your problem I used fileFormat.info and came up with this regex to match "http://whatismyipaddress.com/ip/1.1.1.1":
.*//([.\w]+)/.*/(\d+(?:.\d+){3})
or as a java string:
".*//([.\\w]+)/.*/(\\d+(?:.\\d+){3})"
A quick break down says anything (.*), followed by two slashes (//), followed by at least one or more decimal points or characters (([.\w]+)), followed by a slash, any number of characters and another slash (/.*/), followed by at least 1 digit ((\d+), followed by 3 sets of a decimal point and at least one digit ((?:.\d+){3})). The sets of brackets around the server name part and the IP part are called capturing groups and you can use methods on the Java Matcher class to return the contents of these sections. The ?: on the second part of the ip address tells it that we are using the brackets to group the characters but it's not to be treated as a capturing group.
This regex is not as strict or as flexible as it should be, but it's a starting point.
All of this can be researched on the above links.

How do I write a regular expression to find the following pattern?

I am trying to write a regular expression to do a find and replace operation. Assume Java regex syntax. Below are examples of what I am trying to find:
12341+1
12241+1R1
100001+1R2
So, I am searching for a string beginning with one or more digits, followed by a "1+1" substring, followed by 0 or more characters. I have the following regex:
^(\d+)(1\\+1).*
This regex will successfully find the examples above, however, my goal is to replace the strings with everything before "1+1". So, 12341+1 would become 1234, and 12241+1R1 would become 1224. If I use the first grouped expression $1 to replace the pattern, I get the wrong result as follows:
12341+1 becomes 12341
12241+1R1 becomes 12241
100001+1R2 becomes 100001
Any ideas?

Your existing regex works fine, just that you are missing a \ before \d
String str = "100001+1R2";
str = str.replaceAll("^(\\d+)(1\\+1).*","$1");
Working link

IMHO, the regex is correct.
Perhaps you wrote it wrong in the code. If you want to code the regex ^(\d+)(1\+1).* in a string, you have to write something like String regex = "^(\\d+)(1\\+1).*".
Your output is the result of ^(\d+)(1+1).* replacement, as you miss some backslash in the string (e.g. "^(\\d+)(1\+1).*").

Your regex looks fine to me - I don't have access to java but in JavaScript the code..
"12341+1".replace(/(\d+)(1\+1)/g, "$1");
Returns 1234 as you'd expect. This works on a string with many 'codes' in too e.g.
"12341+1 54321+1".replace(/(\d+)(1\+1)/g, "$1");
gives 1234 5432.

Personally, I wouldn't use a Regex at all (it'd be like using a hammer on a thumbtack), I'd just create a substring from (Pseudocode)
stringName.substring(0, stringName.indexOf("1+1"))
But it looks like other posters have already mentioned the non-greedy operator.
In most Regex Syntaxes you can add a '?' after a '+' or '*' to indicate that you want it to match as little as possible before moving on in the pattern. (Thus: ^(\d+?)(1+1) matches any number of digits until it finds "1+1" and then, NOT INCLUDING the "1+1" it continues matching, whereas your original would see the 1 and match it as well).

Pattern Matching - String Search

I am trying to work out a formula to match a following pattern:
input string example:
'444'/'443'/'434'/'433'/'344'/'334'/'333'
if any of the patterns above exist in a particular input string I want to match it as the same pattern.
also is it possible to do a variable substitution using regex? meaning check for the 3 chars of the string by using each character as a variable and just doing an increment/decrement for each character? so that you dont have to specify the particular number ranges (hardcoding the pattern string ) for different patterns?
Is there any good library one can use for this?? I was working with Pattern class in java.
If you have any link which would be helpful please pass it through :)
Thank you.

Let's first consider this pattern: [34]{3}
The […] is a character class, it matches exactly one of the characters in the set. The {n} is an exact finite repetition.
So, [34]{3} informally means "exactly 3 of either '3' or '4'". Thus, it matches "333", "334", "343", "344", "433", "434", "443", "444", and nothing else.
As a string literal, the pattern is "[34]{3}". If you don't want to hardcode this pattern, then just generate similar-looking strings that follows this template "[…]{n}". Just put the characters that you want to match in the …, and substitute n with the number you want.
Here's an example:
String alpha = "aeiou";
int n = 5;
String pattern = String.format("[%s]{%s}", alpha, n);
System.out.println(pattern);
// [aeiou]{5}
We've now seen that the pattern is not hardcoded, but rather programmatically generated depending on the values of the variables alpha and n. The pattern [aeiou]{5} will 5 consecutive lowercase vowels, e.g. "ooiae", "ioauu", "eeeee", etc.
It's again not clear if you just want to match these kinds of strings, or if they have to appear like '…'/'…'/'…'/'…'/'…'. If the latter is desired, then simply compose the pattern as desired, using repetition and grouping as necessary. You can also just programmatically copy and paste the pattern 5 times if that's simpler. Here's an example:
String p5 = String.format("'%s'/'%<s'/'%<s'/'%<s'/'%<s'", pattern);
System.out.println(p5);
// '[aeiou]{5}'/'[aeiou]{5}'/'[aeiou]{5}'/'[aeiou]{5}'/'[aeiou]{5}'
This will now match strings like "'aeooi'/'eeiuu'/'uaooo'/'eeeia'/'eieio'".
Caveat
Do be careful about what goes in alpha. Specifically, -, [. ], &&, ^, etc, are special metacharacters in Java character class definition. If you restrict alpha to contain only digits/letters, then you will probably not run into any problems, but e.g. [^a] does NOT mean "either '^' or 'a'". It in fact means "anything but 'a'. See java.util.regex.Pattern for exact character class syntax.

You can use the regex:
('\\d{3}'/){6}'\\d{3}'

Pattern.Compile takes a String as its parameter. Though that's probably most often supplied in the form of a string literal, if you have variable upper and lower bounds for your pattern, you can use something like StringBuilder to build your string, then pass that result to Pattern.Compile.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regular expression not working despite testing - java

Change your regex pattern to "\\b(?![A-Z]{2}[0]{4})[A-Z]{2}[0-9]{4}\\b"

Related

Matching sequence of unicode value in Java with regular expression

Regular Expression Java Error

Possible Regular Expression Question

How do I write a regular expression to find the following pattern?

Pattern Matching - String Search

Categories

Resources