How to create a regex that accepts specific characters? - java

I have this regex:
^[a-zA-Z0-9_#.#$%&'*+-/=?^`{|}~!(),:;<>[-\]]{8,}$
I need a regex to accept a minimum word length of 8, letters(uppercase & lowercase), numbers and these characters:
!#$%&'*+-/=?^_`{|}~"(),:;<>#[]
It works when I tested it here.
This is how I used it in Java Android.
public static final String regex = "^[a-zA-Z0-9_#.#$%&'*+-/=?^`{|}~!(),:;<>[-\\]]{8,}$";
This is the error that I received.
java.util.regex.PatternSyntaxException: Missing closing bracket in character class near index 49
^[a-zA-Z0-9_#.#$%&'*+-/=?^`{|}~!(),:;<>[-\]]{8,}$

If you just want to test if a given input string matches your pattern, you may use String#matches directly, e.g.
String regex = "[a-zA-Z0-9_#.#$%&'*+-/=?^`{|}~!(),:;<>\\[\\]-]{8,}";
String input = "Jon#Skeet#123";
if (input.matches(regex)) {
System.out.println("Found a match");
}
else {
System.out.println("No match");
}
If you wanted to parse a larger input text and identify such matching words, then you would want to use a formal Pattern and Matcher. But, I don't see the need for this just based on your question.

You have to use pattern marcher concept. it may help you.
follow tutorial : https://www.mkyong.com/regular-expressions/how-to-validate-password-with-regular-expression/
Here is one Example.
try {
Pattern pattern;
Matcher matcher;
final String PASSWORD_PATTERN = "((?=.*\\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[##$%]).{6,20})";
pattern = Pattern.compile(PASSWORD_PATTERN);
matcher = pattern.matcher(password_string );
if(matcher.matches()){
Log.e("TAG", "TRUE")
}else{
Log.e("TAG", "FALSE")
}
} catch (RuntimeException e) {
return false;
}

Related

Regex Redirect URL excludes token

I'm trying to create a redirect URL for my client. We have a service that you specify "fromUrl" -> "toUrl" that is using a java regex Matcher. But I can't get it work to include the token in when it converts it. For example:
/fromurl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf
Should be:
/tourl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf
but it excludes the token so the result I get is:
/fromurl/login/
/tourl/login/
I tried various regex patterns like: " ?.* and [%5E//?]+)/([^/?]+)/(?.*)?$ and (/*) etc" but no one seems to work.
I'm not that familiar with regex. How can I solve this?
This can be easily done using simple string replace but if you insist on using regular expressions:
Pattern p = Pattern.compile("fromurl");
String originalUrlAsString = "/fromurl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf ";
String newRedirectedUrlAsString = p.matcher(originalUrlAsString).replaceAll("tourl");
System.out.println(newRedirectedUrlAsString);
If I understand you correctly you need something like this?
String from = "/my/old/url/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
String to = from.replaceAll("\\/(.*)\\/", "/my/new/url/");
System.out.println(to); // /my/new/url/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
This will replace everything between the first and the last forward slash.
Can you detail more exactly what the original expression is like? This is necessary because the regular expression is based on it.
Assuming that the first occurrence of fromurl should simply be replaced with the following code:
String from = "/fromurl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
String to = from.replaceFirst("fromurl", "tourl");
But if it is necessary to use more complex rules to determine the substring to replace, you can use:
String from = "/fromurl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
String to = "";
String regularExpresion = "(<<pre>>)(fromurl)(<<pos>>)";
Pattern pattern = Pattern.compile(regularExpresion);
Matcher matcher = pattern.matcher(from);
if (matcher.matches()) {
to = from.replaceAll(regularExpresion, "$1tourl$3");
}
NOTE: pre and pos targets are referencial because I don't know the real expresion of the url
NOTE 2: $1 and $3 refer to the first and the third group
Although existing answers should solve the issue and some are similar, maybe below solution would be of help, with quite an easy regex being used (assuming you get input of same format as your example):
private static String replaceUrl(String inputUrl){
String regex = "/.*(/login\\?token=.*)";
String toUrl = "/tourl";
Pattern p = Pattern.compile(regex);
Matcher matcher = p.matcher(inputUrl);
if (matcher.find()) {
return toUrl + matcher.group(1);
} else
return null;
}
You can write a test if it works for other expected inputs/outputs if you want to change format and adjust regex:
String inputUrl = "/fromurl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
String expectedUrl = "/tourl/login?token=7c8Q8grW5f2Kz7RP1%2FWsqpVB%2FEluVOGfXQdW4I0v82siR2Ism1D8VCvEmKJr%2BKhHhicwPey0uIiTxN049Be8TNsypf";
if (expectedUrl.equals(replaceUrl(inputUrl))){
System.out.println("Success");
}

Regular expression in java that encloses some url

i have this problem:
i have to make a regular expression which take this urls:
http://www.amazon.it/TP-LINK-TL-WR841N-Wireless-300Mbps-Ethernet/dp/B001FWYGJS?ie=UTF8&redirect=true&ref_=s9_simh_gw_p147_d0_i2
http://www.amazon.it/gp/product/B014KMQWU0/
http://www.amazon.it/gp/product/glance/B014KMQWU0/
I need a regular expression which matches the full url until the ASIN of the product (ASIN is a word of 10 capital letters)
I have write this regex but not make what i want:
String regex="http:\\/\\/(?:www\\.|)amazon\\.com\\/(?:gp\\ product|| gp\\ product\\ glance || [^\\/]+\\/dp|dp)\\/([^\\/]{10})";
Pattern pattern=Pattern.compile(regex);
Matcher urlAmazonMatcher = pattern.matcher(url);
while (urlAmazonMatcher.find()) {
System.out.println("PROVA "+urlAmazonMatcher.group(0));
}
This is my solution. Finally it works :D
String regex="(http|www\\.)amazon\\.(com|it|uk|fr|de)\\/(?:gp\\/product|gp\\/product\\/glance|[^\\/]+\\/dp|dp)\\/([^\\/]{10})";
Pattern pattern=Pattern.compile(regex);
Matcher urlAmazonMatcher = pattern.matcher(url);
String toReturn = null;
while (urlAmazonMatcher.find()) {
toReturn=urlAmazonMatcher.group(0);
}
How about
/[^/?]{10}(/$|\?)
This matches 10 characters that are neither / nor ? following a slash if those characters are followed by a final slash or a question mark.
You can get the part that precedes or follows the ASIN using one of the various Matcher functions.
Here is my work from a previous project that was to extract URLs from text:
private Pattern getUriPattern() {
if(uriPattern == null) {
// taken from http://labs.apache.org/webarch/uri/rfc/rfc3986.html
//TODO implement the full URI syntax
String genDelims = "\\:\\/\\?\\#\\[\\]\\#";
String subDelims = "\\!\\$\\&\\'\\*\\+\\,\\;\\=";
String reserved = genDelims + subDelims;
String unreserved = "\\w\\-\\.\\~"; // i.e. ALPHA / DIGIT / "-" / "." / "_" / "~"
String allowed = reserved + unreserved;
// ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
uriPattern = Pattern.compile("((?:[^\\:/\\?\\#]+:)?//[" + allowed + "&&[^\\?\\#]]*(?:\\?([" + allowed + "&&[^\\#]]*))?(?:\\#[" + allowed + "]*)?).*");
}
return uriPattern;
}
You can use the above method as follows:
Matcher uriMatcher =
getUriPattern().matcher(text);
if(uriMatcher.matches()) {
String candidateUriString = uriMatcher.group(1);
try {
new URI(candidateUriString); // check once again if you matched a URL
// your code here
} catch (Exception e) {
// error handling
}
}
This will catch the whole URL, including params. You can then split it up to the first occurence of '?' (if any) and take the first part. Of course, you can rework the regex too.

How to get full sentence using regex in java

As of now, I'm parsing PDF using PDFBox later I will be parsing other documents (.docx/.doc). Using PDFBox, I'm getting all file content into one string. Now, I wanted to get complete sentence wherever a user define words matches.
For example:
... some text here..
Raman took more than 12 year to complete his schooling and now he
is pursuing higher study.
Relational Database.
... some text here ..
If user gives the input year, then it should return whole sentence.
Expected Output:
Raman took more than 12 year to complete his schooling and now he
is pursuing higher study.
I'm trying below code, but it showing nothing. Can anyone correct this
Pattern pattern = Pattern.compile("[\\w|\\W]*+[YEAR]+[\\w]*+.");
Also, If I have to include multiple words to match as OR condition, then what should I make change in my regex ?
Please note all words are in uppercase.
Do not try to put everything into the single regexp. There's a standard Java class java.text.BreakIterator which can be used to find the sentence boundaries.
public static String getSentence(String input, String word) {
Matcher matcher = Pattern.compile(word, Pattern.LITERAL | Pattern.CASE_INSENSITIVE)
.matcher(input);
if(matcher.find()) {
BreakIterator br = BreakIterator.getSentenceInstance(Locale.ENGLISH);
br.setText(input);
int start = br.preceding(matcher.start());
int end = br.following(matcher.end());
return input.substring(start, end);
}
return null;
}
Usage:
public static void main(String[] args) {
String input = "... some text...\n Raman took more than 12 year to complete his schooling and now he\nis pursuing higher study. Relational Database. \n... some text...";
System.out.println(getSentence(input, "YEAR"));
}
Pattern re = Pattern.compile("[^.!?\\s][^.!?]*(?:[.!?](?!['\"]?\\s|$) [^.!?]*)*[.!?]?['\"]?(?=\\s|$)", Pattern.MULTILINE | Pattern.COMMENTS);
Matcher reMatcher = re.matcher(result);
while (reMatcher.find()) {
System.out.println(reMatcher.group());
}
A small fix to #Tagir Valeev answer to prevent index out of bounds exceptions.
private String getSentence(String input, String word) {
Matcher matcher = Pattern.compile(word , Pattern.LITERAL | Pattern.CASE_INSENSITIVE)
.matcher(input);
if(matcher.find()) {
BreakIterator br = BreakIterator.getSentenceInstance(Locale.ENGLISH);
br.setText(input);
int start = br.preceding(matcher.start());
int end = br.following(matcher.end());
if(start == BreakIterator.DONE) {
start = 0;
}
if(end == BreakIterator.DONE) {
end = input.length();
}
return input.substring(start, end);
}
return null;
}

Extracting Number from URL in Java via Regex

Take URL http://www.abc.com/alpha/beta/33445566778899/gamma/delta
i need to return the number 33445566778899 (with forward slashes removed, number is of variable length but between 10 & 20 digits)
Simple enough (or so i thought) except everything I've tried doesn't seem to work but why?
Pattern pattern = Pattern.compile("\\/([0-9])\\d{10,20}\\/");
Matcher matcher = pattern.matcher(fullUrl);
if (matcher.find()) {
return matcher.group(1);
}
Try this one-liner:
String number = url.replaceAll(".*/(\\d{10,20})/.*", "$1");
This regex works -
"\\/(\\d{10,20})\\/"
Testing it-
String fullUrl = "http://www.abc.com/alpha/beta/33445566778899/gamma/delta";
Pattern pattern = Pattern.compile("\\/(\\d{10,20})\\/");
Matcher matcher = pattern.matcher(fullUrl);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
OUTPUT - 33445566778899
Try,
String url = "http://www.abc.com/alpha/beta/33445566778899/gamma/delta";
String digitStr = null;
for(String str : url.split("/")){
System.out.println(str);
if(str.matches("[0-9]{10,20}")){
digitStr = str;
break;
}
}
System.out.println(digitStr);
Output:
33445566778899
Instead of saying it "doesn't seem to work", you should have given use what it was returning. Testing it confirmed what I thought: your code would return 3 for this input.
This is simply because your regexp as written will capture a digit following a / and followed by 10 to 20 digits themselves followed by a /.
The regex you want is "/(\\d{10,20})/" (you don't need to escape the /). Below you'll find the code I tested this with.
public static void main(String[] args) {
String src = "http://www.abc.com/alpha/beta/33445566778899/gamma/delta";
Pattern pattern = Pattern.compile("/(\\d{10,20})/");
Matcher matcher = pattern.matcher(src);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
}

Regular Expressions in Java: Matching a date value surrounded by other data

I have a lot of files I am retrieving data from, and I have hit a wall with date values surrounded by other data. I am using Java, and the regular expression I am using works for the variable string_i_currently_match however I need it to match example_string_i_need_to_match
String example_string_i_need_to_match = "data 10/12/2010, data, data";
String string_i_currently_match = "10/12/2010,";
Pattern pattern = Pattern.compile(
"^(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)\\d\\d(?:,)$"
);
Matcher matcher = pattern.matcher(fileString);
boolean found = false;
while (matcher.find()) {
System.out.printf("I found the text \"%s\" starting at " +
"index %d and ending at index %d.\n",
matcher.group(), matcher.start(), matcher.end());
found = true;
}
if(!found){
System.out.println("No match found.");
}
Perhaps it's because I'm exhausted, but I can't get it to match. Any help, even pointers would be greatly appreciated.
Edit: To clarify, I do not want to match data, data but just get the index of the date its self.
The ^ sign matches the start of the string and $ matches the end. Removing those allows the pattern to match dates within the string.
Like this:
"(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)\\d\\d(?:,)"
This will match your date:
[\d]{2}/[\d]{2}/[\d]{4}
In what you posted, you made at least one error: Only matches a date at the start of the string.
String ResultString = null;
try {
Pattern regex = Pattern.compile("\\b[0-9]{2}/[0-9]{2}/[0-9]{4}\\b");
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
ResultString = regexMatcher.group();
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}
Unless I am overlooking something this should match your date.
See it working here : http://ideone.com/HETGU

Categories

Resources