How to use REGEX to validate EditText? - java

I want to validate a username against these requirements:
Just accept character or digital
At least one character
I tried with
public boolean validateFormat(String input){
return Pattern.compile("^[A-Za-z0-9]+$").matcher(input).matches();
}
How can I do this one?

Try with this regex:
^(\w|\d)+$
^ indicates the start of the string
$ indicates the end of the string
\w means any word character
\d means any digit
| is the logical OR operator
Anyway, i suggest you to use an online regex tester like regex101.com .It is very helpful to quickly test regular expressions.
Hope it can help!
== UPDATE ==
In Java code:
final String regex = "^(\\w|\\d)+$";
final String string = "myCoolUsername12";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
if(matcher.matches()) {
// if you are interested only in matching the full regex
}
// Otherwise, you can iterate over the matched groups (including the full match)
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}

/^[A-Za-z0-9]+(?:[ _-][A-Za-z0-9]+)*$/

Related

use regex to map TY_111.22-L007-C010 from a text

I want to get TY_111.22-L007-C010,Tzo11-L010-C100 and Tff-L010-C110 from this string with regex
"12.5*MAX(\"TY_111.22-L007-C010\";\"Tzo11-L010-C100\";\"Tff-L010-C110\")
I tested this T.*-L\d*-C\d* but it don't give the result I want :
My code java for test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "T.*-L\\d*-C\\d*";
final String string = "\"12.5*MAX(\\\"TY_111.22-L007-C010\\\";\\\"Tzo11-L010-C100\\\";\\\"Tff-L010-C110\\\"";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
You need to use this regex T.*?\-L\d*?\-C\d*
final String regex = "T.*?\\-L\\d*?\\-C\\d*";
Note: you need to escape the hyphens \- and use non-greedy quantifier .*? instead of .*, also you can use only matcher.group() instead of matcher.group(0), in your regex you don't have any groups, so the 0 is useless.
Outputs
Full match: TC_24.00-L010-C090
Full match: TC_24.00-L010-C100
Full match: TC_24.00-L010-C110
Why use a verbose regex pattern matcher when you can handle the problem with one line of code:
String input = "12.5*MAX(\"Txxxx-L007-C010\";\"Txxxx-L010-C100\";\"Txxxx-L010-C110\")";
String[] matches = input.replaceAll("^.*?\"|\"[^\"]*$", "")
.split("\";\"");
System.out.println(Arrays.toString(matches));
This prints:
[Txxxx-L007-C010, Txxxx-L010-C100, Txxxx-L010-C110]
OK...I used three lines of code, but the first and third are just for setting up the data and printing it.

Java Regex Alphabet Followed By Number

I am getting file names as string as follows:
file_g001
file_g222
g_file_z999
I would like to return files that contains "g_x" where x is any number (as string). Note that the last file should not appear as the g_ is followed by an alphabet and not a number like the first 2.
I tried: file.contains("_g[0-9]*$") but this didn't work.
Expected results:
file_g001
file_g222
Are you using the method contains of String ?
If so, it does not work with regular expression.
https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#contains-java.lang.CharSequence-
public boolean contains(CharSequence s)
Returns true if and only if this string contains the specified sequence of char values.
Consider using the method matches.
https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#matches(java.lang.String)
Your regular expression is also fine, we'd just slightly improve that to:
^.*_g[0-9]+$
or
^.*_g\d+$
and it would likely work.
The expression is explained on the top right panel of this demo if you wish to explore/simplify/modify it.
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "^.*_g[0-9]+$";
final String string = "file_g001\n"
+ "file_g222\n"
+ "file_some_other_words_g222\n"
+ "file_g\n"
+ "g_file_z999";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}

Get text in the URL with dynamic date - Regex Java

I need to get the text between the URL which has a date in Java
Input 1:
/test1/raw/2019-06-11/testcustomer/usr/pqr/DATA/mn/export/
Output: testcustomer
Only /raw/ remains, date will change and testcustomer will change
Input 2:
/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/
Output: newcustomer
String url = "/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/";
String customer = getCustomer(url);
public String getCustomer (String _url){
String source = "default";
String regex = basePath + "/raw/\\d{4}-\\d{2}-\\d{2}/usr*";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(_url);
if (m.find()) {
source = m.group(1);
} else {
logger.error("Cant get customer with regex " + regex);
}
return source;
}
It's returning 'default' :(
Your regex /raw/\\d{4}-\\d{2}-\\d{2}/usr* is missing the part for the value you want, you need a regex that find the date, and keep what's next :
/\w*/raw/[0-9-]+/(\w+)/.* or (?<=raw\/\d{4}-\d{2}-\d{2}\/)(\w+) will be good
Pattern p = Pattern.compile("/\\w*/raw/[0-9-]+/(\\w+)/.*");
Matcher m = p.matcher(str);
if (m.find()) {
String value = m.group(1);
System.out.println(value);
}
Or if it's always the 4th part, use split()
String value = str.split("/")[4];
System.out.println(value);
And here a >> code demo
Here, we can likely use raw followed by the date as a left boundary, then we would collect our desired output in a capturing group, we would add an slash and consume the rest of our string, with an expression similar to:
.+raw\/[0-9]{4}-[0-9]{2}-[0-9]{2}\/(.+?)\/.+
Demo
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = ".+raw\\/[0-9]{4}-[0-9]{2}-[0-9]{2}\\/(.+?)\\/.+";
final String string = "/test1/raw/2019-06-11/testcustomer/usr/pqr/DATA/mn/export/\n"
+ "/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
RegEx
If this expression wasn't desired or you wish to modify it, please visit regex101.com.
RegEx Circuit
jex.im visualizes regular expressions:

RegEx for matching mp3 URLs

How can I get an mp3 url with REGEX?
This mp3 url, for example:
https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3
This is a what I've tried so far but I want it to only accept a url with '.mp3' on the end.
(https?|ftp|file)://[-a-zA-Z0-9+&##/%?=~_|!:,.;]*[-a-zA-Z0-9+&##/%=~_|]
This expression would likely pass your desired inputs:
^(https?|ftp|file):\/\/(www.)?(.*?)\.(mp3)$
If you wish to add more boundaries to it, you can do that. For instance, you can add a list of chars instead of .*.
I have added several capturing groups, just to be simple to call, if necessary.
RegEx
If this wasn't your desired expression, you can modify/change your expressions in regex101.com.
RegEx Circuit
You can also visualize your expressions in jex.im:
const regex = /^(https?|ftp|file):\/\/(www.)?(.*?)\.(mp3)$/gm;
const str = `https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3
http://soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3
http://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3
ftp://soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3
file://localhost/examples/mp3/SoundHelix-Song-1.mp3
file://localhost/examples/mp3/SoundHelix-Song-1.wav
file://localhost/examples/mp3/SoundHelix-Song-1.avi
file://localhost/examples/mp3/SoundHelix-Song-1.m4a`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
Java Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "^(https?|ftp|file):\\/\\/(www.)?(.*?)\\.(mp3)$";
final String string = "https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3\n"
+ "http://soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3\n"
+ "http://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3\n"
+ "ftp://soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3\n"
+ "file://localhost/examples/mp3/SoundHelix-Song-1.mp3\n"
+ "file://localhost/examples/mp3/SoundHelix-Song-1.wav\n"
+ "file://localhost/examples/mp3/SoundHelix-Song-1.avi\n"
+ "file://localhost/examples/mp3/SoundHelix-Song-1.m4a";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
If you want it to match inputs ending with '.mp3' you should add \.mp3$ at the end of your regex.
$ indicates the end of your expression
(https?|ftp|file):\/\/[-a-zA-Z0-9+&##\/%?=~_|!:,.;]*[-a-zA-Z0-9+&##\/%=~_|]\.mp3$
Matching:
https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3 **=> Match**
https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp4 **=> No Match**
You could use anchors to assert the start ^ and the end $ of the string and end the pattern with .mp3:
^https?://\S+\.mp3$
Explanation
^ Assert start of string
https?:// Match http with optional s and ://
\S+ Match 1+ times a non whitespace char
\.mp3 Match .mp3
$ Assert end of string
Regex demo | Java demo
For example:
String regex = "^https?://\\S+\\.mp3$";
String[] strings = {
"https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3",
"https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp4"
};
Pattern pattern = Pattern.compile(regex);
for (String s : strings) {
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(0));
}
}
Result
https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3

How do I escape '+' in pattern matching to highlight keyword?

I'm implementing a keyword highlighter in Java. I'm using java.util.regex.Pattern to highlight (making bold) keyword within String content. The following piece of code is working fine for alphanumeric keywords, but it is not working for some special characters. For example, in String content, I would like to highlight the keyword c++ which has the special character + (plus), but it's not getting highlighted properly. How do I escape + character so that c++ is highlighted?
public static void main(String[] args)
{
String content = "java,c++,ejb,struts,j2ee,hibernate";
System.out.println("CONTENT: " + content);
String highlight = "C++";
System.out.println("HIGHLIGHT KEYWORD: " + highlight);
//highlight = highlight.replaceAll(Pattern.quote("+"), "\\\\+");
java.util.regex.Pattern pattern = java.util.regex.Pattern.compile("\\b" + highlight + "\\b", java.util.regex.Pattern.CASE_INSENSITIVE);
System.out.println("PATTERN: " + pattern.pattern());
java.util.regex.Matcher matcher = pattern.matcher(content);
while (matcher.find()) {
System.out.println("Match found!!!");
for (int i = 0; i <= matcher.groupCount(); i++) {
System.out.println(matcher.group(i));
content = matcher.replaceAll("<B>" + matcher.group(i) + "</B>");
}
}
System.out.println("RESULT: " + content);
}
Output:
CONTENT: java,c++,ejb,struts,j2ee,hibernate
HIGHLIGHT KEYWORD: C++
PATTERN: \bC++\b
Match found!!!
c
RESULT: java,c++,ejb,struts,j2ee,hibernate
I even tried to escape '+' before calling Pattern.compile like this,
highlight = highlight.replaceAll(Pattern.quote("+"), "\\\\+");
but still I'm not able to get the syntax right. Can somebody help me solve this?
This should do what you need:
Pattern pattern = Pattern.compile(
"\\b"
+ Pattern.quote(highlight)
+ "\\b",
Pattern.CASE_INSENSITIVE);
Update: you are right, the above doesn't work for C++ (\b matches word boundaries and doesn't recognize ++ as a word). We need a more complicated solution:
Pattern pattern = Pattern.compile(
"\\b"
+ Pattern.quote(highlight)
+ "(?![^\\p{Punct}\\s])", // matches if the match is not followed by
// anything other than whitespace or punctuation
Pattern.CASE_INSENSITIVE);
Update in response to comments: it seems that you need more logic in your pattern creation. Here's a helper method to create the pattern for you:
private static final String WORD_BOUNDARY = "\\b";
// edit this to suit your neds:
private static final String ALLOWED = "[^,.!\\-\\s]";
private static final String LOOKAHEAD = "(?!" + ALLOWED + ")";
private static final String LOOKBEHIND = "(?<!" + ALLOWED + ")";
public static Pattern createHighlightPattern(final String highlight) {
final Pattern pattern = Pattern.compile(
(Character.isLetterOrDigit(highlight.charAt(0))
? WORD_BOUNDARY : LOOKBEHIND)
+ Pattern.quote(highlight)
+ (Character.isLetterOrDigit(highlight.charAt(highlight.length() - 1))
? WORD_BOUNDARY : LOOKAHEAD),
Pattern.CASE_INSENSITIVE);
return pattern;
}
And here is some test code to check that it works:
private static void testMatch(final String haystack, final String needle) {
final Matcher matcher = createHighlightPattern(needle).matcher(haystack);
if (!matcher.find())
System.out.println("Failed to find pattern " + needle);
while (matcher.find())
System.out.println("Found additional match: " + matcher.group() +
" for pattern " + needle);
}
public static void main(final String[] args) {
final String testString = "java,c++,hibernate,.net,asp.net,c#,spring";
testMatch(testString, "java");
testMatch(testString, "c++");
testMatch(testString, ".net");
testMatch(testString, "c#");
}
When I run this method, I don't see any output (which is good :-))
The problem is that the \b word boundary anchor is not matching, because + is a non word character and I assume there is a whitespace following that is also a non word character.
A word boundary \b is matching a change from a word character (Member in \w) to a non word character (no member of \w).
Also if you want to match a + literally you have to escape it. Here you are searching for C++ that means match at least one C and the ++ is a possessive quantifier matching at least 1 C and does not backtrack.
Try changing your pattern to something like this
java.util.regex.Pattern.compile("\\b" + highlight + "(?=\s)", java.util.regex.Pattern.CASE_INSENSITIVE);
(?=\s) is a positive lookahead that will check if there is a whitespace following your highlight
Additionally you will need to esacape the + your are searching for.
All you need is here :
Pattern.compile("\\Q"+highlight+"\\E", java.util.regex.Pattern.CASE_INSENSITIVE);
Assuming your keyword does not begin or end with punctuation, here is a commented regex which uses lookahead and lookbehind to achieve your desired matching behavior:
// Compile regex to match a keyword or keyphrase.
java.util.regex.Pattern pattern = java.util.regex.Pattern.compile(
"(?<=[\\s'\".?!,;:]|^) # Word preceded by ws, quote, punct or BOS.\n" +
// Escape any regex metacharacters in the keyword phrase.
java.util.regex.Pattern.quote(highlight) + " # Keyword to be matched.\n" +
"(?=[\\s'\".?!,;:]|$) # Word followed by ws, quote, punct or EOS.",
Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE | Pattern.COMMENTS);
Note that this solution works even if your keyword is a phrase containing spaces.

Categories

Resources