Regular Expression strings in Java - java

I want to use a regular expression that extracts a substring with the following properties in Java:
Beginning of the substring begins with 'WWW'
The end of the substring is a colon ':'
I have some experience in SQL with using the Like clause such as:
Select field1 from A where field2 like '%[A-Z]'
So if I were using SQL I would code:
like '%WWW%:'
How can I start this in Java?

Pattern p = Pattern.compile("WWW.*:");
Matcher m = p.matcher("zxdfefefefWWW837eghdehgfh:djf");
while (m.find()){
System.out.println(m.group());
}

Here's a different example using substring.
public static void main(String[] args) {
String example = "http://www.google.com:80";
String substring = example.substring(example.indexOf("www"), example.lastIndexOf(":"));
System.out.println(substring);
}

If you want to match only word character and ., then you may want to use the regular expression as "WWW[\\w.]+:"
Pattern p = Pattern.compile("WWW[\\w.]+:");
Matcher m = p.matcher("WWW.google.com:hello");
System.out.println(m.find()); //prints true
System.out.println(m.group()); // prints WWW.google.com:
If you want to match any character, then you may want to use the regular expression as "WWW[\\w\\W]+:"
Pattern p = Pattern.compile("WWW[\\w\\W]+:");
Matcher m = p.matcher("WWW.googgle_$#.com:hello");
System.out.println(m.find());
System.out.println(m.group());
Explanation: WWW and : are literals. \\w - any word character i.e. a-z A-Z 0-9. \\W - Any non word character.

If I understood it right
String input = "aWWW:bbbWWWa:WWW:aWWWaaa:WWWa:WWWabc:WWW:";
Pattern p = Pattern.compile("WWW[^(WWW)|^:]*:");
Matcher m = p.matcher(input);
while(m.find()) {
System.out.println(m.group());
}
Output:
WWW:
WWWa:
WWW:
WWWaaa:
WWWa:
WWWabc:
WWW:

Related

Simple java pattern matching issue?

I would like to test if a string contains insert and name, with any interceding characters. And if it does, I would like to print the match.
For the below code, only the third Pattern matches, and the entire line is printed. How can I match only insert...name?
String x = "aaa insert into name sdfdf";
Matcher matcher = Pattern.compile("insert.*name").matcher(x);
if (matcher.matches())
System.out.print(matcher.group(0));
matcher = Pattern.compile(".*insert.*name").matcher(x);
if (matcher.matches())
System.out.print(matcher.group(0));
matcher = Pattern.compile(".*insert.*name.*").matcher(x);
if (matcher.matches())
System.out.print(matcher.group(0));
try to use group like this .*(insert.*name).*
Matcher matcher = Pattern.compile(".*(insert.*name).*").matcher(x);
if (matcher.matches()) {
System.out.print(matcher.group(1));
//-----------------------------^
}
Or in your case you can just use :
x = x.replaceAll(".*(insert.*name).*", "$1");
Both of them print :
insert into name
You just need to use find() instead of matches() in your code:
String x = "aaa insert into name sdfdf";
Matcher matcher = Pattern.compile("insert.*?name").matcher(x);
if (matcher.find())
System.out.print(matcher.group(0));
matches() expects you to match entire input string whereas find() lets you match your regex anywhere in the input.
Also suggest you to use .*? instead of .*, in case your input may contain multiple instances of index ... name pairs.
This code sample will output:
insert into name
Just use multiple positive lookaheads:
(?=.*insert)(?=.*name).+
See a demo on regex101.com.

How to match a String in a line having immediate special character?

I have given one condition like below which can not able to match line like from table1; or insert into table1(col1,col2 ..)
if(Arrays.asList(line.split("\"")).contains("table1")) ||
Arrays.asList(line.split(" ")).contains("table1"))
System.out.println(line);
Which logic i need to follow ?
Use a regular expression and place all the special characters which you need to split inside that expression.
if(Arrays.asList(line.split("[\",\s\.]").contains("table1"))
Use a regex match as below
if(Arrays.asList(line.split("[\", .]").contains("table1"))
Note that you can put whatever characters you want to split the line against in the square brackets.
You can use regex:
Pattern pat = Pattern.compile("(?<!\\p{L})table1(?!\\p{L})");
if (pat.matcher(line).find())
{
System.out.println(line);
}
If I understand your question properly, you can achieve it without using Splits:
String stringPattern = ".*table1.*";
Pattern pattern = Pattern.compile(stringPattern);
Matcher matcher = pattern.matcher(line);
if (matcher.matches())
System.out.println(line);
You can use a regexp with negative lookahed and negative lookbehind:
String input = "from table1;";
Pattern p = Pattern.compile("(?<![a-zA-Z0-9_])table1(?![a-zA-Z0-9_])");
Matcher matcher = p.matcher(input);
if (matcher.find())
System.out.println(input);
This will match any "table1" occurences where it is not preceded or followed by any letters, numbers or _ sign.
Try this:
if (Arrays.asList(list.split("[^a-zA-Z0-9_]")).contains("table1")) {
System.out.println(list);
}
Or as RealSkeptic suggests use regular expression matching:
if (list.matches(".*\\btable1\\b.*")) {
System.out.println(list);
}

Make regex for url in java

Given a string of type :
https://www.abcd.efg/try-till-you-succedd.html
So , I want a regex that give me data from second last '-' , that is you-succedd.html in this case.
public static void main(String[] args)
{
Pattern p = Pattern.compile(".*-\\s*(.*)");
Matcher m = p.matcher("https://www.abcd.efg/try-till-you-succedd.html");
if (m.find())
System.out.println(m.group(1));
}
But it gives success.html only. Please help
Here is a regex you can use
Pattern p = Pattern.compile("-([^-]*-[^-]*$)");
Matcher m = p.matcher("https://www.abcd.efg/try-till-you-succedd.html");
if (m.find())
System.out.println(m.group(1));
See IDEONE demo
Output: you-succedd.html
Regex means...:
- - a literal hyphen
([^-]*-[^-]*$) - a capturing group that will hold the value we need that matches...
[^-]* - 0 or more characters other than a hyphen
- - a hyphen
[^-]*$ - - 0 or more characters other than a hyphen until the end of string ($).
Note that you can add \.html before $ if you want to restrict the matches to strings that end with .html.
UPDATE
To obtain only you-succedd, you can use
String pattern = "-([^-]*-[^-]*)\\.[^.\s-]+$";
Or
String pattern = "-([^-]*-[^-]*)\\.\\w+$";
See a regex demo 1 and demo 2
simply you can use like this
.*-(.*-.*.html)$

Splitting and Parsing formula String

I have below formula
(Trig01:BAO)/(((Trig01:COUNT*86400)-Trig01:UPI-Trig01:SOS)*2000)
I want to split and get output of staring values which are before colon only,
Final output need as -
{ "BAO","COUNT","UPI","SOS" }
Thanks in advance,
You can try with Positive Lookbehind in below regex pattern to get all the alphanumeric character after colon
(?<=:)[^\W]+
Online demo
Pattern explanation:
(?<= look behind to see if there is:
: ':'
) end of look-behind
[^\W]+ any character except: non-word characters
(all but a-z, A-Z, 0-9, _) (1 or more times)
Sample code:
String str="(Trig01:BAO)/(((Trig01:COUNT*86400)-Trig01:UPI-Trig01:SOS)*2000)";
Pattern p=Pattern.compile("(?<=:)[^\\W]+");
Matcher m=p.matcher(str);
while(m.find()){
System.out.println(m.group());
}
Use Regex, try this:
public static List<String> extractSubstringsFromAllMatches(String sourceString, String pattern) {
Pattern regexPattern = Pattern.compile(pattern);
Matcher matcher = regexPattern.matcher(sourceString);
List<String> matches = new ArrayList<String>();
while (matcher.find()) {
matches.add(matcher.group(1));
}
return matches;
}
Get the results you require by calling:
extractSubstringsFromAllMatches(YourString,":(\\w*)\\W")
Try this one-line solution:
String[] arr = str.replaceAll("^.*?(?=\\w+:)|:[^:]*$", "").split(":.*?(?=\\w+(:|$))");
This works by first stripping off the leading and trailing non-target chars, then splitting on the intervening chars. Matching is done using look aheads, which assert, but font capture, that a word followed by a colon follows.
Here's some test code:
String str = "(Trig01:BAO)/(((Trig02:COUNT*86400)-Trig03:UPI-Trig04:SOS)*2000)";
String[] arr = str.replaceAll("^.*?(?=\\w+:)|:[^:]*$", "").split(":.*?(?=\\w+(:|$))");
System.out.println(Arrays.toString(arr));
Output:
[Trig01, Trig02, Trig03, Trig04]

java Pattern Matching issue

I have an issue to write proper regex to match URL.
String input = "AAAhttp://www.gmail.comBBBBabc#gmail.com"
String regex = "www.*.com" // To match www.gmail.com URL
Pattern p = Pattern.compile(regex)
Matcher m = p.matcher(input)
while(m.find()){
}
Here I want to remove the Url www.gmail.com. However it matches till end of string to match email address also which ends with gmail.com.
Can someone help me to get proper regex to match only the URL?
.* does a greedy match. You have to add ? after * to does an reluctant match.
"www\\..*?\\.com"
Your code would be,
String s = "AAAhttp://www.gmail.comBBBBabc#gmail.com";
Pattern p = Pattern.compile("www\\..*?\\.com");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(0));
}
IDEONE
String regex = "www\\..*?\\.com"
Non-greedy repetition of the wildcard '.' and escape dot when literally
A negated character class is faster than .*?
Use this regex:
www\.[^.]+\.com
[^.]+ means any character that is not a dot.
In Java we need to escape some characters:
// for instance
Pattern regex = Pattern.compile("www\\.[^.]+\\.com");
// etc

Categories

Resources