Make regex for url in java

Make regex for url in java - java

Given a string of type :
https://www.abcd.efg/try-till-you-succedd.html
So , I want a regex that give me data from second last '-' , that is you-succedd.html in this case.
public static void main(String[] args)
{
Pattern p = Pattern.compile(".*-\\s*(.*)");
Matcher m = p.matcher("https://www.abcd.efg/try-till-you-succedd.html");
if (m.find())
System.out.println(m.group(1));
}
But it gives success.html only. Please help

Here is a regex you can use
Pattern p = Pattern.compile("-([^-]*-[^-]*$)");
Matcher m = p.matcher("https://www.abcd.efg/try-till-you-succedd.html");
if (m.find())
System.out.println(m.group(1));
See IDEONE demo
Output: you-succedd.html
Regex means...:
- - a literal hyphen
([^-]*-[^-]*$) - a capturing group that will hold the value we need that matches...
[^-]* - 0 or more characters other than a hyphen
- - a hyphen
[^-]*$ - - 0 or more characters other than a hyphen until the end of string ($).
Note that you can add \.html before $ if you want to restrict the matches to strings that end with .html.
UPDATE
To obtain only you-succedd, you can use
String pattern = "-([^-]*-[^-]*)\\.[^.\s-]+$";
Or
String pattern = "-([^-]*-[^-]*)\\.\\w+$";
See a regex demo 1 and demo 2

simply you can use like this
.*-(.*-.*.html)$

Related

Parse string using Java Regex Pattern?

I have the below java string in the below format.
String s = "City: [name:NYK][distance:1100] [name:CLT][distance:2300] [name:KTY][distance:3540] Price:"
Using the java.util.regex package matter and pattern classes I have to get the output string int the following format:
Output: [NYK:1100][CLT:2300][KTY:3540]
Can you suggest a RegEx pattern which can help me get the above output format?

You can use this regex \[name:([A-Z]+)\]\[distance:(\d+)\] with Pattern like this :
String regex = "\\[name:([A-Z]+)\\]\\[distance:(\\d+)\\]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(s);
StringBuilder result = new StringBuilder();
while (matcher.find()) {
result.append("[");
result.append(matcher.group(1));
result.append(":");
result.append(matcher.group(2));
result.append("]");
}
System.out.println(result.toString());
Output
[NYK:1100][CLT:2300][KTY:3540]
regex demo
\[name:([A-Z]+)\]\[distance:(\d+)\] mean get two groups one the upper letters after the \[name:([A-Z]+)\] the second get the number after \[distance:(\d+)\]
Another solution from #tradeJmark you can use this regex :
String regex = "\\[name:(?<name>[A-Z]+)\\]\\[distance:(?<distance>\\d+)\\]";
So you can easily get the results of each group by the name of group instead of the index like this :
while (matcher.find()) {
result.append("[");
result.append(matcher.group("name"));
//----------------------------^^
result.append(":");
result.append(matcher.group("distance"));
//------------------------------^^
result.append("]");
}

If the format of the string is fixed, and you always have just 3 [...] groups inside to deal with, you may define a block that matches [name:...] and captures the 2 parts into separate groups and use a quite simple code with .replaceAll:
String s = "City: [name:NYK][distance:1100] [name:CLT][distance:2300] [name:KTY][distance:3540] Price:";
String matchingBlock = "\\s*\\[name:([A-Z]+)]\\[distance:(\\d+)]";
String res = s.replaceAll(String.format(".*%1$s%1$s%1$s.*", matchingBlock),
"[$1:$2][$3:$4][$5:$6]");
System.out.println(res); // [NYK:1100][CLT:2300][KTY:3540]
See the Java demo and a regex demo.
The block pattern matches:
\\s* - 0+ whitespaces
\\[name: - a literal [name: substring
([A-Z]+) - Group n capturing 1 or more uppercase ASCII chars (\\w+ can also be used)
]\\[distance: - a literal ][distance: substring
(\\d+) - Group m capturing 1 or more digits
] - a ] symbol.
In the .*%1$s%1$s%1$s.* pattern, the groups will have 1 to 6 IDs (referred to with $1 - $6 backreferences from the replacement pattern) and the leading and final .* will remove start and end of the string (add (?s) at the start of the pattern if the string can contain line breaks).

Regex including date string, email, number

I have this regex expression:
String patt = "(\\w+?)(:|<|>)(\\w+?),";
Pattern pattern = Pattern.compile(patt);
Matcher matcher = pattern.matcher(search + ",");
I am able to match a string like
search = "firstName:Giorgio"
But I'm not able to match string like
search = "email:giorgio.rossi#libero.it"
or
search = "dataregistrazione:27/10/2016"
How I should modify the regex expression in order to match these strings?

You may use
String pat = "(\\w+)[:<>]([^,]+)"; // Add a , at the end if it is necessary
See the regex demo
Details:
(\w+) - Group 1 capturing 1 or more word chars
[:<>] - one of the chars inside the character class, :, <, or >
([^,]+) - Group 2 capturing 1 or more chars other than , (in the demo, I added \n as the demo input text contains newlines).

You can use regex like this:
public static void main(String[] args) {
String[] arr = new String[]{"firstName:Giorgio", "email:giorgio.rossi#libero.it", "dataregistrazione:27/10/2016"};
String pattern = "(\\w+[:|<|>]\\w+)|(\\w+:\\w+\\.\\w+#\\w+\\.\\w+)|(\\w+:\\d{1,2}/\\d{1,2}/\\d{4})";
for(String str : arr){
if(str.matches(pattern))
System.out.println(str);
}
}
output is:
firstName:Giorgio
email:giorgio.rossi#libero.it
dataregistrazione:27/10/2016
But you have to remember that this regex will work only for your format of data. To make up the universal regex you should use RFC documents and articles (i.e here) about email format. Also this question can be useful.
Hope it helps.

The Character class \w matches [A-Za-z0-9_]. So kindly change the regex as (\\w+?)(:|<|>)(.*), to match any character from : to ,.
Or mention all characters that you can expect i.e. (\\w+?)(:|<|>)[#.\\w\\/]*, .

Java Regex to extract substring with optional trailing slash

Regex:
\/test\/(.*|\/?)
Input
/something/test/{abc}/listed
/something/test/{abc}
Expected
{abc} for both the inputs

You need to capture all characters other than / after /test/:
String s = "/something/test/{abc}/listed";
Pattern pattern = Pattern.compile("/test/([^/]+)"); // or "/test/\\{([^/}]+)"
Matcher matcher = pattern.matcher(s);
if (matcher.find()){
System.out.println(matcher.group(1));
}
See the online demo
Details:
/test/ - matches /test/
([^/]+) - matches and captures into Group 1 one or more (+) (but as many as possible, since + is greedy) characters other than / (due to the negated character class [^/]).
Note that in Java regex patterns you do not need to escape / since it is not a special character and one needs no regex delimiters.

This should work for you :
public static void main(String[] args) {
String s1 = "/something/test/{abc}/listed";
String s2 = "/something/test/{abc}";
System.out.println(s1.replaceAll("[^{]+(\\{\\w+\\}).*", "$1"));
System.out.println(s2.replaceAll("[^{]+(\\{\\w+\\}).*", "$1"));
}
O/P :
{abc}
{abc}

Regex (as Java string, that is with doubled backslashes):
".*\\/test\\/([^/]*).*"

What is wrong in regexp in Java

I want to get the word text2, but it returns null. Could you please correct it ?
String str = "Text SETVAR((&&text1 '&&text2'))";
Pattern patter1 = Pattern.compile("SETVAR\\w+&&(\\w+)'\\)\\)");
Matcher matcher = patter1.matcher(str);
String result = null;
if (matcher.find()) {
result = matcher.group(1);
}
System.out.println(result);

One way to do it is to match all possible pattern in parentheses:
String str = "Text SETVAR((&&text1 '&&text2'))";
Pattern patter1 = Pattern.compile("SETVAR[(]{2}&&\\w+\\s*'&&(\\w+)'[)]{2}");
Matcher matcher = patter1.matcher(str);
String result = "";
if (matcher.find()) {
result = matcher.group(1);
}
System.out.println(result);
See IDEONE demo
You can also use [^()]* inside the parentheses to just get to the value inside single apostrophes:
Pattern patter1 = Pattern.compile("SETVAR[(]{2}[^()]*'&&(\\w+)'[)]{2}");
^^^^^^
See another demo
Let me break down the regex for you:
SETVAR - match SETVAR literally, then...
[(]{2} - match 2 ( literally, then...
[^()]* - match 0 or more characters other than ( or ) up to...
'&& - match a single apostrophe and two & symbols, then...
(\\w+) - match and capture into Group 1 one or more word characters
'[)]{2} - match a single apostrophe and then 2 ) symbols literally.

Your regex doesn't match your string, because you didn't specify the opened parenthesis also \\w+ will match any combinations of word character and it won't match space and &.
Instead you can use a negated character class [^']+ which will match any combinations of characters with length 1 or more except one quotation :
String str = "Text SETVAR((&&text1 '&&text2'))";
"SETVAR\\(\\([^']+'&&(\\w+)'\\)\\)"
Debuggex Demo

Regular Expression strings in Java

I want to use a regular expression that extracts a substring with the following properties in Java:
Beginning of the substring begins with 'WWW'
The end of the substring is a colon ':'
I have some experience in SQL with using the Like clause such as:
Select field1 from A where field2 like '%[A-Z]'
So if I were using SQL I would code:
like '%WWW%:'
How can I start this in Java?

Pattern p = Pattern.compile("WWW.*:");
Matcher m = p.matcher("zxdfefefefWWW837eghdehgfh:djf");
while (m.find()){
System.out.println(m.group());
}

Here's a different example using substring.
public static void main(String[] args) {
String example = "http://www.google.com:80";
String substring = example.substring(example.indexOf("www"), example.lastIndexOf(":"));
System.out.println(substring);
}

If you want to match only word character and ., then you may want to use the regular expression as "WWW[\\w.]+:"
Pattern p = Pattern.compile("WWW[\\w.]+:");
Matcher m = p.matcher("WWW.google.com:hello");
System.out.println(m.find()); //prints true
System.out.println(m.group()); // prints WWW.google.com:
If you want to match any character, then you may want to use the regular expression as "WWW[\\w\\W]+:"
Pattern p = Pattern.compile("WWW[\\w\\W]+:");
Matcher m = p.matcher("WWW.googgle_$#.com:hello");
System.out.println(m.find());
System.out.println(m.group());
Explanation: WWW and : are literals. \\w - any word character i.e. a-z A-Z 0-9. \\W - Any non word character.

If I understood it right
String input = "aWWW:bbbWWWa:WWW:aWWWaaa:WWWa:WWWabc:WWW:";
Pattern p = Pattern.compile("WWW[^(WWW)|^:]*:");
Matcher m = p.matcher(input);
while(m.find()) {
System.out.println(m.group());
}
Output:
WWW:
WWWa:
WWW:
WWWaaa:
WWWa:
WWWabc:
WWW:

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Make regex for url in java - java

simply you can use like this .-(.-.*.html)$

Related

Parse string using Java Regex Pattern?

Regex including date string, email, number

Java Regex to extract substring with optional trailing slash

What is wrong in regexp in Java

Regular Expression strings in Java

Categories

Resources

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Make regex for url in java - java

simply you can use like this .*-(.*-.*.html)$

Related

Parse string using Java Regex Pattern?

Regex including date string, email, number

Java Regex to extract substring with optional trailing slash

What is wrong in regexp in Java

Regular Expression strings in Java

Categories

Resources

simply you can use like this .-(.-.*.html)$