Regex to split the first from a "/token1/token2/token3"

Regex to split the first from a "/token1/token2/token3" - java

I'm pretty rusty with regex, but I have the requirement to extract the first token of the following string:
Input: /token1/token2/token3
Required output: /token1
I have tried:
List<String> connectorPath = Splitter.on("^[/\\w+]+")
.trimResults()
.splitToList(actionPath);
Doesn't work for me, any ideas?

Instead of split, you can match
^/\\w+
Or if the string has 3 parts, use a capture group for the first part.
^(/\\w+)/\\w+/\\w+$
Java example
Pattern pattern = Pattern.compile("^/\\w+");
Matcher matcher = pattern.matcher("/token1/token2/token3");
if (matcher.find()) {
System.out.println(matcher.group(0));
}
Output
/token1

You can split on the / that is not at the string start using the (?!^)/ regex:
String[] res = "/token1/token2/token3".split("(?!^)/");
System.out.println(res[0]); // => /token1
See the Java code demo and the regex demo.
(?!^) - a negative lookahead that matches a location not at the start of string
/ - a / char.
Using Guava:
Splitter splitter = Splitter.onPattern("(?!^)/").trimResults();
Iterable<String> iterable = splitter.split(actionPath);
String first = Iterables.getFirst(iterable, "");

You are over-complicating it.
Try the following regular expression: ^(\/\w+)(.+)$
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class PathSplitter {
public static void main(String args[]) {
String input = "/token1/token2/token3";
Pattern pattern = Pattern.compile("^(\\/\\w+)(.+)$");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
System.out.println(matcher.group(1)); // /token1
System.out.println(matcher.group(2)); // /token2/token3
} else {
System.out.println("NO MATCH");
}
}
}

Related

How to extract a substring from a sentence until a delimeter

Hi I have some string like this:
location/city/home-a-berlin?/someNewAdress
I want to extract word berlin which placed between "-a-" and "?". How can i do that with regex in java?
I can do it by using string API but kinda stuck with regex.
String cityName = url.substring(url.lastIndexOf("-a-")+3, url.indexOf('?')) //berlin

You can use a capture group with a negated character class.
-a-([^\?]+)\?
Regex demo | Java demo
In Java:
String regex = "-a-([^\\?]+)\\?";
String string = "location/city/home-a-berlin?/someNewAdress\n";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
Output
berlin

Or
s = s.replaceAll(".*-(.*?)\\?.*", "$1");

Alternative regex:
"-a-(.+?)\\?"
Regex in testbench and context:
public static void main(String[] args) {
String input1 = "location/city/home-a-berlin?/someNewAdress";
List<String> inputs = Arrays.asList(input1);
Pattern pattern = Pattern.compile("-a-(.+?)\\?");
List<String> results = inputs.stream().map(s -> pattern.matcher(s))
.filter(Matcher::find).map(m -> m.group(1)).collect(Collectors.toList());
//Output:
results.forEach(System.out::println);
}
Output:
berlin
Summary of regular-expression constructs:
https://docs.oracle.com/javase/10/docs/api/java/util/regex/Pattern.html

Regular expression in java (java String)

from this -> contractor:"Hi, this is \"Paul\", how are you?" client:"Hi ...." <-
I want to get just -> Hi, this is \"Paul\", how are you? <-
I need a regular expression in java to do that I try it but I m struggle with the inner quotation (\") is driving me mad.
Thanks for any hint.

Java supports lookbehinds, so vanilla regex:
"(.*?(?<!\\))"
Inside a Java string (see https://stackoverflow.com/a/37329801/1225328):
\"(.*?(?<!\\\\))\"
The actual text will be contained inside the first group of each match.
Demo: https://regex101.com/r/8OXujX/2
For example, in Java:
String regex = "\"(.*?(?<!\\\\))\"";
String input = "contractor:\"Hi, this is \\\"Paul\\\", how are you?\" client:\"Hi ....\"";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
if (matcher.find()) { // or while (matcher.find()) to iterate through all the matches
System.out.println(matcher.group(1));
} else {
System.out.println("No matches");
}
Prints:
Hi, this is \"Paul\", how are you?

The regexp should be like this: "(?:\\.|[^"\\])*"
Online demo
It uses non-capturing group ?:, matching any character . or a single character NOT in the list of double quote and backslash.

var text1 = "contractor:\"Hi, this is \\\"Paul\\\", how are you?\" client:\"Hi ....\" <-";
var regExWithQuotation = "contractor:(.+\".+\".+) client:";
Pattern p = Pattern.compile(regExWithQuotation);
var m = p.matcher(text1);
;
if (m.find()) {
var res = m.group(1);
System.out.println(res);
}
var regExWithoutQuotation = "contractor:\"(.+\".+\".+)?\" client:";
p = Pattern.compile(regExWithoutQuotation);
m = p.matcher(text1);
if (m.find()) {
var res = m.group(1);
System.out.println(res);
}
Output is:
"Hi, this is "Paul", how are you?"
Hi, this is "Paul", how are you?

You can use the regex, (?<=contractor:\").*(?=\" client:)
Description of the regex:
(?<=contractor:\") specifies positive lookbehind for contractor:\"
.* specifies any character
(?=\" client:) specifies positive lookahead for \" client:
In short, anything preceded by contractor:\" and followed by \" client:
Demo:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
String str = "contractor:\"Hi, this is \\\"Paul\\\", how are you?\" client:\"Hi ....\"";
String regex = "(?<=contractor:\").*(?=\" client:)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
}
Output:
Hi, this is \"Paul\", how are you?

How to preserve delimeters while using String.split() in Java?

String TextValue = "hello{MyVar} Discover {MyVar2} {MyVar3}";
String[] splitString = TextValue.split("\\{*\\}");
What I'm getting output is [{MyVar, {MyVar2, {MyVar3] in splitString
But my requirement is to preserve those delimiters {} i.e. [{MyVar}, {MyVar2}, {MyVar3}].
Required a way to match above output.

Use something like so:
Pattern p = Pattern.compile("(\\{\\w+\\})");
String str = ...
Matcher m = p.matcher(str);
while(m.find())
System.out.println(m.group(1));
Note, the code above is untested but that will look for words within curly brackets and place them in a group. It will then go over the string and output any string which matches the expression above.
An example of the regular expression is available here.

Thanks kelvin & npinti.
import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class CreateMatcherExample {
public static void main(String[] args) {
String TextValue = "hello{MyVar} Discover {My_Var2} {My_Var3}";
String patternString = "\\{\\w+\\}";
Pattern pattern = Pattern.compile(patternString);
Matcher matcher = pattern.matcher(TextValue);
while(matcher.find()) {
System.out.println(matcher.group());
}
}
}

Regex after a special character in Java

I am using regex in java to get a specific output from a list of rooms at my University.
A outtake from the list looks like this:
(A55:G260) Laboratorium 260
(A55:G292) Grupperom 292
(A55:G316) Grupperom 316
(A55:G366) Grupperom 366
(HDS:FLØYEN) Fløyen (appendix)
(ODO:PC-STUE) Pulpakammeret (PC-stue)
(SALEM:KONF) Konferanserom
I want to get the value that comes between the colon and the parenthesis.
The regex I am using at the moment is:
pattern = Pattern.compile("[:]([A-Za-z0-9ÆØÅæøå-]+)");
matcher = pattern.matcher(room.text());
I've included ÆØÅ, because some of the rooms have Norwegian letters in them.
Unfortunately the regex includes the building code also (e.g. "A55") in the output... Comes out like this:
A55
A55
A55
:G260
:G292
:G316
Any ideas on how to solve this?

The problem is not your regular expression. You need to reference group(1) for the match result.
while (matcher.find()) {
System.out.println(matcher.group(1));
}
However, you may consider using a negated character class instead.
pattern = Pattern.compile(":([^)]+)");

You can try a regex like this :
public static void main(String[] args) {
String s = "(HDS:FLØYEN) Fløyen (appendix)";
// select everything after ":" upto the first ")" and replace the entire regex with the selcted data
System.out.println(s.replaceAll(".*?:(.*?)\\).*", "$1"));
String s1 = "ODO:PC-STUE) Pulpakammeret (PC-stue)";
System.out.println(s1.replaceAll(".*?:(.*?)\\).*", "$1"));
}
O/P :
FLØYEN
PC-STUE

Can try with String Opreations as follows,
String val = "(HDS:FLØYEN) Fløyen (appendix)";
if(val.contains(":")){
String valSub = val.split("\\s")[0];
System.out.println(valSub);
valSub = valSub.substring(1, valSub.length()-1);
String valA = valSub.split(":")[0];
String valB = valSub.split(":")[1];
System.out.println(valA);
System.out.println(valB);
}
Output :
(HDS:FLØYEN)
HDS
FLØYEN

import java.util.regex.Matcher;
import java.util.regex.Pattern;
class test
{
public static void main( String args[] ){
// String to be scanned to find the pattern.
String line = "(HDS:FLØYEN) Fløyen (appendix)";
String pattern = ":([^)]+)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
while (m.find()) {
System.out.println(m.group(1));
}
}
}

Extracting Number from URL in Java via Regex

Take URL http://www.abc.com/alpha/beta/33445566778899/gamma/delta
i need to return the number 33445566778899 (with forward slashes removed, number is of variable length but between 10 & 20 digits)
Simple enough (or so i thought) except everything I've tried doesn't seem to work but why?
Pattern pattern = Pattern.compile("\\/([0-9])\\d{10,20}\\/");
Matcher matcher = pattern.matcher(fullUrl);
if (matcher.find()) {
return matcher.group(1);
}

Try this one-liner:
String number = url.replaceAll(".*/(\\d{10,20})/.*", "$1");

This regex works -
"\\/(\\d{10,20})\\/"
Testing it-
String fullUrl = "http://www.abc.com/alpha/beta/33445566778899/gamma/delta";
Pattern pattern = Pattern.compile("\\/(\\d{10,20})\\/");
Matcher matcher = pattern.matcher(fullUrl);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
OUTPUT - 33445566778899

Try,
String url = "http://www.abc.com/alpha/beta/33445566778899/gamma/delta";
String digitStr = null;
for(String str : url.split("/")){
System.out.println(str);
if(str.matches("[0-9]{10,20}")){
digitStr = str;
break;
}
}
System.out.println(digitStr);
Output:
33445566778899

Instead of saying it "doesn't seem to work", you should have given use what it was returning. Testing it confirmed what I thought: your code would return 3 for this input.
This is simply because your regexp as written will capture a digit following a / and followed by 10 to 20 digits themselves followed by a /.
The regex you want is "/(\\d{10,20})/" (you don't need to escape the /). Below you'll find the code I tested this with.
public static void main(String[] args) {
String src = "http://www.abc.com/alpha/beta/33445566778899/gamma/delta";
Pattern pattern = Pattern.compile("/(\\d{10,20})/");
Matcher matcher = pattern.matcher(src);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regex to split the first from a "/token1/token2/token3" - java

Related

How to extract a substring from a sentence until a delimeter

Regular expression in java (java String)

How to preserve delimeters while using String.split() in Java?

Regex after a special character in Java

Extracting Number from URL in Java via Regex

Categories

Resources