How to extract a substring from a sentence until a delimeter - java

Hi I have some string like this:
location/city/home-a-berlin?/someNewAdress
I want to extract word berlin which placed between "-a-" and "?". How can i do that with regex in java?
I can do it by using string API but kinda stuck with regex.
String cityName = url.substring(url.lastIndexOf("-a-")+3, url.indexOf('?')) //berlin

You can use a capture group with a negated character class.
-a-([^\?]+)\?
Regex demo | Java demo
In Java:
String regex = "-a-([^\\?]+)\\?";
String string = "location/city/home-a-berlin?/someNewAdress\n";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(string);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
Output
berlin

Or
s = s.replaceAll(".*-(.*?)\\?.*", "$1");

Alternative regex:
"-a-(.+?)\\?"
Regex in testbench and context:
public static void main(String[] args) {
String input1 = "location/city/home-a-berlin?/someNewAdress";
List<String> inputs = Arrays.asList(input1);
Pattern pattern = Pattern.compile("-a-(.+?)\\?");
List<String> results = inputs.stream().map(s -> pattern.matcher(s))
.filter(Matcher::find).map(m -> m.group(1)).collect(Collectors.toList());
//Output:
results.forEach(System.out::println);
}
Output:
berlin
Summary of regular-expression constructs:
https://docs.oracle.com/javase/10/docs/api/java/util/regex/Pattern.html

Related

Regex to split the first from a "/token1/token2/token3"

I'm pretty rusty with regex, but I have the requirement to extract the first token of the following string:
Input: /token1/token2/token3
Required output: /token1
I have tried:
List<String> connectorPath = Splitter.on("^[/\\w+]+")
.trimResults()
.splitToList(actionPath);
Doesn't work for me, any ideas?
Instead of split, you can match
^/\\w+
Or if the string has 3 parts, use a capture group for the first part.
^(/\\w+)/\\w+/\\w+$
Java example
Pattern pattern = Pattern.compile("^/\\w+");
Matcher matcher = pattern.matcher("/token1/token2/token3");
if (matcher.find()) {
System.out.println(matcher.group(0));
}
Output
/token1
You can split on the / that is not at the string start using the (?!^)/ regex:
String[] res = "/token1/token2/token3".split("(?!^)/");
System.out.println(res[0]); // => /token1
See the Java code demo and the regex demo.
(?!^) - a negative lookahead that matches a location not at the start of string
/ - a / char.
Using Guava:
Splitter splitter = Splitter.onPattern("(?!^)/").trimResults();
Iterable<String> iterable = splitter.split(actionPath);
String first = Iterables.getFirst(iterable, "");
You are over-complicating it.
Try the following regular expression: ^(\/\w+)(.+)$
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class PathSplitter {
public static void main(String args[]) {
String input = "/token1/token2/token3";
Pattern pattern = Pattern.compile("^(\\/\\w+)(.+)$");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
System.out.println(matcher.group(1)); // /token1
System.out.println(matcher.group(2)); // /token2/token3
} else {
System.out.println("NO MATCH");
}
}
}

Regular expression in java (java String)

from this -> contractor:"Hi, this is \"Paul\", how are you?" client:"Hi ...." <-
I want to get just -> Hi, this is \"Paul\", how are you? <-
I need a regular expression in java to do that I try it but I m struggle with the inner quotation (\") is driving me mad.
Thanks for any hint.
Java supports lookbehinds, so vanilla regex:
"(.*?(?<!\\))"
Inside a Java string (see https://stackoverflow.com/a/37329801/1225328):
\"(.*?(?<!\\\\))\"
The actual text will be contained inside the first group of each match.
Demo: https://regex101.com/r/8OXujX/2
For example, in Java:
String regex = "\"(.*?(?<!\\\\))\"";
String input = "contractor:\"Hi, this is \\\"Paul\\\", how are you?\" client:\"Hi ....\"";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
if (matcher.find()) { // or while (matcher.find()) to iterate through all the matches
System.out.println(matcher.group(1));
} else {
System.out.println("No matches");
}
Prints:
Hi, this is \"Paul\", how are you?
The regexp should be like this: "(?:\\.|[^"\\])*"
Online demo
It uses non-capturing group ?:, matching any character . or a single character NOT in the list of double quote and backslash.
var text1 = "contractor:\"Hi, this is \\\"Paul\\\", how are you?\" client:\"Hi ....\" <-";
var regExWithQuotation = "contractor:(.+\".+\".+) client:";
Pattern p = Pattern.compile(regExWithQuotation);
var m = p.matcher(text1);
;
if (m.find()) {
var res = m.group(1);
System.out.println(res);
}
var regExWithoutQuotation = "contractor:\"(.+\".+\".+)?\" client:";
p = Pattern.compile(regExWithoutQuotation);
m = p.matcher(text1);
if (m.find()) {
var res = m.group(1);
System.out.println(res);
}
Output is:
"Hi, this is "Paul", how are you?"
Hi, this is "Paul", how are you?
You can use the regex, (?<=contractor:\").*(?=\" client:)
Description of the regex:
(?<=contractor:\") specifies positive lookbehind for contractor:\"
.* specifies any character
(?=\" client:) specifies positive lookahead for \" client:
In short, anything preceded by contractor:\" and followed by \" client:
Demo:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
String str = "contractor:\"Hi, this is \\\"Paul\\\", how are you?\" client:\"Hi ....\"";
String regex = "(?<=contractor:\").*(?=\" client:)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group());
}
}
}
Output:
Hi, this is \"Paul\", how are you?

Regex after a special character in Java

I am using regex in java to get a specific output from a list of rooms at my University.
A outtake from the list looks like this:
(A55:G260) Laboratorium 260
(A55:G292) Grupperom 292
(A55:G316) Grupperom 316
(A55:G366) Grupperom 366
(HDS:FLØYEN) Fløyen (appendix)
(ODO:PC-STUE) Pulpakammeret (PC-stue)
(SALEM:KONF) Konferanserom
I want to get the value that comes between the colon and the parenthesis.
The regex I am using at the moment is:
pattern = Pattern.compile("[:]([A-Za-z0-9ÆØÅæøå-]+)");
matcher = pattern.matcher(room.text());
I've included ÆØÅ, because some of the rooms have Norwegian letters in them.
Unfortunately the regex includes the building code also (e.g. "A55") in the output... Comes out like this:
A55
A55
A55
:G260
:G292
:G316
Any ideas on how to solve this?
The problem is not your regular expression. You need to reference group(1) for the match result.
while (matcher.find()) {
System.out.println(matcher.group(1));
}
However, you may consider using a negated character class instead.
pattern = Pattern.compile(":([^)]+)");
You can try a regex like this :
public static void main(String[] args) {
String s = "(HDS:FLØYEN) Fløyen (appendix)";
// select everything after ":" upto the first ")" and replace the entire regex with the selcted data
System.out.println(s.replaceAll(".*?:(.*?)\\).*", "$1"));
String s1 = "ODO:PC-STUE) Pulpakammeret (PC-stue)";
System.out.println(s1.replaceAll(".*?:(.*?)\\).*", "$1"));
}
O/P :
FLØYEN
PC-STUE
Can try with String Opreations as follows,
String val = "(HDS:FLØYEN) Fløyen (appendix)";
if(val.contains(":")){
String valSub = val.split("\\s")[0];
System.out.println(valSub);
valSub = valSub.substring(1, valSub.length()-1);
String valA = valSub.split(":")[0];
String valB = valSub.split(":")[1];
System.out.println(valA);
System.out.println(valB);
}
Output :
(HDS:FLØYEN)
HDS
FLØYEN
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class test
{
public static void main( String args[] ){
// String to be scanned to find the pattern.
String line = "(HDS:FLØYEN) Fløyen (appendix)";
String pattern = ":([^)]+)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
while (m.find()) {
System.out.println(m.group(1));
}
}
}

First and second tocen regex

How could I get the first and the second text in "" from the string?
I could do it with indexOf but this is really boring ((
For example I have a String for parse like: "aaa":"bbbbb"perhapsSomeOtherText
And I d like to get aaa and bbbbb with the help of Regex pattern - this will help me to use it in switch statement and will greatly simplify my app/
If all that you have is colon delimited string just split it:
String str = ...; // colon delimited
String[] parts = str.split(":");
Note, that split() receives regex and compilies it every time. To improve performance of your code you can use Pattern as following:
private static Pattern pColonSplitter = Pattern.compile(":");
// now somewhere in your code:
String[] parts = pColonSplitter.split(str);
If however you want to use pattern for matching and extraction of string fragments in more complicated cases, do it like following:
Pattert p = Patter.compile("(\\w+):(\\w+):");
Matcher m = p.matcher(str);
if (m.find()) {
String a = m.group(1);
String b = m.group(2);
}
Pay attention on brackets that define captured group.
Something like this?
Pattern pattern = Pattern.compile("\"([^\"]*)\"");
Matcher matcher = pattern.matcher("\"aaa\":\"bbbbb\"perhapsSomeOtherText");
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output
aaa
bbbbb
String str = "\"aaa\":\"bbbbb\"perhapsSomeOtherText";
Pattern p = Pattern.compile("\"\\w+\""); // word between ""
Matcher m = p.matcher(str);
while(m.find()){
System.out.println(m.group().replace("\"", ""));
}
output:
aaa
bbbbb
there are several ways to do this
Use StringTokenizer or Scanner with UseDelimiter method

How split a string using regex pattern

How split a [0] like words from string using regex pattern.0 can replace any integer number.
I used regex pattern,
private static final String REGEX = "[\\d]";
But it returns string with [.
Spliting Code
Pattern p=Pattern.compile(REGEX);
String items[] = p.split(lure_value_save[0]);
You have to escape the brackets:
String REGEX = "\\[\\d+\\]";
Java doesn't offer an elegant solution to extract the numbers. This is the way to go:
Pattern p = Pattern.compile(REGEX);
String test = "[0],[1],[2]";
Matcher m = p.matcher(test);
List<String> matches = new ArrayList<String>();
while (m.find()) {
matches.add(m.group());
}

Categories

Resources