Regexp matching in group with special characters - java

I have a String that is something like this:
A20130122.0000+0000-0015+0000_name
Then I would like to extract this information:
The 20130122.0000+0000-0015+0000 that will be parsed to a date later on.
And the final part which is name.
So I am using in Java something like this:
String regexpOfdate = "[0-9]{8}\\.[0-9]{4}\\+[0-9]{4}-[0-9]{4}\\+[0-9]{4}";
String regexpOfName = "\\w+";
Pattern p = Pattern.compile(String.format("A(%s)_(%s)", regexpOfdate, regexpOfName));
Matcher m = p.matcher(theString);
String date = m.group(0);
String name = m.group(1);
But I am getting a java.lang.IllegalStateException: No match found
Do you know what I am doing wrong?

You aren't calling Matcher#find or Matcher#matches methods after this line:
Matcher m = p.matcher(theString);
Try this code:
Matcher m = p.matcher(theString);
if (m.find()) {
String date = m.group(1);
String name = m.group(2);
System.out.println("Date: " + date + ", name: " + name);
}

Matcher#group will throw IllegalStateException if the matcher's regex hasn't yet
been applied to its target text, or if the previous application was not successful.
Matcher#find applies the matcher's regex to the current region of the matcher's target text, returning a Boolean indicating whether a match is found.
Refer
You can try this :
String theString="A20130122.0000+0000-0015+0000_name";
String regexpOfdate = "([0-9]{8})\\.[0-9]{4}\\+[0-9]{4}-[0-9]{4}\\+[0-9]{4}";
String regexpOfName = "(\\w+)";
Pattern p = Pattern.compile(String.format("A(%s)_(%s)", regexpOfdate, regexpOfName));
Matcher m = p.matcher(theString);
if(m.find()){
String date = m.group(2);
String name = m.group(3);
System.out.println("date: "+date);
System.out.println("name: "+name);
}
OUTPUT
date: 20130122
name: name
Refer Grouping in REGEX

Related

Extract multiple dates (dd-MMM-yyyy format) from a string in java

I have searched everywhere for this but couldn't get a specific solution, and the documentation also didn't cover this. So I want to extract the start date and end date from this string "1-Mar-2019 to 31-Mar-2019". The problem is I'm not able to extract both the date strings.
I found the closest solution here but couldn't post a comment asking how to extract values individually due to low reputation: https://stackoverflow.com/a/8116229/10735227
I'm using a regex pattern to look for the occurrences and to extract both occurrences to 2 strings first.
Here's what I tried:
Pattern p = Pattern.compile("(\\d{1,2}-[a-zA-Z]{3}-\\d{4})");
Matcher m = p.matcher(str);
while(m.find())
{
startdt = m.group(1);
enddt = m.group(1); //I think this is wrong, don't know how to fix it
}
System.out.println("startdt: "+startdt+" enddt: "+enddt);
Output is:
startdt: 31-Mar-2019 enddt: 31-Mar-2019
Additionally I need to use DateFormatter to convert the string to date (adding the trailing 0 before single digit date if required).
You can catch both dates simply calling the find method twice, if you only have one, this would only capture the first one :
String str = "1-Mar-2019 to 31-Mar-2019";
String startdt = null, enddt = null;
Pattern p = Pattern.compile("(\\d{1,2}-[a-zA-Z]{3}-\\d{4})");
Matcher m = p.matcher(str);
if(m.find()) {
startdt = m.group(1);
if(m.find()) {
enddt = m.group(1);
}
}
System.out.println("startdt: "+startdt+" enddt: "+enddt);
Note that this could be used with a while(m.find()) and a List<String to be able to extract every date your could find.
If your text may be messy, and you really need to use a regex to extract the date range, you may use
String str = "Text here 1-Mar-2019 to 31-Mar-2019 and tex there";
String startdt = "";
String enddt = "";
String date_rx = "\\d{1,2}-[a-zA-Z]{3}-\\d{4}";
Pattern p = Pattern.compile("(" + date_rx + ")\\s*to\\s*(" + date_rx + ")");
Matcher m = p.matcher(str);
if(m.find())
{
startdt = m.group(1);
enddt = m.group(2);
}
System.out.println("startdt: "+startdt+" enddt: "+enddt);
// => startdt: 1-Mar-2019 enddt: 31-Mar-2019
See the Java demo
Also, consider this enhancement: match the date as whole word to avoid partial matches in longer strings:
Pattern.compile("\\b(" + date_rx + ")\\s*to\\s*(" + date_rx + ")\\b")
If the range can be expressed with - or to you may replace to with (?:to|-), or even (?:to|\\p{Pd}) where \p{Pd} matches any hyphen/dash.
You can simply use String::split
String range = "1-Mar-2019 to 31-Mar-2019";
String dts [] = range.split(" ");
System.out.println(dts[0]);
System.out.println(dts[2]);

Get text in the URL with dynamic date - Regex Java

I need to get the text between the URL which has a date in Java
Input 1:
/test1/raw/2019-06-11/testcustomer/usr/pqr/DATA/mn/export/
Output: testcustomer
Only /raw/ remains, date will change and testcustomer will change
Input 2:
/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/
Output: newcustomer
String url = "/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/";
String customer = getCustomer(url);
public String getCustomer (String _url){
String source = "default";
String regex = basePath + "/raw/\\d{4}-\\d{2}-\\d{2}/usr*";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(_url);
if (m.find()) {
source = m.group(1);
} else {
logger.error("Cant get customer with regex " + regex);
}
return source;
}
It's returning 'default' :(
Your regex /raw/\\d{4}-\\d{2}-\\d{2}/usr* is missing the part for the value you want, you need a regex that find the date, and keep what's next :
/\w*/raw/[0-9-]+/(\w+)/.* or (?<=raw\/\d{4}-\d{2}-\d{2}\/)(\w+) will be good
Pattern p = Pattern.compile("/\\w*/raw/[0-9-]+/(\\w+)/.*");
Matcher m = p.matcher(str);
if (m.find()) {
String value = m.group(1);
System.out.println(value);
}
Or if it's always the 4th part, use split()
String value = str.split("/")[4];
System.out.println(value);
And here a >> code demo
Here, we can likely use raw followed by the date as a left boundary, then we would collect our desired output in a capturing group, we would add an slash and consume the rest of our string, with an expression similar to:
.+raw\/[0-9]{4}-[0-9]{2}-[0-9]{2}\/(.+?)\/.+
Demo
Test
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = ".+raw\\/[0-9]{4}-[0-9]{2}-[0-9]{2}\\/(.+?)\\/.+";
final String string = "/test1/raw/2019-06-11/testcustomer/usr/pqr/DATA/mn/export/\n"
+ "/test3/raw/2018-09-01/newcustomer/usr/pqr/DATA/mn/export/";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}
RegEx
If this expression wasn't desired or you wish to modify it, please visit regex101.com.
RegEx Circuit
jex.im visualizes regular expressions:

Java Finding String regex

I tried to get these date 22-APR-16 11.00.00.000000 and 22-APR-16 10.30.00.000000.
My codes are there but it cant find ,how can I do?
String pattern = "(Başlangıç Tarihi:\\s+)([0-9/:]+\\s+[0-9:]+)(.*)\\s+(Bitiş Tarihi:\\s+)([0-9/:]+\\s+[0-9:]+)(.*)";
Pattern r = Pattern.compile(pattern);
String text = "Başlangıç Tarihi: 22-APR-16 11.00.00.000000 AM Bitiş Tarihi: 22-APR-16 10.30.00.000000 PM";
Matcher m = r.matcher(text);
if(m.find())
{
String startDate = m.group(2);
String endDate = m.group(5);
System.out.println("Start Date : " + startDate);
System.out.println("End Date : " + endDate);
}
KISS
String pattern = "(Başlangıç Tarihi:\\s+)(\\d+-[A-Za-z]+-\\d+\\s[\\d.]+)(.*)\\s+(Bitiş Tarihi:\\s+)(\\d+-[A-Za-z]+-\\d+\\s[\\d.]+)";
Ideone Demo
Moreover, you can just use
(\\d+-[A-Za-z]+-\\d+\\s[\\d.]+)
and find all the matches using loop and store it an array or arraylist. Every even element will be start date and odd element will be end date

How to split a long string in Java?

How to edit this string and split it into two?
String asd = {RepositoryName: CodeCommitTest,RepositoryId: 425f5fc5-18d8-4ae5-b1a8-55eb9cf72bef};
I want to make two strings.
String reponame;
String RepoID;
reponame should be CodeCommitTest
repoID should be 425f5fc5-18d8-4ae5-b1a8-55eb9cf72bef
Can someone help me get it? Thanks
Here is Java code using a regular expression in case you can't use a JSON parsing library (which is what you probably should be using):
String pattern = "^\\{RepositoryName:\\s(.*?),RepositoryId:\\s(.*?)\\}$";
String asd = "{RepositoryName: CodeCommitTest,RepositoryId: 425f5fc5-18d8-4ae5-b1a8-55eb9cf72bef}";
String reponame = "";
String repoID = "";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(asd);
if (m.find()) {
reponame = m.group(1);
repoID = m.group(2);
System.out.println("Found reponame: " + reponame + " with repoID: " + repoID);
} else {
System.out.println("NO MATCH");
}
This code has been tested in IntelliJ and runs without error.
Output:
Found reponame: CodeCommitTest with repoID: 425f5fc5-18d8-4ae5-b1a8-55eb9cf72bef
Assuming there aren't quote marks in the input, and that the repository name and ID consist of letters, numbers, and dashes, then this should work to get the repository name:
Pattern repoNamePattern = Pattern.compile("RepositoryName: *([A-Za-z0-9\\-]+)");
Matcher matcher = repoNamePattern.matcher(asd);
if (matcher.find()) {
reponame = matcher.group(1);
}
and you can do something similar to get the ID. The above code just looks for RepositoryName:, possibly followed by spaces, followed by one or more letters, digits, or hyphen characters; then the group(1) method extracts the name, since it's the first (and only) group enclosed in () in the pattern.

Regex for floor in address

I have this regex:
String regexPattern = "[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor";
I want to test it against:
String lineString = "8th floor, Prince's Building, 12 Chater Road";
so I do:
boolean isMatching = lineString.matches(regexPattern);
and it return false. Why?
I thought it had something to do with whitespaces in Java, so I removed the whitespace in the regexPattern variable so it reads
regexPattern = "[0-9A-Za-z]+(st|nd|rd|th)floor";
and matched it with a string without white space:
String lineString = "8thfloor,Prince'sBuilding,12ChaterRoad"
it still returns false. Why? Any help very much appreciated.
String.matches() only returns true if the entire string matches the pattern.
Try adding .* to the beginning and end of your regex.
Example:
String regex = ".*[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor.*";
This is not the best approach, however...
Here's a better alternative:
String input = "8th floor, Prince's Building, 12 Chater Road";
String regex = "[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor";
Pattern p = Pattern.compile(regex);
boolean isMatch = p.matcher(input).find();
If you want to extract the floor number, do this:
String input = "8th floor, Prince's Building, 12 Chater Road";
String regex = "([0-9A-Za-z])+(st|nd|rd|th)" + " " + "floor";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
if (m.find()) {
String num = m.group(1);
String suffix = m.group(2);
System.out.println("Welcome to the " + num + suffix + " floor!");
// prints 'Welcome to the 8th floor!'
}
Check out the Pattern API for a boatload of info about Java regular expressions.
Edited, per comments ...
The [0-9A-Za-z]+ part is greedily matching until the end of th.
Try [0-9] instead.

Categories

Resources