I'd like to extract 2 arguments from given string using regex. For example:
C:\Users "C:\Program files"
C:\mytext.txt mytext2.txt
Output would be C:\Users and C:\Program files
C:\mytext.txt and mytext2.txt
If string is between " " it can contain white spaces, otherwise it has to be without them. So far I managed to extract arguments between " ", but can't figure out how to extract them when one argument has " " and the other one doesn't (like in example above).
Pattern p = Pattern.compile("\"(.*?)\"");
Matcher m = p.matcher(string);
You can use this regex for matching:
Pattern p = Pattern.compile("\"[^\"]*\"|\\S+");
RegEx Demo
Can you suggest me an approach by which I can split a String which is like:
So I tried to parse that string with
This kind of regular expression, but it is not working . Please suggest me a regular expression by which I can split that string with 20 , 31C , 31D etc as Keys and 150318 , 150425 IN BANGLADESH etc as Values .
If I use string.split(":") then it would not serve my purpose.
If a string is like:
then It will split up into 3 string , and key 20 will be associated with "MY VALUES" , and "ARE HERE" will not associated with key 20 .
You may use matching mechanism instead of splitting since you need to match a specific colon in the string.
The regex to get 2 groups between the first and second colon and also capture everything after the second colon will look like
See demo. The ^ will assert the beginning of the string, ([^:]*) will match and capture into Group 1 zero or more characters other than :, and (.*) will match and capture into Group 2 the rest of the string. $ will assert the position at the end of a single line string (as . matches any symbol but a newline without Pattern.DOTALL modifier).
String s = ":20:AND:HERE";
Pattern pattern = Pattern.compile("^:([^:]*):(.*)$");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println("Key: " + matcher.group(1) + ", Value: " + matcher.group(2) + "\n");
Result for this demo: Key: 20, Value: AND:HERE
You can use the following to split:
Try with split function of String class
String[] splited = string.split(":");
For your requirements:
String c = ":31D:150425 IN BANGLADESH:todasdsa";
String key= c.substring(0,c.indexOf(":"));
String value = c.substring(c.indexOf(":")+1);
System.out.println("key="+key+" value="+value);
C=31D:150425 IN BANGLADESH:todasdsa
key=31D value=150425 IN BANGLADESH:todasdsa
I want to extract only filename from the complete file name + time stamp . below is the input.
String filePath = "fileName1_20150108.csv";
expected output should be: "fileName1"
String filePath2 = "fileName1_filedesc1_20150108_002_20150109013841.csv"
And expected output should be: "fileName1_filedesc1"
I wrote a below code in java to get the file name but it is working for first part (filePath) but not for filepath2.
Pattern pattern = Pattern.compile(".*.(?=_)");
String filePath = "fileName1_20150108.csv";
String filePath2 = "fileName1_filedesc1_20150108_002_20150109013841.csv";
Matcher matcher = pattern.matcher(filePath);
while (matcher.find()) {
System.out.print("Start index: " + matcher.start());
System.out.print(" End index: " + matcher.end() + " ");
Can somebody please help me to correct the regex so i can parse both filepath using same regex?
Anchor the start, and make the .* non-greedy:
Update: change the second group (for fileDesc) to optional, and enforce that it starts with a non-digit character. This will work as long as your fileDesc strings never start with numbers.
You can get the characters before the first underscode, the first underscore, and then the characters until the next underscore:
This should work: "^(.*?)_([0-9_]*)\\.([^.]*)$"
It will return you 3 groups:
the base name (assuming not a single part will be all numbers)
the timestamp info
the extension.
You can test here: http://fiddle.re/v0hne6 (RegexPlanet)
This is related to: RegEx: Grabbing values between quotation marks.
If there is a String like this:
HYPERLINK "hyperlink_funda.docx" \l "Sales"
The regex given on the link
is giving me
[" HYPERLINK ", " \l ", " "]
What regex will return values enclosed in quotation mark (specifically between the \" marks) ?
["hyperlink_funda.docx", "Sales"]
Using Java, String.split(String regex) way.
You're not supposed to use that with .split() method. Instead use a Pattern with capturing groups:
Pattern pattern = Pattern.compile("([\"'])((?:(?=(\\\\?))\\3.)*?)\\1");
Matcher matcher = pattern.matcher(" HYPERLINK \"hyperlink_funda.docx\" \\l \"Sales\" ");
while (matcher.find())
Here is a regex demo, and here is an online code demo.
I think you are misunderstanding the nature of the String.split method. Its job is to find a way of splitting a string by matching the features of the separator, not by matching features of the strings you want returned.
Instead you should use a Pattern and a Matcher:
String txt = " HYPERLINK \"hyperlink_funda.docx\" \\l \"Sales\" ";
String re = "\"([^\"]*)\"";
Pattern p = Pattern.compile(re);
Matcher m = p.matcher(txt);
ArrayList<String> matches = new ArrayList<String>();
while (m.find()) {
String match = m.group(1);
I'm trying to extract part of the URL in the text files.
for example:
/p/gnomecatalog/bugs/search/?q=status%3Aclosed-accepted+or+status%3Awont-fix+or+status%3Aclosed" class="search_bin"><span>Closed Tickets</span></a>
I would like to extract only
HOW I COULD DO THAT BY USING REGULAR Expression. I tried with regex
but it didn't work.
Try this:
it means: "/p"
then: all chars,
then: "/bugs",
then: all chars except "
You can use :
Java Code :
String REGEX = "(\\/p\\/.*\\/bugs\\/.*?(?=\"))";
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(line);
while (m.find()) {
String matched = m.group();
System.out.println("Mached : "+ matched);
Mached : /p/gnomecatalog/bugs/search/?q=status%3Aclosed-accepted+or+status%3Awont-fix+or+status%3Aclosed
Here's another way:
(?i)/p/[a-z/]+bugs/[^ "]+
The (?i) in the beginning makes the regex case insensitive so you don't have to worry about that. Then after bugs/ it will continue until it reaches either a space or a ".
Is possible, in java, to make a regex for matching the end of the string but not the newlines, using the Pattern.DOTALL option and searching for a line with \n?
I want to match
but, in the third example, i don't want to match text ater DDD.
Yes, there is. For example, (?-m)}$ will match a close-brace at the very end of a Java source file. The point is to disable the multiline mode. You can disable as I've shown or by setting the appropriate flag on the Pattern instance.
UPDATE: I believe that multiline is off by default when you instantiate a Pattern, but is on in Eclipse's find by regex.
The regex you need is:
Here is the full code:
String[] sarr = {"aaa\n==test==\naaa\nbbb\naaa", "bbb\naaa==toast==cccdd\nb\nc",
Pattern pt = Pattern.compile("(?s)==(?!.*?==)([^(?:DDD)]*)");
for (String s : sarr) {
Matcher m = pt.matcher(s);
System.out.print("For input: [" + s + "] => ");
if (m.find())
System.out.println("Matched: [" + m.group(1) + ']');
System.out.println("Didn't Match");
For input: [aaa\n==test==\naaa\nbbb\naaa] => Matched: [\naaa\nbbb\naaa]
For input: [bbb\naaa==toast==cccdd\nb\nc] => Matched: [cccdd\nb\nc]
For input: [aaa\n==trick==\naaaDDDaaa\nbbb] => Matched: [\naaa]