I'm trying to write a regex pattern that will match a "digit~digit~string~sentence". eg 14~742091~065M998~P E ROUX 214. I've come up with the following so far:
String regex= "\\d+~?\\d+~?\\w+~?"
How do I extract the sentence after the last ~?
Use Capturing Groups:
\d+~?\d+~?\w+~(.*)
group(1) contains the part you want.
Another solution is using String#split:
String[] splitted = myString.split("~");
String res = splitted[splitted.length() - 1];
Use capturing groups (), as demonstrated in this pattern: "\\d+~\\d+~\\w+~(.*)". Note that you don't need the greedy quantifier ?.
String input = "14~742091~065M998~P E ROUX 214";
Pattern pattern = Pattern.compile("\\d+~\\d+~\\w+~(.*)");
//Pattern pattern = Pattern.compile("(?:\\d+~){2}\\w+~(.*)"); (would also work)
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
System.out.println(matcher.group(1));
}
Prints:
P E ROUX 214
you should use ( ) to extract the output you want,
for more details see here
.*~(.*)$
This simple regex should work for you.
See demo
try the regexp below, the sentence only contains alphanumeric and spaces
^\d+~\d+~\w+~[\w\s]+
Related
I would like to test if a string contains insert and name, with any interceding characters. And if it does, I would like to print the match.
For the below code, only the third Pattern matches, and the entire line is printed. How can I match only insert...name?
String x = "aaa insert into name sdfdf";
Matcher matcher = Pattern.compile("insert.*name").matcher(x);
if (matcher.matches())
System.out.print(matcher.group(0));
matcher = Pattern.compile(".*insert.*name").matcher(x);
if (matcher.matches())
System.out.print(matcher.group(0));
matcher = Pattern.compile(".*insert.*name.*").matcher(x);
if (matcher.matches())
System.out.print(matcher.group(0));
try to use group like this .*(insert.*name).*
Matcher matcher = Pattern.compile(".*(insert.*name).*").matcher(x);
if (matcher.matches()) {
System.out.print(matcher.group(1));
//-----------------------------^
}
Or in your case you can just use :
x = x.replaceAll(".*(insert.*name).*", "$1");
Both of them print :
insert into name
You just need to use find() instead of matches() in your code:
String x = "aaa insert into name sdfdf";
Matcher matcher = Pattern.compile("insert.*?name").matcher(x);
if (matcher.find())
System.out.print(matcher.group(0));
matches() expects you to match entire input string whereas find() lets you match your regex anywhere in the input.
Also suggest you to use .*? instead of .*, in case your input may contain multiple instances of index ... name pairs.
This code sample will output:
insert into name
Just use multiple positive lookaheads:
(?=.*insert)(?=.*name).+
See a demo on regex101.com.
I have given one condition like below which can not able to match line like from table1; or insert into table1(col1,col2 ..)
if(Arrays.asList(line.split("\"")).contains("table1")) ||
Arrays.asList(line.split(" ")).contains("table1"))
System.out.println(line);
Which logic i need to follow ?
Use a regular expression and place all the special characters which you need to split inside that expression.
if(Arrays.asList(line.split("[\",\s\.]").contains("table1"))
Use a regex match as below
if(Arrays.asList(line.split("[\", .]").contains("table1"))
Note that you can put whatever characters you want to split the line against in the square brackets.
You can use regex:
Pattern pat = Pattern.compile("(?<!\\p{L})table1(?!\\p{L})");
if (pat.matcher(line).find())
{
System.out.println(line);
}
If I understand your question properly, you can achieve it without using Splits:
String stringPattern = ".*table1.*";
Pattern pattern = Pattern.compile(stringPattern);
Matcher matcher = pattern.matcher(line);
if (matcher.matches())
System.out.println(line);
You can use a regexp with negative lookahed and negative lookbehind:
String input = "from table1;";
Pattern p = Pattern.compile("(?<![a-zA-Z0-9_])table1(?![a-zA-Z0-9_])");
Matcher matcher = p.matcher(input);
if (matcher.find())
System.out.println(input);
This will match any "table1" occurences where it is not preceded or followed by any letters, numbers or _ sign.
Try this:
if (Arrays.asList(list.split("[^a-zA-Z0-9_]")).contains("table1")) {
System.out.println(list);
}
Or as RealSkeptic suggests use regular expression matching:
if (list.matches(".*\\btable1\\b.*")) {
System.out.println(list);
}
I ping a host. In result a standard output. Below a REGEXP but it do not work correct. Where I did a mistake?
String REGEXP ="time=(\\\\d+)ms";
Pattern pattern = Pattern.compile(REGEXP);
Matcher matcher = pattern.matcher(result);
if (matcher.find()) {
result = matcher.group(1);
}
You only need \\d+ in your regex because
Matcher looks for the pattern (using which it is created) and then tries to find every occurance of the pattern in the string being matched.
Use while(matcher.group(1) in case of multiple occurances.
each () represents a captured group.
You have too many backslashes. Assuming you want to get the number from a string like "time=32ms", then you need:
String REGEXP ="time=(\\d+)ms";
Pattern pattern = Pattern.compile(REGEXP);
Matcher matcher = pattern.matcher(result);
if (matcher.find()) {
result = matcher.group(1);
}
Explanation: The search pattern you are looking for is "\d", meaning a decimal number, the "+" means 1 or more occurrences.
To get the "\" to the matcher, it needs to be escaped, and the escape character is also "\".
The brackets define the matching group that you want to pick out.
With "\\\\d+", the matcher sees this as "\\d+", which would match a backslash followed by one or more "d"s. The first backslash protects the second backslash, and the third protects the fourth.
UPDATE: Thanks for all the great responses! I tried many different regex patterns but didn't understand why m.matches() was not doing what I think it should be doing. When I switched to m.find() instead, as well as adjusting the regex pattern, I was able to get somewhere.
I'd like to match a pattern in a Java string and then extract the portion matched using a regex (like Perl's $& operator).
This is my source string "s": DTSTART;TZID=America/Mexico_City:20121125T153000
I want to extract the portion "America/Mexico_City".
I thought I could use Pattern and Matcher and then extract using m.group() but it's not working as I expected. I've tried monkeying with different regex strings and the only thing that seems to hit on m.matches() is ".*TZID.*" which is pointless as it just returns the whole string. Could someone enlighten me?
Pattern p = Pattern.compile ("TZID*:"); // <- change to "TZID=([^:]*):"
Matcher m = p.matcher (s);
if (m.matches ()) // <- change to m.find()
Log.d (TAG, "looking at " + m.group ()); // <- change to m.group(1)
You use m.match() that tries to match the whole string, if you will use m.find(), it will search for the match inside, also I improved a bit your regexp to exclude TZID prefix using zero-width look behind:
Pattern p = Pattern.compile("(?<=TZID=)[^:]+"); //
Matcher m = p.matcher ("DTSTART;TZID=America/Mexico_City:20121125T153000");
if (m.find()) {
System.out.println(m.group());
}
This should work nicely:
Pattern p = Pattern.compile("TZID=(.*?):");
Matcher m = p.matcher(s);
if (m.find()) {
String zone = m.group(1); // group count is 1-based
. . .
}
An alternative regex is "TZID=([^:]*)". I'm not sure which is faster.
You are using the wrong pattern, try this:
Pattern p = Pattern.compile(".*?TZID=([^:]+):.*");
Matcher m = p.matcher (s);
if (m.matches ())
Log.d (TAG, "looking at " + m.group(1));
.*? will match anything in the beginning up to TZID=, then TZID= will match and a group will begin and match everything up to :, the group will close here and then : will match and .* will match the rest of the String, now you can get what you need in group(1)
You are missing a dot before the asterisk. Your expression will match any number of uppercase Ds.
Pattern p = Pattern.compile ("TZID[^:]*:");
You should also add a capturing group unless you want to capture everything, including the "TZID" and the ":"
Pattern p = Pattern.compile ("TZID=([^:]*):");
Finally, you should use the right API to search the string, rather than attempting to match the string in its entirety.
Pattern p = Pattern.compile("TZID=([^:]*):");
Matcher m = p.matcher("DTSTART;TZID=America/Mexico_City:20121125T153000");
if (m.find()) {
System.out.println(m.group(1));
}
This prints
America/Mexico_City
Why not simply use split as:
String origStr = "DTSTART;TZID=America/Mexico_City:20121125T153000";
String str = origStr.split(":")[0].split("=")[1];
I have a regex in Java:
Pattern pattern = Pattern.compile(<string>text</string><string>.+</string>);
Matcher matcher = pattern.matcher(ganzeDatei);
while (matcher.find()) {
String string = matcher.group();
...
This works fine, but the output is something like
<string>text</string><string>Name</string>
But I just want this: Name
How can I do this?
Capture the text you want to return by wrapping it in parenthesis, so in this example your regex should become
<string>text</string><string>(.+)</string>
Then you can access the text that matched between the parenthesis with
matcher.group(1)
The no-arg group method you are calling, returns the entire portion of the input text that matches your pattern, whereas you want just a subsequence of that, which matches a capturing group (the parenthesis).
Then do this:
Pattern pattern = Pattern.compile(<string>text</string><string>(.+)</string>);
Matcher matcher = pattern.matcher(ganzeDatei);
while (matcher.find()) {
String string = matcher.group(1);
...
Reference:
Java Tutorial: Regex
Pattern JavaDoc: Capturing Groups
Matcher JavaDoc: Matcher.group(n)
Matcher JavaDoc: Matcher.group()
You must put text you want to obtain by group() into brackets. So use:
<string>(.+)</string>