How to match a String in a line having immediate special character? - java

I have given one condition like below which can not able to match line like from table1; or insert into table1(col1,col2 ..)
if(Arrays.asList(line.split("\"")).contains("table1")) ||
Arrays.asList(line.split(" ")).contains("table1"))
System.out.println(line);
Which logic i need to follow ?

Use a regular expression and place all the special characters which you need to split inside that expression.
if(Arrays.asList(line.split("[\",\s\.]").contains("table1"))

Use a regex match as below
if(Arrays.asList(line.split("[\", .]").contains("table1"))
Note that you can put whatever characters you want to split the line against in the square brackets.

You can use regex:
Pattern pat = Pattern.compile("(?<!\\p{L})table1(?!\\p{L})");
if (pat.matcher(line).find())
{
System.out.println(line);
}

If I understand your question properly, you can achieve it without using Splits:
String stringPattern = ".*table1.*";
Pattern pattern = Pattern.compile(stringPattern);
Matcher matcher = pattern.matcher(line);
if (matcher.matches())
System.out.println(line);

You can use a regexp with negative lookahed and negative lookbehind:
String input = "from table1;";
Pattern p = Pattern.compile("(?<![a-zA-Z0-9_])table1(?![a-zA-Z0-9_])");
Matcher matcher = p.matcher(input);
if (matcher.find())
System.out.println(input);
This will match any "table1" occurences where it is not preceded or followed by any letters, numbers or _ sign.

Try this:
if (Arrays.asList(list.split("[^a-zA-Z0-9_]")).contains("table1")) {
System.out.println(list);
}
Or as RealSkeptic suggests use regular expression matching:
if (list.matches(".*\\btable1\\b.*")) {
System.out.println(list);
}

Related

How to replace second occurence of pattern in Java? [duplicate]

I have this small piece of code
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("[a-z]"))
{
System.out.println(s);
}
}
Supposed to print
dkoe
but it prints nothing!!
Welcome to Java's misnamed .matches() method... It tries and matches ALL the input. Unfortunately, other languages have followed suit :(
If you want to see if the regex matches an input text, use a Pattern, a Matcher and the .find() method of the matcher:
Pattern p = Pattern.compile("[a-z]");
Matcher m = p.matcher(inputstring);
if (m.find())
// match
If what you want is indeed to see if an input only has lowercase letters, you can use .matches(), but you need to match one or more characters: append a + to your character class, as in [a-z]+. Or use ^[a-z]+$ and .find().
[a-z] matches a single char between a and z. So, if your string was just "d", for example, then it would have matched and been printed out.
You need to change your regex to [a-z]+ to match one or more chars.
String.matches returns whether the whole string matches the regex, not just any substring.
java's implementation of regexes try to match the whole string
that's different from perl regexes, which try to find a matching part
if you want to find a string with nothing but lower case characters, use the pattern [a-z]+
if you want to find a string containing at least one lower case character, use the pattern .*[a-z].*
Used
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("[a-z]+"))
{
System.out.println(s);
}
}
I have faced the same problem once:
Pattern ptr = Pattern.compile("^[a-zA-Z][\\']?[a-zA-Z\\s]+$");
The above failed!
Pattern ptr = Pattern.compile("(^[a-zA-Z][\\']?[a-zA-Z\\s]+$)");
The above worked with pattern within ( and ).
Your regular expression [a-z] doesn't match dkoe since it only matches Strings of lenght 1. Use something like [a-z]+.
you must put at least a capture () in the pattern to match, and correct pattern like this:
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("(^[a-z]+$)"))
{
System.out.println(s);
}
}
You can make your pattern case insensitive by doing:
Pattern p = Pattern.compile("[a-z]+", Pattern.CASE_INSENSITIVE);

Extracting some pattern using regex

I'm trying to write a regex pattern that will match a "digit~digit~string~sentence". eg 14~742091~065M998~P E ROUX 214. I've come up with the following so far:
String regex= "\\d+~?\\d+~?\\w+~?"
How do I extract the sentence after the last ~?
Use Capturing Groups:
\d+~?\d+~?\w+~(.*)
group(1) contains the part you want.
Another solution is using String#split:
String[] splitted = myString.split("~");
String res = splitted[splitted.length() - 1];
Use capturing groups (), as demonstrated in this pattern: "\\d+~\\d+~\\w+~(.*)". Note that you don't need the greedy quantifier ?.
String input = "14~742091~065M998~P E ROUX 214";
Pattern pattern = Pattern.compile("\\d+~\\d+~\\w+~(.*)");
//Pattern pattern = Pattern.compile("(?:\\d+~){2}\\w+~(.*)"); (would also work)
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
System.out.println(matcher.group(1));
}
Prints:
P E ROUX 214
you should use ( ) to extract the output you want,
for more details see here
.*~(.*)$
This simple regex should work for you.
See demo
try the regexp below, the sentence only contains alphanumeric and spaces
^\d+~\d+~\w+~[\w\s]+

Find string after last underscore before dot extension

I need to find 20140809T0000Z in this string:
PREVIMER_F2-MARS3D-MENOR1200_20140809T0000Z.nc
I tried the following to keep the string before the .nc:
(?<=_)(.*)(?=.nc)
I have the following to start from the last underscore:
/_[^_]*$/
How can I find string after last underscore before dot extension, using a regex?
RegEx is not always the best solution... :)
String pattern="PREVIMER_F2-MARS3D-MENOR1200_20140809T0000Z.nc";
int start=pattern.lastIndexOf("_") + 1;
int end=pattern.lastIndexOf(".");
if(start != 0 && end != -1 && end > start) {
System.out.println(pattern.substring(start,end);
}
You just need lookahead for this requirement.
You can use:
[^._]+(?=[^_]*$)
// matches and returns 20140809T0000Z
RegEx Demo
You could use the below regex,
(?<=_)[^_]*(?=\.nc)
In your pattern just replace .* with [^_]* so that it would match the inner string.
DEMO
String s = "PREVIMER_F2-MARS3D-MENOR1200_20140809T0000Z.nc";
Pattern regex = Pattern.compile("(?<=_)[^_]*(?=\\.nc)");
Matcher regexMatcher = regex.matcher(s);
if (regexMatcher.find()) {
String ResultString = regexMatcher.group();
System.out.println(ResultString);
} //=> 20140809T0000Z
You could use a simpler pattern with a capturing group
.*_(.*)\.nc
By default the first .* will be "greedy" and consume as many characters as possible before the _, leaving just the desired string inside the (.*).
Demo: http://regex101.com/r/aI2xQ9/1
Java code:
String input = "PREVIMER_F2-MARS3D-MENOR1200_20140809T0000Z.nc";
Pattern pattern = Pattern.compile(".*_(.*)\\.nc");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
String group = matcher.group(1);
// ...
}
So, you need a sequence of non-underscore characters that immediately precede the period character.
Try [^_.]+(?=\.)
Demo: https://regex101.com/r/sLAnVs/2
Thanks to Cary Swoveland for pointing out that "no need to escape a period in a character class".

Java Regex Find All Between But Last Character Not Preceded Or Followed By

I have a string: "stuffhere{# name="productViewer" vars="productId={{id}}"}morestuff"
How can I find everything between the beginning { and last }.
Pattern.compile("\\{#(.*?)\\}" + from, Pattern.DOTALL); //Finds {# name="productViewer" vars="productId={{id}
How can I verify that the ending } is not preceded or followed by another }? The string may also be surrounded by other characters.
Id like for the regex to only return: name="productViewer" vars="productId={{id}}"
You can use this pattern:
\\{#(.*)(?<!\\})\\}
(?<!..) is a negative lookbehind that checks your condition (not preceded by })
Note that closing curly brackets don't need to be escaped, you can write:
\\{#(.*)(?<!})}
Try this:
^[^{]*\\{#(.*?)\\}[^}]*$
how about
(?<=[{])[^{}]+
i have never used java, but regex is international isn't it :)
EDIT:
wait... regex has errors...
try this:
String s = "stuffhere{# name=\"productViewer\" vars=\"productId={{id}}\"}morestuff";
Pattern p = Pattern.compile("\\{#\\s+(.*)\\}");
Matcher m = p.matcher(s);
if(m.find()){
System.out.println(m.group(1));
}

Regex doesn't work in String.matches()

I have this small piece of code
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("[a-z]"))
{
System.out.println(s);
}
}
Supposed to print
dkoe
but it prints nothing!!
Welcome to Java's misnamed .matches() method... It tries and matches ALL the input. Unfortunately, other languages have followed suit :(
If you want to see if the regex matches an input text, use a Pattern, a Matcher and the .find() method of the matcher:
Pattern p = Pattern.compile("[a-z]");
Matcher m = p.matcher(inputstring);
if (m.find())
// match
If what you want is indeed to see if an input only has lowercase letters, you can use .matches(), but you need to match one or more characters: append a + to your character class, as in [a-z]+. Or use ^[a-z]+$ and .find().
[a-z] matches a single char between a and z. So, if your string was just "d", for example, then it would have matched and been printed out.
You need to change your regex to [a-z]+ to match one or more chars.
String.matches returns whether the whole string matches the regex, not just any substring.
java's implementation of regexes try to match the whole string
that's different from perl regexes, which try to find a matching part
if you want to find a string with nothing but lower case characters, use the pattern [a-z]+
if you want to find a string containing at least one lower case character, use the pattern .*[a-z].*
Used
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("[a-z]+"))
{
System.out.println(s);
}
}
I have faced the same problem once:
Pattern ptr = Pattern.compile("^[a-zA-Z][\\']?[a-zA-Z\\s]+$");
The above failed!
Pattern ptr = Pattern.compile("(^[a-zA-Z][\\']?[a-zA-Z\\s]+$)");
The above worked with pattern within ( and ).
Your regular expression [a-z] doesn't match dkoe since it only matches Strings of lenght 1. Use something like [a-z]+.
you must put at least a capture () in the pattern to match, and correct pattern like this:
String[] words = {"{apf","hum_","dkoe","12f"};
for(String s:words)
{
if(s.matches("(^[a-z]+$)"))
{
System.out.println(s);
}
}
You can make your pattern case insensitive by doing:
Pattern p = Pattern.compile("[a-z]+", Pattern.CASE_INSENSITIVE);

Categories

Resources