How to make regex pattern for some scenarios - java

Am doing WordXml parsar using JAVA.
And now i want to check (F(1) = 44) this type of pattern will be occured or not in a paragraph.
Note: Inside of open close will have must integer value.
Folloing pattern i will need to check.
(text text (text) text)
(F(1) = 44)
(text text [text] text)
[text text (text) text]
But, Clearly don't know how to make regex pattern for above the senarios.
So, Please suggest me. And anybody pls let me know.

You can use this regex \([a-zA-Z]+\(\d+\)\s*=\s*\d+\), which mean
one or more alphabetic [a-zA-Z]+
followed by one or more degit between parentheses \(\d+\)
followed by one or more space \s*
followed then by equal =
followed then by one or more space \s*
followed then by one or more degit \d+
all this between parentheses \([a-zA-Z]+\(\d+\)\s*=\s*\d+\)
with Pattern like this :
String[] texts = new String[]{"(text text (text) text)",
"(F(1) = 44)",
"(text text [text] text)",
"[text text (text) text]"};
String regex = "\\([a-zA-Z]*\\(\\d+\\)\\s*=\\s*\\d+\\)";
for (String s : texts) {
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println("There are match " + matcher.group());
} else {
System.out.println("No match occurred");
}
}
Output
No match occurred
There are match (F(1) = 44)
No match occurred
No match occurred
regex demo

Related

Regex - get second word after first match

I'm trying to parse a simple DDL statement. First I'm trying to pull the table name out.
The syntax will be something like 'CREATE TABLE DB_NAME.TABLE_NAME'
So far I've got this:
String line = "CREATE TABLE DB_NAME.T_NAME";
String pattern = ".*?\\bTABLE\\s+(\\w+)\\b.*";
System.out.println(line.replaceFirst(pattern, "$1"));
That gives me back "DB_NAME". How can I get it to give me back "T_NAME"?
I tried following the update in this answer, but I couldn't get it to work, probably due to my very limited regex skills.
What about sth like this:
.*?\\bTABLE\\s+\\w+\\.(\\w+)\\b.*
Demo
It first matches the TABLE keyword with .*?\\bTABLE\\s+. Then it matches DB_NAME. with \\w+\\.. Finally it matches and captures T_NAME with (\\w+)
Here's a small piece of code that will do (using named capturing groups):
String line = "CREATE TABLE DB_NAME.T_NAME";
Pattern pattern = Pattern.compile("CREATE TABLE (?<database>\\w+)\\.(?<table>\\w+)");
Matcher matcher = pattern.matcher(line);
if (matcher.matches()) {
String database = matcher.group("database"); // DB_NAME
String table = matcher.group("table"); // T_NAME
}
You may extract all the string after the TABLE into a group and then split with comma to get individual values:
String line = "CREATE TABLE DB_NAME.T_NAME";
String pattern = "\\bTABLE\\s+(\\w+(?:\\.\\w+)*)";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(line);
if (m.find()){
System.out.println(Arrays.toString(m.group(1).split("\\.")));
// => [DB_NAME, T_NAME]
}
See the Java demo.
If you are sure of the incoming format of the string, you might even use
"\\bTABLE\\s+(\\S+)"
See another Java demo.
While \w+(?:\.\w+)* matches 1+ word chars followed with 0+ repetitions of . and 1+ word chars, \S+ plainly matches 1+ non-whitespace chars.

Extracting the next line data of a file if regex pattern matches in Java

I want to extract the next line data of a text file if regex pattern matches in Java. I am able to detect and match the pattern data in a text file, But unable to print the next line of pattern data.
Test data:
*** Explorer
GenV Deno Znet
Regular Expression for matching the Explorer
[*\\+]+[\\s+]+[Explorer]+[:]
Kindly help me on how to get the next line if *** Explorer pattern is found.
You can use this regex with a capturing group for the next line after your search pattern:
[*]+\s+Explorer\R(.*)
Next line is captured in group #1
Regex Breakup:
[*]+\s+Explorer - Match your search pattern
\R - Match any newline character
(.*) - Match and captured full line in group #1
In Java use:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
final String regex = "[*]+\\s+Explorer\\R(.*)";
final String input = "*** Explorer\nGenV Deno Znet";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println("match: " + matcher.group(1));
}
RegEx Demo
You can Use:
[*\s]+Explorer\n(.*?)$
Regex Demo
Well, for starters the regex is not going to match "*** Explorer"
It will match "*** Explorer:"
If this is java, can't you just read the next line?
while ((lineText = lineReader.readLine()) != null) {
hasMatch = lineText.matches(regex);
if(hasMatch) {
lineText = lineReader.readLine();
System.out.println(lineText);
}
}
Works for me.

Regular expression for extracting instance ID, AMI ID, Volume ID

Given the following string
Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305
I want to be able to extract the following using a regular expression
i-b9b4ffaa
ami-dbcf88b1
vol-e97db305
This is the regular expression I came up with, which currently doesn't do what I need :
Pattern p = Pattern.compile("Created by CreateImage([a-z]+[0.9]+)([a-z]+[0.9]+)([a-z]+[0.9]+)",Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305");
System.out.println(m.matches()); --> false
You may match all words starting with letters, followed with a hyphen, and then having alphanumeric chars:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
Pattern pattern = Pattern.compile("(?i)\\b[a-z]+-[a-z0-9]+");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(0));
}
// => i-b9b4ffaa, ami-dbcf88b1, vol-e97db305
See the Java demo
Pattern details:
(?i) - a case insensitive modifier (embedded flag option)
\\b - a word boundary
[a-z]+ - 1 or more ASCII letters
- - a hyphen
[a-z0-9]+ - 1 or more alphanumerics.
To make sure these values appear on the same line after Created by CreateImage, use a \G-based regex:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
Pattern pattern = Pattern.compile("(?i)(?:Created by CreateImage|(?!\\A)\\G)(?:(?!\\b[a-z]+-[a-z0-9]+).)*\\b([a-z]+-[a-z0-9]+)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(1));
}
See this demo.
Note that the above pattern is based on the \G operator that matches the end of the last successful match (so we only match after a match or after Created...) and a tempered greedy token (?:(?!\\b[a-z]+-[a-z0-9]+).)* (matching any symbol other than a newline that does not start a sequence: word boundary+letters+-+letters|digits) that is very resource consuming.
You should consider using a two-step approach to first check if a string starts with Created... string, and then process it:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
if (s.startsWith("Created by CreateImage")) {
Matcher n = Pattern.compile("(?i)\\b[a-z]+-[a-z0-9]+").matcher(s);
while(n.find()) {
System.out.println(n.group(0));
}
}
See another demo

Java pattern matching using regex

I am new to java coding and using pattern matching.I am reading this string from file. So, this will give compilation error. I have a string as follows :
String str = "find(\"128.210.16.48\",\"Hello Everyone\")" ; // no compile error
I want to extract "128.210.16.48" value and "Hello Everyone" from above string. This values are not constant.
can you please give me some suggestions?
Thanks
I suggest you to use String#split() method but still if you are looking for regex pattern then try it and get the matched group from index 1.
("[^"][\d\.]+"|"[^)]*+)
Online demo
Sample code:
String str = "find(\"128.210.16.48\",\"Hello Everyone\")";
String regex = "(\"[^\"][\\d\\.]+\"|\"[^)]*+)";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
output:
"128.210.16.48"
"Hello Everyone"
Pattern explanation:
( group and capture to \1:
" '"'
[^"] any character except: '"'
[\d\.]+ any character of: digits (0-9), '\.' (1
or more times (matching the most amount
possible))
" '"'
| OR
" '"'
[^)]* any character except: ')' (0 or more
times (matching the most amount
possible))
) end of \1
Try with String.split()
String str = "find(\"128.210.16.48\",\"Hello Everyone\")" ;
System.out.println(str.split(",")[0].split("\"")[1]);
System.out.println(str.split(",")[1].split("\"")[1]);
Output:
128.210.16.48
Hello Everyone
Edit:
Explanation:
For the first string split it by comma (,). From that array choose the first string as str.split(",")[0] split the string again with doublequote (") as split("\"")[1] and choose the second element from the array. Same the second string is also done.
The accepted answer is fine, but if for some reason you wanted to still use regex (or whoever finds this question) instead of String.split here's something:
String str = "find(\"128.210.16.48\",\"Hello Everyone\")" ; // no compile error
String regex1 = "\".+?\"";
Pattern pattern1 = Pattern.compile(regex1);
Matcher matcher1 = pattern1.matcher(str);
while (matcher1.find()){
System.out.println("Matcher 1 found (trimmed): " + matcher1.group().replace("\"",""));
}
Output:
Matcher 1 found (trimmed): 128.210.16.48
Matcher 1 found (trimmed): Hello Everyone
Note: this will only work if " is only used as a separator character. See Braj's demo as an example from the comments here.

Regex for matching pattern within quotes

I have some input data such as
some string with 'hello' inside 'and inside'
How can I write a regex so that the quoted text (no matter how many times it is repeated) is returned (all of the occurrences).
I have a code that returns a single quotes, but I want to make it so that it returns multiple occurances:
String mydata = "some string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'(.*?)+'");
Matcher matcher = pattern.matcher(mydata);
while (matcher.find())
{
System.out.println(matcher.group());
}
Find all occurences for me:
String mydata = "some '' string with 'hello' inside 'and inside'";
Pattern pattern = Pattern.compile("'[^']*'");
Matcher matcher = pattern.matcher(mydata);
while(matcher.find())
{
System.out.println(matcher.group());
}
Output:
''
'hello'
'and inside'
Pattern desciption:
' // start quoting text
[^'] // all characters not single quote
* // 0 or infinite count of not quote characters
' // end quote
I believe this should fit your requirements:
\'\w+\'
\'.*?' is the regex you are looking for.

Categories

Resources