forming correct regular expression in dynamic string - java

I have a FileInputStream who reads a file which somewhere contains a string subset looking like:
...
OperatorSpecific(XXX)
{
Customer(someContent)
SaveImage()
{
...
I would like to identify the Customer(someContent) part of the string and switch the someContent inside the parenthesis for something else.
someContent will be a dynamic parameter and will contain a string of maybe 5-10 chars.
I have used regEx before, like once or twice, but I feel that in a context such as this where I don't know what value will be inside the parenthesis I'm at a loss of how I should express it...
In summary I want to have a string returned to me which has my someContent value inside the Customer-parenthesis.
Does anyone have any bright ideas of how to get this done?

Try this one (double the escaping backslashes for the use in java!)
(?<=Customer\()[^\)]*
And replace with your content.
See it here at Regexr
(?<=Customer\() is look behind assertion. It checks at every position if there is a "Customer(" on the left, if yes it matches on the right all characters that are not a ")" with the [^\)]*, this is then the part that will be replaced.
Some working java code
Pattern p = Pattern.compile("(?<=Customer\\()[^\\)]*");
String original = "Customer(someContent)";
String Replacement = "NewContent";
Matcher m = p.matcher(original);
String result = m.replaceAll(Replacement);
System.out.println(result);
This will print
Customer(NewContent)

Using groups works and non-greedy works:
String s =
"OperatorSpecific(XXX)\n {\n" +
" Customer(someContent)\n" +
" SaveImage() {";
Pattern p = Pattern.compile("Customer\\((.*?)\\)");
Matcher matcher = p.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
will print
someContent

Untested, but something like the following should work:
Pattern pattern = Pattern.compile("\\s+Customer\\(\\s*(\\w+)\\s*\\)\\s*");
Matcher matcher = pattern.matcher(input);
matcher.matches();
System.out.println(matcher.group(1));
EDIT
This of course won't work with all possible cases:
// legal variable names
Customer(_someContent)
Customer($some_Content)

Related

Get a substring from string multiple times

I have a String that I don't know how long it is or what caracters are used in it.
I want to search in the string and get any substring found inside "" .
I tried to use pattern.compile but it always return an empty string
Pattern p = Pattern.compile("\".\"");
Matcher m = p.matcher(mystring);
while(m.find()){
System.out.println(m.group().toString());
}
How can I do it?
Use the .+? to get all characters inside "" with grouping
Pattern p = Pattern.compile("\".+?\"");
The .+ specifies that you want at least one or more characters inside the quotations. The ? specifies that it is a reluctant quantifier, which means it will put different quotations into different groups.
Unit test example:
#Test
public void test() {
String test = "speak \"friend\" and \"enter\"";
Pattern p = Pattern.compile("\".+?\"");
Matcher m = p.matcher(test);
while(m.find()){
System.out.println(m.group().toString().replace("\"", ""));
}
}
Output:
friend
enter
That is because your regex actually searches for one character between " and " ... if you want to search for more character, you should rewrite your regex to "\".?\""

Extracting a string using Regex

I have the following code to extract the string within double quotes using Regex.
String str ="\"Java\",\"programming\"";
final Pattern pattern = Pattern.compile("\"([^\"]*)\"");
final Matcher matcher = pattern.matcher(str);
while(matcher.find()){
System.out.println(matcher.group(1));
}
The output I get now is java programming.But from the String str I want the content in the second double quotes which is programming. Can any one tell me how to do that using Regex.
If you take your example, and change it slightly to:
String str ="\"Java\",\"programming\"";
final Pattern pattern = Pattern.compile("\"([^\"]*)\"");
final Matcher matcher = pattern.matcher(str);
int i = 0
while(matcher.find()){
System.out.println("match " + ++i + ": " + matcher.group(1) + "\n");
}
You should find that it prints:
match 1: Java
match 2: programming
This shows that you are able to loop over all of the matches. If you only want the last match, then you have a number of options:
Store the match in the loop, and when the loop is finished, you have the last match.
Change the regex to ignore everything until your pattern, with something like: Pattern.compile(".*\"([^\"]*)\"")
If you really want explicitly the second match, then the simplest solution is something like Pattern.compile("\"([^\"]*)\"[^\"]*\"([^\"]*)\""). This gives two matching groups.
If you want the last token inside double quotes, add an end-of-line archor ($):
final Pattern pattern = Pattern.compile("\"([^\"]*)\"$");
In this case, you can replace while with if if your input is a single line.
Great answer from Paul. Well,You can also try this pattern
final Pattern pattern = Pattern.compile(",\"(\\w+)\"");
Java program
String str ="\"Java\",\"programming\"";
final Pattern pattern = Pattern.compile(",\"(\\w+)\"");
final Matcher matcher = pattern.matcher(str);
while(matcher.find()){
System.out.println(matcher.group(1));
}
Explanation
,\": matches a comma, followed by a quotation mark "
(\\w+): matches one or more words
\": matches the last quotation mark "
Then the group(\\w+) is captured (group 1 precisely)
Output
programming

Regex matcher - No match found

I am trying to use Regex to extract the values from a string and use them for the further processing.
The string I have is :
String tring =Format_FRMT: <<<$gen>>>(((valu e))) <<<$gen>>>(((value 13231)))
<<<$gen>>>(((value 13231)))
Regex pattern I have made is :
Pattern p = Pattern.compile("\\<{3}\\$([\\w ]+)\\>{3}\\s?\\({3}([\\w ]+)\\){3}");
When I am running the whole program
Matcher m = p.matcher(tring);
String[] try1 = new String[m.groupCount()];
for(int i = 1 ; i<= m.groupCount();i++)
{
try1[i] = m.group(i);
//System.out.println("group - i" +try1[i]+"\n");
}
I am getting
No match found
Can anybody help me with this? where exactly this is going wrong?
My first aim is just to see whether I am able to get the values in the corresponding groups or not. and If that is working fine then I would like to use them for further processing.
Thanks
Here is an exaple of how to get all the values you need with find():
String tring = "CHARDATA_FRMT: <<<$gen>>>(((valu e))) <<<$gen>>>(((value 13231)))\n<<<$gen>>>(((value 13231)))";
Pattern p = Pattern.compile("<{3}\\$([\\w ]+)>{3}\\s?\\({3}([\\w ]+)\\){3}");
Matcher m = p.matcher(tring);
while (m.find()){
System.out.println("Gen: " + m.group(1) + ", and value: " + m.group(2));
}
See IDEONE demo
Note that you do not have to escape < and > in Java regex.
After you create the Matcher and before you reference its groups, you must call one of the methods that attempts the actual match, like find, matches, or lookingAt. For example:
Matcher m = p.matcher(tring);
if (!m.find()) return; // <---- Add something like this
String[] try1 = new String[m.groupCount()];
You should read the javadocs on the Matcher class to decide which of the above methods makes sense for your data and application. http://docs.oracle.com/javase/7/docs/api/java/util/regex/Matcher.html

Pattern (string) allows characters only one time

I want to check if my string contains only allowed characters. Everything works properly for example 7B, 77B or 7BBBB, but when I input something like this 7B7 or 7BB2 it's not matching.
Everything work fine, but when integer is last character it's not working.
Could You tell me what is wrong with that code?
pattern = Pattern.compile("[0-9]*[a-f]*[A-F]*");
matcher = pattern.matcher(stNumber);
if (matcher.matches()) {...}
If you want to mix numbers and chars in a various order you need sth like:
Pattern pattern = Pattern.compile("[\\da-fA-F]*")
Why not try it this way?
// Compile this pattern.
Pattern pattern = Pattern.compile("[0-9]*[a-f]*[A-F]*[0-9]*");
// See if this String matches.
Matcher m = pattern.matcher("num123");
if (m.matches()) {
System.out.println(true);
}
Source
Are you trying to verify that the string only has digits and letters and nothing else?
If so try using the following:
pattern = Pattern.compile("^[a-z-A-Z\\d]*$");
matcher = pattern.matcher(stNumber);
if (matcher.matches()) {...}

Pattern/Matcher group() to obtain substring in Java?

UPDATE: Thanks for all the great responses! I tried many different regex patterns but didn't understand why m.matches() was not doing what I think it should be doing. When I switched to m.find() instead, as well as adjusting the regex pattern, I was able to get somewhere.
I'd like to match a pattern in a Java string and then extract the portion matched using a regex (like Perl's $& operator).
This is my source string "s": DTSTART;TZID=America/Mexico_City:20121125T153000
I want to extract the portion "America/Mexico_City".
I thought I could use Pattern and Matcher and then extract using m.group() but it's not working as I expected. I've tried monkeying with different regex strings and the only thing that seems to hit on m.matches() is ".*TZID.*" which is pointless as it just returns the whole string. Could someone enlighten me?
Pattern p = Pattern.compile ("TZID*:"); // <- change to "TZID=([^:]*):"
Matcher m = p.matcher (s);
if (m.matches ()) // <- change to m.find()
Log.d (TAG, "looking at " + m.group ()); // <- change to m.group(1)
You use m.match() that tries to match the whole string, if you will use m.find(), it will search for the match inside, also I improved a bit your regexp to exclude TZID prefix using zero-width look behind:
Pattern p = Pattern.compile("(?<=TZID=)[^:]+"); //
Matcher m = p.matcher ("DTSTART;TZID=America/Mexico_City:20121125T153000");
if (m.find()) {
System.out.println(m.group());
}
This should work nicely:
Pattern p = Pattern.compile("TZID=(.*?):");
Matcher m = p.matcher(s);
if (m.find()) {
String zone = m.group(1); // group count is 1-based
. . .
}
An alternative regex is "TZID=([^:]*)". I'm not sure which is faster.
You are using the wrong pattern, try this:
Pattern p = Pattern.compile(".*?TZID=([^:]+):.*");
Matcher m = p.matcher (s);
if (m.matches ())
Log.d (TAG, "looking at " + m.group(1));
.*? will match anything in the beginning up to TZID=, then TZID= will match and a group will begin and match everything up to :, the group will close here and then : will match and .* will match the rest of the String, now you can get what you need in group(1)
You are missing a dot before the asterisk. Your expression will match any number of uppercase Ds.
Pattern p = Pattern.compile ("TZID[^:]*:");
You should also add a capturing group unless you want to capture everything, including the "TZID" and the ":"
Pattern p = Pattern.compile ("TZID=([^:]*):");
Finally, you should use the right API to search the string, rather than attempting to match the string in its entirety.
Pattern p = Pattern.compile("TZID=([^:]*):");
Matcher m = p.matcher("DTSTART;TZID=America/Mexico_City:20121125T153000");
if (m.find()) {
System.out.println(m.group(1));
}
This prints
America/Mexico_City
Why not simply use split as:
String origStr = "DTSTART;TZID=America/Mexico_City:20121125T153000";
String str = origStr.split(":")[0].split("=")[1];

Categories

Resources