not able to match code between two comments, mutiple times

not able to match code between two comments, mutiple times - java

I want to get content between two comments in some file.
like a file x
#user code
alert("");
alert("");
#user code
{
===
====
}
#user code
alert("as");
alert("as");
#user code
i am using this regex pattern to match
final Pattern pat = Pattern.compile("//#User code\r?\n(.*)\r?\n//#User code" , Pattern.DOTALL);
but its matching from first #user code to end of the file.
pls help.

A quick fix is to use .*? instead of just .*. The ? changes the * into a non-greedy repetition, which will match up until the nearest #user code, instead of the furthest.

Related

RegEx for matching between any two HTML tags

I have the following content :
<div class="TEST-TEXT">hi</span>
first young CEO's TEST-TEXT
<span class="test">hello</span>
I am trying to match the TEST-TEXT string to replace it is value but only when it is a text and not within an attribute value.
I have checked the concepts of look-ahead and look-behind in Regex but the current issue with that is that it needs to use a fixed width for the match here is a link regex-match-all-characters-between-two-html-tags that show case a very similar case but with an exception that there is a span with a class to create a match
also checked the link regex-match-attribute-in-a-html-code
here are two regular expressions I am trying with :
\"([^"]*)\"
(?s)(?<=<([^{]*)>)(.+?)(?=</.>)
both are not working for me try using [https://regex101.com/r/ApbUEW/2]
I expect it to match only the string when it is a text
current behavior it matches both cases
Edit : I want the text to be dynamic and not specific to TEST-TEXT

Something like this should help:
\>([^"<]*)\<
EDIT:
Without open and close tags included:
(?<=\>)([^"<]*)(?=\<)

Try TEST-TEXT(?=<\/a>)
TEST-TEXT matches TEST-TEXT
?= look ahead to check closing tag </a>
see at
regex101

Here, we might just add a soft boundary on the right of the desired output, which you have been already doing, then a char list for the desired output, then collect, after that we can make a replacement by using capturing groups (). Maybe similar to this:
([A-Z-]+)(<\/)
Demo
This snippet is just to show that the expression might be valid:
const regex = /([A-Z-]+)(<\/)/gm;
const str = `<div class="TEST-TEXT">hi</span><a href=\\"https://en.wikipedia.org/wiki/TEST-TEXT\\">first young CEO's
TEST-TEXT</a><span class="test">hello</span><div class="TEST-TEXT">hi</span><a href=\\"https://en.wikipedia.org/wiki/TEST-TEXT\\">first young CEO's
TEST-TEXT</a><span class="test">hello</span>`;
const subst = `NEW-TEXT$2`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
RegEx
If this expression wasn't desired, it can be modified or changed in regex101.com.
RegEx Circuit
jex.im also helps to visualize the expressions.

Maybe this will help?
String html = "<div class=\"TEST-TEXT\">hi</span>\n" +
"first young CEO's TEST-TEXT\n" +
"<span class=\"test\">hello</span>";
Pattern pattern = Pattern.compile("(<)(.*)(>)(.*)(TEST-TEXT)(.*)</.*>");
Matcher matcher = pattern.matcher(html);
while (matcher.find()){
System.out.println(matcher.group(5));
}

A RegEx for that a string between any two HTML tags
(?![^<>]*>)(TEST\-TEXT)

JAVA REGEX: Match until the specific character

I have this Java code
String cookies = TextUtils.join(";", LoginActivity.msCookieManager.getCookieStore().getCookies());
Log.d("TheCookies", cookies);
Pattern csrf_pattern = Pattern.compile("csrf_cookie=(.+)(?=;)");
Matcher csrf_matcher = csrf_pattern.matcher(cookies);
while (csrf_matcher.find()) {
json.put("csrf_key", csrf_matcher.group(1));
Log.d("CSRF KEY", csrf_matcher.group(1));
}
The String contains something like this:
SessionID=sessiontest;csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e
Im trying to get the csrf_cookie data by using this Regular Expression:
csrf_cookie=(.+)(?=;)
I expect a result like this in the code:
csrf_matcher.group(1);
e18d027da2fb95e888ebede711f1bc39
instead I get a:
3492f8670f4b09a6b3c3cbdfcc59e512;ci_session=8d823b309a361587fac5d67ad4706359b40d7bd0
What is the possible work around for this problem?

Here is a one-liner using String#replaceAll:
String input = "SessionID=sessiontest;csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e";
String cookie = input.replaceAll(".*csrf_cookie=([^;]*).*", "$1");
System.out.println(cookie);
e18d027da2fb95e888ebede711f1bc39
Demo
Note: We could have used a formal regex pattern matcher, and in face you may want to do this if you need to do this search/replacement often in your code.

You are getting more data than expected because you are using an greedy '+' (It will match as long as it can)
For example the pattern a+ could match on aaa the following: a, aa, and aaa. Where the later is 'preferred' if the pattern is greedy.
So you are matching
csrf_cookie=e18d027da2fb95e888ebede711f1bc39;ci_session=3f4675b5b56bfd0ba4dae46249de0df7994ee21e;
as long as it ends with a ';'. The first ';' is skipped with .+ and the last ';' is found with the possitive lookahead
To make a patter ungreedy/lazy use +? instead of + (so a+? would match a (three times) on aaa string)
So try with:
csrf_cookie=(.+?);
or just match anything that is not a ';'
csrf_cookie=([^;]*);
that way you don't need to make it lazy.

Java(Apex) RegEx not working?

I am having trouble with a regex in salesforce, apex. As I saw that apex is using the same syntax and logic as apex, I aimed this at java developers also.
I debugged the String and it is correct. street equals 'str 3 B'.
When using http://www.regexr.com/, the regex works('\d \w$').
The code:
Matcher hasString = Pattern.compile('\\d \\w$').matcher(street);
if(hasString.matches())
My problem is, that hasString.matches() resolves to false. Can anyone tell me if I did something somewhere wrong? I tried to use it without the $, with difference casing, etc. and I just can't get it to work.
Thanks in advance!

You need to use find instead of matches for partial input match as matches attempts to match complete input text.
Matcher hasString = Pattern.compile("\\d \\w$").matcher(street);
if(hasString.find()) {
// matched
System.out.println("Start position: " + hasString.start());
}

constants and inputs with Regex in Java

I'm trying to create a regex for a string I write down.
My string is like :
'AUR HALAA /PART="PROJECT" /ROLE="VR_ANALYST" /TYPE="C" /CAPABILITY="S" /ADD' (SUC)
The constant part in regex is :
'AUR
/ROLE=""
The inputs are:
HALAA
VR_ANALYST
I tried the regex like this:
\'(AUR) HALAA .* /ROLE="(.)" .
but it doesnt work.
Could you please show me some tricks to how to do this ?

Try this:
^AUR (\\w+).*?/ROLE="(\\w+)".*$

This regex might work for you
^AUR (\\w+) .*? /ROLE="(\\w+)" .*$
And, you can then use "groups" in Matcher class to get the matching groups which will give you HALAA at group(1) and VR_ANALYST at group(2)

Regex for removing part of a line if it is preceded by some word in Java

There's a properties language bundle file:
label.username=Username:
label.tooltip_html=Please enter your username.</center></html>
label.password=Password:
label.tooltip_html=Please enter your password.</center></html>
How to match all lines that have both "_html" and "</center></html>" in that order and replace them with the same line except the ending "</center></html>". For example, line:
label.tooltip_html=Please enter your username.</center></html>
should become:
label.tooltip_html=Please enter your username.
Note: I would like to do this replacement using an IDE (IntelliJ IDEA, Eclipse, NetBeans...)

Since you clarified that this regex is to be used in the IDE, I tested this in Eclipse and it works:
FIND:
(_html.*)</center></html>
REPLACE WITH:
$1
Make sure you turn on the Regular expressions switch in the Find/Replace dialog. This will match any string that contains _html.* (where the .* greedily matches any string not containing newlines), followed by </center></html>. It uses (…) brackets to capture what was matched into group 1, and $1 in the replacement substitutes in what group 1 captured.
This effectively removes </center></html> if that string is preceded by _html in that line.
If there can be multiple </center></html> in a line, and they are all to be removed if there's a _html_ to their left, then the regex will be more complicated, but it can be done in one regex with \G continuing anchor if absolutely need be.
Variations
Speaking more generally, you can also match things like this:
(delete)this part only(please)
This now creates 2 capturing groups. You can match strings with this pattern and replace with $1$2, and it will effectively delete this part only, but only if it's preceded by delete and followed by please. These subpatterns can be more complicated, of course.

if (line.contains("_html=")) {
line = line.replace("</center></html>", "");
}
No regExp needed here ;) (edit) as long as all lines of the property file are well formed.

String s = "label.tooltip_html=Please enter your password.</center></html>";
Pattern p = Pattern.compile("(_html.*)</center></html>");
Matcher m = p.matcher(s);
System.out.println(m.replaceAll("$1"));

Try something like this:
Pattern p = Pattern.compile(".*(_html).*</center></html>");
Matcher m = p.matcher(input_line); // get a matcher object
String output = input_line;
if (m.matches()) {
String output = input_line.replace("</center></html>", "");
}

/^(.*)<\/center><\/html>/
finds you the
label.tooltip_html=Please enter your username.
part. then you can just put the string together correctly.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

not able to match code between two comments, mutiple times - java

A quick fix is to use .? instead of just .. The ? changes the * into a non-greedy repetition, which will match up until the nearest #user code, instead of the furthest.

Related

RegEx for matching between any two HTML tags

JAVA REGEX: Match until the specific character

Java(Apex) RegEx not working?

constants and inputs with Regex in Java

Regex for removing part of a line if it is preceded by some word in Java

Categories

Resources