Regex: Read value between multiple brackets - java

I currently working on translating a website (Smarty) with Poedit. To get all the text from the .tpl files i'm using regex to get the data between the {t} and {/t}. so an example:
{t}Password incorrect, please try again{/t}
The regex will read Password incorrect, please try again and place it in a .po file. This is all working fine. It goes wrong when it gets a little more advanced.
Sometimes the text between the {t} tags uses a parameter. this looks like this:
{t 1=$email|escape 2=$mailbox}No $1 given, please check your $2{/t}
This is also working great.
The real problem start when i use brackets inside the parameter like this:
{t 1={site info='name'} 2=$mailbox}visit %1 or go to your %2{/t}
My regex will close when it sees the first closing brackets so the result will be 2=$mailbox}visit %1 or go to your %2.
My regex looks like this:
\{t.*?\}?[}]([^\{]+)\{\/t\}|\{t\}([^\{]+)\{\/t\}
The regex is used inside a java program.
Does anybody has a way to fix this problem?

The easiest solution I see on this is to normalize the .tpl files. Just use a regex which matches all tags something like this one:
{[^}]*[^{]*}
I had the same issue to solve and it worked pretty good with the normalizing.
The normalizing-method would look like this:
final String regex = "\\{[^\\}]*[^\\{]*\\}";
private String normalizeContent(String content) {
return content.replaceAll(regex, "");
}

Related

Splitting string with similar starting pattern

So, I've been trying to split something I'm reading from a file. But everything that I've tried does not give me only the part that I want.
What I have as string is this:
Scenario:
Bunch of stuf here
Just typing stuff for the example...
Scenario:
More stuff here
A lot more stuff here
XX123
I want to get everything from 'Scenario:' to 'XX123'
Like this:
Scenario:
More stuff here
A lot more stuff here
XX123
The file that I'm reading from have a lot of those 'Scenarios:' and using Pattern from java doesn't give me only the part that I want. Instead it gives from the first 'Scenario:' it finds until 'XX123'
I also tried to use StringUtils.substringBetween, same result.
Thanks in advance
The old-fashioned way to do it would look something like this:
String inputText;
String END_MARKER = "XXX123";
int indexOfEnd = inputText.indexOf(END_MARKER);
// search in reverse
int indexOfScenario = inputText.lastIndexOf("Scenario", indexOfEnd);
String result = inputText.substring(indexOfScenario,
indexOfEnd + END_MARKER.length());

Selenium via java - sendKeys doesn't send specific chars to input

I'm having a strange condition where i'm trying to type into input by using sendKeys , the reuslt is that specific chars doesn't seem to be implemented in the input at all.
What i'm trying to do:
webDriver.findElement(By.id("additionalInfo(token_autocompleteSelectInputId)")).sendKeys("(test)");
the result is that input field is now : test) and the missing char is '(' .
If i will try
webDriver.findElement(By.id("additionalInfo(token_autocompleteSelectInputId)")).sendKeys("((((((((((")
the result is that the input is empty.
Anyone ever faced this issue before? it is happening on a very specific input in the app, couldn't find anything related to it in the html code.
Thanks in advance.
Edit: I can manually type ( in the input field.
Maybe it's a special character for selenium, have you tried using escape characters? Something like backslash before it if it allows it.
Edit: I found some issue report on github from last year, not sure if they agreed to not fix it. Executing a script to type "(" seems to be an alternative.
Source: https://github.com/seleniumhq/selenium/issues/674
try declaring the key as a string first
String keyToSend = "(test)";
webDriver.findElement(By.id("additionalInfo(token_autocompleteSelectInputId)")).sendKeys(keyToSend);
In this case you should try using JavascriptExecutor as below :-
WebElement el = webDriver.findElement(By.id("additionalInfo(token_autocompleteSelectInputId)"));
((JavascriptExecutor)webDriver).executeScript("arguments[0].value = arguments[1]", el, "(test)");
Hope it helps..:)

Why doesn't this Java regex compile?

I am trying to extract the pass number from strings of any of the following formats:
PassID_132
PassID_64
Pass_298
Pass_16
For this, I constructed the following regex:
Pass[I]?[D]?_([\d]{2,3})
-and tested it in Eclipse's search dialog. It worked fine.
However, when I use it in code, it doesn't match anything. Here's my code snippet:
String idString = filename.replaceAll("Pass[I]?[D]?_([\\d]{2,3})", "$1");
int result = Integer.parseInt(idString);
I also tried
java.util.regex.Pattern.compile("Pass[I]?[D]?_([\\d]{2,3})")
in the Expressions window while debugging, but that says "", whereas
java.util.regex.Pattern.compile("Pass[I]?[D]?_([0-9]{2,3})")
compiled, but didn't match anything. What could be the problem?
Instead of Pass[I]?[D]?_([\d]{2,3}) try this:
Pass(?:I)?(?:D)?_([\d]{2,3})
There's nothing invalid with your tegex, but it sucks. You don't need character classes around single character terms. Try this:
"Pass(?:ID)?_(\\d{2,3})"

How to use java regex to filter xml file

I have this java string with xml info and I am trying to use java regex to filter out all the junk that is between the words to form a word enclosed in brackets, e.g. [DEFENDANT].
I want to go from this:
<w:p><w:r><w:t>[</w:t></w:r><st1:PlaceName w:st="on"><w:r><w:t>DEFENDANT</w:t></w:r>
</st1:PlaceName><w:r><w:t> </w:t></w:r><st1:PlaceType w:st="on"><w:r><w:t>CITY</w:t></w:r>
</st1:PlaceType><w:r><w:t>], [</w:t></w:r><st1:place w:st="on"><st1:PlaceName w:st="on"><w:r>
<w:t>DEFENDANT</w:t></w:r></st1:PlaceName><w:r><w:t> </w:t></w:r><st1:PlaceType w:st="on"><w:r>
<w:t>STATE</w:t></w:r></st1:PlaceType></st1:place><w:r><w:t>] [DEFENDANT ZIP]</w:r><w:r>
to this:
<w:p><w:r><w:t>[DEFENDANT CITY], [DEFENDANT STATE] [DEFENDANT ZIP]</w:r><w:r>
I have been testing with regex epression like (\[)<.+>+([A-Z ]+\]) on regexPlanet extensively to no avail.
Do not use Regex to parse XML. Just use the built in Java XML library.
If it's all on a single line, like this:
<w:p><w:r><w:t>[</w:t></w:r><st1:PlaceName w:st="on"><w:r><w:t>DEFENDANT</w:t></w:r></st1:PlaceName><w:r><w:t> </w:t></w:r><st1:PlaceType w:st="on"><w:r><w:t>CITY</w:t></w:r></st1:PlaceType><w:r><w:t>], [</w:t></w:r><st1:place w:st="on"><st1:PlaceName w:st="on"><w:r><w:t>DEFENDANT</w:t></w:r></st1:PlaceName><w:r><w:t> </w:t></w:r><st1:PlaceType w:st="on"><w:r><w:t>STATE</w:t></w:r></st1:PlaceType></st1:place><w:r><w:t>] [DEFENDANT ZIP]</w:r><w:r>
Then this regex should work:
([<\w:\w>]+)(\[[</\w:\w>]+\s\w:\w+="\w+"><\w:\w><\w:\w>)(\w+)(</\w:\w></\w:\w></\w+:\w+><\w:\w><\w:\w>\s</\w:\w></\w:\w><\w+:\w+\s\w:\w+="\w+"><\w:\w><\w:\w>)(\w+)(</\w:\w></\w:\w></\w+:\w+><\w:\w><\w:\w>\],\s\[</\w:\w></\w:\w><\w+:\w+\s\w:\w+="\w+"><\w+:\w+\s\w:\w+="\w+"><\w:\w><\w:\w>)(\w+)(</\w:\w></\w:\w></\w+:\w+><\w:\w><\w:\w>\s</w:\w></\w:\w><\w+:\w+\s\w:\w+="\w+"><\w:\w><\w:\w>)(\w+)(</\w:\w></\w:\w></\w+:\w+></\w+:\w+><\w:\w><\w:\w>\]\s\[)(\w+\s\w+)(\])(</\w:\w><\w:\w>)
I have a working example here: RegExr
I could have grouped things a little better, but overall, it gets the job done, so you should be able to see it working.
Also, if it's not on a single line (if it's like it is in your example), then this would work:
([<\w:\w>]+)(\[[</\w:\w>]+\s\w:\w+="\w+"><\w:\w><\w:\w>)(\w+)(</\w:\w></\w:\w>\s+</\w+:\w+><\w:\w><\w:\w>\s</\w:\w></\w:\w><\w+:\w+\s\w:\w+="\w+"><\w:\w><\w:\w>)(\w+)(</\w:\w></\w:\w>\s+</\w+:\w+><\w:\w><\w:\w>\],\s\[</\w:\w></\w:\w><\w+:\w+\s\w:\w+="\w+"><\w+:\w+\s\w:\w+="\w+"><\w:\w>\s+<\w:\w>)(\w+)(</\w:\w></\w:\w></\w+:\w+><\w:\w><\w:\w>\s</w:\w></\w:\w><\w+:\w+\s\w:\w+="\w+"><\w:\w>\s+<\w:\w>)(\w+)(</\w:\w></\w:\w></\w+:\w+></\w+:\w+><\w:\w><\w:\w>\]\s\[)(\w+\s\w+)(\])(</\w:\w><\w:\w>)
You can see that on RegExr here.

Regex to Extract First Part of URL

I need a java regex to extract parts of a URL.
For example, take the following URLs:
http://localhost:81/example
https://test.com/test
http://test.com/
I would want my regex expression to return:
http://localhost:81
https://test.com
http://test.com
I will be using this in a Java patcher.
This is what I have so far, problem is it takes the whole URLs:
^https?:\/\/(?!.*:\/\/)\S+
import Java.net.URL
//snip
URL url = new URL(urlString);
return url.getProtocol() + "://" + url.getAuthority();
The right tool for the right job.
Building off your attempt, try this:
^https?://[^/]+
I'm assuming that you want to capture everything until the first / after http://? (That's what I was getting from your examples - if not, please post some more).
Are these URLs given as one input, or are each a different string?
Edit: It was pointed out that there were unnecessary escapes, so fixed to a more condensed version
Language independent answer:
For the whitespace: replace /^\s+/ with the empty string.
For removing the path information from the URL, if you can assume there aren't any slashes in the path (i.e. you're not dealing with http://localhost:81/foo/bar/baz), replace /\/[^\/]+$/ with the empty string. If there might be more slashes, you might try something like replacing /(^\s*.*:\/\/[^\/]+)\/.*/ with $1.
A simple one: ^(https?://[^/]+)

Categories

Resources