Replace symbols in string [closed]

Replace symbols in string [closed] - java

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
How to replace ^ into calling Math.pow() ?
For example:
str = "10 + 5.2^12"; // -> "10 + Math.pow(5.2, 12)"
str = "2^(12) + 6"; // -> "Math.pow(2, 12) + 6"

You can do it like this:
str = str.replaceAll("\\(?(\\d+\\.?\\d*)\\)?\\^\\(?(\\d+\\.?\\d*)\\)?", "Math.pow($1,$2)");
In this case, you are looking for 2 groups of digits (\\d+\\.?\\d*), which could be a float value and can be inside of () \\(? and \\)?. Between this 2 groups, you need to have the ^ sign \\^. If it matches, then replaceAll method replaces all this pattern with Math.pow($1,$2), where $1 and $2 will be replaced with first and second groups of digits.
But one thing, it could lead to wrong results if you have a complicated expression, with 2 multiplications in a row, like a 10.22^31.22^5. In this case, this regular expression should be much more complicated. And may be you should use some other algorithm to parse such expressions.

You can do this with regular expression in java.
Use the below code snippet to devide and replace the string.
private static String replaceMatchPow(String pValue) {
String regEx = "(\\d*\\.*\\d*)\\^\\(*(\\d*\\.*\\d*)\\)*";
Pattern pattern = Pattern.compile(regEx);
Matcher m = pattern.matcher(pValue);
if (m.find()) {
String value = "Math.pow(" + m.group(1) + "," + m.group(2) + ")";
pValue = pValue.replaceAll(regEx, value);
}
return pValue;
}
How the regex will work?
Here is the example for the input "2^(12) + 6"
1st group (\d*\.\d)
\d* -> match a digit [0-9],
* -> zero to unlimited
\^ -> divide the groups by ^ symbol
(* -> left bracket, * means occurrence zero to unlimited
2nd group (\d*\.\d)
)* -> right bracket
Result: 1st group values is 2 and 2nd group values is 12
/(\d*.*\d*)\^(*(\d*.*\d*))*/
1st Capturing group (\d*.*\d*)
\d* match a digit [0-9]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
.* matches the character . literally
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\d* match a digit [0-9]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\^ matches the character ^ literally
(* matches the character ( literally
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
2nd Capturing group (\d*.*\d*)
\d* match a digit [0-9]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
.* matches the character . literally
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\d* match a digit [0-9]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
)* matches the character ) literally
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]

For your special problem you should lookup how the caret character ^ is escaped in Java regular expressions, it should be \^.
Note that:
Backslashes within string literals in Java source code are interpreted
as required by The Java™ Language Specification as either Unicode
escapes (section 3.3) or other character escapes (section 3.10.6) It
is therefore necessary to double backslashes in string literals that
represent regular expressions to protect them from interpretation by
the Java bytecode compiler.
So you end up with "\\^" as a Java String.
However in general you will write a parser and interpreter for arithmetical expressions.
This subject is known as compiler construction and relies on knowledge of finite automata and formal languages.
A nice example can be found in chapter 10.2 "A Desk Calculator" of "The C++ Programming Language, Fourth Edition" by Bjarne Stroustrup. (Yes, I know you want to code in Java, please continue to read).
It features this formal grammar:
program:
end //end is end-of-input
expr_list end
expr_list:
expression print //pr intis newline or semicolon
expression print expr_list
expression:
expression + term
expression − term
term
term:
term / primary
term ∗ primary
primary
primary:
number //number is a ﬂoating-point literal
name //name is an identiﬁer
name = expression
−primary
( expression )
Stroustrup shows you how to code this in a similar structure:
double expr(bool get) //add and subtract
{
double left = term(get);
for(;;) { //‘‘forever’’
switch(ts.current().kind) {
case Kind::plus:
left += term(true);
break;
case Kind::minus:
left −= term(true);
break;
default:
return left;
}
}
}
this code embodies this part of the grammar:
expression:
expression + term
expression − term
term
This technique is called recursive descent, because recursion in the grammar will be implemented as a recursive function call in the corresponding code. The featured relevant C++ code should be understandable to a Java developer as well.
Once you got a bit familiar with grammars and how to implement them,
see e.g. Unambiguous grammar for exponentiation operation
how to deal with exponentiation in such a way, that the usual operator precedence rules are fulfilled.

Related

Java Regex: Allow only one blank space between two words [duplicate]

I am using the following regular expression without restricting any character length:
var test = /^(a-z|A-Z|0-9)*[^$%^&*;:,<>?()\""\']*$/ // Works fine
In the above when I am trying to restrict the characters length to 15 as below, it throws an error.
var test = /^(a-z|A-Z|0-9)*[^$%^&*;:,<>?()\""\']*${1,15}/ //**Uncaught SyntaxError: Invalid regular expression**
How can I make the above regular expression work with the characters limit to 15?

You cannot apply quantifiers to anchors. Instead, to restrict the length of the input string, use a lookahead anchored at the beginning:
// ECMAScript (JavaScript, C++)
^(?=.{1,15}$)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*$
^^^^^^^^^^^
// Or, in flavors other than ECMAScript and Python
\A(?=.{1,15}\z)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*\z
^^^^^^^^^^^^^^^
// Or, in Python
\A(?=.{1,15}\Z)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*\Z
^^^^^^^^^^^^^^^
Also, I assume you wanted to match 0 or more letters or digits with (a-z|A-Z|0-9)*. It should look like [a-zA-Z0-9]* (i.e. use a character class here).
Why not use a limiting quantifier, like {1,15}, at the end?
Quantifiers are only applied to the subpattern to the left, be it a group or a character class, or a literal symbol. Thus, ^[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']{1,15}$ will effectively restrict the length of the second character class [^$%^&*;:,<>?()\"'] to 1 to 15 characters. The ^(?:[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*){1,15}$ will "restrict" the sequence of 2 subpatterns of unlimited length (as the * (and +, too) can match unlimited number of characters) to 1 to 15 times, and we still do not restrict the length of the whole input string.
How does the lookahead restriction work?
The (?=.{1,15}$) / (?=.{1,15}\z) / (?=.{1,15}\Z) positive lookahead appears right after ^/\A (note in Ruby, \A is the only anchor that matches only start of the whole string) start-of-string anchor. It is a zero-width assertion that only returns true or false after checking if its subpattern matches the subsequent characters. So, this lookahead tries to match any 1 to 15 (due to the limiting quantifier {1,15}) characters but a newline right at the end of the string (due to the $/\z/\Z anchor). If we remove the $ / \z / \Z anchor from the lookahead, the lookahead will only require the string to contain 1 to 15 characters, but the total string length can be any.
If the input string can contain a newline sequence, you should use [\s\S] portable any-character regex construct (it will work in JS and other common regex flavors):
// ECMAScript (JavaScript, C++)
^(?=[\s\S]{1,15}$)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*$
^^^^^^^^^^^^^^^^^
// Or, in flavors other than ECMAScript and Python
\A(?=[\s\S]{1,15}\z)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*\z
^^^^^^^^^^^^^^^^^^
// Or, in Python
\A(?=[\s\S]{1,15}\Z)[a-zA-Z0-9]*[^$%^&*;:,<>?()\"']*\Z
^^^^^^^^^^^^^^^^^^

Make regular expression fail for invalid input [duplicate]

This question already has answers here:
Java RegEx meta character (.) and ordinary dot?
(9 answers)
Closed 2 years ago.
Upon validation using regular expression in Java, I need to return true for height having values :
80cm
80.2cm
80.25cm
My regular expression is as follows :
(\d)(\d?)(.?)(\d?)(\d?)(c)(m)
However if I pass in height as 71-80cm , the regular expression returns true too.
What change should I make to the regular expression to return false when height is 71-80cm ?

. matches any character, so you need to have \\. or just \. depending on the source. Check out: Java RegEx meta character (.) and ordinary dot?
Furthermore, additional changes need to be made such that e.g. 8025cm is not accepted if that is what you want.

I assume that the OP wishes to match substrings of the form
abcm
where:
"cm" is a literal;
"cm" is not followed by a letter;
"b" is the string representation of a non-negative float or integer (e.g., "80" or "80.25", but not "08" or ".25"); and
"a" is a character other than "-", "+" and ".", unless "b" is at the beginning of the string, in which case "a" is an empty string.
If my assumptions are correct you could use the following regex to match b in abcm:
(?<![-+.\d])[1-9]\d*(?:\.\d+)?cm(?![a-zA-Z])
Demo
The regex engine performs the following operations:
(?<! # begin negative lookbehind
[-+.\d] # match '-', '+', '.' or a digit
) # end negative lookbehind
[1-9] # match digit other than zero
\d* # match 0+ digits
(?:\.\d+) # match '.' followed by 1+ digits in a non-cap grp
? # optionally match non-cap grp
cm # match 'cm'
(?![a-zA-Z]) # match a letter in a negative lookahead
If my assumptions about what is required are not correct it may be evident how my answer could be adjusted appropriately.

Ok, let's take your expression and clean it up a little. You don't need all the capturing groups (..), since all you're interested in is validating the complete string. For that reason you should also enclose the expression in line beginning ^ and line end $ anchors, so your expression can't match inside a larger string. Lastly, you can group the period and trailing digits together (?:), since you won't get one without the other as per your example data. Which gets us:
^\d\d?(?:\.\d\d?)?cm$
See regex demo.
Then in Java, that check could look like this:
boolean foundMatch = subjectString.matches("^\\d\\d?(?:\\.\\d\\d?)?cm$");

Regular expression that accepts only two digit integer or a floating number

I am trying to validate a text field that accepts number like 10.99, 1.99, 1, 10, 21.
\d{0,2}\.\d{1,2}
Above expression is only passing values such as 10.99, 11.99,1.99, but I want something that would satisfy my requirement.

Try this:
^\d{1,2}(\.\d{1,2})?$
^ - Match the start of string
\d{1,2} - Must contains at least 1 digit at most 2 digits
(\.\d{1,2}) - When decimal points occur must have a . with at least 1 and at most 2 digits
? - can have zero to 1 times
$ - Match the end of string

Assuming you don't want to allow edge cases like 00, and want at least 1 and at most 2 decimal places after the point mark:
^(?!00)\d\d?(\.\d\d?)?$
This precludes a required digit before the decimal point, ie ".12" would not match (you would have to enter "0.12", which is best practice).
If you're using String#matches(), you can drop the leading/trailing ^ and $, because that method must to match the entire string to return true.

First \d{0,2} does not seem to fit your requirement as in that case it will be valid for no number as well. It will give you the correct output but logically it does not mean to check no number in your string so you can change it to \d{1,2}
Now, in regex ? is for making things optional, you can use it with individual expression like below:
\d{1,2}\.?\d{0,2}
or you can use it on the combined expression like below
\d{1,2}(\.\d{1,2})?
You can also refer below list for further queries:
abc… Letters
123… Digits
\d Any Digit
\D Any Non-digit character
. Any Character
\. Period
[abc] Only a, b, or c
[^abc] Not a, b, nor c
[a-z] Characters a to z
[0-9] Numbers 0 to 9
\w Any Alphanumeric character
\W Any Non-alphanumeric character
{m} m Repetitions
{m,n} m to n Repetitions
* Zero or more repetitions
+ One or more repetitions
? Optional character
\s Any Whitespace
\S Any Non-whitespace character
^…$ Starts and ends
(…) Capture Group
(a(bc)) Capture Sub-group
(.*) Capture all
(abc|def) Matches abc or def
Useful link : https://regexone.com/

Can you try using this :
(\d{1,2}\.\d{1,2})|(\d{1,2})
Here is a Demo, you can check also simple program
You have two parts or two groups one to check the float numbers #.#, #.##, ##.##, ##.# and the second group to check the integer #, ##, so we can use the or |, float|integer

I think patterns of this type are best handled with alteration:
/^\s*([-+]?[0-9]*\.[0-9]+([eE][-+]?[0-9]+)?)$ #float
| # or
^(\d{1,2})$ # 2 digit int/mx
Demo

String.replaceAll() with regex gets messed up

I'm trying to implement the Swedish "Robbers language" in Java. It's basically just replacing each consonant with itself, followed by a "o", followed by itself again. I thought I had it working with this code
str.replaceAll("[bcdfghjklmnpqrstvwxz]+", "$0o$0");
but it fails when there are two or more subsequent consonants, for example
String str = "horse";
It should produce hohororsose, but instead I get hohorsorse. I'm guessing the replacement somehow messes up the matching indexes in the original string. How can I make it work?

str.replaceAll("[bcdfghjklmnpqrstvwxz]", "$0o$0");
Remove the + quantifier as it will group consonants.
// when using a greedy quantifier
horse
h | o | rs | e
hoh | o | rsors | e
A plus sign matches one or more of the preceding character, class, or subpattern. For example a+ matches ab and aaab. But unlike a* and
a?, the pattern a+ does not match at the beginning of strings that
lack an "a" character.
https://autohotkey.com/docs/misc/RegEx-QuickRef.htm

+ means: Between one and unlimited times, as many times as possible, giving back as needed (greedy)
+? means: Between one and unlimited times, as few times as possible, expanding as needed (lazy)
{1} means: Exactly 1 time (meaningless quantifier)
In your case you don't need a quantifier.
You can experiment with regular expressions online at https://regex101.com/

Semi colon separated alphanumeric

I need to validate the below string using regular expression in Java:
String alphanumericList ="[\"State\"; \"districtOne\";\"districtTwo\"]";
I have tried the following:
String pattern="^\\[ (\"[\\w]\")\\s+(?:\\s+;\\s+ (\"[\\w]\")+) \\]$";
String alphanumericList ="[\"State1\"; \"district1\";\"district2\"]";
But the validation fails.
Any help is appreciated.

I'll try and mark the possible issues with your expression (issue numbers above the chars):
1 4 2 3 1 4 5 1
"^\\[ (\"[\\w]\")\\s+(?:\\s+;\\s+ (\"[\\w]\")+) \\]$"
As you can see, there are at least 5 issues:
The spaces in your expression are interpreted literally, i.e. if the input doesn't contain them, it would not match. Most probably you want to remove those spaces.
You expect at least one whitespace character after the first group (\\s+), which the input doesn't seem to contain. You probably want to remove that or change the quantifier from + to *.
You expect at least one whitespace character before each semicolon. Together with no. 2 this would make at least two after the first group. The solution would be the same as for no. 2.
Your expression the strings between double quotes seems wrong. (\"[\\w]\")+ means "a double quote, a single word character, a double quote" and all at least once. Besides that, \w is already a character class, you the brackets around that are not needed here (unless you want to add more classes or characters inside). You probably want (\"\\w+\") instead.
Additionally to 4 your non-capturing group that contains the semicolon ((?:\\s+;\\s+ (\"[\\w]\")+)) doesn't have a quantifier, i.e. it would be expected exactly once. You probably want to put the quantifier + or * after that group.
Another point that's not a direct issue is the capturing group around \"[\\w]\". Since you seem to want to match multiple strings after semicolons you'd only be able to capture one of the matching groups. Hence you'd most probably not be able to do what you intended anyways and thus the group is not necessary.
That said the fixed original expression would look like this:
pattern = "^\\[(\"\\w+\")(?:\\s*;\\s+\"\\w+\")+\\]$"

You are looking for this pattern:
String pattern = "\\[\\s*\"[^\"]*\"\\s*(?:;\\s*\"[^\"]*\"\\s*)*+\\]";
No need to add anchors since there are implicit if you use the matches() method since this method is the more appropriate for validation tasks.
pattern details:
\\[ # a literal opening square bracket
\\s* # optional whitespaces
\" # literal quote
[^\"]* # content between quotes: chars that are not a quote (zero or more)
\"
\\s*
(?: # non-capturing group:
; # a literal semi-colon
\\s*
\" # quoted content
[^\"]*
\"
\\s*
)*+ # repeat this group zero or more time (with a possessive quantifier)
\\] # a literal closing square bracket
The possessive quantifier prevent the regex engine to backtrack into repeated non-capturing groups if the closing square bracket is not present. It is a security to prevent uneeded backtracking and to make the pattern fail faster. Not that you can make possessive other quantifiers too before the non-capturing group for the same reason. More about possessive quantifiers.
I decided to describe the content between quotes in this way: \"[^\"]*\", but you can be more restrictive, allowing for example only words characters: \"\\w*\" or more general, allowing escaped quotes: \"[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*+\"

Try this
static final String HEAD = "^\\[\\s*";
static final String TAIL = "\\s*\\]$";
static final String SEP = "\\s*;\\s*";
static final String ITEM = "\"[^\"]*\"";
static final String PAT = HEAD + ITEM + "(" + SEP + ITEM + ")*" + TAIL;

Try:
pattern = "^\\[(\"\\w+\";\\s*)*(\"\\w+\")\\]$";

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Replace symbols in string [closed] - java

Related

Java Regex: Allow only one blank space between two words [duplicate]

Make regular expression fail for invalid input [duplicate]

Regular expression that accepts only two digit integer or a floating number

String.replaceAll() with regex gets messed up

Semi colon separated alphanumeric

Categories

Resources