I'm trying to split the input given by the user for my calculator.
For example,
if the user inputs "23+45*(1+1)" I want to this to be split into [23,+,45,*,(,1,+,1,)].
What your looking for is called a lexer. A lexer splits up input into chunks (called tokens) that you can read.
Fortunately, your lexer is pretty simple and can be written by hand. For more complicated lexers, you can use flex (as in "The Fast Lexical Analyzer"--not Adobe Flex), or (since you're using Java) ANTLR (note, ANTLR is much more than just a lexer).
Simply come up with a list of regular expressions, one for each token to match (note that since your input is so simple, you can probably do away with this list and merge them all into one single regex. However, for more advanced lexers, it helps to do one regex for each token) e.g.
\d+
\+
-
*
/
\(
\)
Then start a loop: while there are more characters to be parsed, go through each of your regular expressions and attempt to match them against the beginning of the string. If they match, add the first matched group to your list of input. Otherwise, continue matching (if none of them match, tell the user they have a syntax error).
Pseudocode:
List<String>input = new LinkedList<String>();
while(userInputString.length()>0){
for (final Pattern p : myRegexes){
final Matcher m = p.matcher(userInputString);
if(m.find()) {
input.add(m.group());
//Remove the token we found from the user's input string so that we
//can match the rest of the string against our regular expressions.
userInputString=userInputString.substring(m.group().length());
break;
}
}
}
Implementation notes:
You may want to prepend the ^ character to all of your regular expressions. This makes sure you anchor your matches against the beginning of the string. My pseudocode assumes you have done this.
I think using stacks to split the operand and operator and evaluate the expression would be more appropriate. In the calculator we generally use Infix notation to define the arithmetic expression.
Operand1 op Operand2
Check the Shunting-yard algorithm used in many such cases to parse the mathematical expression. This is also a good read.
This might be a little sloppy, because I am learning still, but it does split them into strings.
public class TestClass {
public static void main(String[] args)
{
Scanner sc = new Scanner(System.in);
ArrayList<String> separatedInput = new ArrayList<String>();
String input = "";
System.out.print("Values: ");
input = sc.next();
if (input.length() != 0)
{
boolean numberValue = true;
String numbers = "";
for (int i = 0; i < input.length(); i++)
{
char ch = input.charAt(i);
String value = input.substring(i, i+1);
if (Character.isDigit(ch))
{ numberValue = true; numbers = numbers + value; }
if (!numberValue)
{ separatedInput.add(numbers); separatedInput.add(value); numbers = ""; }
numberValue = false;
if (i == input.length() - 1)
{
if (Character.isDigit(ch))
{ separatedInput.add(numbers); }
}
}
}
System.out.println(separatedInput);
}
}
Related
I am trying to split a given string using the java split method while the string should be devided by two different characters (+ and -) and I am willing to save the characters inside the array aswell in the same index the string has been saven.
for example :
input : String s = "4x^2+3x-2"
output :
arr[0] = 4x^2
arr[1] = +3x
arr[2] = -2
I know how to get the + or - characters in a different index between the numbers but it is not helping me,
any suggestions please?
You can face this problem in many ways. I´m sure there are clever and fancy ways to split this expression. I will show you the simplest problem-solving process that can help you.
State the problem you need to solve, the input and output
Problem: Split a math expression into subexpressions at + and - signals
Input: 4x^2+3x-2
Output: 4x^2,+3x,-2
Create a pseudo code with some logic you might think works
Given an expression string
Create an empty list of expressions
Create a subExpression string
For each character in the expression
Check if the character is + ou - then
add the subExpression in the list and create a new empty subexpression
otherwise, append the character in the subExpression
In the end, add the left subexpression in the list
Implement the pseudo-code in the programming language of your choice
String expression = "4x^2+3x-2";
List<String> expressions = new ArrayList();
StringBuilder subExpression = new StringBuilder();
for (int i = 0; i < expression.length(); i++) {
char character = expression.charAt(i);
if (character == '-' || character == '+') {
expressions.add(subExpression.toString());
subExpression = new StringBuilder(String.valueOf(character));
} else {
subExpression.append(String.valueOf(character));
}
}
expressions.add(subExpression.toString());
System.out.println(expressions);
Output
[4x^2, +3x, -2]
You will end with one algorithm that works for your problem. You can start to improve it.
Try this code:
String s = "4x^2+3x-2";
s = s.replace("+", "#+");
s = s.replace("-", "#-");
String[] ss = s.split("#");
for (int i = 0; i < ss.length; i++) {
Log.e("XOP",ss[i]);
}
This code replaces + and - with #+ and #- respectively and then splits the string with #. That way the + and - operators are not lost in the result.
If you require # as input character then you can use any other Unicode character instead of #.
Try this one:
String s = "4x^2+3x-2";
String[] arr = s.split("[\\+-]");
for(int i=0;i<arr.length;i++){
System.out.println(arr[i]);
}
Personally I like it better to have positive matches of patterns, especially if the split pattern itself is empty.
So for instance you could use a Pattern and Matcher like this:
Pattern p = Pattern.compile("(^|[+-])([^+-]*)");
Matcher m = p.matcher("4x^2+3x-2");
while (m.find()) {
System.out.printf("%s or %s %s%n", m.group(), m.group(1), m.group(2));
}
This matches the start of the string or a plus or minus: ^|[+-], followed by any amount of characters that are not a plus or minus: [^+-]*.
Do note that the ^ first matches the start of the string, and is then used to negate a character class when used between brackets. Regular expressions are tricky like that.
Bonus: you can also use the two groups (within the parenthesis in the pattern) to match the operators - if any.
All this is presuming that you want to use/test regular expressions; generally things like this require a parser rather than a regular expression.
A one-liner for persons thinking that this is too complex:
var expressions = Pattern.compile("^|[+-][^+-]*")
.matcher("4x^2+3x-2")
.results()
.map(r -> r.group())
.collect(Collectors.toList());
If I have expression in a string variable like this 20+567-321, so how can I extract last number 321 from it where operator can be +,-,*,/
If the string expression is just 321, I have to get 321, here there is no operator in the expression
You can do this by splitting your string based on your operators as following:
String[] result = myString.split("[-+*/]");
[+|-|*|/] is Regex that specifies the points from where your string should be split. Here, result[result.length-1] is your required string.
EDIT
As suggested by #ElliotFrisch we need to escape - in regex while specifying it. So following pattern should also work:
String[] result = myString.split("[+|\\-|*|/]");
Here is the list of characters they need to be escaped.
Link.
This seems to be an assignment for learning programming and algo, and also I doubt splitting using Regex would be efficient in a case where only last substring is required.
Start from end, and iterate until the length of the string times.
Declare a empty string say Result
While looping, if any of those operator is found, return Result, else prepend the traversed character to the string Result.
Return Result
String[] output = s.split("[+-/*]");
String ans = output[output.length-1];
Assumption here that there will be no spaces and the string contains only numbers and arithmetic operators.
[+-/*] is a regular expression that matches only the characters we provide inside the square brackets. We are splitting based on those characters.
If you wanna do it with StringTokenizer:
public static void main(String args[])
{
String expression = "20+567-321";
StringTokenizer tokenizer = new StringTokenizer(expression, "+-*/");
int count = tokenizer.countTokens();
if( count > 0){
for(int i=0; i< count; i++){
if(i == count - 1 ){
System.out.println(tokenizer.nextToken());
}else{
tokenizer.nextToken();
}
}
}
}
Recall you can specify multiple delimiters in StringTokenizer.
String str="b5l*a+i";//trying to replace the characters by user input (integer)
StringBuffer sb=new StringBuffer(str);
for(int i=0;i<sb.length();i++)
{
for(int j='a';j<='z';j++)
{
if(sb.charAt(i)==j)
{
System.out.println("Enter value for "+j);
int ip=sc.nextInt();
char temp=(char)ip;
//here how to replace the characters by int????
}
}
}
/* finally it will look like enter value b 4 enter value a 5 enter value i 6 the output is 451*5+6 */
With a Regular Expression
You should use regular expressions, it's more elegant and much more powerful. For example, without changing a line of code, you can use variable names that make more than one letter.
Example
Scanner sc = new Scanner(System.in);
String str = "b5l*a+i";
// This pattern could be declared as a constant
// It matches any sequence of alpha characters
Pattern pattern = Pattern.compile("[a-zA-Z]+");
Matcher matcher = pattern.matcher(str);
StringBuffer result = new StringBuffer();
// For each match ...
while(matcher.find()) {
// matcher.group() returns the macth found
System.out.println("Enter value for "+ matcher.group());
Integer input = sc.nextInt();
// ... append the parsed string with replacement of the match ...
matcher.appendReplacement(result, input.toString());
}
// ... and don't forget to append tail to add characters that follow the last match
matcher.appendTail(result);
System.out.println(result);
Taking your code and adapting it by following up on Kevin Anderson's comment, this seems to do what you're looking for:
Scanner sc = new Scanner(System.in);
String str="b5l*a+i";
StringBuffer sb=new StringBuffer(str);
for(int i=0;i<sb.length();i++)
{
for(int j='a';j<='z';j++)
{
if(sb.charAt(i)==j)
{
System.out.println("Enter value for "+(char)j);
int ip=sc.nextInt();
sb.deleteCharAt(i);
sb.insert(i, ip);
}
}
}
Might I also suggest this code which would behave similarly?
Scanner sc = new Scanner(System.in);
String str="b5l*a+i";
StringBuffer sb=new StringBuffer(str);
for(int i=0;i<sb.length();i++)
{
char original = sb.charAt(i);
if(original >= 'a' && original <= 'z')
{
System.out.println("Enter value for "+original);
int ip=sc.nextInt();
sb.deleteCharAt(i);
sb.insert(i, ip);
}
}
It should be more efficient, as it will not have to loop through the characters.
EDIT
After seeing #Sebastien's excellent answer, and applying some more of my own changes, I believe the following is an even better solution than the above ones, if it fits your project's constraints.
Scanner sc = new Scanner(System.in);
String str = "b5l*a+i";
Matcher matcher = Pattern.compile("[a-z]").matcher(str);
StringBuilder sb = new StringBuilder(str);
while (matcher.find())
{
System.out.println("Enter value for " + matcher.group());
int ip = sc.nextInt();
sb.setCharAt(matcher.start(), Character.forDigit(ip, 10));
}
Here is what is better:
Pattern matching with regular expressions. This way you don't need to manually search through each character of the String and check if it is a letter, then decide what to do with it. You can let the Matcher do that for you. The regular expression [a-z] means "exactly one character in the range of a to z. The matcher.find() method returns true each time it finds a new match for that expression as it moves through the String, and false when there are no more. Then, matcher.group() gets the character from the past find() operation (as a String, but that doesn't matter to us). matcher.start() gets the index for the match (the methods are named start() and end() because a typical match will be more than one character and have a start and end index, but only start() matters to us).
Switched to StringBuilder. StringBuilder is considered the newer implementation of what StringBuffer was designed for. It is generally preferred to use StringBuilder, unless you need your application to be thread-safe (which you don't, unless you know for sure that you specifically do need it to be). In our case, it also makes the action a lot easier by providing the setCharAt method, which does exactly what we need to do. Now we can just plop in the index of the char we intend to change (which the Matcher so conveniently provides us with), and the new character we got from the input. We must first make a character out of the int, using the convenient static method of the Character class, forDigit. The first part is the digit we read from input, and the second bit is the radix, which it needs to know to determine the digit's validity (for example, in base-10, the input 10 will be invalid, but in base-16, hexadecimal, it will return 'a'), in our case we put 10, because base-10 is the most common English number system. If the input is invalid (i.e. more than one base-10 digit, like 10, or less than 0), it will return a null character, so you may want to pop that out of the forDigit argument and first check if it is null, and handle input accordingly, something like the following:
char ipChar = Character.forDigit(ip, 10);
if (ipChar == '\u0000') throw new MyCustomNotADigitException("error message");
sb.setCharAt(matcher.start(), ipChar);
String str = "muthu", str1 = "";
int n = 5;
for (int i = 0; i < str.length(); i++) {
if (str.charAt(i) == 'u') {
str1 = str1 + n;
} else
str1 = str1 + str.charAt(i);
}
my program is to take a big string from the user like aaaabaaaaaba
then the output should be replace aaa by 0 and aba by 1 in the given pattern of
string it should not be take a sequence one into the other every sequence is
individual and like aaaabaaabaaaaba here aaa-aba-aab-aaa-aba are individual and
should not overlap eachother while matching please help me to get this program
example: aaaabaaaaaba input ended output is 0101
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Pattern1 {
Scanner sc =new Scanner(System.in);
public void m1()
{ String s;
System.out.println("enter a string");
s=sc.nextLine();
assertTrue(s!=null);
Pattern p = Pattern.compile(s);
Matcher m =p.matcher(".(aaa");
Matcher m1 =p.matcher("aba");
while(m.find())
{
s.replaceAll(s, "1");
}
while(m1.find())
{
s.replaceAll(s, "0");
}
System.out.println(s);
}
private boolean assertTrue(boolean b) {
return b;
// TODO Auto-generated method stub
}
public static void main(String[] args) {
Pattern1 p = new Pattern1();
p.m1();
}
}
With regex and find you can search for each successive match and then add a 0 or 1 depending on the characters to the output.
String test = "aaaabaaaaabaaaa";
Pattern compile = Pattern.compile("(?<triplet>(aaa)|(aba))");
Matcher matcher = compile.matcher(test);
StringBuilder out = new StringBuilder();
int start = 0;
while (matcher.find(start)) {
String triplet = matcher.group("triplet");
switch (triplet) {
case "aaa":
out.append("0");
break;
case "aba":
out.append("1");
break;
}
start = matcher.end();
}
System.out.println(out.toString());
If you have "aaaaaba" (one a too much in the first triplet) as input, it will ignore the last "a" and output "01". So any invalid characters between valid triplets will be ignored.
If you want to go through the string blocks of 3 you can use a for-loop and the substring() function like this:
String test = "aaaabaaaaabaaaa";
StringBuilder out = new StringBuilder();
for (int i = 0; i < test.length() - 2; i += 3) {
String triplet = test.substring(i, i + 3);
switch (triplet) {
case "aaa":
out.append("0");
break;
case "aba":
out.append("1");
break;
}
}
System.out.println(out.toString());
In this case, if a triplet is invalid, it will just be ignored and neither a "0" nor a "1" will be added to the output. If you want to do something in this case, just add a default clause to the switch statement.
Here's what I understand from your question:
The user string will be some sequence of the tokens "aaa" and "aba"
There will be no other combinations of 'a' and 'b'. For example, you will not get "aaabaa" as an input string as "baa" is invalid..
For each consecutive 3 character string, replace "aaa" with 0 and "aba" with 1.
I'm guessing that this is a homework assignment designed to teach you about the dangers of catastrophic backtracking and how to carefully use quantifiers.
My suggestion would be to do this in two parts:
Identify and replace each 3-letter segment with a single character.
Replace those characters with the appropriate value. ('1' or '0')
For example, first construct a pattern like a([ab])a to capture the character ('a' or 'b') between two 'a's. Then, use the Matcher class' replaceAll method to replace each match with the captured character. So, for input aaaabaaaaaba' you getabab` as a result. Finally, replace all 'a' with '0' and all 'b' with '1'.
In Java:
// Create the matcher to identify triplets in the form "aaa" or "aba"
Matcher tripletMatcher = Pattern.compile("a([ab])a").matcher(inputString);
// Replace each triplet with the middle letter, then replace 'a' and 'b' properly.
String result = tripletMatcher.replaceAll("$1").replace('a', '0').replace('b', '1');
There's better ways of doing this, of course, but this should work. I've left the code intentionally dense and hard to read quickly. So, if this is a homework assignment, make sure you understand it fully and then rewrite it yourself.
Also, keep in mind that this will not work if the input string that isn't a sequence of "aaa" and "aba". Any other combination, such as "baa" or "abb", will cause errors. For example, ababaa, aababa, and aaabab will all result in unexpected and potentially incorrect results.
I am trying to write regular expression to restrict some characters. The character to restrict is based on the requirement from various users.
I am trying to use this regex - [(char1|char2|char3|...)$]
Note: Each char will be from requirement.
If the user entered string matches any of the character i ll return true. Now,
what I want to know is weather this expression will work for all the conditions?
For example - requirement1 = .:, requirement2 = .:&%
I will concatinate | in between each char and then i will generate regular expression in java. This is working for my requirement1 but not for requirement2.
my sample java code
String requirement = ":>&%";
String regExp1 = null;
for (int i = 0; i < requirement.length(); i++) {
regExp1 = "[(" + requirement.charAt(i);
if (i - 1 != requirement.length()) {
regExp1.concat("|");
}
}
if (regExp1 != null) {
regExp1.concat(")]$");
}
Pattern p = Pattern.compile(regExp);
Matcher m = p.matcher(arg);
if (m.find())
return true;
else
return false;
How can I generate standard regular expression?
If you want "one of these characters" the brackets are good enough. No need for parenthesis and pipes.
Something like this : [.:,] and [.:&%] may work. If want them one or more times you have to had + at the end of your regex (ie: [.:&%]+).
As said in the comments, beware of special chars (like the dot, which means any chars in regex).