How do you parse non-standard form function?

How do you parse non-standard form function? - java

A standard form function like A*B+A*B' is easy to parse (spliting by + and then spliting by *). How do you parse a function, if it doesn't take a standard form?
Example: a function can take the following forms:
A*B+A(A+B')
A*B+(A+B')A
A*B+A*B(A+B)
Any ideas?
P.S: I would like to parse the function in Java.

A standard form function like A*B+A*B' is easy to parse (splitting by + and then splitting by *).
Good. Now, all that's left is to deal with those pesky parenthesis. First, we will remove them with array.split, and then we will add the necessary logic to carry out the multiplications:
Once you have split the string A(A+B')C, you will end up with an array of three strings A, A+B, and C. And notice that in this method odd-number strings are ALWAYS the ones inside the parenthesis. So all we have to do is check to see if the last and first characters of odd strings are letters (A, B, C) or operators (*,+).
String firstString = "A*B+A*B(A+B)+A*B+A*B(A+B)";
String leftOfParenthesis;
String insideParenthesis;
String rightOfParenthesis
String last;
String first;
String[] masterArray;
masterArray = str.split(firstString);
for(int i=0; i<masterArray.length; i+2){
leftOfParenthesis = masterArray[i];
insideParenthesis = masterArray[i+1];
rightParenthesis = masterArray[i+2];
last = leftOfParenthesis.substring(leftOfParenthesis.length()-1);
first = rightParenthesis.substring(0,1);
if(last.isLetter() && first.isLetter()){
leftOfParenthesis.append("*" + insideParenthesis + "*" +
last + "+last*" + insideParenthesis + "*" + first);
rightOfParenthesis[0] = last;
}
else if(last.isLetter()){
leftOfParenthesis.append("*" + insideParenthesis + "*" + last);
}
else if(first.isLetter()){
leftOfParenthesis.append("+" + first + "*" +
insideParenthesis + "*" );
}
}
That's the basic logic. There will be some issues with the rightParenthesis = masterArray[i+2]; if you run past the end of your input string and there aren't that many terms left. So you will have to add some if statements to check for that. And this isn't totally generally, if you have parenthesis inside parenthesis or more than two terms inside a pair of parenthesis, you will have to add special logic to deal with that.

Rather than trying to parse with ad hoc methods (which always ends badly), you
are better off
writing an BNF grammar for your expression forms, in all
variants
code a recursive descent parser (See
https://stackoverflow.com/a/2336769/120163)

Related

What code would I use to put parentheses around all the occurrences of a term/substring?

I know how to do it... but the way I'm thinking is complicated and has a lot of room for errors. I'm still learning Java, but I have learned that Java has ways of doing just about anything. Is there some way I can put parentheses around each occurrence of a substring? (see an example below)
Original String: "abcabcabcd"
Search For: "abc"
Final Output: "(abc)(abc)(abc)d"

The easiest way here is to use String::replaceAll
String str = "abcabcabcd";
String sub = "abc";
System.out.println(str.replaceAll(sub, "(" + sub + ")"));
As pointed out by #Jacob G., String::replace may be preferred here because there is no regex element needed.
Output:
(abc)(abc)(abc)d

If you are interested in another solution, there is one with recursion. Just for educational purpose:
private static String parentheses(String input, String template) {
int start = input.indexOf(template);
if (start == -1) {
return input;
}
return input.substring(0, start) +
"(" + input.substring(start, start + template.length()) + ")" +
parentheses(input.substring(start + template.length()), template);
}

How can I check if ArrayMap.keySet() contains a certain variable + Regex?

I have an ArrayMap, of which the keys are something like tag - randomWord. I want to check if the tag part of the key matches a certain variable.
I have tried messing around with Patterns, but to no success. The only way I can get this working at this moment, is iterating through all the keys in a for loop, then splitting the key on ' - ', and getting the first value from that, to compare to my variable.
for (String s : testArray) {
if ((s.split("(\\s)(-)(\\s)(.*)")[0]).equals(variableA)) {
// Do stuff
}
}
This seems very devious to me, especially since I only need to know if the keySet contains the variable, that's all I'm interested in. I was thinking about using the contains() method, and put in (variableA + "(\\s)(-)(\\s)(.*)"), but that doesn't seem to work.
Is there a way to use the .contains() method for this case, or do I have to loop the keys manually?

You should split these tasks into two steps - first extract the tag, then compare it. Your code should look something like this:
for (String s : testArray) {
if (arrayMap. keySet().contains(extractTag(s)) {
// Do stuff
}
}
Notice that we've separated our concerns into two steps, making it easier to verify each step behaves correctly individually. So now the question is "How do we implement extractTag()?"
The ( ) symbols in a regular expression create a group match, which you can retrieve via Matcher.group() - if you only care about tag you could use a Pattern like so:
"(\\S+)\\s-\\s.*"
In which case your extractTag() method would look like:
private static final Pattern TAG_PATTERN = Pattern.compile("(\\S+)\\s-\\s.*");
private static String extractTag(String s) {
Matcher m = TAG_PATTERN.matcher(s);
if (m.matches()) {
return m.group(1);
}
throw new IllegalArgumentException(
"'" + s + "' didn't match " TAG_PATTERN.pattern());
}
If you'd rather use String.split() you just need to define a regular expression that matches the delimiter, in this case -; you could use the following regular expression in a split() call:
"\\s-\\s"
It's often a good idea to use + after \\s to support one or more spaces, but it depends on what inputs you need to process. If you know it's always exactly one-space-followed-by-one-dash-followed-by-one-space, you could just split on:
" - "
In which case your extractTag() method would look like:
private static String extractTag(String s) {
String[] parts = s.split(" - ");
if (parts.length > 1) {
return s[0];
}
throw new IllegalArgumentException("Could not extract tag from '" + s + "'");
}

Java Regular expression with multi variable and arrayList of string

I created this Java method:
public String isInTheList(List<String> listOfStrings)
{
/*
* Iterates through the list, and if the list contains the input of the user,
* it will be returned.
*/
for(String string : listOfStrings)
{
if(this.answer.matches("(?i).*" + string + ".*"))
{
return string;
}
}
return null;
}
I use this method in a while block in order to validate user input. I want to check if that input matches the concatenation of two different predefined ArrayLists of Strings.
The format of the input must be like this:
(elementOfThefirstList + " " + elementOfTheSecondList)
where the Strings elementOfThefirstList and elementOfTheSecondList are both elements from their respective list.
for(int i = 0; i < firstListOfString.size(); i++)
{
if(userInput.contains(firstListOfString.get(i) + " " + userInput.isInTheList(secondListOfString)))
{
isValid = true;//condition for exit from the while block
}
}
It work if the user input is like this:
elementOfThefirstList + " " + elementOfTheSecondList
However, it will also work if the user input is like this:
elementOfThefirstList + " " + elementOfTheSecondList + " " + anotherElementOfTheFirstList
How can I modify my regular expression, as well as my method, in order to have exactly one repetition of elements in both lists concatenated with a space between them?
I tried with another regular expression and I think that I will use this: "{1}". However, I am not able to do that with a variable.

With the information you provide as to how you are getting this issue, there is little that can be said about how to fix it. I strongly encourage you to look at this quantifiers tutorial before moving forward.
Let's look at some solutions.
For example, lets look at the line:if(this.answer.matches("(?i).*" + string + ".*"))What you are trying to do is to see if this.answer contains string, ignoring case (I doubt you need the last .*). But you are using a Greedy Quantifier to compare them. If the issue is arising due to an input error in this comparison, I would consider looking at the linked tutorial for Reluctant Quantifiers.
Okay, so it wasn't a quantifier issue. The other possible fix may be this block of code:
for(int i = 0; i < firstListOfString.size(); i++)
{
if(userInput.contains(firstListOfString.get(i) + " " + userInput.isInTheList(secondListOfString)))
{
isValid = true;//condition for exit from the while block
}
}
I don't know you you got userInput to have the containsmethod, but I assume that you used containment to call the String method. If this is the case, there could be a solution to the issue. You would only have to state that it is valid if and only if it is equal to an element from the first list and a matching element from the second string.
The final solution I have for you is simple. If there are no other spaces present within the list elements, you could split the concatenated String on a space and check how many elements the resulting array contains. If it is greater than two, then you have an invalid concatenation.
Hopefully this helps!

Regular expression pattern to find a number within a semicolon delimited list of numbers

String temp = "77"; // It can be 0 or 100 or any value
// So the pattern will be like this only but number can be change anytime
String inclusion = "100;0;77;200;....;90";
I need to write a regular expression so that I can see whether temp exists in inclusion or not so for that I wrote a regexPattern like this.
// This is the regular Expression I wrote.
String regexPattern = "(^|.*;)" + temp + "(;.*|$)";
So do you think this regular expression will work everytime or there is some problem with that regexPattern?
if(inclusion.matches(regexPattern)) {
}

You could run into issues if temp can contain special characters for regular expressions, but if it is always integers then your method should be fine.
However, a more straightforward way to do this would be to split your string on semi-colons and then see if temp is in the resulting array.
If you do stick with regex, you can simplify it a bit by dropping the .*, the following will work the same way as your current regex:
"(^|;)" + temp + "(;|$)"
edit: Oops, the above will actually not work, I am a bit unfamiliar with regex in Java and didn't realize that the entire string needs to match, thanks Affe!

You don't need regex:
temp = "77"
String searchPattern = ";" + temp + ";";
String inclusion = ";" + "100;0;77;200;....;90" + ";";
inclusion.indexOf(searchPattern);

Another alternative without regex
String inclusion2 = ";" + inclusion + ";"; // To ensure that all number are between semicolons
if (inclusion2.indexOf(";" + temp + ";") =! -1) {
// found
}
Of course, no pattern recognition here (wildcards and the like)

I need a Java regular expression

I am currently using the following regular expression:
^[a-zA-Z]{0,}(\\*?)?[a-zA-Z0-9]{0,}
to check a string to start with an alpha character and end with alphanumeric characters and have an asterisk(*) anywhere in the string but only a maximum of one time. The problem here is that if the given string still passes if it starts with a number but doesn't have an *, which should fail. How can I rework the regex to fail this case?
ex.
TE - pass
*TE - pass
TE* - pass
T*E - pass
*9TE - pass
*TE* - fail (multiple asterisk)
9E - fail (starts with number)
EDIT:
Sorry to introduce a late edit but I also need to ensure that the string is 8 characters or less, can I include that in the regex as well? Or should I just check the string length after the regex validation?

This passes your example:
"^([a-zA-Z]+\\*?|\\*)[a-zA-Z0-9]*$"
It says:
start with: [a-zA-Z]+\\*? (a letter and maybe a star)
| (or)
\\* a single star
and end with [a-zA-Z0-9]* (an alphanumeric character)
Code to test it:
public static void main(final String[] args) {
final Pattern p = Pattern.compile("^([a-zA-Z]+\\*?|\\*)\\w*$");
System.out.println(p.matcher("TE").matches());
System.out.println(p.matcher("*TE").matches());
System.out.println(p.matcher("TE*").matches());
System.out.println(p.matcher("T*E").matches());
System.out.println(p.matcher("*9TE").matches());
System.out.println(p.matcher("*TE*").matches());
System.out.println(p.matcher("9E").matches());
}
Per Stargazer, if you allow alphanumeric before the star, then use this:
^([a-zA-Z][a-zA-Z0-9]*\\*?|\\*)\\w*$

One possible way is to separate into 2 conditions:
^(?=[^*]*\*?[^*]*$)[a-zA-Z*][a-zA-Z0-9*]*$
The (?=[^*]*\*?[^*]*$) part ensures there is at most one * in the string.
The [a-zA-Z*][a-zA-Z0-9*]* part ensures it starts with an alphabet or a *, and followed by only alphanumerals or *.

It might be easier to develop and maintain later if you just break your regular expressions into a few pieces, e.g., one for the start and end, and one for the asterisk. I am not sure what the overall performance effect would be, you would have simpler expressions but have to run a few of them.

This is Python, it'll need some massaging for Java:
>>> import re
>>> p = re.compile('^([a-z][^*]*[*]?[^*]*[a-z0-9]|[*][^*]*[a-z0-9]|[a-z][^*]*[*])$', re.I)
>>> for test in ['TE', '*TE', 'TE*', 'T*E', '*9TE', '*TE*', '9E']:
... if p.match(test):
... print test, 'pass'
... else:
... print test, 'fail'
...
TE pass
*TE pass
TE* pass
T*E pass
*9TE pass
*TE* fail
9E fail
Hope I didn't miss anything.

How about this, it's easier to read:
boolean pass = input.replaceFirst("\\*", "").matches("^[a-zA-Z].*\\w$");
Assuming I read right, you want to:
Start with an alpha character
End with an alphanumeric character
Allow up to one * anywhere

At most one asterisk, alphabetic characters anywhere and numbers anywhere but at start.
String alpha = "[a-zA-Z]";
String alnum = "[a-zA-Z0-9]";
String asteriskNone = "^" + alpha + "+" + alnum + "*";
String asteriskStart = "^\\*" + alnum + "*";
String asteriskInside = "^" + alpha + "+" + alnum + "+\\*" + alnum + "*";
String yourRegex = asteriskNone + "|" + asteriskStart + "|"
+ asteriskInside;
String[] tests = {"TE","*TE","TE*","T*E","*9TE","*TE*", "9E"};
for (String test : tests)
System.out.println(test + " " + (test.matches(yourRegex)?"PASS":"FAIL"));

Look for two possible patterns, one starting with *, and one with an alpha char:
^[a-zA-Z][a-zA-Z0-9]*(\\*?)?[a-zA-Z0-9]*|\*[a-zA-Z0-9]*

^([a-zA-Z][a-zA-Z0-9]*\*|\*|[a-zA-Z])([a-zA-Z0-9])*$
the parenthesis around the second half are for clarity and can be safely excluded.

This was a tough one (liked the challenge), but here it is:
^(\*[a-zA-Z0-9]+|[a-zA-Z]+[\*]{1}[a-zA-Z]*)$
In order to comply with T9*Z, as pointed out on another post with StarGazer712, I had to change it to:
^(\*[a-zA-Z0-9]+|[a-zA-Z]{1}[a-zA-Z0-9]*[\*]{1}[a-zA-Z0-9]*)$

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How do you parse non-standard form function? - java

Rather than trying to parse with ad hoc methods (which always ends badly), you are better off writing an BNF grammar for your expression forms, in all variants code a recursive descent parser (See https://stackoverflow.com/a/2336769/120163)

Related

What code would I use to put parentheses around all the occurrences of a term/substring?

How can I check if ArrayMap.keySet() contains a certain variable + Regex?

Java Regular expression with multi variable and arrayList of string

Regular expression pattern to find a number within a semicolon delimited list of numbers

I need a Java regular expression

Categories

Resources