I am trying to implement a parser in Java to extract the arguments of some functions.
When I have a function like:
max(1, 2, 3)
I just simply can use a Regular Expresion to extract the args.
But all my functions are not like that. If I have some nested function, eg:
max(sum(1, max(1,2,sum(2,5)), 3, 5, mult(3,3))
I would like to obtain:
sum(1, max(1,2,sum(2,5))
3
5
mult(3,3)
I tried using a Regular Expression, but I asume the language is not regular. Another approach was splliting by ',', but did not work as well.
Is there any method for extracting the arguments of a function? I do not really know how this type of problem can be solved since there is no a pattern to use for extracting the arguments.
Any help or insight would be really appreciated. Thanks!!
Parsing a source code into an some abstract model is quite complex topic, depending on the language complexity.
But first step is usually tokenization, where you read one character at a time and detect full tokens (like variable names, function names, operators, literals etc).
Since you presented only very limited scope for the problem , you have very small set of tokens:
name of a function
( and ) to indicate method call
, to separate arguments
numbers
Reading one symbol at the time, you should be able to very easily detect when one token ends and the next one begins. Also your tokens are very distinct (i.e. you don't have to differentiate function name from variable name), you can very easily categorize them.
Once you have a token, you know the grammar (you have only function calls), you can easily build a syntax tree (where at the root you have top level function call with its arguments being children nodes).
From that structure you can easily fetch whichever parts you wish.
If you are more interested in how it works in the javac compiler, you can always check out its source code (it's open source after all):
https://github.com/openjdk/jdk/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavacParser.java
https://github.com/openjdk/jdk/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavaTokenizer.java
However, that may be quite a long read.
Finally found a method that works:
public List<String> parseArgs(String l){
int startIdx = l.indexOf("(") + 1;
int endIdx = l.lastIndexOf(")") - 1;
int count = 0;
int argIdx = startIdx;
List<String> args = new ArrayList<>();
for (int i = startIdx; i < endIdx; i++) {
if (l.charAt(i) == '(')
count -= 1;
else if (l.charAt(i) == ')'){
count += 1;
}
else if (l.charAt(i) == ',' && count == 0){
args.add(l.substring(argIdx, i).trim());
argIdx = i + 1;
}
}
args.add(l.substring(argIdx, endIdx + 1).trim());
return args;
}
String l = "max(sum(1, max(1,2,sum(2,5))), 3, 5, mult(3,3))";
parseArgs(l).forEach(System.out::println);
//Prints
sum(1, max(1,2,sum(2,5)))
3
5
mult(3,3)
Related
In my software, I need to decide the version of a feature based on 2 parameters. Eg.
Render version 1 -> if (param1 && param2) == true;
Render version 2 -> if (!param1 && !param2) == true;
Render version 3 -> if only param1 == true;
Render version 4 -> if only param2 == true;
So, to meet this requirement, I wrote a code which looks like this -
if(param1 && param2) //both are true {
version = 1;
}
else if(!param1 && !param2) //both are false {
version = 2;
}
else if(!param2) //Means param1 is true {
version = 3;
}
else { //Means param2 is true
version = 4;
}
There are definitely multiple ways to code this but I finalised this approach after trying out different approaches because this is the most readable code I could come up with.
But this piece of code is definitely not scalable because -
Let say tomorrow we want to introduce new param called param3. Then
the no. of checks will increase because of multiple possible
combinations.
For this software, I am pretty much sure that we
will have to accommodate new parameters in future.
Can there be any scalable & readable way to code these requirements?
EDIT:
For a scalable solution define the versions for each parameter combination through a Map:
Map<List<Boolean>, Integer> paramsToVersion = Map.of(
List.of(true, true), 1,
List.of(false, false), 2,
List.of(true, false), 3,
List.of(false, true), 4);
Now finding the right version is a simple map lookup:
version = paramsToVersion.get(List.of(param1, param2));
The way I initialized the map works since Java 9. In older Java versions it’s a little more wordy, but probably still worth doing. Even in Java 9 you need to use Map.ofEntries if you have 4 or more parameters (for 16 combinations), which is a little more wordy too.
Original answer:
My taste would be for nested if/else statements and only testing each parameter once:
if (param1) {
if (param2) {
version = 1;
} else {
version = 3;
}
} else {
if (param2) {
version = 4;
} else {
version = 2;
}
}
But it scales poorly to many parameters.
If you have to enumerate all the possible combinations of Booleans, it's often simplest to convert them into a number:
// param1: F T F T
// param2; F F T T
static final int[] VERSIONS = new int[]{2, 3, 4, 1};
...
version = VERSIONS[(param1 ? 1:0) + (param2 ? 2:0)];
I doubt that there is a way that would be more compact, readable and scalable at the same time.
You express the conditions as minimized expressions, which are compact and may have meaning (in particular, the irrelevant variables don't clutter them). But there is no systematism that you could exploit.
A quite systematic alternative could be truth tables, i.e. the explicit expansion of all combinations and the associated truth value (or version number), which can be very efficient in terms of running-time. But these have a size exponential in the number of variables and are not especially readable.
I am afraid there is no free lunch. Your current solution is excellent.
If you are after efficiency (i.e. avoiding the need to evaluate all expressions sequentially), then you can think of the truth table approach, but in the following way:
declare an array of version numbers, with 2^n entries;
use the code just like you wrote to initialize all table entries; to achieve that, enumerate all integers in [0, 2^n) and use their binary representation;
now for a query, form an integer index from the n input booleans and lookup the array.
Using the answer by Olevv, the table would be [2, 4, 3, 1]. A lookup would be like (false, true) => T[01b] = 4.
What matters is that the original set of expressions is still there in the code, for human reading. You can use it in an initialization function that will fill the array at run-time, and you can also use it to hard-code the table (and leave the code in comments; even better, leave the code that generates the hard-coded table).
Your combinations of parameters is nothing more than a binary number (like 01100) where the 0 indicates a false and the 1 a true.
So your version can be easily calculated by using all the combinations of ones and zeroes. Possible combinations with 2 input parameters are:
11 -> both are true
10 -> first is true, second is false
01 -> first is false, second is true
00 -> both are false
So with this knowledge I've come up with a quite scalable solution using a "bit mask" (nothing more than a number) and "bit operations":
public static int getVersion(boolean... params) {
int length = params.length;
int mask = (1 << length) - 1;
for(int i = 0; i < length; i++) {
if(!params[i]) {
mask &= ~(1 << length - i - 1);
}
}
return mask + 1;
}
The most interesting line is probably this:
mask &= ~(1 << length - i - 1);
It does many things at once, I split it up. The part length - i - 1 calculates the position of the "bit" inside the bit mask from the right (0 based, like in arrays).
The next part: 1 << (length - i - 1) shifts the number 1 the amount of positions to the left. So lets say we have a position of 3, then the result of the operation 1 << 2 (2 is the third position) would be a binary number of the value 100.
The ~ sign is a binary inverse, so all the bits are inverted, all 0 are turned to 1 and all 1 are turned to 0. With the previous example the inverse of 100 would be 011.
The last part: mask &= n is the same as mask = mask & n where n is the previously computed value 011. This is nothing more than a binary AND, so all the same bits which are in mask and in n are kept, where as all others are discarded.
All in all, does this single line nothing more than remove the "bit" at a given position of the mask if the input parameter is false.
If the version numbers are not sequential from 1 to 4 then a version lookup table, like this one may help you.
The whole code would need just a single adjustment in the last line:
return VERSIONS[mask];
Where your VERSIONS array consists of all the versions in order, but reversed. (index 0 of VERSIONS is where both parameters are false)
I would have just gone with:
if (param1) {
if (param2) {
} else {
}
} else {
if (param2) {
} else {
}
}
Kind of repetitive, but each condition is evaluated only once, and you can easily find the code that executes for any particular combination. Adding a 3rd parameter will, of course, double the code. But if there are any invalid combinations, you can leave those out which shortens the code. Or, if you want to throw an exception for them, it becomes fairly easy to see which combination you have missed. When the IF's become too long, you can bring the actual code out in methods:
if (param1) {
if (param2) {
method_12();
} else {
method_1();
}
} else {
if (param2) {
method_2();
} else {
method_none();
}
}
Thus your whole switching logic takes up a function of itself and the actual code for any combination is in another method. When you need to work with the code for a particular combination, you just look up the appropriate method. The big IF maze is then rarely looked at, and when it is, it contains only the IFs themselves and nothing else potentially distracting.
I need to parse an expression such as: neg(and(X,Y))
I need it to come out with the Abstract Stack Machine Code Such as for the example above:
LOAD X;
LOAD Y;
EXEC and;
EXEC neg;
But for now the machine code is not an issue, how can i parse / break up my input string of an expression into all its sub expressions?
I have tried to find the first bracket and then concat from that to the last bracket but that then gives isuess if you have a inner expression?
code that i have tried: (please not it is still very much in the development phase)
private boolean evaluateExpression(String expression) {
int brackets = 0;
int beginIndex = -1;
int endIndex = -1;
for (int i = 0; i < expression.length(); i++) {
if (expression.charAt(i) == '(') {
brackets++;
if (brackets == 0) {
endIndex = i;
System.out.println("the first expression ends at " + i);
}
}
if (expression.charAt(i) == ')') {
brackets--;
if (brackets == 0) {
endIndex = i;
System.out.println("the first expression ends at " + i);
}
}
}
// Check for 1st bracket
for (int i = 0; i < expression.length(); i++) {
if (expression.charAt(i) == '(') {
beginIndex = i;
break;
}
}
String subExpression = expression.substring(beginIndex, endIndex);
System.out.println("Sub expression: " + subExpression);
evaluateExpression(subExpression);
return false;
}
I am just looking for a basic solution, It only has to do: and, or, neg
The expressions you are trying to parse are actually making a Context Free Language, which can be represented as a Context Free Grammer.
You can create a context free grammer that represents this language of expressions, and use a CFG parser to parse it.
One existing java tool that does it (and more) is JavaCC, though it could be an overkill here.
Another algorithm to parse sentences using a CFG is CYK, which is fairly easy to program and use.
In here, the CFG representing the available expressions are:
S -> or(S,S)
S -> and(S,S)
S -> not(S)
S -> x | for each variable x
Note that though this is relatively simple CFG - the language it describes is irregular, so if you were hoping for regex - it's probably not the way to go.
Actually if you want your parser to be strong enough to deal with most cases, you would like to use a tokenizer(java has a implemented tokenizer class) to token the string first, then try to recognize each expression, storing operands and operators in a tree structure, then evaluate them recursively.
If you only want to deal with some simple situations, remember to use recursion, that is the core part~
Parsing things like this is typically done using syntax trees, using some type of preference for order of operations. An example for what you have posted would be as follows:
Processing items left to right the tree would be populated like this
1arg_fcall(neg)
2arg_fcall(and)
Load Y
Load X
Now we can recursively visit this tree bottom to top to get
Load X
Load Y
EXEC and //on X and Y
EXEC neg //on result of and
I am working on a program that displays zip codes and house numbers. I need to sort the zip codes in ascending order in the first column then sort the house numbers from left to right, keeping them with the same zip code.
For instance:
Looks like this:
90153 | 9810 6037 8761 1126 9792 4070
90361 | 2274 6800 2196 3158 9614 9086
I want it to look like this:
90153 | 1126 4070 6037 8761 9792 9810
90361 | 2186 2274 3158 6800 9086 9614
I used the following code to sort the zip codes but how do I sort the house numbers? Do I need to add a loop to sort the numbers to this code? If so, where? So sorry I couldn't make the code indent correctly.
void DoubleArraySort()
{
int k,m,Hide;
boolean DidISwap;
DidISwap = true;
while (DidISwap)
{
DidISwap = false;
for ( k = 0; k < Row - 1; k++)
{
if ( Numbers[k][0] > Numbers[k+1][0] )
{
for ( m = 0; m < Col; m++)
{
Hide = Numbers[k ][m];
Numbers[k ][m] = Numbers[k+1][m];
Numbers[k+1][m] = Hide ;
DidISwap = true;
}
}
}
}
}
Use an object ZipCode like this:
public class ZipCode{
private String zipcode;
private ArrayList<String> adds
public ZipCode(String zip){
zipcode = zip;
adds = new ArrayList<String>();
}
public void addAddress(String address){
adds.add(address);
Collections.sort(adds);
}
}
Keep an array of ZipCodes sorting them necessarily:
ZipCode[] zips = . . .
.
.
.
Arrays.sort(zips);
First of all, are you aware that Java provides a more efficient sorting mechanism out of the box? Check the Arrays class.
Secondly you have to be very careful with your approach. What you are doing here is swapping all the elements of one row with the other. But you are not doing the same thing within each row. So you need a separate nested loop outside the current while (before or after, doesn't make a difference), which checks the houses themselves and sorts them:
for ( k = 0; k < Row; k++)
{
do
{
DidISwap = false;
for ( m = 0; m < Col-1; m++)
{
if (Numbers[k][m] > Numbers[k][m+1])
{
Hide = Numbers[k][m];
Numbers[k][m] = Numbers[k][m+1];
Numbers[k][m+1] = Hide;
DidISwap = true;
}
}
}
while (DidISwap);
}
However, your approach is very inefficient. Why don't you put the list of houses in a SortedSet, and then create a SortedMap which maps from your postcodes to your Sorted Sets of houses? Everything will be sorted automatically and much more efficiently.
You can use the TreeMap for your SortedMap implementation and the TreeSet for your SortedSet implementation.
I / we could try to tell you how to fix (sort of) your code to do what you want, but it would be counter-productive. Instead, I'm going to explain "the Java way" of doing these things, which (if you follow it) will make you more productive, and make your code more maintainable.
Follow the Java style conventions. In particular, the identifier conventions. Method names and variable names should always start with a lower case character. (And try to use class, method and variable names that hint as to the meaning of the class/method/variable.)
Learn the Java APIs and use existing standard library classes and methods in preference to reinventing the wheel. For instance:
The Arrays and Collections classes have standard methods for sorting arrays and collections.
There are collection types that implement sets and mappings and the like that can take care of "boring" things like keeping elements in order.
If you have a complicated data structure, build it out of existing collection types and custom classes. Don't try and represent it as arrays of numbers. Successful Java programmers use high-level design and implementation abstractions. Your approach is like trying to build a multi-storey car-park from hand-made bricks.
My advice would be to get a text book on object-oriented programming (in Java) and get your head around the right way to design and write Java programs. Investing the effort now will make you more productive.
I have this code, it should find a pre known method's name in the chosen file:
String[] sorok = new String[listaZ.size()];
String[] sorokPlusz1 = new String[listaIdeig.size()];
boolean keresesiFeltetel1;
boolean keresesiFeltetel3;
boolean keresesiFeltetel4;
int ind=0;
for (int i = 0; i < listaZ.size(); i++) {
for (int id = 0; id < listaIdeig.size(); id++) {
sorok = listaZ.get(i);
sorokPlusz1 = listaIdeig.get(id);
for (int j = 0; j < sorok.length; j++) {
for (int jj = 1; jj < sorok.length; jj++) {
keresesiFeltetel3 = (sorok[j].equals(oldName)) && (sorokPlusz1[id].startsWith("("));
keresesiFeltetel4 = sorok[j].startsWith(oldNameV3);
keresesiFeltetel1 = sorok[j].equals(oldName) && sorok[jj].startsWith("(");
if (keresesiFeltetel1 || keresesiFeltetel3 || keresesiFeltetel4) {
Array.set(sorok, j, newName);
listaZarojeles.set(i, sorok);
}
}
System.out.println(ind +". index, element: " +sorok[j]);
}
ind++;
}
}
listaZ is an ArrayList, elements spearated by '(' and ' ', listaIdeig is this list, without the first line (because of the keresesifeltetel3)
oldNameV3 is: oldName+ ()
I'd like to find a method's name if this is looking like this:
methodname
() {...
To do this I need the next line in keresesifeltetel 3, but I can't get it working properly. It's not finding anything or dropping errors.
Right now it writes out the input file's element's about 15 times, then it should; and shows error on keresesifeltetel3, and:
Exception in thread "AWT-EventQueue-0" java.lang.ArrayIndexOutOfBoundsException: 0
I think your problem is here: sorokPlusz1[id]. id does not seem to span sorokPlusz1's range. I suspect you want to use jj and that jj should span sorokPlusz1's range instead of sorok's and that sorok[jj].startsWith("(") should be sorokPlusz1[jj].startsWith("(").
But note that I'm largely speculating as I'm not 100% sure what you're trying to do or what listaZ and listaIdeig look like.
You're creating sorok with size = listaZ's size, and then you do this: sorok = listaZ.get(i);. This is clearly not right. Not knowing the exact type of listaZ makes it difficult to tell you what's wrong with it. If it's ArrayList<String[]>, then change
String[] sorok = new String[listaZ.size()]; to String[] sorok = null; or String[] sorok;. If it's ArrayList<String> then you probably want to do something more like sorok[i] = listaZ.get(i);
Now for some general notes about asking questions here: (with some repetition of what was said in the comments) (in the spirit of helping you be successful in getting answers to questions on this site).
Your question is generally unclear. After reading through your question and the code, I still have little idea what you're trying to do and what the input variables (listaZ and listaIdeig) look like.
Using non-English variable names makes it more difficult for any English speaker to help. Even changing sorok to array and keresesiFeltetelX to bX would be better (though still not great). Having long variable names that aren't understandable makes it much more difficult to read.
Comment your code. Enough comments (on almost every line) makes it much easier to understand your code.
Examples. If you have difficulty properly explaining what you want to do (in English), you can always provide a few examples which would assist your explanation a great deal (and doing this is a good idea in general). Note that a good example is both providing the input and the desired output (and the actual output, if applicable).
public boolean catDog(String str)
{
int count = 0;
for (int i = 0; i < str.length(); i++)
{
String sub = str.substring(i, i+1);
if (sub.equals("cat") && sub.equals("dog"))
count++;
}
return count == 0;
}
There's my code for catDog, have been working on it for a while and just cannot find out what's wrong. Help would be much appreciated!*/
EDIT- I want to Return true if the string "cat" and "dog" appear the same number of times in the given string.
One problem is that this will never be true:
if (sub.equals("cat") && sub.equals("dog"))
&& means and. || means or.
However, another problem is that your code looks like your are flailing around randomly trying to get it to work. Everyone does this to some extent in their first programming class, but it's a bad habit. Try to come up with a clear mental picture of how to solve the problem before you write any code, then write the code, then verify that the code actually does what you think it should do and that your initial solution was correct.
EDIT: What I said goes double now that you've clarified what your function is supposed to do. Your approach to solving the problem is not correct, so you need to rethink how to solve the problem, not futz with the implementation.
Here's a critique since I don't believe in giving code for homework. But you have at least tried which is better than most of the clowns posting homework here.
you need two variables, one for storing cat occurrences, one for dog, or a way of telling the difference.
your substring isn't getting enough characters.
a string can never be both cat and dog, you need to check them independently and update the right count.
your return statement should return true if catcount is equal to dogcount, although your version would work if you stored the differences between cats and dogs.
Other than those, I'd be using string searches rather than checking every position but that may be your next assignment. The method you've chosen is perfectly adequate for CS101-type homework.
It should be reasonably easy to get yours working if you address the points I gave above. One thing you may want to try is inserting debugging statements at important places in your code such as:
System.out.println(
"i = " + Integer.toString (i) +
", sub = ["+sub+"]" +
", count = " + Integer.toString(count));
immediately before the closing brace of the for loop. This is invaluable in figuring out what your code is doing wrong.
Here's my ROT13 version if you run into too much trouble and want something to compare it to, but please don't use it without getting yours working first. That doesn't help you in the long run. And, it's almost certain that your educators are tracking StackOverflow to detect plagiarism anyway, so it wouldn't even help you in the short term.
Not that I really care, the more dumb coders in the employment pool, the better it is for me :-)
choyvp obbyrna pngQbt(Fgevat fge) {
vag qvssrerapr = 0;
sbe (vag v = 0; v < fge.yratgu() - 2; v++) {
Fgevat fho = fge.fhofgevat(v, v+3);
vs (fho.rdhnyf("png")) {
qvssrerapr++;
} ryfr {
vs (fho.rdhnyf("qbt")) {
qvssrerapr--;
}
}
}
erghea qvssrerapr == 0;
}
Another thing to note here is that substring in Java's built-in String class is exclusive on the upper bound.
That is, for String str = "abcdefg", str.substring( 0, 2 ) retrieves "ab" rather than "abc." To match 3 characters, you need to get the substring from i to i+3.
My code for do this:
public boolean catDog(String str) {
if ((new StringTokenizer(str, "cat")).countTokens() ==
(new StringTokenizer(str, "dog")).countTokens()) {
return true;
}
return false;
}
Hope this will help you
EDIT: Sorry this code will not work since you can have 2 tokens side by side in your string. Best if you use countMatches from StringUtils Apache commons library.
String sub = str.substring(i, i+1);
The above line is only getting a 2-character substring so instead of getting "cat" you'll get "ca" and it will never match. Fix this by changing 'i+1' to 'i+2'.
Edit: Now that you've clarified your question in the comments: You should have two counter variables, one to count the 'dog's and one to count the 'cat's. Then at the end return true if count_cats == count_dogs.