unknown number of children in ANTLR tree - java

I am working on a parser for a calculator, which also needs to build a tree.
For example:
exp returns[Tree tree]e1=exp e2=operator e3=exp{
Tree tempTree = ($e2.tree);
tempTree.insertChild ($e1.tree);
tempTree.insertChild ($e3.tree);
$tree = tempTree;
}
I would like to know how can I build a tree for a multiple arguments function without assuming number of children.
For example: max(a,b,c,d,..)
I thought of using something like FUNCTION LEFTBRACKET exp (COMMA exp)* RIGHTBRACKET
but I am not sure about building the tree for the * expression

Something like:
FUNCTION: FUNCTION_NAME LEFTBRACKET PARAMETERS RIGHTBRACKET;
PARAMETERS: exp | exp COMMA PARAMEGERS;
may help.

What you did works fine, and the children will be put into a list that you can access via expr().

Related

weka wrapper attribute selection random forest java

protected static void attSelection_w(Instances data) throws Exception {
AttributeSelection fs = new AttributeSelection();
WrapperSubsetEval wrapper = new WrapperSubsetEval();
wrapper.buildEvaluator(data);
wrapper.setClassifier(new RandomForest());
wrapper.setFolds(10);
wrapper.setThreshold(0.001);
fs.SelectAttributes(data);
fs.setEvaluator(wrapper);
fs.setSearch(new BestFirst());
System.out.println(fs.toResultsString());
}
Above is my code for wrapper based attribute selection using random forest + bestfirst search. However, this somehow spits out a result using cfs, like below.
Search Method:
Greedy Stepwise (forwards).
Start set: no attributes
Merit of best subset found: 0.287
Attribute Subset Evaluator (supervised, Class (nominal): 9 class):
CFS Subset Evaluator
Including locally predictive attributes
There is no other code using CFS in the whole class, and I'm pretty much stuck.. I would appreciate any help. Thanks!
You just inverted the order and get the default method, the correct order is to set the parameter first, then call the selection:
//first
fs.setEvaluator(wrapper);
fs.setSearch(new BestFirst());
//then
fs.SelectAttributes(data);
Just set class Index and add this line after creating instance data
data.setClassIndex(data.numAttributes() - 1);
I checked and it worked fine.

TSurgeon - relabel node using old value

I am trying to implement TSurgeon on a standford parse tree (from the core-nlp api). What my intended action will do is add a prefix to the node that I find (e.g. the node found is NN and I would like to rename it to Skip-NN)
What I am trying is this:
TsurgeonPattern surgery = Tsurgeon.parseOperation("relabel target Skip-target");
for (TregexPattern pat : patterns) {
Tsurgeon.processPattern(pat, surgery, tree).pennPrint();
}
An example of one the TregexPattern's used would be NP << NP=target
Although as you might of guessed the result is similar to:
NP -> "Skip-target" instead of NP -> "Skip-NP"
I am quite new to using TSurgeon and am unsure as to where to look for information regarding an issue like this.
EDIT: Essentially what I'm asking; is there a way to use the current label of a node when relabeling it.
You should be able to use regexes for this. Something like
relabel target /^(.*)$/Skip-$1/
Though you will have to be careful with your pattern, it will have to ignore nodes beginning with Skip-.

Java: How to execute an XPath query on a node

So I'm reading from an XML file with many layers of nesting in Java using xPath.
At the moment I have a method that takes the path to XML file and a xpath query as parameters, and returns a NodeIterator.
Then I iterate through those node, and for some of the nodes (if their name matches) I need to execute another query on them and get a NodeIterator of their children etc
Is it possible to have a function with 2 parameters, one an already existing Node and the other an xPath query to execute on that Node?
So replacing:NodeIterator ni = XPathAPI.selectNodeIterator(document,xpathQuery);
With some like : NodeIterator ni2 = xPathAPI.selectNodeIterator(parentNode, query);
I've searched on the internet and I can't find any examples, and I'm not sure what the syntax to do the above would be, or if it's even possible?
Many thanks in advance :)
Presumably your XPathAPI class is the Apache/Xalan org.apache.xpath.XPathAPI?
In that case, what's wrong with
static NodeIterator selectNodeIterator(Node contextNode, java.lang.String str)
It seems to do exactly what you want.

How to merge two ASTs?

I'm trying to implement a tool for merging different versions of some source code. Given two versions of the same source code, the idea would be to parse them, generate the respective Abstract Source Trees (AST), and finally merge them into a single output source keeping grammatical consistency - the lexer and parser are those of question ANTLR: How to skip multiline comments.
I know there is class ParserRuleReturnScope that helps... but getStop() and getStart() always return null :-(
Here is a snippet that illustrates how I modified my perser to get rules printed:
parser grammar CodeTableParser;
options {
tokenVocab = CodeTableLexer;
backtrack = true;
output = AST;
}
#header {
package ch.bsource.ice.parsers;
}
#members {
private void log(ParserRuleReturnScope rule) {
System.out.println("Rule: " + rule.getClass().getName());
System.out.println(" getStart(): " + rule.getStart());
System.out.println(" getStop(): " + rule.getStop());
System.out.println(" getTree(): " + rule.getTree());
}
}
parse
: codeTabHeader codeTable endCodeTable eof { log(retval); }
;
codeTabHeader
: comment CodeTabHeader^ { log(retval); }
;
...
Assuming you have the ASTs (often difficult to get in the first place, parsing real languages is often harder than it looks), you first have to determine what they have in common, and build a mapping collecting that information. That's not as easy as it looks; do you count a block of code that has moved, but is the same exact subtree, as "common"? What about two subtrees that are the same except for consistent renaming of an identifier? What about changed comments? (most ASTs lose the comments; most programmers will think this is a really bad idea).
You can build a variation of the "Longest Common Substring" algorithm to compare trees. I've used that in tools that I have built.
Finally, after you've merged the trees, now you need to regenerate the text, ideally preserving most of the layout of the original code. (Programmers hate when you change the layout they so loving produced). So your ASTs need to capture position information, and your regeneration has to honor that where it can.
The call to log(retval) in your parser code looks like it's going to happen at the end of the rule, but it's not. You'll want to move the call into an #after block.
I changed log to spit out a message as well as the scope information and added calls to it to my own grammar like so:
script
#init {log("#init", retval);}
#after {log("#after", retval);}
: statement* EOF {log("after last rule reference", retval);}
-> ^(STMTS statement*)
;
Parsing test input produced the following output:
Logging from #init
getStart(): [#0,0:4='Print',<10>,1:0]
getStop(): null
getTree(): null
Logging from after last rule reference
getStart(): [#0,0:4='Print',<10>,1:0]
getStop(): null
getTree(): null
Logging from #after
getStart(): [#0,0:4='Print',<10>,1:0]
getStop(): [#4,15:15='<EOF>',<-1>,1:15]
getTree(): STMTS
The call in the after block has both the stop and tree fields populated.
I can't say whether this will help you with your merging tool, but I think this will at least get you past the problem with the half-populated scope object.

groovy xml parsing function

I wish to have a groovy function which can take 2 or more parameters something like input, find_tag.
I wrote something like below to test(not function), but it does not give me D_1164898448. Please help me with it.
def temp="""<Portals objVersion=\"1.1.19\">
<vector xsi:type=\"domainservice:Portals\" objVersion=\"1.1.19\">
<domainName>D_1164898448</domainName>
<address xsi:type=\"metadata:NodeRef\" objVersion=\"1.1.19\">
<host>Komodo</host>
<port>18442</port>
</address>
</vector>
</Portals>"""
def fInput="domainName"
def records = new XmlParser().parseText(temp)
def t=records.findAll{ it.fInput}.text()
println t
Update
for attribute i am doin something like below
println "id = ${records.attribute("id")}"
but like wise how to do it for nodes?
println "host = ${records.vector.address.host.text()}"
If you don't know the exact path to the XML tag you're searching for, you can do something like this to get the content of all tags with the given name:
def t = records."**"."$fInput".text()
To access attributes from a given XML node you can also use the # notation, e.g.
records.vector.#objVersion
What you need to do is:
turn off namespace awareness, so that XmlParser won't throw an error on encountering unbound xsi: prefix. You can do it by passing right arguments to XmlParser constructor.
properly traverse the DOM tree returned by parser - it returns a Node, not a list, and using findAll the way you used will not work
(optionally) remove backslashes from before double quotes in your XML, as escaping double quotes inside a heredoc is not necessary
Your code after corrections:
def temp="""
<Portals objVersion="1.1.19">
<vector xsi:type="domainservice:Portals" objVersion="1.1.19">
<domainName>D_1164898448</domainName>
<address xsi:type="metadata:NodeRef" objVersion="1.1.19">
<host>Komodo</host>
<port>18442</port>
</address>
</vector>
</Portals>
"""
def fInput="domainName"
def records= new XmlParser(false, false).parseText(temp)
def t = records.vector."$fInput".text()
println t
Running it displays 'D_1164898448', as expected.
I'm think you must use XPath expression here, or if you input xml excactly as you show in question, i'm recommend to you regexp like
def temp = ".." //your temp
def m = temp =~ /<domainName>(.*)</domainName>/
print m[0][1] // should be your domain
more about groovy regexp http://groovy.codehaus.org/Regular+Expressions

Categories

Resources