I am trying to parse a very simple C file using the antlr v4 grammar found at https://github.com/antlr/grammars-v4.
The file looks like this:
#include <stdio.h>
int main()
{
printf("hello world!");
return 0;
}
I am trying to parse the file like this:
public void parse(FileInputStream myFile) throws IOException {
ANTLRInputStream source = new ANTLRInputStream(myFile);
CLexer lexer = new CLexer(source);
CommonTokenStream stream = new CommonTokenStream(lexer);
CParser parser = new CParser(stream);
ParseTree tree = parser.primaryExpression();
ParseTreeWalker.DEFAULT.walk(new MyParseListener(), tree);
}
As i am trying to parse it i will get the error.
line 1:0 token recognition error at: '#i'
Is there another step i need to do to handle preprocessing? Is the C grammar incomplete?
The grammar provided does not support #include directives as far as I can see in the current committed version.
In fact the #include directive is not part of the C grammar as such, and therefor it is not treated typically by a compiler but by a preprocessor
Definition of the C preprocessor
Related
Compiling this using cmd : javac Test.java. However compilation fails, saying it cant find symbol parser.prog(). Any ideas?
import org.antlr.runtime.*;
public class TestT {
public static void main(String[] args) throws Exception {
// Create an TLexer that feeds from that stream
//TLexer lexer = new TLexer(new ANTLRInputStream(System.in));
TLexer lexer = new TLexer(new ANTLRFileStream("input.txt"));
// Create a stream of tokens fed by the lexer
CommonTokenStream tokens = new CommonTokenStream(lexer);
// Create a parser that feeds off the token stream
TParser parser = new TParser(tokens);
// Begin parsing at rule prog
parser.prog();
}
}
In your T.g4 grammar (or T.g), you must also have a parser rule named prog:
grammar T;
prog
: ...
;
...
Looking at your generated parser, I see you have a parser rule like this:
filter
: expression EOF
;
Use that instead:
// Begin parsing at rule prog
parser.filter();
I have a Lexer and a Parser called y86 Lexer and Parser which work as far as I know. But I have a file with y86 commands and I want to parse them using Java. So far I have code as follows.
y86Lexer y86 = null;
CommonTokenStream tokenStream = null;
y86Parser y86p = null;
try
{
y86 = new y86Lexer(CharStreams.fromFileName("C:\\Users\\saigbomian\\Documents"
+ "\\LearnANTLR\\src\\sum.ys"));
tokenStream = new CommonTokenStream(y86);
y86p = new y86Parser(tokenStream);
}
catch (IOException e)
{
log.error("Error occured while reading from file");
e.printStackTrace();
}
I'm not sure how to do the parsing. I have seen people use something like y86Parser.CompilationUnitContext but I can seem to find that class. I have tried printing from the Listeners antlr creates but I don't know how to trigger these listeners
For each rule ruleName in your grammar, the y86Parser class will contain a class named RuleNameContext and a method named ruleName(), which will parse the input according to that rule and return an instance of the RuleNameContext class containing the parse tree. You can then use listeners or visitors to walk that parse tree.
So if you don't have a compilationUnit method or a CompilationUnitContext class, your grammar probably just doesn't have a rule named compilationUnit. Instead you should pick a rule that you do have and call the method corresponding to that rule.
Antlr4 has a new class ParseTreeWalker. But how do I use it? I am looking for a minimal working example. My grammar file is 'gram.g4' and I want to parse a file 'program.txt'
Here is my code so far. (This assumes ANTLR has run my grammar file and created all of the gramBaseListener, gramLexer, etc etc):
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.*;
import static org.antlr.v4.runtime.CharStreams.fromFileName;
public class launch{
public static void main(String[] args) {
CharStream cs = fromFileName("gram.g4"); //load the file
gramLexer lexer = new gramLexer(cs); //instantiate a lexer
CommonTokenStream tokens = new CommonTokenStream(lexer); //scan stream for tokens
gramParser parser = new gramParser(tokens); //parse the tokens
// Now what?? How do I connect the above with the below?
ParseTreeWalker walker = new ParseTreeWalker(); // how do I use this to parse program.txt??
}}
I am using java but I assume it is similar in other languages.
The ANTLR documentation (http://www.antlr.org/api/Java/index.html) is short on examples. There are many tutorials on the internet but they are mostly for ANTLR version 3. The few using version 4 don't work or are outdated (for example, there is no parser.init() function, and classes like ANTLRInputStream are depreciated)
Thanks in advance for anyone who can help.
For each of your parser rules in your grammar the generated parser will have a corresponding method with that name. Calling that method will start parsing at that rule.
Therefore if your "root-rule" is named start then you'd start parsing via gramParser.start() which returns a ParseTree. This tree can then be fed into the ParseTreeWalker alongside with the listener you want to be using.
All in all it could look something like this (EDITED BY OP):
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.*;
import static org.antlr.v4.runtime.CharStreams.fromFileName;
public class launch{
public static void main(String[] args) {
CharStream cs = fromFileName("program.txt"); //load the file
gramLexer lexer = new gramLexer(cs); //instantiate a lexer
CommonTokenStream tokens = new CommonTokenStream(lexer); //scan stream for tokens
gramParser parser = new gramParser(tokens); //parse the tokens
ParseTree tree = parser.start(); // parse the content and get the tree
Mylistener listener = new Mylistener();
ParseTreeWalker walker = new ParseTreeWalker();
walker.walk(listener,tree);
}}
************ NEW FILE Mylistener.java ************
public class Mylistener extends gramBaseListener {
#Override public void enterEveryRule(ParserRuleContext ctx) { //see gramBaseListener for allowed functions
System.out.println("rule entered: " + ctx.getText()); //code that executes per rule
}
}
Of course you have to replace <listener> with your implementation of BaseListener
And just one small sidenode: In Java it is convention to start classnames with capital letters and I'd advise you to stick to that in order for making the code more readable for other people.
This example should work with ANTLR 4.8.
Below the example you can find references to setup your Java env, API and Listeners.
public class Launch {
public static void main(String[] args) {
InputStream inputStream = null;
MyprogramLexer programLexer = null;
try {
File file = new File("/program.txt");
inputStream = new FileInputStream(file);
programLexer = new MyprogramLexer(CharStreams.fromStream(inputStream)); // read your program input and create lexer instance
} finally {
if (inputStream != null) {
inputStream.close();
}
}
/* assuming a basic grammar:
myProgramStart: TOKEN1 otherRule TOKEN2 ';' | TOKENX finalRule ';'
...
*/
CommonTokenStream tokens = new CommonTokenStream(programLexer); // get tokens
MyParser parser = new MyParser(tokens);
MyProgramListener listener = new MyProgramListener(); // your custom extension from BaseListener
parser.addParseListener(listener);
parser.myProgramStart().enterRule(listener); // myProgramStart is your grammar rule to parse
// what we had built?
MyProgram myProgramInstance = listener.getMyProgram(); // in your listener implementation populate a MyProgram instance
System.out.println(myProgramInstance.toString());
}
}
References:
https://www.antlr.org/api/Java/
https://tomassetti.me/antlr-mega-tutorial/#java-setup
https://riptutorial.com/antlr/example/16571/listener-events-using-labels
I am using antlr v4 for extracting parse tree of java programs for other purposes. I have started from this sample: ANTLR v4 visitor sample
And I have tested the steps on given link to check if it works and everything gone right:
java Run
a = 1+2
b = a^2
c = a+b*(a-1)
a+b+c
^Z
Result: 33.0
And then I wrote my own to parse java programs as Structure below:
|_Java.g4
|_Java.tokens
|_JavaBaseVisitor.java
|_JavaLexer.java
|_JavaLexer.tokens
|_JavaParser.java
|_JavaTreeExtractorVisitor.java
|_JavaVisitor.java
|_Run.java
And the Run.java is as below:
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.*;
public class Run {
public static void main(String[] args) throws Exception {
CharStream input = CharStreams.fromFileName("F:\\Projects\\Java\\Netbeans\\ASTProj\\JavaTreeExtractor\\prog.java");
JavaLexer lexer = new JavaLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
JavaParser parser = new JavaParser(tokens);
ParseTree tree = parser.getContext();
JavaTreeExtractorVisitor calcVisitor = new JavaTreeExtractorVisitor();
String result = calcVisitor.visit(tree);
System.out.println("Result: " + result);
}
}
But at the statement ParseTree tree = parser.getContext(); the tree object gets null.
As I am new to antlr, any suggestions for me to check or any solution?
(If more info is required, just notify me).
TG.
Assuming you're using the grammar here, you want the starting point for parsing a Java file to be
ParseTree tree = parser.compilationUnit();
(For anyone not using that grammar, you want whatever you named your top-level parser rule.)
Shouldn't you be doing:
ParseTree tree = parser.input();
as in the calculator example?
I use the grammar Java.g from the ANTLR wiki produces a lexer and parser for Java source files.Then use the following code to generate an abstract syntax tree (AST).
ANTLRInputStream input = new ANTLRInputStream(new FileInputStream(fileName));
JavaLexer lexer = new JavaLexer(input); // create lexer
// create a buffer of tokens pulled from the lexer
CommonTokenStream tokens = new CommonTokenStream(lexer);
JavaParser parser = new JavaParser(tokens); // create parser
JavaParser.javaSource_return r = parser.javaSource(); // parse rule 'javaSource'
/*RuleReturnScope result = parser.compilationUnit();
CommonTree t = (CommonTree) result.getTree();*/
// WALK TREE
// get the tree from the return structure for rule prog
CommonTree t = (CommonTree)r.getTree();
Then modify the AST. For example,replace "File file = new File(filepath, fileType);" to
"S3Object _file = new S3Object(_fileName);" by modify the AST node. After this,I want to translate this AST to java source code.I modify the JavaTreeParser.g and write a stringtemplate and use the following method to get the java source code:
FileReader groupFileR = new FileReader("src/com/googlecode/zcg/templates/JavaTemplate.stg");
StringTemplateGroup templates = new StringTemplateGroup(groupFileR);
groupFileR.close();
// create a stream of tree nodes from AST built by parser
CommonTreeNodeStream nodes = new CommonTreeNodeStream(t);
// tell it where it can find the token objects
nodes.setTokenStream(tokens);
JavaTreeParser walker = new JavaTreeParser(nodes); // create the tree Walker
walker.setTemplateLib(templates); // where to find templates
// invoke rule prog, passing in information from parser
JavaTreeParser.javaSource_return r2 = walker.javaSource();
// EMIT BYTE CODES
// get template from return values struct
StringTemplate output = (StringTemplate)r2.getTemplate();
System.out.println(output.toString()); // render full template
If I don't modify the AST,it will get the java source code correctly,but after I modify the AST,it doesn't get the right java source code(the AST was modified correctly).For example,if I input the following souce code,and translate to AST,then modify "File file = new File(filepath, fileType);" to "S3Object _file = new S3Object(_fileName);":
public void methodname(String address){
String filepath = "file";
int fileType = 3;
File file = new File(filepath, fileType);
}
the result will be the following:
public void methodname( String address)
{
String filepath="file";
int fileType=3;
methodname (Stringaddress){Stringfilepath;//it's not what I wanted
}
Am I doing it wrong? Is there a more proper way for me to solve this problem?
unfortunately I cannot recommend doing source to source translation by rewriting the abstract syntax trees; try using the parse trees. If I remember ANTLR 3 can also generate those easily.
Ter