I'm working on a compiler design project in Java. Lexical analysis is done (using jflex) and I'm wondering which yacc-like tool would be best(most efficient, easiest to use, etc.) for doing syntactical analysis and why.
If you specifically want YACC-like behavior (table-driven), the only one I know is CUP.
In the Java world, it seems that more people lean toward recursive descent parsers like ANTLR or JavaCC.
And efficiency is seldom a reason to pick a parser generator.
In the past, I've used ANLTR for both lexer and parser, and the JFlex homepage says it can interoperate with ANTLR. I wouldn't say that ANTLR's online documentation is that great. I ended up investing in 'The Definitive ANTLR reference', which helped considerably.
GNU Bison has a Java interface,
http://www.gnu.org/software/bison/manual/html_node/Java-Bison-Interface.html
You can use it go generate Java code.
There is also jacc.
Jacc is about as close to yacc as you can get, but it is implemented in pure java and generates a java parser.
It interfaces well with jFlex
http://web.cecs.pdx.edu/~mpj/jacc/
Another option would be the GOLD Parser.
Unlike many of the alternatives, the GOLD parser generates the parsing tables from the grammar and places them in a binary, non-executable file. Each supported language then has an engine which reads the binary tables and parses your source file.
I've not used the Java implementation specifically, but have used the Delphi engine with fairly good results.
Related
I'm really interested in parser combinators, especially those who can deal with left-recursive and ambiguous grammars. I know the fabulous Superpower by Nicholas Blumhardt but it's unable to deal with this kind of grammars.
I've found some GLL parser combinators libraries like this https://github.com/djspiewak/gll-combinators, but it uses Scala and, that is a big inconvenience for me (I don't know that language).
I would like to know if there is any of these in C# (or Java)
Thank you very much.
I did a compiler project, using Java on IntelliJ IDE with ANTLR 4 extension, there are good resources out on the internet. This is the official book "The Definitive ANTLR 4 Reference" I find it quite good, also they offer nice documentation.
ANTLR 4 has the ability to deal with left-recursive and ambiguous grammars, you can implement the compiler with c# and Java and any language I think.
You can use their starter grammars for too many different languages.
Edit:
ANTLR 4 is a tool for Language Recognition, a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees.
It's NOT a library.
Scala has an amazingly simple way to create parsers. Is there a fairly equivalent way to doing the same thing in the Java-only world that doesn't take a week of learning curve?
I'm not sure about the lerning curve, but in the Java world, the ANTLR Parser Generator is very well regarded and considered among the best.
How robust and how configurable does the parser need to be? If the grammar is fairly simple and stable you could just use a recursive descent parser, which uses methods that represent each grammar production rule. I think the output would be roughly what JavaCC would produce, as they are both top-down.
http://en.wikipedia.org/wiki/Recursive_descent_parser
Hope this might be helpful.
Manning publications has a book, "DSLs in Action", that covers Java in the beginning.
But, you may want to look at perhaps using Groovy to write your DSL, as there is a great deal of opportunities in a dynamic language, and it would have a shorter learning curve than Scala does.
For an introduction you can start with http://docs.codehaus.org/display/GROOVY/Writing+Domain-Specific+Languages.
The book I mentioned also covers using antlr, and when it makes sense to use and when it doesn't, so if you want to get a better understanding of how to write and maintain a DSL it is an excellent book.
I wish to create a app that translates input java code into HTML formatted java code,
For example:
public class ReadWithScanner
Would become
<span class="public">public</span> <span class="class">class</span> ReadWithScanner
However it gets quite complicated when it comes to parameters and regular expressions. Now I have a bit of time on my hands, and I wish to write my own code parser.
How would I start this? and is there any tutorials or online content to not only help me write this, but understand it.
Thanks
For help with the complexity of parsing, you'll need to rely on the Java Language Specification.
As I seem to recall, Java is an LL(k) language (see here, for instance). However, the Java language, despite all attempts to keep it "compact", is still quite large and complex. The grammar is spread out over the entire document. This is not a project for the faint at heart. You might consider using a Java parsing tool (like Java-front).
What you need to do is use ANTLR, it already has Java grammars for parsing Java, then you just need to supply your own templates to output whatever you want from the Abstract Syntax Tree you generate with ANTLR.
If you need a resource for learning about parsers, I can recommend Basics of Compiler Design, which is available as a free download.
It covers more than just parsers, but if you read the first few chapters, you should have a good basic understanding of both lexers and parsers.
I think you need a lexical analyzer.
I used early the Flex lexical analyzer. It is not too complicated to use.
If you need to parse the analyzed text you can use the bison c++
bisoncpp.sourceforge.net/
(C++ konwledge need and linux environment)
I want to parse some data, and I have a BNF grammar to parse it with. Can anyone recommend any grammar compilers capable of generating code that can be used on a mobile device?
Since this is for JavaME, the generated code must be:
Hopefully pretty small
Low dependencies on exotic Java libraries
Not dependant on any runtime jar files.
I have used JFlex before, and I know it satisfies your second and third requirements. But I don't know how big the generated code might be. According to the manual, it generates a packed DFA table by default, so it might not be too bad.
The first question is do you have an existing grammar definition? When I've ported a LALR grammar to Java, I've used JFlex/CUP.
If your starting from scratch, I'd suggest you use JavaCC/FreeCC, which is an LL(k) parser. It's quite well documented and there are not runtime dependencies.
I'm developing a small programming language based mostly of the C99 standard and I've already written a fairly decent lexer in java and now I'm looking to generate a Java Parser from the grammar. I know there's Bison, but that seems to only generate C code. I'm looking for a application that will allow me to input my grammar and create a full parser class in java code. Reading other SO posts on related topics, I've found ANTLR, but I'm wondering if anyone in the SO knows about a better tool?
thanks!
Another couple to look at are JavaCC and SableCC (it has been a long time since I looked at SableCC).
I've been quite impressed by BNFC, which is able to generate parsers in Java as well as in C, C++, C#, F#, Haskell, and OCaml.
The JFlex home page at http://jflex.de indicates where to find Bison-like tools that can target Java:
http://byaccj.sourceforge.net/
http://www2.cs.tum.edu/projects/cup/
http://www.antlr.org/