How to modularize a JavaCC grammar file (.jj)? - java

I am learning compiler construction and want to implement the JavaScript grammar using JavaCC.
(I have already written my own JavaScript CodeModel which allows programmatic construction of the JavaScript code, now I want to write a JavaCC-based parser counterpart for that.)
My question is, is there a way to modularize the JavaCC grammar (.jj-file) into several files?
I have very good experience with the JavaParser so I am learning from their java_1_5.jj grammar. However, this is a 3000+ LoC file which is a bit hard to comprehend.
I would like to divide the grammar file into several parts so that it's easier to hande and understand. My Google searchen on "javacc modular", "javacc include", "javacc import" brought me some cryptic results which did not help much.
To be specific, how would I move the definition of the IDENTIFIER (lines 380-1081) to another file?

There is no way built in to JavaCC to modularize .jj files. The best thing to do is often to use JJT, as this allows you to move all actions out of the grammar file. If you don't want to use JJT, the next best thing may be to use the builder pattern.
If you just want an include facility, there are many preprocessors that can be used.

Yes, you can create various classes and pass parameters by creating objects and send to the objects in try catch inside javacc file by which it will look modular.

I'm aware that the question was posed nearly six years ago, but I'll answer even so, since people will be looking for an answer to this now and again.
The most advanced version of JavaCC is JavaCC 21and JavaCC 21 does have (among other things) an INCLUDE directive that, as best I can guess, is exactly what you are looking for.
There are actually quite a few other features that JavaCC21 that are not present in the legacy JavaCC project. Here is a biggie: the longstanding bug in which nested syntactic lookahead does not work correctly has been fixed. See here.

Related

Is it possible to use the auto-generated antlr parser (or its grammar) from a Xtext project?

I was wondering, whether it is possible to take the antlr grammar (*.g) or the generated parsers (from this grammar) and use it in a separate project?
For this I was looking into the SysMLv2 (eclipse-based) project on github, where xtext was used in order to define the grammar of this new modelling language. The grammar and the generated parsers can be found here.
My first idea was just to take the grammar file (InternalAlf.g) and use antlr (i tried 3.5.0 and 3.5.2) in order to generate the parser + lexer. Doing this i end up with a bunch of error message that symbols were not found (the symbol in question: EObject).
Then since it is obviously an eclipse project i figured another naive solution would be to package the whole project as a jar and include it as library in mine. I tried to use eclipse for that (export -> excecutable jar). That option requires a MainClass, where i am not sure which one to take and which also lets me doubt this approach. Using the other export jar option, does not allow to add the necessary dependencies to my jar.
Anyone other proposals? Since the antlr grammar file is available, it should be (actually) quite easy to generate the parser, but i am not sure how to do this, since this grammar file has a bunch of dependecies. Or if I rephrase this question: how do i deal with this type of antlr grammar files (that have dependecies to java libraries). In typical antlr tutorials, I (as a newb in antlr and xtext) could not find the answer.
best regards
I looked at the grammar in that project. IT is HIGHLY specific to Xtext. (To the point that it’s a bit difficult to find the ANTLR grammar amongst all of the actions).
You might be able to use the ANTLR3 grammar to parse it and discard all of the actions, etc. that make it so tightly coupled to Xtext (being careful about any semantic predicates and dependencies they might have on those actions). Emphasis on the MIGHT here.
In short, it’s not going to be at all simple to generate a parser divorced from Xtext using this grammar.
If you were to elaborate on what you need to accomplish by not just using the Xtext SysMLv2, and feel a need to create a separate parser someone might be able to point you in an appropriate direction.

What's the best way to write a OCaml parser in scala/java?

So I started to write a parser for OCaml in Scala with the Scala CombinatorParser,
but I get the feeling that this is not the right tool for the job.
Especially getting the precedences and associativity of operators and non-closed constructions right can be challenging.
So my question is: Whats the best way to for such a real world parser like one for OCaml?
I looked into parser generators like ANTLR, but there are numerous and I have no idea which one would actually make the job easier.
You can have a look at JavaCC generator. I find it quite useful to make DSL parsers. I guess it's a good candidate for parsing "real" languages too.
OCaml parser is implemented in pretty straightforward lex+yacc. Therefore, the easiest way is to port the rules using the equivalent lex+yacc toolset in your language.
I do not mean converting OCaml parsing rules in LL(k) (i.e. Parsec) is completely impossible. Actually it is not very difficult if you write an automatic conversion tool: see my blog entry about it http://camlspotter.blogspot.sg/2011/05/planck-small-parser-combinator-library.html But, with human hands, it is an almost impossible task to do correctly in short time.
-edit-
On the second thought, the easiest way, if you are not a Scala/Java purist, is to use the original OCaml parser and write some OCaml code to output its AST to something easy to parse for any other languages, for example, S-exp.
You may want to check out ANTLR. For small DSLs I found it very usable. I assume it can handle complex languages as well.

Syntax Preprocessors for Java

I'm looking for a Java macro language that provides for convenient ways of doing closures (that compile to anonymous inner classes) and list comprehension (that compiles down to basic java loops).
An example of the kind of thing I'm looking for would be Xtend2 http://www.eclipse.org/Xtext/#xtend2
But I want something for general purpose programming (Xtend2 is very specific DSL for Xtext and has a ton of dependencies). Maybe even something that would let me define multiple classes in a single file (which would then get split up into two separate files by the pre-processor).
Does anything like this exist?
Edited to add:
I'm doing Android development so any alternatives have to generate either valid Java source or the byte code has to be compatible with the dalvik recompiler.
Mmm, there used to be the JSE, which was tremendous fun, back in the day.
Mirah is cool, but not ready for primetime, IMO.
You can do a lot with smart templating, although your source view is the Java.
There's a post on SO about using XTend on Android from a few days ago, too.
Frege produces java source code.
I do not know whether dalvik would like it. (But I would be interested to hear ...)
And, of course, you have some runtime library code.
That being said, there are a number of other projects that do closures etc. in java, for example: lambdaj

JavaME-suitable grammar compiler recommendations?

I want to parse some data, and I have a BNF grammar to parse it with. Can anyone recommend any grammar compilers capable of generating code that can be used on a mobile device?
Since this is for JavaME, the generated code must be:
Hopefully pretty small
Low dependencies on exotic Java libraries
Not dependant on any runtime jar files.
I have used JFlex before, and I know it satisfies your second and third requirements. But I don't know how big the generated code might be. According to the manual, it generates a packed DFA table by default, so it might not be too bad.
The first question is do you have an existing grammar definition? When I've ported a LALR grammar to Java, I've used JFlex/CUP.
If your starting from scratch, I'd suggest you use JavaCC/FreeCC, which is an LL(k) parser. It's quite well documented and there are not runtime dependencies.

Which Java oriented lexer parser for simple project (ANTLR, DIY, etc)

I am working on a small text editor project and want to add basic syntax highlighting for a couple of languages (Java, XML..just to name a few). As a learning experience I wanted to add one of the popular or non popular Java lexer parser.
What project do you recommend. Antlr is probably the most well known, but it seems pretty complex and heavy.
Here are the option that I know of.
Antlr
Ragel (yes, it can generate Java source for processing input)
Do it yourself (I guess I could write a simple token parser and highlight the source code).
ANTLR or JavaCC would be the two I know. I'd recommend ANTLR first.
ANTLR may seem complex and heavy but you don't need to use all of the functionality that it includes; it's nicely layered. I'm a big fan of using it to develop parsers. For starters, you can use the excellent ANTLRWorks to visualize and test the grammars that you are creating. It's really nice to be able to watch it capture tokens, build parse trees and step through the process.
For your text editor project, I would check out filter grammars, which might suit your needs nicely. For filter grammars you don't need to specify the entire lexical structure of your language, only the parts that you care about (i.e. need to highlight, color or index) and you can always add in more until you can handle a whole language.
Google code has new project acacia-lex. Written by myself, it seems simple (so far) java lexer using javax annotations.
SableCC
Another interesting option (which I didn't try yet) would be Xtext, which uses Antlr but also includes tools for creating Eclipse editors for your language.
ANTLR is the way to go. I would not build it by hand. You'll also find if you look around on the ANTLR web site that grammars are available for Java, XML, etc.
Another option would be Xtext. It will not only generate a parser for your grammar, but also a complete editor with syntax coloring, error markers, content assist and outline view.
I've done it with JFlex before and was quite satisfied with it. But the language I was highlighting was simple enough that I didn't need a parser generator, so your mileage may vary.
JLex and CUP are decent lexer and parser generators, respectively. I'm currently using both to develop a simple scripting language for a project I'm working on.
I don't think that you need a lexer. all you need is first read the file extention to detect the language and then from a xml file which listed the language keywords easily find them and highlight them.

Categories

Resources