Scala has an amazingly simple way to create parsers. Is there a fairly equivalent way to doing the same thing in the Java-only world that doesn't take a week of learning curve?
I'm not sure about the lerning curve, but in the Java world, the ANTLR Parser Generator is very well regarded and considered among the best.
How robust and how configurable does the parser need to be? If the grammar is fairly simple and stable you could just use a recursive descent parser, which uses methods that represent each grammar production rule. I think the output would be roughly what JavaCC would produce, as they are both top-down.
http://en.wikipedia.org/wiki/Recursive_descent_parser
Hope this might be helpful.
Manning publications has a book, "DSLs in Action", that covers Java in the beginning.
But, you may want to look at perhaps using Groovy to write your DSL, as there is a great deal of opportunities in a dynamic language, and it would have a shorter learning curve than Scala does.
For an introduction you can start with http://docs.codehaus.org/display/GROOVY/Writing+Domain-Specific+Languages.
The book I mentioned also covers using antlr, and when it makes sense to use and when it doesn't, so if you want to get a better understanding of how to write and maintain a DSL it is an excellent book.
Related
So I started to write a parser for OCaml in Scala with the Scala CombinatorParser,
but I get the feeling that this is not the right tool for the job.
Especially getting the precedences and associativity of operators and non-closed constructions right can be challenging.
So my question is: Whats the best way to for such a real world parser like one for OCaml?
I looked into parser generators like ANTLR, but there are numerous and I have no idea which one would actually make the job easier.
You can have a look at JavaCC generator. I find it quite useful to make DSL parsers. I guess it's a good candidate for parsing "real" languages too.
OCaml parser is implemented in pretty straightforward lex+yacc. Therefore, the easiest way is to port the rules using the equivalent lex+yacc toolset in your language.
I do not mean converting OCaml parsing rules in LL(k) (i.e. Parsec) is completely impossible. Actually it is not very difficult if you write an automatic conversion tool: see my blog entry about it http://camlspotter.blogspot.sg/2011/05/planck-small-parser-combinator-library.html But, with human hands, it is an almost impossible task to do correctly in short time.
-edit-
On the second thought, the easiest way, if you are not a Scala/Java purist, is to use the original OCaml parser and write some OCaml code to output its AST to something easy to parse for any other languages, for example, S-exp.
You may want to check out ANTLR. For small DSLs I found it very usable. I assume it can handle complex languages as well.
I want to write a parser and converter of haml-like languages, to parse them, and convert them into html content.
I found people usually use regular-expression to do this, but we have to write a lot of difficult regular expressions, which is not easy. Is there any tools or libraries to do it? I hope it in java and easy to use.
And, is there any articles about how to write such a parser? Thanks in advance!
Regular expressions are usually a poor-mans-parser. A regex is not a real parser.
Parsers are usually generated by a parser generator. You specify the language in a specification file and the parser generator will convert this to sourcecode for your parser.
After some research and testing, I have to say, parboiled is the best tool for this job.
I have spent one day on the PEG and the good examples parboiled has provided, and another day on writing a simple sass parser. It was so easy and nature. Much easier and clearer than Regex. And the best thing is that I can use only Java to write the program, no external DSL needs to learn.
I want to say thank you very much to the author of parboiled, it's a great tool that I'm looking for.
You can use JavaCC. It is a yacc like parser generator. The output is the Java source code for the parser.
I need to translate programs written in a domain specific language into xml representation. These programs are in the form of simple text file. What approach would you suggest me? What api should I use to:
Parse the text files written in this language.
Write xml based on the token and token streams I obtain.
My criteria is more of a rapid and easier development rather then memory or computing time efficiency.
Many Thanks
Ketan
The less trivial part of the job is with step #1, parsing the Domain Specific Language (DSL) text, rather than #2, pushing this to some XML language.
Hopefully you readily have a parser for the DSL (obviously this language must have been put to use somewhere...), and you may be able to "hook" your export/conversion logic into this parser. If such is not possible, you'll need to write a new parser.
Depending on the complexity of the DSL, you may be able to write, longhand, a simple parser based on a few loops and switch cases.
For more complicated languages, ANTLR is often a good choice. In a nutshell, one formalize the grammar of the DSL, in Backus Naur Form (BNF, or actually EBNF, here, i.e. the Extended family) and ANTLR produces a parser, written in a target language of choice (including Java). The learning curve with ANTLR is a factor to consider but in the context of a moderately to extremely sophisticated language, a well worth investment. ANTLR is similar but, in my opinion, a better tool than GNU Bison, this latter would however do the trick as well, and too, target Java is so desired.
If you are familiar with other languages, in particular Python, there are many other tools that can be put to use for more or less ad-hoc parsers; I've also used PyParsing and gladly recommend it.
XStream is the best XML serializer/deserializer for Java EVAR. If you can turn your DSL into Java classes, this is a great library to use.
I'm working on a compiler design project in Java. Lexical analysis is done (using jflex) and I'm wondering which yacc-like tool would be best(most efficient, easiest to use, etc.) for doing syntactical analysis and why.
If you specifically want YACC-like behavior (table-driven), the only one I know is CUP.
In the Java world, it seems that more people lean toward recursive descent parsers like ANTLR or JavaCC.
And efficiency is seldom a reason to pick a parser generator.
In the past, I've used ANLTR for both lexer and parser, and the JFlex homepage says it can interoperate with ANTLR. I wouldn't say that ANTLR's online documentation is that great. I ended up investing in 'The Definitive ANTLR reference', which helped considerably.
GNU Bison has a Java interface,
http://www.gnu.org/software/bison/manual/html_node/Java-Bison-Interface.html
You can use it go generate Java code.
There is also jacc.
Jacc is about as close to yacc as you can get, but it is implemented in pure java and generates a java parser.
It interfaces well with jFlex
http://web.cecs.pdx.edu/~mpj/jacc/
Another option would be the GOLD Parser.
Unlike many of the alternatives, the GOLD parser generates the parsing tables from the grammar and places them in a binary, non-executable file. Each supported language then has an engine which reads the binary tables and parses your source file.
I've not used the Java implementation specifically, but have used the Delphi engine with fairly good results.
I'm writing a Java application using Struts 2, but now I'd like to make it a hybrid Java & Scala project instead. I don't have much experience with Scala, but I learned Haskell years ago at college -- I really liked the functional programmed paradigm, but of course in class we were only given problems that were supremely suited to a functional solution! In the real world, I think some code is better suited to an imperative style, and I want to continue using Java for that (I know Scala supports imperative syntax, but I'm not ready to go in the direction of a pure Scala project just yet).
In a hybrid project, how does one decide what to code in Java and what to code in Scala?
Two things:
99% of Java code can be expressed in Scala
You can write projects that support mixed Java+Scala compilation. Your Scala code can call your Java code and your Java code can call your Scala code. (If you want to do the latter, I suggest defining the interface in Java and then just implementing it in Scala. Otherwise, calling Scala code from Java can get a little ugly.)
So the answer is: whatever parts you want. Your Scala code does not need to be purely functional. Your Scala code can call Java libraries. So pretty much any parts you could write in Java you could also write in Scala.
Now, some more practical considerations. When first trying Scala, some people pick relatively isolated, non-mission-critical parts of their program to write in Scala. Unit tests are a good candidate if you like that approach.
If you're familiar with Java and have learned Haskell in the past, I suggest treating Scala as a "better Java". Fundamentally, Scala compiles to JVM bytecode that is very, very similar to what Java outputs. The only difference is that Scala is more "productive": it produces more bytecode per line of code than Java does. Scala has a lot of things in common with Haskell (first-class functions, for-comprehensions are like Haskell's do-notation, limited type inference), but it's also very different (it's not lazy by default, it's not pure). So you can use some of your insights from Haskell to inspire your Scala style, but "under the hood" it's all Java bytecode.
In the spirit of your question, I recommend you write in Scala any code that involves heavy manipulation of collections, or that handle XML.
Scala's collection library is the foremost functional feature of Scala, and you'll experience great LoC reduction through its usage. Yes, there are Java alternatives, such as Google's collection library, but you asked what you should write in Scala. :-)
Scala also has native handling of XML. You might well find the transition difficult, if you try to take DOM code and make it work on Scala. But if you, instead, try to approach the problem and the Scala perspective and write it from scratch for Scala, you'll have gains.
I'd advise using Actors as well, but I'm not sure how well you can integrate that with Struts 2 code on Java. But if you have concurrent code, give Actors in Scala a thought.
It might sound silly, but why not write your entire project in Scala? It's a wonderful language that is far more expressive than Java while maintaining binary-compatible access to existing Java libraries.
Ask these questions of your project:
"What operations need side-effects?" and "What functionality is already covered well by Java libraries?" Then implement the rest in Scala.
However I would warn that hybrid projects are by their very nature more difficult than stand alone projects as you need to use multiple languages/environments. So given you claim not much experience with Scala I'd recommend playing with some toy projects first, perhaps a subset of your full goal. That will also give you a feel for where the divide should occur.