how to implement an idl-to-java compiler

how to implement an idl-to-java compiler - java

I need to implement an idl-to-java compiler. In fact, it's not idl-to-java. Interface definition language is extended. So I need to implement a compiler which can generates java source file. I know nothing about corba and I feel hard to start. Do you think it's possible for me to finish this work in half a year? and if so, what should I do. ps: please forgive my English.

If you don't know anything about parsers and parser generators it's going to be a tough job, but I think that half a year should be plenty if you don't start from scratch.
I suggest that you use Antlr, which happens to have an IDL parser implementation among its contributed examples. This is probably for an older version of Antlr, but it's definitely a good starting point. Be sure to get hold of the Antlr book, you're going to need it!
For the code generation part you could use StringTemplate, a template engine written by Antlr's author, Terence Parr, exactly for this purpose.
If you really have to implement a whole ORB you might as well check out how others did it, e.g. here.

A true IDL-to-java not only spews Java code that maps that stuff back to IDL definitions (strictly adhering to the OMG standards). It also generate Java code that allows your definitions to work with an underlying CORBA stack (not unlike a true compiler generating instructions for a target hardware architecture.)
That is, an IDL compiler
1) takes your IDL definitions and converts them into CORBA-stack, language-specific independent definitions (in your case, in Java).
2) In addition to that, it generates CORBA-stack/vendor specific code as well.
If all you need is something that does #1, then it's not an IDL-to-Java compiler (not in the true sense of the word). But we can call it that for the sake of simplicity.
So you have two possible routes here:
1) Look at the source code of IDL compilers from existing CORBA stacks that are Java based (OpenOrb or JacOrb), or
2) Look at the OMG's specs that tell you how to map from IDL to your language of choice: http://www.omg.org/technology/documents/idl2x_spec_catalog.htm
This is all assuming you know about compiler theory and implementation. Otherwise, if this is an experiment for learning, great! But if this is part of work with a deadline, this could be an unrealistic task.
Either way, good luck.

You can use idl4emf:
http://code.google.com/p/idl4emf/
This project is composed by an IDL grammar implementation in Xtext and an IDL metamodel implementation in Ecore.
This project also includes a code generator project from IDL files. You can implement your own generator from IDL files just writing Xpand templates in Eclipse EMF.
I've used this project as part of several generator projects successfully.

Related

Syntax Preprocessors for Java

I'm looking for a Java macro language that provides for convenient ways of doing closures (that compile to anonymous inner classes) and list comprehension (that compiles down to basic java loops).
An example of the kind of thing I'm looking for would be Xtend2 http://www.eclipse.org/Xtext/#xtend2
But I want something for general purpose programming (Xtend2 is very specific DSL for Xtext and has a ton of dependencies). Maybe even something that would let me define multiple classes in a single file (which would then get split up into two separate files by the pre-processor).
Does anything like this exist?
Edited to add:
I'm doing Android development so any alternatives have to generate either valid Java source or the byte code has to be compatible with the dalvik recompiler.

Mmm, there used to be the JSE, which was tremendous fun, back in the day.
Mirah is cool, but not ready for primetime, IMO.
You can do a lot with smart templating, although your source view is the Java.
There's a post on SO about using XTend on Android from a few days ago, too.

Frege produces java source code.
I do not know whether dalvik would like it. (But I would be interested to hear ...)
And, of course, you have some runtime library code.
That being said, there are a number of other projects that do closures etc. in java, for example: lambdaj

What is the difference between Acceleo and Xpand?

I have a DSL which is based on a custom metamodel, which in its turn is based on EMF/Ecore. I am trying to figure out which solution to choose, and I cant find any decent comparisons anywhere.
Does anyone have any reasons why I should choose one over the other?
What I know so far is that Acceleo uses a OMG standardized language, but it seems harder to use than Xpand.

First of all, I wonder why you consider Acceleo more difficult to learn than Xpand, while both languages have differences (blocks and delimiters for example) they have quite a similar structure. I won't details all the elements in both languages but, for example, I don't see such a difference between something like:
«FOREACH myAttributes AS a»«a.name»«ENDFOREACH»
and
[for (a: Attribute|myAttributes)][a.name/][/for]
Both are template based languages and as such they have quite the same structure. The main difference between Acceleo and Xpand comes from the fact that Acceleo is based on the standards MOFM2T and OCL from the OMG and the tooling.
I am not very familiar with Xpand tooling but you can find more about it on their wiki. Acceleo on the other side contains an editor with syntax highlighting, code completion, error detection, refactoring and more. It also contains a debugger, a profiler, Ant and Maven support. You can also easily deploy your generators as Eclipse plugin for other users or use them out of Eclipse in a regular Java application. You can find more information on Acceleo here. You can see in videos most of the features of Acceleo on the Obeo Network (registration required).
Finally, the latest activity on xPand as occurred a year ago while Acceleo is actively developed. You can even follow the Acceleo development on github if you want.
Stephane Begaudeau
Disclaimer: I am one of the member of the Acceleo dev' team.

I am a dabbler, not an expert.
My impression is that if you need little more than a templating language, then Xpand is the way to go. Otherwise, pick Acceleo - but as you say, the learning curve is very steep.
When do you need more than a templating language? For me, they seem to run out of gas when the structure (not content) of the output is dependent on multiple independent pieces of the input. If you don't want to get into Acceleo, but have one of these cases, consider inventing an auto-generated "shim" language that gets you partway from input language to output language, perhaps with a lot of redundancy in it to avoid lookups at template-generation time.

I've been using the old 2.x Acceleo on a full scalled project and done some test with the new one.
The langage is pretty easy to use, but with the new version it's a little bit more difficult to bind some
java code to your template when the script langage is not enought.
I was a very big fan of the 2.x, but with the 3.x, I add lots of troubles to make it work. You have to write java code to handle eclipse resources for instance. I totaly gave up when updating to juno, my acceleo projects didn't worked anymore and I didn't manage to correct it in two days. I hope they will make it easier to use out of the box.

Basically the main difference is that ACCELEO is an implementation of the MOF Models To Text Transformation Language which is the OMG (Object Management Group) Standard for the definition of Models to Text transformation. It is therefore a standard language designed by the same group ho designed MOF, UML, SysML and MDA in general. XPAnd is a language which I guess existed before the standard but it is now different from it.
If you start from scratch then start with Acceleo.

In my case, I use a custom meta-model (derived from UML2) with custom stereotypes and stereotypes properties). I tried both Acceleo and Xpand template languages. Indeed they are pretty similar in term of structure and capabilities.
However, I can see one big difference (which makes Xpand much better in this use case): you can use your custom stereotypes in your Xpand templates.
Xpand engine brilliantly chooses the "best matching template/rule" for every stereotype (taking into account inheritance between stereotypes as well).
Furthermore, it is very easy to obtain stereotype properties.
These two "features" make the templates very elegant, compact and readable.
For example:
«DEFINE myTemplate FOR MyUmlProfile::MyStereoType»
MyValue: «this.myStereotypeProperty» or simply: «myStereotypeProperty»
«ENDDEFINE»
In Acceleo, I found it clumsy to achieve the same (longer statements, more code) and my templates ended up lengthy and complex. The positive thing about Acceleo, however, was that it worked conveniently from IBM RSA (applied directly to RSA (emx) models). It has code highlighting and auto-complete working nicely.
Xpand only worked if I exported my RSA models to ".uml" (~XML) format. It doesn't offer code highlighting or auto-complete (or at least I didn't figure out how).
Considering all pros and cons, I still vote for Xpand (in my use case).

Intelligent search and generation of Java code, preferrably using Python?

Basically, I do lots of one-off code generation, large-scale refactorings, etc. etc. in Java.
My tool language of choice is Python, but I'll take whatever solutions you can offer.
Here is a simplified illustration of what I would like, in a pseudocode
Generating an implementation for an interface
search within my project:
for each Interface as iName:
write class(name=iName+"Impl", implements=iName)
search within the body of iName:
for each Method as mName:
write method(name=mName, body="// TODO implement this...")
Basically, the tool I'm searching for would allow me to:
parse files according to their Java structure ("search for interfaces")
search for words contextualized by language elements and types ("variables of type SomeClass", "doStuff() method calls on SomeClass instances")
to run searches with structural context ("within the body of the current result")
easily replace or generate code (with helpers to generate, as above, or functions for replacing, "rename the interface to Foo", "insert the line Blah.Blah()", etc.)
The point is, I don't want to spend a lot of time writing these things, as they are usually throwaway. But sometimes I need something just a little smarter than what grep offers. It wouldn't be too hard to write up a simplistic version of this, but if I'm going to use something like this at all, I'd expect it to be robust.
Any suggestions of a tool/library that will help me accomplish this?
Edit to add some clarification
Python is definitely not necessary; I'll take whatever is that. I merely suggest it incase there are choices.
This is to be used in combination with IDE refactoring; sometimes it just doesn't do everything I want.
In instances where I'm using for code generation (as above), it's for augmenting the output of other code generators. e.g. a library we use outputs a tonne of interfaces, and we need to make standard implementations of each one to mesh it to our codebase.

First, I am not aware of any tool or libraries implemented in Python that specifically designed for refactoring Java code, and a Google search did not give me any leads.
Second, I would posit that writing such a decent tool or library for refactoring Java in Python would be a large task. You would have to implement a Java compiler front-end (lexer/parser, AST builder and type analyser) in Python, then figure out how to integrate this with a program editor. I'm not surprised that nobody has done this ... given that mature alternatives already exist.
Thirdly, doing refactoring without a full analysis of the source code (but uses pattern matching for example) will be incapable of doing complex refactoring, and will is likely to make mistakes in edge cases that the implementor did not think of. I expect that is the level at which the OP is currently operating ...
Given that bleak outlook, what are the alternatives:
One alternative is to use one of the existing Java IDEs (e.g. NetBeans, Eclipse, IDEA. etc) as a refactoring tool. The OP won't be able to extend the capabilities of such a tool in Python code, but the chances are that he won't really need to. I expect that at least one of these IDEs does 95% of what he needs, and (if he is realistic) that should be good enough. Especially when you consider that IDEs have lots of incidental features that help make refactoring easier; e.g. structured editing, undo/redo, incremental compilation, intelligent code completion, intelligent searching, type and call hierarchy views, and so on.
(Aside ... if existing IDEs are not good enough (#WizardOfOdds - only the OP can make that call!!), it would make more sense to try to extend the refactoring capability of an existing IDE than start again in a different implementation language.)
Depending on what he is actually doing, model-driven code generation may be another alternative. For instance, if the refactoring is happening because he is frequently creating and recreating his object model(s), then an alternative is to code the models in some modeling language and generate his code from those models. My tool of choice when doing this kind of thing is Eclipse EMF and related technologies. The EMF technologies include generation of editors, XML serialization, persistence, queries, model to model transformation and so on. I have used EMF to implement and roll out projects with object models consisting of 50 to 100 distinct classes with complex relationships and validation requirements. EMF's support for merging source code edits when you regenerate from an updated model is a key feature.

If you are coding in Java, I strongly recommend that you use NetBeans IDE. It has this kind of refactoring support builtin. Eclipse also supports this kind of thing (although I prefer NetBeans). Both projects are open source, so if you want to see how they perform this refactoring, you can look at their source code.

Java has its fair share of criticism these days but in the area of tooling - it isn't justified.
We are spoiled for choice; Eclipse, Netbeans, Intellij are the big three IDEs. All of them offer excellent levels of searching and Refactoring. Eclipse has the edge on Netbeans I think and Intellij is often ahead of Eclipse
You can also use static analysis tools such as FindBugs, CheckTyle etc to find issues - i.e. excessively long methods and classes, overly complex code.
If you really want to leverage your Python skills - take a look at Jython. Its a Python interpreter written in Java.

JavaME-suitable grammar compiler recommendations?

I want to parse some data, and I have a BNF grammar to parse it with. Can anyone recommend any grammar compilers capable of generating code that can be used on a mobile device?
Since this is for JavaME, the generated code must be:
Hopefully pretty small
Low dependencies on exotic Java libraries
Not dependant on any runtime jar files.

I have used JFlex before, and I know it satisfies your second and third requirements. But I don't know how big the generated code might be. According to the manual, it generates a packed DFA table by default, so it might not be too bad.

The first question is do you have an existing grammar definition? When I've ported a LALR grammar to Java, I've used JFlex/CUP.
If your starting from scratch, I'd suggest you use JavaCC/FreeCC, which is an LL(k) parser. It's quite well documented and there are not runtime dependencies.

Is static metaprogramming possible in Java?

I am a fan of static metaprogramming in C++. I know Java now has generics. Does this mean that static metaprogramming (i.e., compile-time program execution) is possible in Java? If so, can anyone recommend any good resources where one can learn more about it?

No, this is not possible. Generics are not as powerful as templates. For instance, a template argument can be a user-defined type, a primitive type, or a value; but a generic template argument can only be Object or a subtype thereof.
Edit: This is an old answer; since 2011 we have Java 7, which has Annotations that can be used for such trickery.

The short answer
This question is nearly more than 10 years old, but I am still missing one answer to this. And this is: yes, but not because of generics and note quite the same as C++.
As of Java 6, we have the pluggable annotation processing api. Static metaprogramming is (as you already stated in your question)
compile-time program execution
If you know about metaprogramming, then you also know that this is not really true, but for the sake of simplicity, we will use this. Please look here if you want to learn more about metaprogramming in general.
The pluggable annotation processing api is called by the compiler, right after the .java files are read but before the compiler writes the byte-code to the .class files. (I had one source for this, but i cannot find it anymore.. maybe someone can help me out here?).
It allows you, to do logic at compile time with pure java-code. However, the world you are coding in is quite different. Not specifically bad or anything, just different. The classes you are analyzing do not yet exist and you are working on meta data of the classes. But the compiler is run in a JVM, which means you can also create classes and program normally. But furthermore, you can analyze generics, because our annotation processor is called before type erasure.
The main gist about static metaprogramming in java is, that you provide meta-data (in form of annotations) and the processor will be able to find all annotated classes to process them. On (more easy) example can be found on Baeldung, where an easy example is formed. In my opinion, this is quite a good source for getting started. If you understand this, try to google yourself. There are multiple good sources out there, to much to list here. Also take a look at Google AutoService, which utilizes an annotation processor, to take away your hassle of creating and maintaining the service files. If you want to create classes, i recommend looking at JavaPoet.
Sadly though, this API does not allow us, to manipulate source code. But if you really want to, you should take a look at Project Lombok. They do it, but it is not supported.
Why is this important (Further reading for the interested ones among you)
TL;DR: It is quite baffling to me, why we don't use static metaprogramming as much as dynamic, because it has many many advantages.
Most developers see "Dynamic and Static" and immediately jump to the conclusion that dynamic is better. Nothing wrong with that, static has a lot of negative connotations for developers. But in this case (and specifically for java) this is the exact other way around.
Dynamic metaprogramming requires reflections, which has some major drawbacks. There are quite a lot of them. In short: Performance, Security, and Design.
Static metaprogramming (i.e. Annotation Processing) allows us to intersect the compiler, which already does most of the things we try to accomplish with reflections. We can also create classes in this process, which are again passed to the annotation processors. You then can (for example) generate classes, which do what normally had to be done using reflections. Further more, we can implement a "fail fast" system, because we can inform the compiler about errors, warnings and such.
To conclude and compare as much as possible: let us imagine Spring. Spring tries to find all Component annotated classes at runtime (which we could simplify by using service files at compile time), then generates certain proxy classes (which we already could have done at compile time) and resolves bean dependencies (which, again, we already could have done at compile time). Jake Whartons talk about Dagger2, in which he explains why they switched to static metaprogramming. I still don't understand why the big players like Spring don't use it.
This post is to short to fully explain those differences and why static would be more powerful. If you want, i am currently working on a presentation for this. If you are interested and speak German (sorry about that), you can have a look at my website. There you find a presentation, which tries to explain the differences in 45 minutes. Only the slides though.

Take a look at Clojure. It's a LISP with Macros (meta-programming) that runs on the JVM and is very interoperable with Java.

What do you exactly mean by "static metaprogramming"? Yes, C++ template metaprogramming is impossible in Java, but it offers other methods, much more powerful than those from C++:
reflection
aspect-oriented programming (#AspectJ)
bytecode manipulation (Javassist, ObjectWeb ASM, Java agents)
code generation (Annotation Processing Tool, template engines like Velocity)
Abstract Syntax Tree manipulations (APIs provided by popular IDEs)
possibility to run Java compiler and use compiled code even at runtime
There's no best method: each of those methods has its strengths and weaknesses.
Due to flexibility of JVM, all of those methods in Java can be used both at compilation time and runtime.

No. Even more, generic types are erased to their upper bound by the compiler, so you cannot create a new instance of a generic type T at runtime.
The best way to do metaprogamming in Java is to circumvent the type erasure and hand in the Class<T> object of your type T. Still, this is only a hack.

If you need powerful compile-time logic for Java, one way to do that is with some kind of code generation. Since, as other posters have pointed out, the Java language doesn't provide any features suitable for doing compile-time logic, this may be your best option (iff you really do have a need for compile-time logic). Once you have exhausted the other possibilities and you are sure you want to do code-generation, you might be interested in my open source project Rjava, available at:
http://www.github.com/blak3mill3r
It is a Java code generation library written in Ruby, which I wrote in order to generate Google Web Toolkit interfaces for Ruby on Rails applications automatically. It has proved quite handy for that.
As a warning, it can be very difficult to debug Rjava code, Rjava doesn't do much checking, it just assumes you know what you're doing. That's pretty much the state of static metaprogramming anyway. I'd say it's significantly easier to debug than anything non-trivial done with C++ TMP, and it is possible to use it for the same kinds of things.
Anyway, if you were considering writing a program which outputs Java source code, stop right now and check out Rjava. It might not do what you want yet, but it's MIT licensed, so feel free to improve it, deep fry it, or sell it to your grandma. I'd be glad to have other devs who are experienced with generic programming to comment on the design.

Lombok offers a weak form of compile time metaprogramming. However, the technique they use is completely general.
See Java code transform at compile time for a related discussion

You can use a metaprogramming library for Java such as Spoon: https://github.com/INRIA/spoon/

No, generics in Java is purely a way to avoid casting of Object.

In a very reduced sense, maybe?
http://michid.wordpress.com/2008/08/13/type-safe-builder-pattern-in-java/

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.