Intelligent search and generation of Java code, preferrably using Python? - java

Basically, I do lots of one-off code generation, large-scale refactorings, etc. etc. in Java.
My tool language of choice is Python, but I'll take whatever solutions you can offer.
Here is a simplified illustration of what I would like, in a pseudocode
Generating an implementation for an interface
search within my project:
for each Interface as iName:
write class(name=iName+"Impl", implements=iName)
search within the body of iName:
for each Method as mName:
write method(name=mName, body="// TODO implement this...")
Basically, the tool I'm searching for would allow me to:
parse files according to their Java structure ("search for interfaces")
search for words contextualized by language elements and types ("variables of type SomeClass", "doStuff() method calls on SomeClass instances")
to run searches with structural context ("within the body of the current result")
easily replace or generate code (with helpers to generate, as above, or functions for replacing, "rename the interface to Foo", "insert the line Blah.Blah()", etc.)
The point is, I don't want to spend a lot of time writing these things, as they are usually throwaway. But sometimes I need something just a little smarter than what grep offers. It wouldn't be too hard to write up a simplistic version of this, but if I'm going to use something like this at all, I'd expect it to be robust.
Any suggestions of a tool/library that will help me accomplish this?
Edit to add some clarification
Python is definitely not necessary; I'll take whatever is that. I merely suggest it incase there are choices.
This is to be used in combination with IDE refactoring; sometimes it just doesn't do everything I want.
In instances where I'm using for code generation (as above), it's for augmenting the output of other code generators. e.g. a library we use outputs a tonne of interfaces, and we need to make standard implementations of each one to mesh it to our codebase.

First, I am not aware of any tool or libraries implemented in Python that specifically designed for refactoring Java code, and a Google search did not give me any leads.
Second, I would posit that writing such a decent tool or library for refactoring Java in Python would be a large task. You would have to implement a Java compiler front-end (lexer/parser, AST builder and type analyser) in Python, then figure out how to integrate this with a program editor. I'm not surprised that nobody has done this ... given that mature alternatives already exist.
Thirdly, doing refactoring without a full analysis of the source code (but uses pattern matching for example) will be incapable of doing complex refactoring, and will is likely to make mistakes in edge cases that the implementor did not think of. I expect that is the level at which the OP is currently operating ...
Given that bleak outlook, what are the alternatives:
One alternative is to use one of the existing Java IDEs (e.g. NetBeans, Eclipse, IDEA. etc) as a refactoring tool. The OP won't be able to extend the capabilities of such a tool in Python code, but the chances are that he won't really need to. I expect that at least one of these IDEs does 95% of what he needs, and (if he is realistic) that should be good enough. Especially when you consider that IDEs have lots of incidental features that help make refactoring easier; e.g. structured editing, undo/redo, incremental compilation, intelligent code completion, intelligent searching, type and call hierarchy views, and so on.
(Aside ... if existing IDEs are not good enough (#WizardOfOdds - only the OP can make that call!!), it would make more sense to try to extend the refactoring capability of an existing IDE than start again in a different implementation language.)
Depending on what he is actually doing, model-driven code generation may be another alternative. For instance, if the refactoring is happening because he is frequently creating and recreating his object model(s), then an alternative is to code the models in some modeling language and generate his code from those models. My tool of choice when doing this kind of thing is Eclipse EMF and related technologies. The EMF technologies include generation of editors, XML serialization, persistence, queries, model to model transformation and so on. I have used EMF to implement and roll out projects with object models consisting of 50 to 100 distinct classes with complex relationships and validation requirements. EMF's support for merging source code edits when you regenerate from an updated model is a key feature.

If you are coding in Java, I strongly recommend that you use NetBeans IDE. It has this kind of refactoring support builtin. Eclipse also supports this kind of thing (although I prefer NetBeans). Both projects are open source, so if you want to see how they perform this refactoring, you can look at their source code.

Java has its fair share of criticism these days but in the area of tooling - it isn't justified.
We are spoiled for choice; Eclipse, Netbeans, Intellij are the big three IDEs. All of them offer excellent levels of searching and Refactoring. Eclipse has the edge on Netbeans I think and Intellij is often ahead of Eclipse
You can also use static analysis tools such as FindBugs, CheckTyle etc to find issues - i.e. excessively long methods and classes, overly complex code.
If you really want to leverage your Python skills - take a look at Jython. Its a Python interpreter written in Java.

Related

Can I automatically generate code samples from IDE code templates?

I'm trying to find information (documentation, advice, etc) on how certain IDE templates (e.g. in Eclipse, IntelliJ, and NetBeans) are instantiated internally by IDEs, and I'm having some trouble.
I'm hoping, perhaps optimistically, that I can automatically generate multiple (at least two) distinct samples of each pattern from templates written in the associated grammars.
Every pattern-parameter (including cursors) must be filled, and samples for the same pattern should only have non-pattern-parameter content in common.
At this stage, they need to be syntactically valid so that they can be parsed, but do not need to be fully semantically valid/compilable snippets.
If anyone knows how any of these IDEs work internally, and can tell me if/how I might be able to do this (or can point me towards sufficient documentation), I would greatly appreciate it.
Background/Context
I'm trying to create a research dataset for a pattern mining task - specifically, for mining code templates. I've been looking into it for some time and, as far as I'm aware, there isn't a suitable precedent dataset, so I have to make one.
Rather than painstakingly defining every feature of every pattern myself, I'm writing tools to partially automate the process. Specifically, automating the tasks of deriving candidate patterns from samples, and of filtering out any candidates not observed in the actual corpus. The tools are input-language-agnostic, but I am initially targetting Java ASTs via the Eclipse JDT.
My thinking is that well-established patterns such as idioms and IDE code templates, from sufficiently reputable sources, are rational and intuitive pattern candidates with which I can, at least, evaluate recall. I can, and will, define some target-sample sets manually. However, I would prefer to generate them automatically, so that I can collect more complicated templates en masse (e.g. those published by IDE community members).
Thanks in advance,
Marcos C-S

Refactoring java code using scripts

Is there an eclipse based solution to refactor Java code using scripts?
I've read about the Eclipse Language toolkit, but it seems that it implies the creation of a plugin, which sounds like overkill for a one-off operation.
Are there some kind of bindings to a scripting language, or at least a way to call refactoring code from java but without a plugin?
Sample use case : I have a project which uses castor generated classes, and I want to migrate to JAXB 2. It implies a lot of refactoring in the existing code, which cannot be done by search and replace, nor regular expressions, because of the context-sensitveness.
When the refactoring is complex, I usually write a transformation pipeline with Recoder. The only drawback of this tool is that it sometimes breaks the code format (e.g. moving comments around, or adding/deleting whitespace), but so far it has been enough for my requirements.
Eclipse provides some refactoring help. For eg if you select the portion of code you want to refactor and right click, you get an option for Refactor. From which you can extract to a method(the one i commonly use while refactoring), extract interface, superclass etc.
You can also check these:
http://www.eclipse.org/articles/article.php?file=Article-Unleashing-the-Power-of-Refactoring/index.html
Eclipse: Most useful refactorings

What is the difference between Acceleo and Xpand?

I have a DSL which is based on a custom metamodel, which in its turn is based on EMF/Ecore. I am trying to figure out which solution to choose, and I cant find any decent comparisons anywhere.
Does anyone have any reasons why I should choose one over the other?
What I know so far is that Acceleo uses a OMG standardized language, but it seems harder to use than Xpand.
First of all, I wonder why you consider Acceleo more difficult to learn than Xpand, while both languages have differences (blocks and delimiters for example) they have quite a similar structure. I won't details all the elements in both languages but, for example, I don't see such a difference between something like:
«FOREACH myAttributes AS a»«a.name»«ENDFOREACH»
and
[for (a: Attribute|myAttributes)][a.name/][/for]
Both are template based languages and as such they have quite the same structure. The main difference between Acceleo and Xpand comes from the fact that Acceleo is based on the standards MOFM2T and OCL from the OMG and the tooling.
I am not very familiar with Xpand tooling but you can find more about it on their wiki. Acceleo on the other side contains an editor with syntax highlighting, code completion, error detection, refactoring and more. It also contains a debugger, a profiler, Ant and Maven support. You can also easily deploy your generators as Eclipse plugin for other users or use them out of Eclipse in a regular Java application. You can find more information on Acceleo here. You can see in videos most of the features of Acceleo on the Obeo Network (registration required).
Finally, the latest activity on xPand as occurred a year ago while Acceleo is actively developed. You can even follow the Acceleo development on github if you want.
Stephane Begaudeau
Disclaimer: I am one of the member of the Acceleo dev' team.
I am a dabbler, not an expert.
My impression is that if you need little more than a templating language, then Xpand is the way to go. Otherwise, pick Acceleo - but as you say, the learning curve is very steep.
When do you need more than a templating language? For me, they seem to run out of gas when the structure (not content) of the output is dependent on multiple independent pieces of the input. If you don't want to get into Acceleo, but have one of these cases, consider inventing an auto-generated "shim" language that gets you partway from input language to output language, perhaps with a lot of redundancy in it to avoid lookups at template-generation time.
I've been using the old 2.x Acceleo on a full scalled project and done some test with the new one.
The langage is pretty easy to use, but with the new version it's a little bit more difficult to bind some
java code to your template when the script langage is not enought.
I was a very big fan of the 2.x, but with the 3.x, I add lots of troubles to make it work. You have to write java code to handle eclipse resources for instance. I totaly gave up when updating to juno, my acceleo projects didn't worked anymore and I didn't manage to correct it in two days. I hope they will make it easier to use out of the box.
Basically the main difference is that ACCELEO is an implementation of the MOF Models To Text Transformation Language which is the OMG (Object Management Group) Standard for the definition of Models to Text transformation. It is therefore a standard language designed by the same group ho designed MOF, UML, SysML and MDA in general. XPAnd is a language which I guess existed before the standard but it is now different from it.
If you start from scratch then start with Acceleo.
In my case, I use a custom meta-model (derived from UML2) with custom stereotypes and stereotypes properties). I tried both Acceleo and Xpand template languages. Indeed they are pretty similar in term of structure and capabilities.
However, I can see one big difference (which makes Xpand much better in this use case): you can use your custom stereotypes in your Xpand templates.
Xpand engine brilliantly chooses the "best matching template/rule" for every stereotype (taking into account inheritance between stereotypes as well).
Furthermore, it is very easy to obtain stereotype properties.
These two "features" make the templates very elegant, compact and readable.
For example:
«DEFINE myTemplate FOR MyUmlProfile::MyStereoType»
MyValue: «this.myStereotypeProperty» or simply: «myStereotypeProperty»
«ENDDEFINE»
In Acceleo, I found it clumsy to achieve the same (longer statements, more code) and my templates ended up lengthy and complex. The positive thing about Acceleo, however, was that it worked conveniently from IBM RSA (applied directly to RSA (emx) models). It has code highlighting and auto-complete working nicely.
Xpand only worked if I exported my RSA models to ".uml" (~XML) format. It doesn't offer code highlighting or auto-complete (or at least I didn't figure out how).
Considering all pros and cons, I still vote for Xpand (in my use case).

Complete metaprogramming framework for Java?

I'm interested in metaprogramming (i.e. programs that help programmers do tedious programming tasks). I'm looking for a tool which has the following properties:
usable both at compile time and runtime;
inspects program structure;
can add new classes, methods or fields and make them visible to Java compiler;
can change behavior of methods;
Java-based (well, Java is most popular programming language according to some rankings);
good integration with IDEs and build tools like Ant, Gradle or Maven;
actively maintained project;
easy to use and extend;
There are some solutions for this, like:
reflection
AspectJ
Annotation Processing Tool
bytecode manipulation (CGLIB, Javassist, java.lang.instrument)
Eclipse JDT
Project Lombok
Groovy, JRuby, Scala
But unfortunately none of them meets all the criteria above. Is there any complete metaprogramming solution for Java?
There's JackPot, which is Java based but I don't think gets a lot a current attention. Has ASTs and symbol tables AFAIK. You can probably extend it; I doubt anybody will stop (or help) you.
There's the Java-based compiler APIs for the Sun, er, Oracle java compiler. They're likely actively maintained, but I don't think you can modify source code and regenerate it. Certainly has symbol tables; dunno about trees. Probably pretty hard to extend; you have to keep up with the compiler guys, not the other way round.
There is ANTLR, which has a Java implementation and a Java parser that will build ASTs. I don't think it has full symbol tables, so doing serious code analysis/revision is likely to be hard. ANTLR is certainly actively maintained, and nobody will object to you enhancing the Java grammar with symbol tables. Just know that will take you about 6 months for Java 1.6 if that's all you do. (That's how long it took our internal [smart] guy to do it for DMS, starting with symbol table support for 1.4).
Not in Java, and not easily integrated into IDEs, but capable of carrying massive analysis and transformation on Java code is our DMS Software Reengineering Toolkit with its Java Front End.
DMS is generic compiler machinery: parsing, AST building, symbol table machinery, flow analysis machinery, with that additional bonuses of source-to-source transformations and generic prettyprinting of ASTs back to legal text including retention of comments. It offers a set of APIs supporting these services, and additional tools for defining grammars and langauge-dependent flow analyzers.
The Java Front End gives crucial detail (using those APIs) to DMS to allow it process Java: a grammar/parser, full symbol table construction for Java 1.4-1.6 (with 1.7 due momentarily), as well as some control and data flow analysis (to be extended over time because this stuff is so useful).
By using the services provided by DMS and the Java Front end, one can reasonably contemplate building arbitrary Java anlaysis and transformation tools. (This makes the tool a "complete" metaprogramming tool, in that it can inspect any language structure, or change any language structure, as opposed to say template metaprogramming or reflection). We believe this to be much more effective than ad hoc tools because you don't have to build the infrastructure, the infrastructure provided is robust and handles cases you don't have the energy to implement, and it is designed to support such tasks. YMMV.
DMS/Java Front end have been used to construct a variety of Java tools: test coverage, profilers, dead code elimination, clone detection on scale, JavaDoc with hyperlinked source-code, fast XML parser/generators, etc.
Yes, its actively maintained; undergoing continuous enhancement since the first version in 1998.
There's a Java metaprogramming framework that is part of Tapestry IOC, it's called Plastic. It munges class bytecodes using custom classloaders, I haven't tried it yet but it looks like it gives a simple interface that still enables the programmer to make powerful metaprogramming changes.
Check out the Meta Programming System:
http://www.jetbrains.com/mps/
It has great IDE support and is used quite frequently by the smart folks at JetBrains.
Check out Spring Roo.

Advantages of Java over Ruby/JRuby

I am learning Java.
I have learned and used Ruby. The Ruby books always tell the advantages of Ruby over Java. But there must be some advantages, that's why lots of people (especially companies) use Java and not Ruby.
Please tell the absolute(not philosophical!) advantages of Java over Ruby.
Many more developers experienced with
Java than with Ruby.
Many existing libraries in Java (That
helps JRuby too).
Static typechecking (can be seen as
advantage and as disadvantage).
Existing codebase that has to be
maintained.
Good tool-support.
More and deeper documentations and
tutorials.
More experiences with good practices
and pitfalls.
More commercial support. That's
interesting for companies.
Many of these advantages are the result, that the Java-ecosystem is more matured, than that around Ruby. Many of these points are subjective, like static vs. dynamic typing.
I don't know Ruby very well, but I can guess the following points:
Java has more documentation (books, blogs, tutorial, etc.); overall documentation quality is very good
Java has more tools (IDEs, build tools, compilers, etc.)
Java has better refactoring capabilities (due to the static type system, I guess)
Java has more widespread adoption than Ruby
Java has a well-specified memory model
As far as I know, Java has better support for threading and unicode (JRuby may help here)
Java's overall performance is quite good as of late (due to hotspot, G1 new garbage collector, etc.)
Nowadays, Java has very attractive and cheap server hosting: appengine
Please tell the absolute … advantages of Java over Ruby
Programmers should rarely deal in absolutes.
I'll dare it, and say that as a rule, static typing (Java) is an advantage over dynamic typing (Ruby) because it helps recognize errors much quicker, and without the need to potentially difficult unit tests1).
Harnessed intelligently, a strong type system with static type checking can be a real time-saver.
1) I do not oppose unit testing! But good unit testing is hard and the compiler can be a great help at reducing the sheer number of necessary test cases.
Reason #1. There's a lot of legacy Java code out there. Ruby is new, there's not so many programmers who know it and even fewer who are good at it. Similarly, there is a lot more library code available for Java than Ruby.
So there may be Technical reasons Ruby is better than Java, but if you're asking for Business reasons, Java still beats it.
The Java Virtual Machine, which has had over a decade of improvements including:
just in time compilation in the HotSpot compiler (JIT - compiling byte code to native code)
a plethora of garbage collection algorithms and tuning parameters
runtime console support for profiling, management etc. of your application (JConsole, JVisualVM etc)
I like this Comparison(Found on link Given by Markus!Thanks!)... Thanks to all... i am also expecting some more discrete advantages
And its Great!!
The language.
My opinion is that the particular properties of the Java language itself lead us to the powerful capabilities of the IDEs and tools. These capabilities are especially valuable when you have to deal with very large code-base.
If I try to enumerate these properties it would be:
of course strong static typing
the grammar of language is a LALR(1) grammar - so it is easy to build a parser
fully qualified names (packages)
What we've got in the IDE so far, for example Eclipse:
great capabilities of exploring very large code bases. You can unambiguously find all references, call hierarhy, usages of classes or public and protected members - it is very valuable when you studying the code of the project or going to change something.
very helpful code editor. I noticed that when I writing code in the Eclipse's java editor I'm actually typing by hand only names of calsses or methods and then I press Ctrl+1 and editor generates a lot of things for me. And especially good that eclipse encourage you to write the usage of piece of code first and even before the code is aclually writen. So you do the method call before you create the method and then editor generates the method stub for you. Or you add extra arguments to the method or constructor in the place when you're invoking it - and editor change the signature for you. And enev more complicated things - you pass some object to the method that accept some interface - and if the object's class do not implement this interface - editor can do it for you... and so on. There's a lot of intresting things.
There is a LOT of tools for Java. As an example of a one great tool I want to mention Maven. Actually, my opinion is that the code reuse is really possible only when we have such a tool like Maven. The infrastructure built around it and integration with IDE make feasible very intresting thinsg. Example: I have m2eclipse plugin installed. I have new empty project in the Eclipse. I know that there is a class that I need to use (reuse actually) somewhere in the repositories, let say StringUtils for example. I write in my code 'StringUtils', Eclipse's editor tell me that there is no such class in the project and underlines it with red. I press Ctrl+1 and see that there is an ability to search this class in the public repository (actually in the index, not the repository itself). Some libs were found, I choose one of them at particular version and the tool downloads the jar, configures my project's calsspath and I alredy got all that I need.
So it's all about programmer's productivity.
The JVM.
My opinion is that the JVM (Sun's HotSpot particularly) is a one of the most intresting pieces of software nowadays. Of course the key point here is a performance. But current implementation of HotSpot JVM explores very cutting edge ways to achieve such really great performance. It explores all possible advantages of just-in-time compiling over static, collects statistics of the usage of code before JIT-compile it, optimise when it possible virtual calls, can inline a lot more things that static compiler can, and so on. And the great thing here that all this stuff is in the JVM, but not in the language itself (as contrary with C# as example). Actually, if you're just learning the Java language, I strongly encourage you to learn the details of modern implementations of JVM, so you know what is really hurt performance and what isn't, and do not put unnecessary optimizations in the Java code, and do not afraid to use all possibilities of the language.
So...
it's all about IDEs and tools actually, but by some reason we have them for Java not for any other language or platform (.NET of course is a great competitor in the Windows world).
This has probably been beaten to death, but my personal opinion is that Ruby excels at quickly created web apps (and frameworks) that are easy to learn, beautiful to read, and are more than fast enough for web apps.
Where Java is better suited for raw muscle and speed.
For example, I wrote a Ruby program to convert a 192 MB text file to a MongoDB collection. Ruby took hours to run. And the Ruby code was as simple/optimized as you could get (1.9.2).
I re-wrote it in Java and it runs in 4 minutes. Yes. Hours to 4 minutes. So take that for what it's worth.
Network effect. Java has the advantage of more people using Java. Who themselves use Java because more people use Java.
If you have to build a big software, you'll need to collaborate. By having a lot of programmers out there, you are sure that there will be someone that can be asked to maintain your software even if the original developers have left the company.
Static type checking and good Java IDE offer no magic and this is good for a lot of maintainer instead of Ruby.
It is not sufficient to indicate that java is statically typed and ruby is dynamically typed.
Correct me if I'm wrong, but does this cover the fact that in ruby you can add to and even
change the program (class definitions, method definitions etc) at runtime? AFAIK you can have dynamically typed languages that are not "dynamic" (can be changed at runtime).
Because in Ruby you can change the program at runtime you don't know until you've actually run the program how it is going to behave, and even then you don't know if it will behave the same next time because your code may have been changed by some other code that called the code you're writing and testing.
This predictability is, depending on the context, the advantage of Java - one of the contexts where this is an advantage is when you have a lot of developers of varying skill levels working on a fairly large enterprise application.
IMHO, what one person considers an advantage might be a disadvantage for someone else. Some people prefer static typing while others like dynamic. It is quite subjective and depends largely upon the job and the person doing it.
I would say just learn Java and decide for yourself what its strong points are. Knowing both languages yourself beats any comparisons/advice some other person can give. And its usually a good thing to know another language, so you're not wasting your time.
Negatives for Java:
There is a lot of duplication in libraries and frameworks available for Java.
Java developers/communities tend to create over complicated solutions to simple problems.
There is a lot more legacy in Java to maintain.
Too much pandering to business users has introduced cruft that makes middle managers feel better. In other words, some philosophies in Java are more concerned with BS instead of getting the job done. This is why companies like to use Java.
You'll generally need to write more code in Java than Ruby.
It takes a lot more configuring/installing/setup to get a fully working Java development environment over Ruby.
Positives for Java:
Speed.
Documentation.
Lower level language than Ruby, which could be a good thing or a bad thing, depending on your needs.
None of my points are very scientific, but I think the differences in philosophy and personalities behind Java and Ruby is what makes them very different to each other.
Better performances
There are more choices:
Developers - lots to hire
Libraries - lots of wheels already invented.
IDE's - lots of development environments to choose from. Not only just vi/emacs + a shell.
Runtimes - if you for some reason do not like the JVM you use on the system, you can either download or buy another implementation and it will most likely Just Work. How many Ruby implementations are there?
Please note that this has nothing to do with the LANGUAGES as such :)
Reading up on this : Is Ruby as cross-platform as Java? made me realize at least one factual advantage of java over ruby:
The J2ME-compatible subest of java is more portable than ruby
as long as JRuby won't run on J2ME which may be forever

Categories

Resources