Alternatives to Java bytecode instrumentation

Alternatives to Java bytecode instrumentation - java

I'm starting a project that will have to instrument java applications for coverage purposes (definition-usage of variables, etc). It has to add trace statements and some logic to the application and maybe remove statements.
I have searched for ways of instrument Java code and what I always find is about bytecode instrumentation.
My question is: It's the only way to instrument Java applications? There is any other way to do that? What are the advantages of bytecode instrumentation over the others?
I'll probably use the bytecode solution, but I want to know what are the problems with the other approaches (if any) to decide precisely.
Thanks!

The other method close to changing bytecode is using AOP (Aspect Oriented Programming).
The main library is AspectJ which also mostly defines the area.
The third option that might be interesting (since you are just starting out with the program) is using Spring.
It means you will have to learn a bit about IOC (inversion of control) but it basically means that instead of creating your objects yourself you let spring do it for you, and it has it advantages because when spring is incharge of the creation it can add all sorts of things in the creation process without you having to really declare it all yourself in aspectj.
In terms of complexity I would probably rate it:
spring (easiest)
aspectj
bytecode instrumentation (hardest)
but it's exactly the other way around when talking about capabilities (power). for example doing something like substracting code is only possible using the last one (I think)

You should definitely check out AspectJ
From what you describe you will be able to do what you want with it.
Doing bytecode instrumentation yourself is absolutely possible but it much more complicated.
I think you should check out AsepctJ first and got back to do bytecode instrumentation yourself as last resort.

See my paper on building coverage tools using program transformation engines. This approach has the advantage that it can be used on arbitrary programming languages. In addition, it sees the source code the way the programmer sees it, not as compiled byte codes (as generics get more complex, and get ground into finer byte codes, it gets harder to understand that source code by inspecting the byte code).
It is perhaps worth noting that program transformation generalizes aspect-oriented programming.

Related

How to eliminate the need of JIT when target platform is pre-determined?

It is great that using Intermediate Language (.Net: MSIL, Java: Bytecode) we can achieve platform independence. But when an application is supposed to run on a single platform only (e.g. Windows), in that case is there any simple way to specify that "I don't need JIT every time just give me the native code."?

Single platform (Windows) doesn't really mean single target. I'm currently running on Windows - and some binaries are x86, and some are x64. Even within the same processor family, different specific chips have different abilities that the JIT could take care of.
On .NET you can use NGEN - but personally I would see how much benefit there is before you actually use it in production. I believe the main benefit is in terms of start-up time rather than performance when actually executing. In fact, I believe there are some optimizations that the "normal" JIT can make which NGEN won't.
One point to note is that although the Hotspot JIT for Java is adaptive as Dolda2000 mentions, the .NET JIT is currently "once only" - it won't re-JIT code putting in more effort if it turns out to be very heavily used, or make assumptions around subclassing and then "undo" them later.

I cannot speak for .Net, but there certainly are native Java compilers, such as GNU GCJ.
More importantly, however, are you really sure that you want to avoid JITing? A JIT compiler, operating as it is on knowledge of the global state of the code, can often make optimizations that static compilers cannot. For example, a JIT compiler can inline virtual methods when it knows that no subclass currently exists that overrides it (whereas a static compiler couldn't know if such a class would be (statically or dynamically) linked in later on). There are many other examples as well, but I don't think the scope of this answer is to list them. :)

Another point:
Many current frameworks manipulate the bytecode during classloading. This means that the code on disk is not the code executed. Any Java Framework doing annotation based dependency injection will use this. JPA/Hibernate use this. AOP (Aspect Oriented Programming) usually use this, although the AOP frameworks usually also provide a way to manipulate the class files during the build.
Compiling the code upfront into native code would render these frameworks useless.

Intelligent search and generation of Java code, preferrably using Python?

Basically, I do lots of one-off code generation, large-scale refactorings, etc. etc. in Java.
My tool language of choice is Python, but I'll take whatever solutions you can offer.
Here is a simplified illustration of what I would like, in a pseudocode
Generating an implementation for an interface
search within my project:
for each Interface as iName:
write class(name=iName+"Impl", implements=iName)
search within the body of iName:
for each Method as mName:
write method(name=mName, body="// TODO implement this...")
Basically, the tool I'm searching for would allow me to:
parse files according to their Java structure ("search for interfaces")
search for words contextualized by language elements and types ("variables of type SomeClass", "doStuff() method calls on SomeClass instances")
to run searches with structural context ("within the body of the current result")
easily replace or generate code (with helpers to generate, as above, or functions for replacing, "rename the interface to Foo", "insert the line Blah.Blah()", etc.)
The point is, I don't want to spend a lot of time writing these things, as they are usually throwaway. But sometimes I need something just a little smarter than what grep offers. It wouldn't be too hard to write up a simplistic version of this, but if I'm going to use something like this at all, I'd expect it to be robust.
Any suggestions of a tool/library that will help me accomplish this?
Edit to add some clarification
Python is definitely not necessary; I'll take whatever is that. I merely suggest it incase there are choices.
This is to be used in combination with IDE refactoring; sometimes it just doesn't do everything I want.
In instances where I'm using for code generation (as above), it's for augmenting the output of other code generators. e.g. a library we use outputs a tonne of interfaces, and we need to make standard implementations of each one to mesh it to our codebase.

First, I am not aware of any tool or libraries implemented in Python that specifically designed for refactoring Java code, and a Google search did not give me any leads.
Second, I would posit that writing such a decent tool or library for refactoring Java in Python would be a large task. You would have to implement a Java compiler front-end (lexer/parser, AST builder and type analyser) in Python, then figure out how to integrate this with a program editor. I'm not surprised that nobody has done this ... given that mature alternatives already exist.
Thirdly, doing refactoring without a full analysis of the source code (but uses pattern matching for example) will be incapable of doing complex refactoring, and will is likely to make mistakes in edge cases that the implementor did not think of. I expect that is the level at which the OP is currently operating ...
Given that bleak outlook, what are the alternatives:
One alternative is to use one of the existing Java IDEs (e.g. NetBeans, Eclipse, IDEA. etc) as a refactoring tool. The OP won't be able to extend the capabilities of such a tool in Python code, but the chances are that he won't really need to. I expect that at least one of these IDEs does 95% of what he needs, and (if he is realistic) that should be good enough. Especially when you consider that IDEs have lots of incidental features that help make refactoring easier; e.g. structured editing, undo/redo, incremental compilation, intelligent code completion, intelligent searching, type and call hierarchy views, and so on.
(Aside ... if existing IDEs are not good enough (#WizardOfOdds - only the OP can make that call!!), it would make more sense to try to extend the refactoring capability of an existing IDE than start again in a different implementation language.)
Depending on what he is actually doing, model-driven code generation may be another alternative. For instance, if the refactoring is happening because he is frequently creating and recreating his object model(s), then an alternative is to code the models in some modeling language and generate his code from those models. My tool of choice when doing this kind of thing is Eclipse EMF and related technologies. The EMF technologies include generation of editors, XML serialization, persistence, queries, model to model transformation and so on. I have used EMF to implement and roll out projects with object models consisting of 50 to 100 distinct classes with complex relationships and validation requirements. EMF's support for merging source code edits when you regenerate from an updated model is a key feature.

If you are coding in Java, I strongly recommend that you use NetBeans IDE. It has this kind of refactoring support builtin. Eclipse also supports this kind of thing (although I prefer NetBeans). Both projects are open source, so if you want to see how they perform this refactoring, you can look at their source code.

Java has its fair share of criticism these days but in the area of tooling - it isn't justified.
We are spoiled for choice; Eclipse, Netbeans, Intellij are the big three IDEs. All of them offer excellent levels of searching and Refactoring. Eclipse has the edge on Netbeans I think and Intellij is often ahead of Eclipse
You can also use static analysis tools such as FindBugs, CheckTyle etc to find issues - i.e. excessively long methods and classes, overly complex code.
If you really want to leverage your Python skills - take a look at Jython. Its a Python interpreter written in Java.

Groovy, Scala maybe making my life easier?

Is it possible to replace any java coding which I use daily with groovy or scala? E.g. writing small webapps which include servlets/portlets etc.

I've completely replaced my server side processing/data crunching that would previously be written in Java with Scala. It's made life a lot easier, and a lot more fun.
Small servlets for REST webservices written on top of Step (web pico "framework", it's a single code file, comically small servlet wrapper) http://github.com/alandipert/step. Scala's xml handling combined with a simple json outputter (I use Twitter's) makes this completely painless.
Hibernate + Annotations as my persistence layer (very painless once you've cleaned up the Hibernate's collection handling/types)
Various data crunching background tasks.
It's certainly possible, and a really simple transition to make. Just start writing Scala as if you were writing Java, at it's worst it's just Java but much less verbose. From there you can gradually pick up the Scala concepts over time: Options, functional concepts, closures etc.

I have been using Groovy for a few months now and find that it addresses a lot of the things that have been bothering me about Java for a number of years (handling collections, null pointers, verbosity). The principal is that you should be able to take your Java source file, rename it to .groovy and start converting gradually ... that isn't quite true because Groovy doesn't support inner classes, for loops with multiple loop variables, do..while, and character literals, but these are easy to fix.
Scala is the statically-typed alternative ... Bill Venners reckons it allows you to achieve the same as Java (with compile-time checking) in about half the number of lines of code. And Scala has the LIFT framework, which is less mature than Grails but still promising.
Both Groovy and Scala are worth exploring, and will (eventually) make you more productive.

I use Groovy all the time for utilities, both on the command line and on the web. Often, the utilities use jars/class files from my project, since it is all on the JVM.
For web utils, take a look at Groovlets. You can come up to speed with Groovlets in a couple of hours. A groovlet is simply a servlet distilled down to its essence.
If you need to persist state, Grails is a leading web framework (with a higher learning curve).
I don't know about portlets per se, as that is its own beast.

yes. both are compiled for the same VM, you can use Java classes in them. the programming language syntax is just sugar coating on the JVM bytecode, which is the same no matter what.

Advantages of Java over Ruby/JRuby

I am learning Java.
I have learned and used Ruby. The Ruby books always tell the advantages of Ruby over Java. But there must be some advantages, that's why lots of people (especially companies) use Java and not Ruby.
Please tell the absolute(not philosophical!) advantages of Java over Ruby.

Many more developers experienced with
Java than with Ruby.
Many existing libraries in Java (That
helps JRuby too).
Static typechecking (can be seen as
advantage and as disadvantage).
Existing codebase that has to be
maintained.
Good tool-support.
More and deeper documentations and
tutorials.
More experiences with good practices
and pitfalls.
More commercial support. That's
interesting for companies.
Many of these advantages are the result, that the Java-ecosystem is more matured, than that around Ruby. Many of these points are subjective, like static vs. dynamic typing.

I don't know Ruby very well, but I can guess the following points:
Java has more documentation (books, blogs, tutorial, etc.); overall documentation quality is very good
Java has more tools (IDEs, build tools, compilers, etc.)
Java has better refactoring capabilities (due to the static type system, I guess)
Java has more widespread adoption than Ruby
Java has a well-specified memory model
As far as I know, Java has better support for threading and unicode (JRuby may help here)
Java's overall performance is quite good as of late (due to hotspot, G1 new garbage collector, etc.)
Nowadays, Java has very attractive and cheap server hosting: appengine

Please tell the absolute … advantages of Java over Ruby
Programmers should rarely deal in absolutes.
I'll dare it, and say that as a rule, static typing (Java) is an advantage over dynamic typing (Ruby) because it helps recognize errors much quicker, and without the need to potentially difficult unit tests1).
Harnessed intelligently, a strong type system with static type checking can be a real time-saver.
1) I do not oppose unit testing! But good unit testing is hard and the compiler can be a great help at reducing the sheer number of necessary test cases.

Reason #1. There's a lot of legacy Java code out there. Ruby is new, there's not so many programmers who know it and even fewer who are good at it. Similarly, there is a lot more library code available for Java than Ruby.
So there may be Technical reasons Ruby is better than Java, but if you're asking for Business reasons, Java still beats it.

The Java Virtual Machine, which has had over a decade of improvements including:
just in time compilation in the HotSpot compiler (JIT - compiling byte code to native code)
a plethora of garbage collection algorithms and tuning parameters
runtime console support for profiling, management etc. of your application (JConsole, JVisualVM etc)

I like this Comparison(Found on link Given by Markus!Thanks!)... Thanks to all... i am also expecting some more discrete advantages
And its Great!!

The language.
My opinion is that the particular properties of the Java language itself lead us to the powerful capabilities of the IDEs and tools. These capabilities are especially valuable when you have to deal with very large code-base.
If I try to enumerate these properties it would be:
of course strong static typing
the grammar of language is a LALR(1) grammar - so it is easy to build a parser
fully qualified names (packages)
What we've got in the IDE so far, for example Eclipse:
great capabilities of exploring very large code bases. You can unambiguously find all references, call hierarhy, usages of classes or public and protected members - it is very valuable when you studying the code of the project or going to change something.
very helpful code editor. I noticed that when I writing code in the Eclipse's java editor I'm actually typing by hand only names of calsses or methods and then I press Ctrl+1 and editor generates a lot of things for me. And especially good that eclipse encourage you to write the usage of piece of code first and even before the code is aclually writen. So you do the method call before you create the method and then editor generates the method stub for you. Or you add extra arguments to the method or constructor in the place when you're invoking it - and editor change the signature for you. And enev more complicated things - you pass some object to the method that accept some interface - and if the object's class do not implement this interface - editor can do it for you... and so on. There's a lot of intresting things.
There is a LOT of tools for Java. As an example of a one great tool I want to mention Maven. Actually, my opinion is that the code reuse is really possible only when we have such a tool like Maven. The infrastructure built around it and integration with IDE make feasible very intresting thinsg. Example: I have m2eclipse plugin installed. I have new empty project in the Eclipse. I know that there is a class that I need to use (reuse actually) somewhere in the repositories, let say StringUtils for example. I write in my code 'StringUtils', Eclipse's editor tell me that there is no such class in the project and underlines it with red. I press Ctrl+1 and see that there is an ability to search this class in the public repository (actually in the index, not the repository itself). Some libs were found, I choose one of them at particular version and the tool downloads the jar, configures my project's calsspath and I alredy got all that I need.
So it's all about programmer's productivity.
The JVM.
My opinion is that the JVM (Sun's HotSpot particularly) is a one of the most intresting pieces of software nowadays. Of course the key point here is a performance. But current implementation of HotSpot JVM explores very cutting edge ways to achieve such really great performance. It explores all possible advantages of just-in-time compiling over static, collects statistics of the usage of code before JIT-compile it, optimise when it possible virtual calls, can inline a lot more things that static compiler can, and so on. And the great thing here that all this stuff is in the JVM, but not in the language itself (as contrary with C# as example). Actually, if you're just learning the Java language, I strongly encourage you to learn the details of modern implementations of JVM, so you know what is really hurt performance and what isn't, and do not put unnecessary optimizations in the Java code, and do not afraid to use all possibilities of the language.
So...
it's all about IDEs and tools actually, but by some reason we have them for Java not for any other language or platform (.NET of course is a great competitor in the Windows world).

This has probably been beaten to death, but my personal opinion is that Ruby excels at quickly created web apps (and frameworks) that are easy to learn, beautiful to read, and are more than fast enough for web apps.
Where Java is better suited for raw muscle and speed.
For example, I wrote a Ruby program to convert a 192 MB text file to a MongoDB collection. Ruby took hours to run. And the Ruby code was as simple/optimized as you could get (1.9.2).
I re-wrote it in Java and it runs in 4 minutes. Yes. Hours to 4 minutes. So take that for what it's worth.

Network effect. Java has the advantage of more people using Java. Who themselves use Java because more people use Java.

If you have to build a big software, you'll need to collaborate. By having a lot of programmers out there, you are sure that there will be someone that can be asked to maintain your software even if the original developers have left the company.
Static type checking and good Java IDE offer no magic and this is good for a lot of maintainer instead of Ruby.

It is not sufficient to indicate that java is statically typed and ruby is dynamically typed.
Correct me if I'm wrong, but does this cover the fact that in ruby you can add to and even
change the program (class definitions, method definitions etc) at runtime? AFAIK you can have dynamically typed languages that are not "dynamic" (can be changed at runtime).
Because in Ruby you can change the program at runtime you don't know until you've actually run the program how it is going to behave, and even then you don't know if it will behave the same next time because your code may have been changed by some other code that called the code you're writing and testing.
This predictability is, depending on the context, the advantage of Java - one of the contexts where this is an advantage is when you have a lot of developers of varying skill levels working on a fairly large enterprise application.

IMHO, what one person considers an advantage might be a disadvantage for someone else. Some people prefer static typing while others like dynamic. It is quite subjective and depends largely upon the job and the person doing it.
I would say just learn Java and decide for yourself what its strong points are. Knowing both languages yourself beats any comparisons/advice some other person can give. And its usually a good thing to know another language, so you're not wasting your time.

Negatives for Java:
There is a lot of duplication in libraries and frameworks available for Java.
Java developers/communities tend to create over complicated solutions to simple problems.
There is a lot more legacy in Java to maintain.
Too much pandering to business users has introduced cruft that makes middle managers feel better. In other words, some philosophies in Java are more concerned with BS instead of getting the job done. This is why companies like to use Java.
You'll generally need to write more code in Java than Ruby.
It takes a lot more configuring/installing/setup to get a fully working Java development environment over Ruby.
Positives for Java:
Speed.
Documentation.
Lower level language than Ruby, which could be a good thing or a bad thing, depending on your needs.
None of my points are very scientific, but I think the differences in philosophy and personalities behind Java and Ruby is what makes them very different to each other.

Better performances

There are more choices:
Developers - lots to hire
Libraries - lots of wheels already invented.
IDE's - lots of development environments to choose from. Not only just vi/emacs + a shell.
Runtimes - if you for some reason do not like the JVM you use on the system, you can either download or buy another implementation and it will most likely Just Work. How many Ruby implementations are there?
Please note that this has nothing to do with the LANGUAGES as such :)

Reading up on this : Is Ruby as cross-platform as Java? made me realize at least one factual advantage of java over ruby:
The J2ME-compatible subest of java is more portable than ruby
as long as JRuby won't run on J2ME which may be forever

Is static metaprogramming possible in Java?

I am a fan of static metaprogramming in C++. I know Java now has generics. Does this mean that static metaprogramming (i.e., compile-time program execution) is possible in Java? If so, can anyone recommend any good resources where one can learn more about it?

No, this is not possible. Generics are not as powerful as templates. For instance, a template argument can be a user-defined type, a primitive type, or a value; but a generic template argument can only be Object or a subtype thereof.
Edit: This is an old answer; since 2011 we have Java 7, which has Annotations that can be used for such trickery.

The short answer
This question is nearly more than 10 years old, but I am still missing one answer to this. And this is: yes, but not because of generics and note quite the same as C++.
As of Java 6, we have the pluggable annotation processing api. Static metaprogramming is (as you already stated in your question)
compile-time program execution
If you know about metaprogramming, then you also know that this is not really true, but for the sake of simplicity, we will use this. Please look here if you want to learn more about metaprogramming in general.
The pluggable annotation processing api is called by the compiler, right after the .java files are read but before the compiler writes the byte-code to the .class files. (I had one source for this, but i cannot find it anymore.. maybe someone can help me out here?).
It allows you, to do logic at compile time with pure java-code. However, the world you are coding in is quite different. Not specifically bad or anything, just different. The classes you are analyzing do not yet exist and you are working on meta data of the classes. But the compiler is run in a JVM, which means you can also create classes and program normally. But furthermore, you can analyze generics, because our annotation processor is called before type erasure.
The main gist about static metaprogramming in java is, that you provide meta-data (in form of annotations) and the processor will be able to find all annotated classes to process them. On (more easy) example can be found on Baeldung, where an easy example is formed. In my opinion, this is quite a good source for getting started. If you understand this, try to google yourself. There are multiple good sources out there, to much to list here. Also take a look at Google AutoService, which utilizes an annotation processor, to take away your hassle of creating and maintaining the service files. If you want to create classes, i recommend looking at JavaPoet.
Sadly though, this API does not allow us, to manipulate source code. But if you really want to, you should take a look at Project Lombok. They do it, but it is not supported.
Why is this important (Further reading for the interested ones among you)
TL;DR: It is quite baffling to me, why we don't use static metaprogramming as much as dynamic, because it has many many advantages.
Most developers see "Dynamic and Static" and immediately jump to the conclusion that dynamic is better. Nothing wrong with that, static has a lot of negative connotations for developers. But in this case (and specifically for java) this is the exact other way around.
Dynamic metaprogramming requires reflections, which has some major drawbacks. There are quite a lot of them. In short: Performance, Security, and Design.
Static metaprogramming (i.e. Annotation Processing) allows us to intersect the compiler, which already does most of the things we try to accomplish with reflections. We can also create classes in this process, which are again passed to the annotation processors. You then can (for example) generate classes, which do what normally had to be done using reflections. Further more, we can implement a "fail fast" system, because we can inform the compiler about errors, warnings and such.
To conclude and compare as much as possible: let us imagine Spring. Spring tries to find all Component annotated classes at runtime (which we could simplify by using service files at compile time), then generates certain proxy classes (which we already could have done at compile time) and resolves bean dependencies (which, again, we already could have done at compile time). Jake Whartons talk about Dagger2, in which he explains why they switched to static metaprogramming. I still don't understand why the big players like Spring don't use it.
This post is to short to fully explain those differences and why static would be more powerful. If you want, i am currently working on a presentation for this. If you are interested and speak German (sorry about that), you can have a look at my website. There you find a presentation, which tries to explain the differences in 45 minutes. Only the slides though.

Take a look at Clojure. It's a LISP with Macros (meta-programming) that runs on the JVM and is very interoperable with Java.

What do you exactly mean by "static metaprogramming"? Yes, C++ template metaprogramming is impossible in Java, but it offers other methods, much more powerful than those from C++:
reflection
aspect-oriented programming (#AspectJ)
bytecode manipulation (Javassist, ObjectWeb ASM, Java agents)
code generation (Annotation Processing Tool, template engines like Velocity)
Abstract Syntax Tree manipulations (APIs provided by popular IDEs)
possibility to run Java compiler and use compiled code even at runtime
There's no best method: each of those methods has its strengths and weaknesses.
Due to flexibility of JVM, all of those methods in Java can be used both at compilation time and runtime.

No. Even more, generic types are erased to their upper bound by the compiler, so you cannot create a new instance of a generic type T at runtime.
The best way to do metaprogamming in Java is to circumvent the type erasure and hand in the Class<T> object of your type T. Still, this is only a hack.

If you need powerful compile-time logic for Java, one way to do that is with some kind of code generation. Since, as other posters have pointed out, the Java language doesn't provide any features suitable for doing compile-time logic, this may be your best option (iff you really do have a need for compile-time logic). Once you have exhausted the other possibilities and you are sure you want to do code-generation, you might be interested in my open source project Rjava, available at:
http://www.github.com/blak3mill3r
It is a Java code generation library written in Ruby, which I wrote in order to generate Google Web Toolkit interfaces for Ruby on Rails applications automatically. It has proved quite handy for that.
As a warning, it can be very difficult to debug Rjava code, Rjava doesn't do much checking, it just assumes you know what you're doing. That's pretty much the state of static metaprogramming anyway. I'd say it's significantly easier to debug than anything non-trivial done with C++ TMP, and it is possible to use it for the same kinds of things.
Anyway, if you were considering writing a program which outputs Java source code, stop right now and check out Rjava. It might not do what you want yet, but it's MIT licensed, so feel free to improve it, deep fry it, or sell it to your grandma. I'd be glad to have other devs who are experienced with generic programming to comment on the design.

Lombok offers a weak form of compile time metaprogramming. However, the technique they use is completely general.
See Java code transform at compile time for a related discussion

You can use a metaprogramming library for Java such as Spoon: https://github.com/INRIA/spoon/

No, generics in Java is purely a way to avoid casting of Object.

In a very reduced sense, maybe?
http://michid.wordpress.com/2008/08/13/type-safe-builder-pattern-in-java/

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.