Stand-alone Bytecode Verifier

Stand-alone Bytecode Verifier - java

In my bytecode instrumentation project, I stumble frequently on VerifyErrors. However, the default java Verifier gives little information on which instruction resulted in the error (it only gives the method and a small message). Is there any stand-alone bytecode verifier which provides with a little more advanced help in locating the error, at least the precise instruction location? Thank you.

ASM CheckClassAdaptor.verify() gives great feedback:
http://asm.ow2.org/

I was also looking for something that would report potential verify errors, but especially IncompatibleClassChangeErrors. I wrote a little test project with one API class and another client class calling API methods, plus a main class to run a verifier; then changed the API, recompiling it but not the client, and checked to see what could be caught. Used -target 7 though no special JDK 7 features for now.
First and most obviously, Class.forName can find certain errors in the client class's signature, but it does not seem to check method bodies for calls to nonexistent API methods and the like, even if you call getDeclaredMethods; the errors are reported by the VM only when the problematic line of code is actually run.
JustIce in BCEL 5.2 seems to be easiest;
org.apache.bcel.verifier.Verifier.main(new String[] {clazz});
does the job:
Pass 3a, method number 1 ['public void m()']:
VERIFIED_REJECTED
Instruction invokestatic[184](3) 4 constraint violated:
Referenced method 'x' with expected signature '()V' not found in class 'API'.
....
I tried ASM 4.0, but
org.objectweb.asm.util.CheckClassAdapter.main(new String[] {clazz});
does not work; perhaps it checks the format of methods, but not linkage. Inlining main and passing checkDataFlow=true does not help.
Searching, I also found https://kenai.com/hg/maxine~maxine/file/8429d3ebc036/com.oracle.max.vm/test/test/com/sun/max/vm/verifier/CommandLineVerifier.java but I could not find any way to make this work; the accompanying unit test throws a ClassNotFoundException when run.

As with any project involving JVM bytecode, I would first check to see whether the BCEL has anything that might be useful for you. Also, perhaps FindBugs may help - though I'm not sure whether it assumes verifiable bytecode to start with or not.

Related

what else we can use instead of import com.sun.management.OperatingSystemMXBean as this import is giving me Sonar issue

Due to project sec. issues . Not allowed to use com.sun.management.OperatingSystemMXBean . Instead i am trying to use java.lang.management.OperatingSystemMXBean . But in my method i need to know the cpuLoad (getSystemCpuLoad) . how can i get the same using lang.management ? is there any method present in java. lang.* to get the systemcpuLoad ?

I don't think there is an alternative. At least not in the standard Java SE class libraries1.
Not all com.sun.* packages are considered to be closed APIs. In this case the javadocs include this interface. I take that as an implicit statement that this is an open API.
If this is just the generic warning from SonarQube that you shouldn't depend on com.sun.* and sun.* APIs (see RSPEC-1191), my advice is to suppress the warning for this particular case.
I don't see how this is a project "security" issue. Please explain why you think that.
Okay. Let me put my question in this way : How to getSystemCpuLoad method in java.lang.management.OperatingSystemMXBean.
One way is just like your current code (presumably) does. Cast the MXBean instance to a com.sun.management.OperatingSystemMXBean and call the method. (And suppress the SonarQube warning.)
The one thing to note is that the getSystemCpuLoad method is marked as deprecated in Java 17. You should now use getCpuLoad instead.
1 - If you found and used a 3rd-party library2 that provides this functionality, or it you implemented your own (in native code, for example), I think you will be making the problem worse. Now you have an extra dependency to track or extra code to maintain. Bear in mind that the implementation of this functionality is OS specific, so you would need to find or write an implementation that works on all of your platforms, both now and in the future.
2 - Beware of posts that suggest using the SIGAR library. It hasn't been updated in a long time, and there are reports that its problematic on some platforms.

Is it possible to redefine core JDK classes using instrumentation?

I want to redefine the bytecode of the StackOverflowError constructor so I have a "hook" for when a stack overflow occurs. All I want to do is insert a single method call to a static method of my choosing at the start of the constructor. Is it possible to do this?

You should be able to do it using one of two ways (unless something changed in the last 1-2 years, in which case I'd love some links to changelogs/docs):
Mentioned in a comment, not very feasible I guess, modify the classes you are interested in, put them in a jar and then use the -bootclasspath option to load them instead of the default ones. As was mentioned before this can have some legal issues (and is a pain to do in general).
You should be able to (or at least you used to be able to) instrument almost all core classes (iirc Class was the only exception I've seen). One of many problems you might have is the fact that many of core classes are being initialized before the agents you provide (or well their premain methods to be exact) are consulted. To overcome this you will have to add Can-Retransform-Classes property to your agent jar and then re-transform the classes you are interested in. Be aware that re-transformation is a bit less powerful and doesn't give you all the options you'd have normally with instrumentation, you can read more about it in the doc.
I am assuming you know how to do instrumentation?

There are several things to consider.
It is possible to redefine java.lang.StackOverflowError. I tried it successfully on 1.7.0_40. isModifiableClass(java.lang.StackOverflowError.class) return true and I successfully redefined it inserting a method invocation into all of its constructors
You should be aware that when you insert a method call into a class via Instrumentation you still have to obey the visibility imposed by the ClassLoader relationships. Since StackOverflowError is loaded by the bootstrap loader it can only invoke methods of classes loaded by the bootstrap loader. You would have to add the target method’s class(es) to the bootstrap loader
This works if the application’s code throws a StackOverflowError manually. However, when a real stackoverflow occurs, the last thing the JVM will do is to invoke additional methods (keep in mind what the error says, the stack is full). Consequently it creates an instance of StackOverflowError without calling its constructor (a JVM can do that). So your instrumentation is pointless in this situation.
As already pointed out by others, a “Pure Java Application” must not rely on modified JRE classes. It is only valid to use Instrumentation as add-on, i.e. development or JVM management tool. You should keep in mind that the fact that Oracle’s JVM 1.7.0_40 supports the redefinition of StackOverflowError does not imply that other versions or other JVMs do as well.

How to check if binaries are built from particular sources

The legacy project I am working on includes some external library in a form of set of binary jar files. We decided that for analysis and potential patching, we want to receive sources of this library, use them to build new binaries and after detailed and long enough regression testing switch to these binaries.
Assume that we have already retrieved and built the sources (I am actually in planning phase). Before real testing, I would like to perform some "compatibility checks" to exclude possibility that the sources represent something dramatically different from what is in the "old" binaries.
Using the javap tool I was able to extract the version of JDK used for compilation (at least I believe it is the version of JDK). It says, the binaries were built using major version 46 and minor 0. According to this article it maps to JDK 1.2.
Assume that the same JDK would be used for sources compilation.
The question is:
Is there a reliable and possibly effective method of verification if both of these binaries are built from the same sources? I would like to know if all method signatures and class definitions are identical and if most or maybe all of method implementations are identical/similar.
The library is pretty big, so I think that detailed analysis of decompiled binaries may be not an option.

I suggest a multi-stage process:
Apply the previously suggested Jardiff or similar to see if there are any API differences. If possible, pick a tool that has an option for reporting private methods etc. In practice, any substantial implementation change in Java is likely to change some methods and classes, even if the public API is unchanged.
If you have an API match, compile a few randomly selected files with the indicated compiler, decompile the result and the original class files, and compare the results. If they match, apply the same process to larger and larger bodies of code until you either find a mismatch, or have checked everything.
Diffs of decompiled code are more likely to give you clues about the nature of the differences, and are easier to filter for non-significant differences, than the actual class files.
If you get a mismatch, analyze it. It may be due to something you do not care about. If so, try to construct a script that will delete that form of difference and resume the compile-and-compare process. If you get widespread mismatches, experiment with compiler parameters such as optimization. If adjustments to the compiler parameters eliminate the differences, continue with the bulk comparison. The objective in this phase is to find a combination of compiler parameters and decompiled code filters that produce a match on the sample files, and apply them to bulk comparison of the library.
If you cannot get a reasonably close match in the decompiled code, you probably do not have the right source code. Even so, if you have an API match it may be worth building your system and running your tests using the result of the compilation. If your tests run at least as well with the version you built from source, continue work using it.

There are a variety of JAR comparison tools out there. One that used to be pretty good is Jardiff. I haven't used it in awhile but I'm sure it's still available. There are also some commercial offerings in the same space that could fit your needs.

Jardiff that Perception mentioned is a good start, however there is no way to do it 100% percent sure theoretically. This is because the same source can be compiled with different compilers and different compiler configurations and optimization levels. So there is no way to compare binary code (bytecode) beyond class and method signatures.
What do you mean by "similar implementation" of a method? Let's suppose that a clever compiler drops an else case because it figures out that the condition may not be true ever. Are the two similar? Yes and no.. :-)
The best way to go IMHO is setting up very good regression test cases that check every key feature of your libraries. This might be a horror, but on long term might be cheaper than hunting for bugs. It all depends on your future plans in this project. Not a trivial easy decision.

For method signatures, use a tool like jardiff.
For similarity of implementation, you have to fall back to a wild guess. Comparing the bytecode on opcode-level may be compiler-dependent and lead to a large number of false negatives. If this is the case, you could fall back to compare the methods of a class using the LineNumberTable.
It gives you a list of line numbers for each method (as long as the class file has been compiled with the debug flag, which is often missing in very old or commercial libraries).
If two class files are compiled from the same source code, then at least the line numbers of each method should match exactly.
You can use a library such as Apache BCEL to retrieve the LineNumberTable:
// import org.apache.bcel.classfile.ClassParser;
JavaClass fooClazz = new ClassParser( "Foo.class" ).parse();
for( Method m : fooClazz.getMethods() )
{
LineNumberTable lnt = m.getLineNumberTable();
LineNumber[] tab = lnt.getLineNumberTable();
for( LineNumber ln : tab )
{
System.out.println( ln.getLineNumber() );
}
}

Is there any class to diagnose invoked method in a java class?

I need to diagnose all invoked methods in a class(either declared in the class or not) using it's source code. Means that give the class source code to a method as an input and get the invoked method by the class as the output. In fact I need a class/method which operates same as java lexical analyzer .
Is there any method to diagnose all invoked methods ?
of course I tried to use Runtime.traceMethodCalls(); to solve the problem but there was no output. I've read I need to run java debug with java -g but unfortunately when I try to run java -g it makes error. Now what should I do ? Is there any approach ?

1) In the general case, no. Reflection will always allow the code to make method calls that you won't be able to analyze without actually running the code.
2) Tracing the method calls won't give you the full picture either, since a method is not in any way guaranteed (or even likely) to make all the calls it can every time you call it.
Your best bet is some kind of "best effort" code analysis. You may want to try enlisting the compiler's help with that. For example, compile the code and analyze the generated class file for all emitted external symbols. It won't guarantee catching every call (see #1), but it will get you close in most cases.

You can utilize one of the open source static analyzers for Java as a starting point. Checkstyle allows you to build your own modules. Soot has a pretty flexible API and a good example of call analysis. FindBugs might also allow you too write a custom module. AFAIK all three are embeddable in the form of a JAR, so you can incorporate whatever you come up with into your own custom program.

From your question it is hard to determine what is exactly problem you're trying to solve.
But in case:
If you want to analyze source code, to see which parts of it are redundant and may be removed, then you could use some IDE (Eclipse, IntelliJ IDEA Community Edition etc.) In IDE's you have features to search for usages of method and also you have functionality to analyze code and highlight unused methods as warnings/errors.
If you want to see where during runtime some method is called, then you could use profiling tool to collect information on those method invocations. Depending on tool you could see also from where those methods were called. But bare in mind, that when you execute program, then it is not guaranteed that your interesting method is called from every possible place.
if you are developing an automated tool for displaying calling graphs of methods. Then you need to parse source and start working with code entities. One way would be to implement your own compiler and go on from there. But easier way would be to reuse opensourced parser/compiler/analyzer and build your tool around it.
I've used IntelliJ IDEA CE that has such functionalitys and may be downloaded with source http://www.jetbrains.org/display/IJOS/Home
Also there is well known product Eclipse that has its sources available.
Both of these products have enormous code base, so isolating interesting part would be difficult. But it would still be easier than writing your own java compiler and werifying that it works for every corner case.

For analyzing the bytecode as mentioned above you could take a look at JBoss Bytecode. It is more for testing but may also be helpful for analyzing code.
sven.malvik.de

You may plug into the compiler.
Have a look the source of Project Lombok for instance.
There is no general mechanism, so they have one mechanism for javac and one for eclipse's compiler.
http://projectlombok.org/

Java: Locate reflection code usage

We have huge codebase and some classes are often used via reflection all over the code. We can safely remove classes and compiler is happy, but some of them are used dynamically using reflection so I can't locate them otherwise than searching strings ...
Is there some reflection explorer for Java code?

No simple tool to do this. However you can use code coverage instead. What this does is give you a report of all the line of code executed. This can be even more useful in either improving test code or removing dead code.
Reflections is by definition very dynamic and you have to run the right code to see what it would do. i.e. you have to have reasonable tests. You can add logging to everything Reflection does if you can access this code, or perhaps you can use instrumentation of these libraries (or change them directly)

I suggest, using appropriately licensed source for your JRE, modifying the reflection classes to log when classes are used by reflection (use a map/WeakHashMap to ignore duplicates). Your modified system classes can replace those in rt.jar with -Xbootclasspath/p: on the command line (on Oracle "Sun" JRE, others will presumably have something similar). Run your program and tests and see what comes up.
(Possibly you might have to hack around issues with class loading order in the system classes.)

I doubt any such utility is readily available, but I could be wrong.
This is quite complex, considering that dynamically loaded classes (via reflection) can themselves load other classes dynamically and that the names of loaded classes may come from variables or some runtime input.
Your codebase probably does neither of these. If this a one time effort searching strings might be a good option. Or you look for calls to reflection methods.

As the other posters have mentioned, this cannot be done with static analysis due to the dynamic nature of Reflection. If you are using Eclipse, you might find this coverage tool to be useful, and it's very easy to work with. It's called EclEmma

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.