At this moment I participate in big legacy project with many huge classes and generated code.
I wish to find all methods that have bytecode length bigger than 8000 bytes (because OOTB java will not optimize it).
I found manual way like this: How many bytes of bytecode has a particular method in Java?
, however my goal is to scan many files automatically.
I tried to use jboss-javassist, but AFAIK getting bytecode length is available only on class level.
Huge methods might indeed never get inlined, however, but I have my doubts regarding the threshold of 8000. This comment suggests a much smaller limit, though it is platform and configuration dependent anyway.
You are right that getting bytecode length needs to process classes on that low level, however, you didn’t specify what actual obstacle you encountered when trying to do that with Javassist. A simple program doing that with Javassist, would be
try(InputStream is=javax.swing.JComponent.class.getResourceAsStream("JComponent.class")) {
ClassFile cf = new ClassFile(new DataInputStream(is));
for(MethodInfo mi: cf.getMethods()) {
CodeAttribute ca = mi.getCodeAttribute();
if(ca == null) continue; // abstract or native
int bLen = ca.getCode().length;
if(bLen > 300)
System.out.println(mi.getName()+" "+mi.getDescriptor()+", "+bLen+" bytes");
}
}
This has been written and tested with a recent version of Javassist that uses Generics in the API. If you have a different/older version, you have to use
try(InputStream is=javax.swing.JComponent.class.getResourceAsStream("JComponent.class")) {
ClassFile cf = new ClassFile(new DataInputStream(is));
for(Object miO: cf.getMethods()) {
MethodInfo mi = (MethodInfo)miO;
CodeAttribute ca = mi.getCodeAttribute();
if(ca == null) continue; // abstract or native
int bLen = ca.getCode().length;
if(bLen > 300)
System.out.println(mi.getName()+" "+mi.getDescriptor()+", "+bLen+" bytes");
}
}
Related
I just started using Wala Java Slicer to do some source code analysis tasks. I have a question about the proper use of the library. Assuming I have the following example code:
public void main(String[] args) {
...
UserType ut = userType;
int i = ut.getInt();
...
System.out.println(i);
}
Calculating a slice for the println statement with Wala gives the following statements:
NORMAL_RET_CALLER:Node: < Application, LRTExecutionClass, main([Ljava/lang/String;)V > Context: Everywhere[15]13 = invokevirtual < Application, LUserType, getInt()I > 11 #27 exception:12
NORMAL main:23 = getstatic < Application, Ljava/lang/System, out, <Application,Ljava/io/PrintStream> > Node: < Application, LRTExecutionClass, main([Ljava/lang/String;)V > Context: Everywhere
NORMAL main:invokevirtual < Application, Ljava/io/PrintStream, println(I)V > 23,13 #63 exception:24 Node: < Application, LRTExecutionClass, main([Ljava/lang/String;)V > Context: Everywhere
The code I am using to create the slice with Wala is shown below:
AnalysisScope scope = AnalysisScopeReader.readJavaScope("...",
null, WalaJavaSlicer.class.getClassLoader());
ClassHierarchy cha = ClassHierarchy.make(scope);
Iterable<Entrypoint> entrypoints = Util.makeMainEntrypoints(scope, cha);
AnalysisOptions options = new AnalysisOptions(scope, entrypoints);
// Build the call graph
CallGraphBuilder cgb = Util.makeZeroCFABuilder(options, new AnalysisCache(),cha, scope, null, null);
CallGraph cg = cgb.makeCallGraph(options, null);
PointerAnalysis pa = cgb.getPointerAnalysis();
// Find seed statement
Statement statement = findCallTo(findMainMethod(cg), "println");
// Context-sensitive thin slice
Collection<Statement> slice = Slicer.computeBackwardSlice(statement, cg, pa, DataDependenceOptions.NO_BASE_NO_HEAP, ControlDependenceOptions.NONE);
dumpSlice(slice);
There are a number of statements that I expect to find in the slice but are not present:
The assign statement ut = userType is not included even though the dependent method call ut.getInt(), IS included in the slice
No statements from the implementation of getInt() are included. Is there an option to activate "inter-procedural" slicing? I should mention here that the .class file is included in the path used to create the AnalysisScope.
As you can see, I am using DataDependenceOptions.NO_BASE_NO_HEAP and ControlDependenceOptions.NONE for the dependence options. But even when I use FULL for both, the problem persists.
What am I doing wrong?
The assign statement ut = userType is not included even though the
dependent method call ut.getInt(), IS included in the slice
I suspect that assignment never makes it into the byte code since it's an un-required local variable and hence will not be visible to WALA:
Because the SSA IR has already been somewhat optimized, some
statements such as simple assignments (x=y, y=z) do not appear in the
IR, due to copy propagation optimizations done automatically during
SSA construction by the SSABuilder class. In fact, there is no SSA
assignment instruction; additionally, a javac compiler is free to do
these optimizations, so the statements may not even appear in the
bytecode. Thus, these Java statements will never appear in the slice.
http://wala.sourceforge.net/wiki/index.php/UserGuide:Slicer#Warning:_exclusion_of_copy_statements_from_slice
I am trying bcel to modify a method by inserting invoke before specific instructions.
It seems that my instrumentation would result in a different stackmap table, which can not be auto-generated by the bcel package itself.
So, my instrumented class file contains the old stackmap table, which would cause error with jvm.
I haved tried with removeCodeAttributes, the method of MethodGen, that can remove all the code attributes. It can work in simple cases, a wrapped function, for example. And it can not work in my case now.
public class Insert{
public static void main(String[] args) throws ClassFormatException, IOException{
Insert isrt = new Insert();
String className = "StringBuilder.class";
JavaClass jclzz = new ClassParser(className).parse();
ClassGen cgen = new ClassGen(jclzz);
ConstantPoolGen cpgen = cgen.getConstantPool();
MethodGen mgen = new MethodGen(jclzz.getMethods()[1], className, cpgen);
InstructionFactory ifac = new InstructionFactory(cgen);
InstructionList ilist = mgen.getInstructionList();
for (InstructionHandle ihandle : ilist.getInstructionHandles()){
System.out.println(ihandle.toString());
}
InstructionFinder f = new InstructionFinder(ilist);
InstructionHandle[] insert_pos = (InstructionHandle[])(f.search("invokevirtual").next());
Instruction inserted_inst = ifac.createInvoke("java.lang.System", "currentTimeMillis", Type.LONG, Type.NO_ARGS, Constants.INVOKESTATIC);
System.out.println(inserted_inst.toString());
ilist.insert(insert_pos[0], inserted_inst);
mgen.setMaxStack();
mgen.setMaxLocals();
mgen.removeCodeAttributes();
cgen.replaceMethod(jclzz.getMethods()[1], mgen.getMethod());
ilist.dispose();
//output the file
FileOutputStream fos = new FileOutputStream(className);
cgen.getJavaClass().dump(fos);
fos.close();
}
}
Removing a StackMapTable is not a proper solution for fixing a wrong StackMapTable. The important cite is:
4.7.4. The StackMapTable Attribute
In a class file whose version number is 50.0 or above, if a method's Code attribute does not have a StackMapTable attribute, it has an implicit stack map attribute (§4.10.1). This implicit stack map attribute is equivalent to a StackMapTable attribute with number_of_entries equal to zero.
Since a StackMapTable must have explicit entries for every branch target, such an implicit StackMapTable will work with branch-free methods only. But in these cases, the method usually doesn’t have an explicit StackMapTable anyway, so you wouldn’t have that problem then (unless the method had branches which your instrumentation removed).
Another conclusion is that you can get away with removing the StackMapTable, if you patch the class file version number to a value below 50. Of course, this is only a solution if you don’t need any class file feature introduced in version 50 or newer…
There was a grace period in which JVMs supported a fall-back mode for class files with broken StackMapTables just for scenarios like yours, where the tool support is not up-to-date. (See -XX:+FailoverToOldVerifier or -XX:-UseSplitVerifier) But the grace period is over now and that support has been declined, i.e. Java 8 JVMs do not support the fall-back mode anymore.
If you want to keep up with the Java development and instrument newer class files which might use features of these new versions you have only two choices:
Calculate the correct StackMapTable manually
Use a tool which supports calculating the correct StackMapTable attributes, e.g. ASM, (see java-bytecode-asm) does support it
Some days ago we switched to Java 7 within my Company - finally! Jay \o/ So I found out about the Objects class and was astonished how short the methods hashCode() and equals() were realized, reducing a lot of boylerplate code compared to the ones generated by eclipse per default (ALT+SHIFT+S --> H).
I was wondering if I could change the default behaviour of the eclipse generated hashCode() and equals()?
I'd love to see this:
#Override
public int hashCode()
{
return Objects.hash(one, two, three, four/*, ...*/);
}
instead of this:
#Override
public int hashCode()
{
final int prime = 31;
int result = 1;
result = prime * result + ((one == null) ? 0 : one.hashCode());
result = prime * result + ((two == null) ? 0 : two.hashCode());
result = prime * result + ((three == null) ? 0 : three.hashCode());
result = prime * result + ((four== null) ? 0 : four.hashCode());
// ...
return result;
}
The same goes for equals(). This is the article I got this from.
Any ideas how to realize this best?
hashCode and equals generation using the Java 7 Objects class has now been implemented in Eclipse. I was working on the feature request 424214 back in August 2018 and my contributions were merged in the JDT UI codebase shortly afterwards (see commit f543cd6).
Here's an overview of the new option in the Source > Generate hashCode() and equals... tool:
This has been officially released in Eclipse 4.9 in September 2018. Simply download the latest version of Eclipse (downloads can be found here), or install the latest available software with the following update site:
http://download.eclipse.org/releases/latest
In addition to this new feature, arrays are now handled more cleverly. The generation will use the Arrays.deepHashCode and Arrays.deepEquals methods in a number of cases where it would previously incorrectly prefer the standard Arrays.hashCode and Arrays.equals alternatives.
In the Eclipse preferences go to Java > Editor > Templates.
In there you can create a new template. The pattern could look like:
#Override
public int hashCode()
{
return Objects.hash(one, two, three, four/*, ...*/);
}
I'm not sure if there's a variable that will properly enumerate your fields however.
You might want to look at some further explanations on these templates
There is a new plugin available which can generate toString(), hashCode(), equals() methods using java 7 features, apache common lang library, guava library. It has good customizable features. Please find the link below to install plugin.
After installation, just right click -> Jenerate -> different options
The link - https://marketplace.eclipse.org/content/jenerate
The javadoc for Runtime.availableProcessors() in Java 1.6 is delightfully unspecific. Is it looking just at the hardware configuration, or also at the load? Is it smart enough to avoid being fooled by hyperthreading? Does it respect a limited set of processors via the linux taskset command?
I can add one datapoint of my own: on a computer here with 12 cores and hyperthreading, Runtime.availableProcessors() indeed returns 24, which is not a good number to use in deciding how many threads to try to run. The machine was clearly not dead-idle, so it also can't have been looking at load in any effective way.
On Windows, GetSystemInfo is used and dwNumberOfProcessors from the returned SYSTEM_INFO structure.
This can be seen from void os::win32::initialize_system_info() and int os::active_processor_count() in os_windows.cpp of the OpenJDK source code.
dwNumberOfProcessors, from the MSDN documentation says that it reports 'The number of logical processors in the current group', which means that hyperthreading will increase the number of CPUs reported.
On Linux, os::active_processor_count() uses sysconf:
int os::active_processor_count() {
// Linux doesn't yet have a (official) notion of processor sets,
// so just return the number of online processors.
int online_cpus = ::sysconf(_SC_NPROCESSORS_ONLN);
assert(online_cpus > 0 && online_cpus <= processor_count(), "sanity check");
return online_cpus;
}
Where _SC_NPROCESSORS_ONLN documentation says 'The number of processors currently online (available).' This is not affected by the affinity of the process, and is also affected by hyperthreading.
According to Sun Bug 6673124:
The code for active_processor_count, used by Runtime.availableProcessors() is as follows:
int os::active_processor_count() {
int online_cpus = sysconf(_SC_NPROCESSORS_ONLN);
pid_t pid = getpid();
psetid_t pset = PS_NONE;
// Are we running in a processor set?
if (pset_bind(PS_QUERY, P_PID, pid, &pset) == 0) {
if (pset != PS_NONE) {
uint_t pset_cpus;
// Query number of cpus in processor set
if (pset_info(pset, NULL, &pset_cpus, NULL) == 0) {
assert(pset_cpus > 0 && pset_cpus <= online_cpus, "sanity check");
_processors_online = pset_cpus;
return pset_cpus;
}
}
}
// Otherwise return number of online cpus
return online_cpus;
}
This particular code may be Solaris-specific. But I would imagine that the behavior would be at least somewhat similar on other platforms.
AFAIK, it always gives you the total number of available CPUs even those not available for scheduling. I have a library which uses this fact to find reserved cpus. I read the /proc/cpuinfo and the default thread affinity of the process to work out what is available.
I am getting expected ClassVerifyErrors when attempting to load a class i have generated using ASM. On further inspection i can see that the jvm is correct and that the method is talking about has an invalid MAX_STACK value. THe strange thing is am using the auto calculate the stack and max local options so this should not be a problem...
The method with the invalid option is very simple and yet the result is bad bytecode.
I have written a class with the intended method and compared my asm generated class against what javac produces and the byte codes matchup with the only error being the max stack is 0 which is wrong while javac sets a value of 2.
Id like to avoid having to calculate tha max stack/locals myself.
Max stack and variable calculation can produce the wrong results if bytecode is not valid. You can verify that by running generated code trough the CheckClassAdapter.
For example,
ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_MAXS);
// generate code into cw instance...
PrintWriter pw = new PrintWriter(System.out);
CheckClassAdapter.verify(new ClassReader(cw.toByteArray()), true, pw);