In ASM, I'm trying to determine the labels for a try-catch block.
Currently I have:
public void printTryCatchLabels(MethodNode method) {
if (method.tryCatchBlocks != null) {
for (int i = 0; i < method.tryCatchBlocks.size(); ++i) {
Label start = method.tryCatchBlocks.get(i).start.getLabel();
Label end = method.tryCatchBlocks.get(i).end.getLabel();
Label catch_start = method.tryCatchBlocks.get(i).handler.getLabel();
System.out.println("try{ " + start.toString());
System.out.println("} " + end.toString());
System.out.println("catch { " + catch_start.toString());
System.out.println("} " /*where does the catch block end?*/);
}
}
}
I'm trying to determine where the label is for the end of the catch block but I don't know how. Why do I need it? Because I want to "remove" try-catch blocks from the byte-code.
For example, I am trying to change:
public void test() {
try {
System.out.println("1");
} catch(Exception e) {
//optionally rethrow e.
}
System.out.println("2");
}
to:
public void test() {
System.out.println("1");
System.out.println("2");
}
So to remove it, I thought that I could just get the labels and remove all instructions between the catch-start and the catch-end and then remove all the labels.
Any ideas?
I recommend reading the JVM Spec §3.12. Throwing and Handling Exceptions. It contains an example that is very simple but still exhibiting the problems with your idea:
Compilation of try-catch constructs is straightforward. For example:
void catchOne() {
try {
tryItOut();
} catch (TestExc e) {
handleExc(e);
}
}
is compiled as:
Method void catchOne()
0 aload_0 // Beginning of try block
1 invokevirtual #6 // Method Example.tryItOut()V
4 return // End of try block; normal return
5 astore_1 // Store thrown value in local var 1
6 aload_0 // Push this
7 aload_1 // Push thrown value
8 invokevirtual #5 // Invoke handler method:
// Example.handleExc(LTestExc;)V
11 return // Return after handling TestExc
Exception table:
From To Target Type
0 4 5 Class TestExc
Here, the catch block ends with a return instruction, thus does not join with the original code flow. This, however, is not a required behavior. Instead, the compiled code could have a branch to the last return instruction in place of the 4 return instruction, i.e.
Method void catchOne()
0: aload_0
1: invokevirtual #6 // Method tryItOut:()V
4: goto 13
7: astore_1
8: aload_0
9: aload_1
10: invokevirtual #5 // Method handleExc:(LTestExc;)V
13: return
Exception table:
From To Target Type
0 4 7 Class TestExc
(e.g. at least one Eclipse version compiled the example exactly this way)
But it could also be vice versa, having a branch to instruction 4 in place of the last return instruction.
Method void catchOne()
0 aload_0
1 invokevirtual #6 // Method Example.tryItOut()V
4 return
5 astore_1
6 aload_0
7 aload_1
8 invokevirtual #5 // Method Example.handleExc(LTestExc;)V
11 goto 4
Exception table:
From To Target Type
0 4 5 Class TestExc
So you already have three possibilities to compile this simple example which doesn’t contain any conditionals. The conditional branches associated with loops or if instructions do not necessarily point to the instruction right after a conditional block of code. If that block of code would be followed by another flow control instruction, the conditional branch (the same applies to switch targets) could short-circuit the branch.
So it’s very hard to determine which code belongs to a catch block. On the byte code level, it doesn’t even have to be a contiguous block but may be interleaved with other code.
And at this time we didn’t even speak about compiling finally and synchronized or the newer try(…) with resources statement. They all end up creating exception handlers that look like catch blocks on the byte code level.
Since branch instructions within the exception handler might target code outside of the handler when recovering from the exception, traversing the code graph of an exception handler doesn’t help here as processing the branch instruction correctly requires the very information about the branch target you actually want to gather.
So the only way to handle this task is to do the opposite. You have to traverse the code graph from the beginning of the method for the non-exceptionally execution and consider every encountered instruction as not belonging to an exception handler. For the simple task of stripping exception handlers this is already sufficient as you simply have to retain all encountered instructions and drop all others.
In short, you will have to do am execution flow analysis. In your example:
public void test() {
try { // (1) try start
System.out.println("1");
} // (2) try end
catch(Exception e) {
//optionally rethrow e. // (3) catch start
} // (4) catch end
System.out.println("2"); // (5) continue execution
}
Graphically it will look like this:
---(1)-+--(2)---------------------+
| +--(5 execution path merged)
+--(3 branched here)--(4)--+
So, you need to build a graph of the code blocks and then remove nodes related to (3) and (4). Currently ASM doesn't provide execution flow analysis tools, though some users reported that they build such tools on top of ASM's tree package.
There are relatively common situations in which the end of the catch block is easy to detect. I'm assuming here that we are using a Java compiler.
when the try block (between the start and end label), ends with a GOTO(joinLabel). If the block does not throw an exception or return always, or break/continue out of a surrounding loop, then it will end with GOTO which points to the end of last handler.
same for catch blocks which are not the last block, they will jump over the handlers to follow using a GOTO which can help you identify the end of the last handler. So if the try block does not have such a GOTO, you might find one in the other handlers.
these non-last catch blocks can be detected by comparing the handler labels of TRYCATCH instructions with the same start and end labels. The start label of the next handler acts as the (exclusive) end of the previous handler.
At the bytecode level, excepting handling is essentially a goto. The code doesn't have to be structured, or even have a well defined catch block at all. And even if you are dealing only with normally compiled Java code, it is still quite complicated once you consider the possibilities of try with resources or complicated control flow structures inside the catch block.
If you just want to remove code associated with the "catch block", I would recommend simply removing the associated exception handler entry and then doing a dead code elimination pass. You can probably find an existing DCE pass somewhere (for example, Soot), or you could write your own.
Related
I'm dumping info about all the processes running on my pc into a .txt file. To do this I execute handle.exe from my java application. The file contains all the running processes in this format:
RuntimeBroker.exe pid: 4756
4: Key HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image
8: Event
10: WaitCompletionPacket
1C: IRTimer
20: WaitCompletionPacket
24: IRTimer
28: File \Device\CNG
--
SearchIndexer.exe pid: 5616
4: Event
8: WaitCompletionPacket
C: IoCompletion
1C: IRTimer
20: File \Device\0000007s
22: Directory
I need to get the name of the process that is using a given device i.e. if I'm looping through the file searching for the string "\Device\0000007s", I need to get the name of the process and the process id which is a few lines above. Does anybody know how could I do this? The processes are delimited by a line of dashes -- in the file. Bear in mind that the file is massive, this is just an example.
I would read each line of a process (using a Scanner) into a List<String>. Then search through the List<String> for your desired String and if it is there, do your processing. Here is some psuedo-code:
Scanner scanner = new Scanner("path/to/file.txt");
List<String> stringList;
while(scanner.hasNextLine()) {
String nextLine = scanner.nextLine();
if(nextLine.equals("--") {
for(String line : stringList) {
if(line.contains("\Device\0000007s") {
// Do your processsing here
}
}
stringList.clear();
}
else {
stringList.add(nextLine);
}
}
This is just psuedo-code, doesn't handle the edge case of the last process and probably won't compile. I will leave the nitty-gritty syntax up to you.
There is probably a more optimal way of doing this, with less looping. But for simple things like this I much prefer a clear approach to an optimized one.
I just started using Wala Java Slicer to do some source code analysis tasks. I have a question about the proper use of the library. Assuming I have the following example code:
public void main(String[] args) {
...
UserType ut = userType;
int i = ut.getInt();
...
System.out.println(i);
}
Calculating a slice for the println statement with Wala gives the following statements:
NORMAL_RET_CALLER:Node: < Application, LRTExecutionClass, main([Ljava/lang/String;)V > Context: Everywhere[15]13 = invokevirtual < Application, LUserType, getInt()I > 11 #27 exception:12
NORMAL main:23 = getstatic < Application, Ljava/lang/System, out, <Application,Ljava/io/PrintStream> > Node: < Application, LRTExecutionClass, main([Ljava/lang/String;)V > Context: Everywhere
NORMAL main:invokevirtual < Application, Ljava/io/PrintStream, println(I)V > 23,13 #63 exception:24 Node: < Application, LRTExecutionClass, main([Ljava/lang/String;)V > Context: Everywhere
The code I am using to create the slice with Wala is shown below:
AnalysisScope scope = AnalysisScopeReader.readJavaScope("...",
null, WalaJavaSlicer.class.getClassLoader());
ClassHierarchy cha = ClassHierarchy.make(scope);
Iterable<Entrypoint> entrypoints = Util.makeMainEntrypoints(scope, cha);
AnalysisOptions options = new AnalysisOptions(scope, entrypoints);
// Build the call graph
CallGraphBuilder cgb = Util.makeZeroCFABuilder(options, new AnalysisCache(),cha, scope, null, null);
CallGraph cg = cgb.makeCallGraph(options, null);
PointerAnalysis pa = cgb.getPointerAnalysis();
// Find seed statement
Statement statement = findCallTo(findMainMethod(cg), "println");
// Context-sensitive thin slice
Collection<Statement> slice = Slicer.computeBackwardSlice(statement, cg, pa, DataDependenceOptions.NO_BASE_NO_HEAP, ControlDependenceOptions.NONE);
dumpSlice(slice);
There are a number of statements that I expect to find in the slice but are not present:
The assign statement ut = userType is not included even though the dependent method call ut.getInt(), IS included in the slice
No statements from the implementation of getInt() are included. Is there an option to activate "inter-procedural" slicing? I should mention here that the .class file is included in the path used to create the AnalysisScope.
As you can see, I am using DataDependenceOptions.NO_BASE_NO_HEAP and ControlDependenceOptions.NONE for the dependence options. But even when I use FULL for both, the problem persists.
What am I doing wrong?
The assign statement ut = userType is not included even though the
dependent method call ut.getInt(), IS included in the slice
I suspect that assignment never makes it into the byte code since it's an un-required local variable and hence will not be visible to WALA:
Because the SSA IR has already been somewhat optimized, some
statements such as simple assignments (x=y, y=z) do not appear in the
IR, due to copy propagation optimizations done automatically during
SSA construction by the SSABuilder class. In fact, there is no SSA
assignment instruction; additionally, a javac compiler is free to do
these optimizations, so the statements may not even appear in the
bytecode. Thus, these Java statements will never appear in the slice.
http://wala.sourceforge.net/wiki/index.php/UserGuide:Slicer#Warning:_exclusion_of_copy_statements_from_slice
I have a very strange problem with NullPointerException. Code example is as follows:
...
... public String[] getParams(...) {
... ...
... ...
143 return new String[] {
144 getUserFullName(),
145 StringUtil.formatDate(sent),
. tiltu,
. StringUtil.initCap(user.getName()),
. vuosi.toString(),
. asiatyyppi[0] + " " + lisatiedot[0],
. asiatyyppi[1] + " " + lisatiedot[1],
. alaviitteet[0],
152 alaviitteet[1]};
153 }
Now, I have got an issue from production having a stack trace:
java.lang.NullPointerException
at package.EmailService.getParams(EmailService.java:143)
...
I am unable to produce that kind of stack trace myself. It maybe some environment issue that for some reason line numbers don't match. If I have null references on any variable stack trace points to that specific line but never to line 143.
But what I want to ask is: is it possible to produce NullPointerException at line 143 specifically?
The line number in the stack trace comes from the LineNumberTable attribute in the class file. (See JVM specification)
It would be no problem to output the right line number for a subexpression - all the compiler has to do is to say that from byte-code index x to y, there is a correspondence with source code line z.
But up to and including Java 1.7 there was a bug in the compiler, that was fixed in 1.8:
https://bugs.openjdk.java.net/browse/JDK-7024096
A DESCRIPTION OF THE PROBLEM : linenumbertable in compiled code only
has line numbers for starts of statements. When a statement is using
method chaining or other "fluent Interface" style API's a single
statement can span tens of lines and contain hundreds on method
invocations.
Currently an exception throw from within any of these methods will
have the linenumber of the first line of the enclosing statement which
makes it very hard to debug which method call is having a problem.
linnumbertable should have correct line numbers for every method
invocation.
--
BT2:EVALUATION
It seems simple enough to fix this, but its kinda risky at the end of
the jdk7 development cycle, targetting to jdk8.
So in 1.7, you would get the wrong reported line number for these kind of subexpressions (if they occurred within the same method though - if you invoked another method in a subexpression, and that other method caused a NullPointerException, you would see it reported there - this is probably why the bug isn't always a big problem)
One way you could work around this is by taking the Java 8 compiler to compile your source code, and use the flags javac -source 1.7 -target 1.7. But it would be better and safer to upgrade your prod environment to 1.8.
Consider your original code which defines a new String array:
return new String[] {
getUserFullName(),
StringUtil.formatDate(sent),
tiltu,
StringUtil.initCap(user.getName()),
vuosi.toString(),
asiatyyppi[0] + " " + lisatiedot[0],
asiatyyppi[1] + " " + lisatiedot[1],
alaviitteet[0],
alaviitteet[1]};
}
If any of the elements of the inline array should trigger a NullPointerException the JVM will interpret the Exception as having occurred on the line where the definition began. In other words, the JVM will view the above code as being the same as:
return new String[] { getUserFullName(), StringUtil.formatDate(sent), tiltu, StringUtil.initCap user.getName()), vuosi.toString(), asiatyyppi[0] + " " + lisatiedot[0], asiatyyppi[1] + " " + lisatiedot[1], alaviitteet[0], alaviitteet[1]}; }
where everything is on one line.
If you really want to handle NullPointerExceptions here, you should define the variables outside the instantiaition.
Assertions in java compile down to a private synthetic static boolean added to a test - the proposal is nicely documented here:
JSR Assertion Proposal
In it, we create
final private static boolean $assertionsEnabled =
ClassLoader.desiredAssertionStatus(className);
and then assert(X) becomes if ($assertionsEnabled && !x) { throw }
Which makes perfect sense ;)
However, I've noticed that what I actually get is
public void test1(String s) {
assert (!s.equals("Fred"));
System.out.println(s);
}
becomes
static final /* synthetic */ boolean $assertionsDisabled;
public void test1(String s) {
if ((!(AssertTest.$assertionsDisabled)) && (s.equals("Fred"))) {
throw new AssertionError();
}
System.out.println(s);
}
static {
AssertTest.$assertionsDisabled = !(AssertTest.class.desiredAssertionStatus());
}
I can't find any documentation as to why they went with a NEGATIVE test, rather than a positive test - i.e. the original proposal captured assertionsENABLED, now we use assertionsDISABLED.
The only thing I can think of is that this would possibly (POSSIBLY!) generate better branch prediction, but that seems like a pretty lame guess to me - the Java philosophy is (almost) always to make the bytecode simple, and let the JIT sort out optimisations.
( note that this isn't a question about how assertions work - I know that! :) )
( As an aside, it's quite interesting to see that this leads to incorrect tutorials! 6.2.1 of this tutorial, which someone quoted in response to a previous SO question on assertions gets the sense of the test wrong! :)
Any ideas?
This is actually done for a reason, not merely for something like a more compact bytecode or faster condition execution. If you look at the Java Language Specification, §14.10, you will see the following note:
An assert statement that is executed before its class or interface has completed initialization is enabled.
There's also an example that contains an initialization loop:
class Bar {
static {
Baz.testAsserts();
// Will execute before Baz is initialized!
}
}
class Baz extends Bar {
static void testAsserts() {
boolean enabled = false;
assert enabled = true;
System.out.println("Asserts " +
(enabled ? "enabled" : "disabled"));
}
}
As Bar is a superclass for Baz, it must be initialized before Baz initialization. However, its initializer executes an assertion statement in the context of Baz class that is not initialized yet, so there was no chance to set the $assertionsDisabled field. In this case, the field has its default value, and everything works according to the spec: assertion is executed. Had we have an $assertionsEnabled field, the assertions for an uninitialized class would not be executed, so it would violate the specification.
The boolean is actually implemented with an integer. There is a common believe that comparison with zero is quicker, but I don't see any reason to use disable instead of enabled.
IMHO, as false is the default for boolean, I try to chose a flag which has a default value of false In this case $assertionsEnabled would make more sense.
Although it looks like there's redundant work being done when you look at the decompiled java source - you can't rely on this - you need to look at the byte code level.
Have a look at the bytecode both the eclipse compiler and oracle javac produce:
#0: getstatic Test.$assertionsDisabled
#3: ifne #23
(assertion code) #6: aload_1
(assertion code) #7: ldc "Fred"
(assertion code) #9: invokevirtual String.equals(Object)
(assertion code) #12: ifeq #23
(assertion code) #15: new AssertionError
(assertion code) #18: dup
(assertion code) #19: invokespecial AssertionError.<init>()
(assertion code) #22: athrow
#23: getstatic System.out (PrintStream)
#26: aload_1
#27: invokevirtual PrintStream.println(String)
#30: return
Please note byte code #3 - it doesn't need to invert the Test.$assertionsDisabled value, it only needs to perform a single negative test (i.e. it doesn't make any difference if it's a negative test or a positive test at the byte code level)
In summary, it's being implemented efficiently and without any redundant work being performed.
I was experimenting with enum, and I found that the following compiles and runs fine on Eclipse (Build id: 20090920-1017, not sure exact compiler version):
public class SwitchingOnAnull {
enum X { ,; }
public static void main(String[] args) {
X x = null;
switch(x) {
default: System.out.println("Hello world!");
}
}
}
When compiled and run with Eclipse, this prints "Hello world!" and exits normally.
With the javac compiler, this throws a NullPointerException as expected.
So is there a bug in Eclipse Java compiler?
This is a bug. Here's the specified behavior for a switch statement according to the Java Language Specification, 3rd Edition:
JLS 14.11 The switch Statement
SwitchStatement:
switch ( Expression ) SwitchBlock
When the switch statement is executed, first the Expression is evaluated. If the Expression evaluates to null, a NullPointerException is thrown and the entire switch statement completes abruptly for that reason.
Apparently the bug in Eclipse has nothing to do with default case or enum at all.
public class SwitchingOnAnull {
public static void main(String[] args) {
java.math.RoundingMode x = null;
switch(x) {};
switch((Integer) null) {};
switch((Character) null) {
default: System.out.println("I've got sunshine!");
}
}
}
The above code compiles and runs "fine" on (at least some version of) Eclipse. Each individual switch throws a NullPointerException when compiled with javac, which is exactly as the specification mandates.
The cause
Here's javap -c SwitchingOnAnull when compiled under Eclipse:
Compiled from "SwitchingOnAnull.java"
public class SwitchingOnAnull extends java.lang.Object{
public SwitchingOnAnull();
Code:
0: aload_0
1: invokespecial #8; //Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: aconst_null
1: astore_1
2: getstatic #16; //Field java/lang/System.out:Ljava/io/PrintStream;
5: ldc #22; //String I've got sunshine!
7: invokevirtual #24; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
10: return
}
It seems that the Eclipse compiler gets rid of the entire switch constructs entirely. Unfortunately this optimization breaks the language specification.
The official words
The bug has been filed and assigned for fix.
Olivier Thomann 2010-05-28 08:37:21 EDT
We are too aggressive on the optimization.
For:
switch((Integer) null) {};
we optimize out the whole switch statement when we should at least evaluate the
expression.
I'll take a look.
Candidate for 3.6.1.
See also
Bug 314830 - Switching on a null expression doesn't always throw NullPointerException
Definitly. If we look at the chapter 14.11 of the java language specification, it clearly states (under 'discussion'):
The prohibition against using null as
a switch label prevents one from
writing code that can never be
executed. If the switch expression is
of a reference type, such as a boxed
primitive type or an enum, a run-time
error will occur if the expression
evaluates to null at run-time.
Yep. According to the JLS it's a bug:
If the switch expression is of a reference type, such as a boxed primitive type or an enum, a run-time error will occur if the expression evaluates to null at run-time.