Java assertions - $assertionsDisabled vs $assertionsEnabled - java

Assertions in java compile down to a private synthetic static boolean added to a test - the proposal is nicely documented here:
JSR Assertion Proposal
In it, we create
final private static boolean $assertionsEnabled =
ClassLoader.desiredAssertionStatus(className);
and then assert(X) becomes if ($assertionsEnabled && !x) { throw }
Which makes perfect sense ;)
However, I've noticed that what I actually get is
public void test1(String s) {
assert (!s.equals("Fred"));
System.out.println(s);
}
becomes
static final /* synthetic */ boolean $assertionsDisabled;
public void test1(String s) {
if ((!(AssertTest.$assertionsDisabled)) && (s.equals("Fred"))) {
throw new AssertionError();
}
System.out.println(s);
}
static {
AssertTest.$assertionsDisabled = !(AssertTest.class.desiredAssertionStatus());
}
I can't find any documentation as to why they went with a NEGATIVE test, rather than a positive test - i.e. the original proposal captured assertionsENABLED, now we use assertionsDISABLED.
The only thing I can think of is that this would possibly (POSSIBLY!) generate better branch prediction, but that seems like a pretty lame guess to me - the Java philosophy is (almost) always to make the bytecode simple, and let the JIT sort out optimisations.
( note that this isn't a question about how assertions work - I know that! :) )
( As an aside, it's quite interesting to see that this leads to incorrect tutorials! 6.2.1 of this tutorial, which someone quoted in response to a previous SO question on assertions gets the sense of the test wrong! :)
Any ideas?

This is actually done for a reason, not merely for something like a more compact bytecode or faster condition execution. If you look at the Java Language Specification, §14.10, you will see the following note:
An assert statement that is executed before its class or interface has completed initialization is enabled.
There's also an example that contains an initialization loop:
class Bar {
static {
Baz.testAsserts();
// Will execute before Baz is initialized!
}
}
class Baz extends Bar {
static void testAsserts() {
boolean enabled = false;
assert enabled = true;
System.out.println("Asserts " +
(enabled ? "enabled" : "disabled"));
}
}
As Bar is a superclass for Baz, it must be initialized before Baz initialization. However, its initializer executes an assertion statement in the context of Baz class that is not initialized yet, so there was no chance to set the $assertionsDisabled field. In this case, the field has its default value, and everything works according to the spec: assertion is executed. Had we have an $assertionsEnabled field, the assertions for an uninitialized class would not be executed, so it would violate the specification.

The boolean is actually implemented with an integer. There is a common believe that comparison with zero is quicker, but I don't see any reason to use disable instead of enabled.
IMHO, as false is the default for boolean, I try to chose a flag which has a default value of false In this case $assertionsEnabled would make more sense.

Although it looks like there's redundant work being done when you look at the decompiled java source - you can't rely on this - you need to look at the byte code level.
Have a look at the bytecode both the eclipse compiler and oracle javac produce:
#0: getstatic Test.$assertionsDisabled
#3: ifne #23
(assertion code) #6: aload_1
(assertion code) #7: ldc "Fred"
(assertion code) #9: invokevirtual String.equals(Object)
(assertion code) #12: ifeq #23
(assertion code) #15: new AssertionError
(assertion code) #18: dup
(assertion code) #19: invokespecial AssertionError.<init>()
(assertion code) #22: athrow
#23: getstatic System.out (PrintStream)
#26: aload_1
#27: invokevirtual PrintStream.println(String)
#30: return
Please note byte code #3 - it doesn't need to invert the Test.$assertionsDisabled value, it only needs to perform a single negative test (i.e. it doesn't make any difference if it's a negative test or a positive test at the byte code level)
In summary, it's being implemented efficiently and without any redundant work being performed.

Related

why getSimpleName() is twice in com.sun.tools.javac.tree.JCTree$JCClassDecl

I had a weird bug in an application code, which is an annotation processor and I could find that the root cause of the bug was that the class com.sun.tools.javac.tree.JCTree$JCClassDecl contains the method getSimpleName() twice when I query the class using the reflective method getMethods(). The two versions differ only in the return type. This is legal in JVM code, but not legal in Java. This is not method overloading, because it is only the return type that differs and the return type is not part of the method signature.
The issue can be demonstrated with the simple code:
Method[] methods = com.sun.tools.javac.tree.JCTree.JCClassDecl.class.getMethods();
for (int i = 0; i < methods.length; i++) {
System.out.println(methods[i]);
}
It will print
...
public javax.lang.model.element.Name com.sun.tools.javac.tree.JCTree$JCClassDecl.getSimpleName()
public com.sun.tools.javac.util.Name com.sun.tools.javac.tree.JCTree$JCClassDecl.getSimpleName()
...
(The ellipsis stands for more output lines showing the various other methods that are not interesting for us now.)
The Java version I used to test this is
$ java -version
java version "11" 2018-09-25
Java(TM) SE Runtime Environment 18.9 (build 11+28)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11+28, mixed mode)
on a Windows 10 machine.
QUESTION: How was this class code created? My understanding is that this part of the code is written in Java, but in Java this is not possible. Also: what is the aim to have two same-signature versions of a method? Any hint?
If you look at the source code1 you'll see there's only one method with a name of getSimpleName(). This method returns com.sun.tools.javac.util.Name. There's two critical things to note about this:
That method is actually overriding com.sun.source.tree.ClassTree#getSimpleName() which is declared to return javax.lang.model.element.Name.
The com.sun.tools.javac.util.Name abstract class implements the javax.lang.model.element.Name interface, and since the overridden method returns the former it is taking advantage of covariant return types.
According to this Oracle blog, a method which overrides another but declares a covariant return type is implemented using bridge methods.
How is this implemented?
Although the return type based overloading is not allowed by java language, JVM always allowed return type based overloading. JVM uses full signature of a method for lookup/resolution. Full signature includes return type in addition to argument types. i.e., a class can have two or more methods differing only by return type. javac uses this fact to implement covariant return types. In the above, CircleFactory example, javac generates code which is equivalent to the following:
class CircleFactory extends ShapeFactory {
public Circle newShape() {
// your code from the source file
return new Circle();
}
// javac generated method in the .class file
public Shape newShape() {
// call the other newShape method here -- invokevirtual newShape:()LCircle;
}
}
We can use javap with -c option on the class to verify this. Note that we still can't use return type based overloading in source language. But, this is used by javac to support covariant return types. This way, there is no change needed in the JVM to support covariant return types.
And in fact, if you run the following command:
javap -v com.sun.tools.javac.tree.JCTree$JCClassDecl
The following will be output (only including the relevant methods):
public com.sun.tools.javac.util.Name getSimpleName();
descriptor: ()Lcom/sun/tools/javac/util/Name;
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: getfield #13 // Field name:Lcom/sun/tools/javac/util/Name;
4: areturn
LineNumberTable:
line 801: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcom/sun/tools/javac/tree/JCTree$JCClassDecl;
And:
public javax.lang.model.element.Name getSimpleName();
descriptor: ()Ljavax/lang/model/element/Name;
flags: (0x1041) ACC_PUBLIC, ACC_BRIDGE, ACC_SYNTHETIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokevirtual #96 // Method getSimpleName:()Lcom/sun/tools/javac/util/Name;
4: areturn
LineNumberTable:
line 752: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Lcom/sun/tools/javac/tree/JCTree$JCClassDecl;
As you can see, the second method, the one which returns javax.lang.model.element.Name, is both synthetic and a bridge. In other words, the method is generated by the compiler as part of the implementation of covariant return types. It also simply delegates to the "real" method, the one actually present in the source code which returns com.sun.tools.javac.util.Name.
1. The source code link is for JDK 13.

Byte code difference in #-values after change Ant to Maven

I changed the compilation of a project from Ant to Maven. We still use Java 1.7.0_80. The ByteCode of all classes stayed the same except for a Comparator class. I disassembled the class with javap -c. The only differences are in #-values, e.g.:
before: invokevirtual #2
after: invokevirtual #29
What do these #-values mean? Is this a functional difference or just a difference in naming?
EDIT: The first method in Bytecode before and after:
public de.continentale.kvrl.model.comp.KvRLRechBetragEuroComparator();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public de.continentale.kvrl.model.comp.KvRLRechBetragEuroComparator();
Code:
0: aload_0
1: invokespecial #21 // Method java/lang/Object."<init>":()V
4: return
The #29 refers to an index in the constant table that describes the method being invoked. It would have been better if you had pasted the full output of javap -c, because the comment at the end of the line indicates the actual method descriptor.
Example of javap -c on an invocation of an instance method test in class Z:
7: invokevirtual #19 // Method snippet/Z.test:()V
You can ignore the # number, as long as the method descriptor in the comment is the same.
Why
As to the "why" question, it is very likely caused by the debug in the ant and maven compiler task/plugin settings.
By default, the maven-compiler-plugin compiles with the debug configuration option set to true, which will attach among other the name of the source file, and thing like line number tables and local variable names.
https://maven.apache.org/plugins/maven-compiler-plugin/compile-mojo.html
By default, the ant java compiler has this set to false.
https://ant.apache.org/manual/Tasks/javac.html
To compare and check, set the debug parameter of the Ant Javac task to true and see if that makes the generated .class files the same.

JVM Verify Error 'Illegal type at constant pool'

I am currently writing my own compiler and I am trying to compile the following code:
List[String] list = List("a", "b", "c", "d")
list stream map((String s) => s.toUpperCase())
System out println list
The compiler has no problem parsing, linking or compiling the code, but when it comes to executing it, the JVM throws the following error:
java.lang.VerifyError: Illegal type at constant pool entry 40 in class dyvil.test.Main
Exception Details:
Location:
dyvil/test/Main.main([Ljava/lang/String;)V #29: invokevirtual
Reason:
Constant pool index 40 is invalid
Bytecode:
...
I tried to use javap to find the problem, and this is the instruction #29:
29: invokevirtual #40 // InterfaceMethod java/util/Collection.stream:()Ljava/util/stream/Stream;
And the entry in the Constant Pool (also using javap):
#37 = Utf8 stream
#38 = Utf8 ()Ljava/util/stream/Stream;
#39 = NameAndType #37:#38 // stream:()Ljava/util/stream/Stream;
#40 = InterfaceMethodref #36.#39 // java/util/Collection.stream:()Ljava/util/stream/Stream;
When opening the class with the Eclipse Class File Viewer, the line where #29 should be reads:
Class Format Exception
and all following instructions are not shown anymore (except for Locals, ...). However, the ASM Bytecode Plugin writes
INVOKEVIRTUAL java/util/Collection.stream ()Ljava/util/stream/Stream;
at that line, which appears to be valid. What am I doing wrong / missing here?
I figured out my mistake. The error lies here:
invokevirtual #40 // InterfaceMethod
^^^^^^^ ^^^^^^^^^
I am using invokevirtual on an interface method, which is generally not a good idea. However I think that the error thrown by the verifier should be a bit more clear about what is actually wrong.
With ASM you will see this error because of an incorrect isInterface flag. The argument isInterface of visitMethodInsn refers to the target/owner/destination and not the current context.
i.o.w when using INVOKEVIRTUAL, isInterface is false.
see issue here for more details.

Determine where a catch block ends ASM

In ASM, I'm trying to determine the labels for a try-catch block.
Currently I have:
public void printTryCatchLabels(MethodNode method) {
if (method.tryCatchBlocks != null) {
for (int i = 0; i < method.tryCatchBlocks.size(); ++i) {
Label start = method.tryCatchBlocks.get(i).start.getLabel();
Label end = method.tryCatchBlocks.get(i).end.getLabel();
Label catch_start = method.tryCatchBlocks.get(i).handler.getLabel();
System.out.println("try{ " + start.toString());
System.out.println("} " + end.toString());
System.out.println("catch { " + catch_start.toString());
System.out.println("} " /*where does the catch block end?*/);
}
}
}
I'm trying to determine where the label is for the end of the catch block but I don't know how. Why do I need it? Because I want to "remove" try-catch blocks from the byte-code.
For example, I am trying to change:
public void test() {
try {
System.out.println("1");
} catch(Exception e) {
//optionally rethrow e.
}
System.out.println("2");
}
to:
public void test() {
System.out.println("1");
System.out.println("2");
}
So to remove it, I thought that I could just get the labels and remove all instructions between the catch-start and the catch-end and then remove all the labels.
Any ideas?
I recommend reading the JVM Spec §3.12. Throwing and Handling Exceptions. It contains an example that is very simple but still exhibiting the problems with your idea:
Compilation of try-catch constructs is straightforward. For example:
void catchOne() {
try {
tryItOut();
} catch (TestExc e) {
handleExc(e);
}
}
is compiled as:
Method void catchOne()
0 aload_0 // Beginning of try block
1 invokevirtual #6 // Method Example.tryItOut()V
4 return // End of try block; normal return
5 astore_1 // Store thrown value in local var 1
6 aload_0 // Push this
7 aload_1 // Push thrown value
8 invokevirtual #5 // Invoke handler method:
// Example.handleExc(LTestExc;)V
11 return // Return after handling TestExc
Exception table:
From To Target Type
0 4 5 Class TestExc
Here, the catch block ends with a return instruction, thus does not join with the original code flow. This, however, is not a required behavior. Instead, the compiled code could have a branch to the last return instruction in place of the 4 return instruction, i.e.
Method void catchOne()
0: aload_0
1: invokevirtual #6 // Method tryItOut:()V
4: goto 13
7: astore_1
8: aload_0
9: aload_1
10: invokevirtual #5 // Method handleExc:(LTestExc;)V
13: return
Exception table:
From To Target Type
0 4 7 Class TestExc
(e.g. at least one Eclipse version compiled the example exactly this way)
But it could also be vice versa, having a branch to instruction 4 in place of the last return instruction.
Method void catchOne()
0 aload_0
1 invokevirtual #6 // Method Example.tryItOut()V
4 return
5 astore_1
6 aload_0
7 aload_1
8 invokevirtual #5 // Method Example.handleExc(LTestExc;)V
11 goto 4
Exception table:
From To Target Type
0 4 5 Class TestExc
So you already have three possibilities to compile this simple example which doesn’t contain any conditionals. The conditional branches associated with loops or if instructions do not necessarily point to the instruction right after a conditional block of code. If that block of code would be followed by another flow control instruction, the conditional branch (the same applies to switch targets) could short-circuit the branch.
So it’s very hard to determine which code belongs to a catch block. On the byte code level, it doesn’t even have to be a contiguous block but may be interleaved with other code.
And at this time we didn’t even speak about compiling finally and synchronized or the newer try(…) with resources statement. They all end up creating exception handlers that look like catch blocks on the byte code level.
Since branch instructions within the exception handler might target code outside of the handler when recovering from the exception, traversing the code graph of an exception handler doesn’t help here as processing the branch instruction correctly requires the very information about the branch target you actually want to gather.
So the only way to handle this task is to do the opposite. You have to traverse the code graph from the beginning of the method for the non-exceptionally execution and consider every encountered instruction as not belonging to an exception handler. For the simple task of stripping exception handlers this is already sufficient as you simply have to retain all encountered instructions and drop all others.
In short, you will have to do am execution flow analysis. In your example:
public void test() {
try { // (1) try start
System.out.println("1");
} // (2) try end
catch(Exception e) {
//optionally rethrow e. // (3) catch start
} // (4) catch end
System.out.println("2"); // (5) continue execution
}
Graphically it will look like this:
---(1)-+--(2)---------------------+
| +--(5 execution path merged)
+--(3 branched here)--(4)--+
So, you need to build a graph of the code blocks and then remove nodes related to (3) and (4). Currently ASM doesn't provide execution flow analysis tools, though some users reported that they build such tools on top of ASM's tree package.
There are relatively common situations in which the end of the catch block is easy to detect. I'm assuming here that we are using a Java compiler.
when the try block (between the start and end label), ends with a GOTO(joinLabel). If the block does not throw an exception or return always, or break/continue out of a surrounding loop, then it will end with GOTO which points to the end of last handler.
same for catch blocks which are not the last block, they will jump over the handlers to follow using a GOTO which can help you identify the end of the last handler. So if the try block does not have such a GOTO, you might find one in the other handlers.
these non-last catch blocks can be detected by comparing the handler labels of TRYCATCH instructions with the same start and end labels. The start label of the next handler acts as the (exclusive) end of the previous handler.
At the bytecode level, excepting handling is essentially a goto. The code doesn't have to be structured, or even have a well defined catch block at all. And even if you are dealing only with normally compiled Java code, it is still quite complicated once you consider the possibilities of try with resources or complicated control flow structures inside the catch block.
If you just want to remove code associated with the "catch block", I would recommend simply removing the associated exception handler entry and then doing a dead code elimination pass. You can probably find an existing DCE pass somewhere (for example, Soot), or you could write your own.

Eclipse bug? Switching on a null with only default case

I was experimenting with enum, and I found that the following compiles and runs fine on Eclipse (Build id: 20090920-1017, not sure exact compiler version):
public class SwitchingOnAnull {
enum X { ,; }
public static void main(String[] args) {
X x = null;
switch(x) {
default: System.out.println("Hello world!");
}
}
}
When compiled and run with Eclipse, this prints "Hello world!" and exits normally.
With the javac compiler, this throws a NullPointerException as expected.
So is there a bug in Eclipse Java compiler?
This is a bug. Here's the specified behavior for a switch statement according to the Java Language Specification, 3rd Edition:
JLS 14.11 The switch Statement
SwitchStatement:
switch ( Expression ) SwitchBlock
When the switch statement is executed, first the Expression is evaluated. If the Expression evaluates to null, a NullPointerException is thrown and the entire switch statement completes abruptly for that reason.
Apparently the bug in Eclipse has nothing to do with default case or enum at all.
public class SwitchingOnAnull {
public static void main(String[] args) {
java.math.RoundingMode x = null;
switch(x) {};
switch((Integer) null) {};
switch((Character) null) {
default: System.out.println("I've got sunshine!");
}
}
}
The above code compiles and runs "fine" on (at least some version of) Eclipse. Each individual switch throws a NullPointerException when compiled with javac, which is exactly as the specification mandates.
The cause
Here's javap -c SwitchingOnAnull when compiled under Eclipse:
Compiled from "SwitchingOnAnull.java"
public class SwitchingOnAnull extends java.lang.Object{
public SwitchingOnAnull();
Code:
0: aload_0
1: invokespecial #8; //Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: aconst_null
1: astore_1
2: getstatic #16; //Field java/lang/System.out:Ljava/io/PrintStream;
5: ldc #22; //String I've got sunshine!
7: invokevirtual #24; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
10: return
}
It seems that the Eclipse compiler gets rid of the entire switch constructs entirely. Unfortunately this optimization breaks the language specification.
The official words
The bug has been filed and assigned for fix.
Olivier Thomann 2010-05-28 08:37:21 EDT
We are too aggressive on the optimization.
For:
switch((Integer) null) {};
we optimize out the whole switch statement when we should at least evaluate the
expression.
I'll take a look.
Candidate for 3.6.1.
See also
Bug 314830 - Switching on a null expression doesn't always throw NullPointerException
Definitly. If we look at the chapter 14.11 of the java language specification, it clearly states (under 'discussion'):
The prohibition against using null as
a switch label prevents one from
writing code that can never be
executed. If the switch expression is
of a reference type, such as a boxed
primitive type or an enum, a run-time
error will occur if the expression
evaluates to null at run-time.
Yep. According to the JLS it's a bug:
If the switch expression is of a reference type, such as a boxed primitive type or an enum, a run-time error will occur if the expression evaluates to null at run-time.

Categories

Resources