Currently I am tinkering with the jvm bytecode instructions. I made a simple compiler that given source code (C like style) generates valid jvm bytecode representation. For example, the following code:
float x = 3;
float y = 4.5;
float z = x + y;
print z;
Compiles to:
ldc 3
i2f
fstore 1
ldc 4.5
fstore 2
fload 1
fload 2
fadd
fstore 3
getstatic java/lang/System/out Ljava/io/PrintStream;
fload 3
invokevirtual java/io/PrintStream/println(F)V
return
(I know the generated java code is not the most efficient as of now, but that is not the point).
Using a Java Bytecode Editor, I loaded a compiled main class and replaced the main method code with my code. After that, I was able to run the class file with my code perfectly fine. My question is, is there a tool/script without UI that can take java bytecode and generate the appropriate headers for the class file (in other words, take the bytecode and make a valid class file out of it). I guess I can write a script myself, but that would take some time that I might not have now.
The Krakatau assembler allows you to write bytecode in a textual format and assembles it into a classfile, handling all the binary encoding details for you.
It's similar to the older Jasmin assembler, but with minor syntax changes in order to remove ambiguity and to support classfile features that Jasmin can't handle. Unlike Jasmin, it fully supports the entire Java 8 classfile format and optionally allows full control over the binary representation of the classfile.
For example, here's a class using lambdas in Krakatau assembly format.
.version 52 0
.class public super LambdaTest1
.super java/lang/Object
.method public <init> : ()V
.code stack 1 locals 1
aload_0
invokespecial Method java/lang/Object <init> ()V
return
.end code
.end method
.method public static varargs main : ([Ljava/lang/String;)V
.code stack 4 locals 2
invokedynamic InvokeDynamic invokeStatic Method java/lang/invoke/LambdaMetafactory metafactory (Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite; MethodType (J)J MethodHandle invokeStatic Method LambdaTest1 lambda$main$0 (J)J MethodType (J)J : applyAsLong ()Ljava/util/function/LongUnaryOperator;
astore_1
getstatic Field java/lang/System out Ljava/io/PrintStream;
aload_1
ldc2_w 42L
invokeinterface InterfaceMethod java/util/function/LongUnaryOperator applyAsLong (J)J 3
invokevirtual Method java/io/PrintStream println (J)V
return
.end code
.end method
.method private static synthetic lambda$main$0 : (J)J
.code stack 4 locals 2
lload_0
lload_0
l2i
lshl
lreturn
.end code
.end method
.innerclasses
java/lang/invoke/MethodHandles$Lookup java/lang/invoke/MethodHandles Lookup public static final
.end innerclasses
.end class
Related
In putty I am attempting to create a Jasmin program that, when assembled and ran as a Java program, will output the integer "431". When I attempt to assemble the program the console says there is a syntax error on line 11. I am having trouble figuring out what it is. Here is my code:
.class public Lab3_JasminExample
.super java/lang/Object
.method public <init>()V
aload_0
invokespecial java/lang/Object/<init>()V
return
.end method
.method public static main ([Ljava.lang.String;)V
.limit stack 10
.limit locals 10
getstatic java/lang/System/out Ljava/io/PrintStream;
sipush 431
invokevirtual java/io/PrintStream/println(I)V
return
.end method
Line 11 would be ".limit stack 10" and I can't see what is wrong with how I wrote that. What am I doing incorrectly?
Errors may be reported on a line but be triggered by previous (or following!) lines, so always look around the offending line.
My Jasmin (version 2.4) correctly reports the error on line 10
a.j:10: Warning - Syntax error.
.method public static main ([Ljava.lang.String;)V
^
This is a silly mistake really: there is space between the method name (main) and its descriptor (([Ljava.lang.String;)V)
Line 10 should be .method public static main([Ljava.lang.String;)V
In the coverage result, it shows that I've covered 9 instructions while there are only 5 lines highlighted green. Which are the other 4 instructions?
Click the dropdown arrow at the top right of the Coverage box. It'll give you a couple different ways to measure your coverage. The default seems to be instructions (bytecode instructions), but you can manually select lines.
The reason you are seeing 9 instructions is because there are 9 bytecode instructions in Foo:
$ javap -c Foo.class
Compiled from "Foo.java"
public class Foo {
public Foo();
Code:
0: aload_0
1: invokespecial #8 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: getstatic #16 // Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #22 // String Test
5: invokevirtual #24 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
8: new #1 // class Foo
11: invokespecial #30 // Method "<init>":()V
14: return
}
As #schmosel says, it is counting bytecode instructions.
You can verify this by reading the EMMA reference documentation (EclEMMA is an Eclipse GUI wrapped around EMMA), in which the phrase "bytecode instructions" is used throughout.
I'm working on code that calculates entries in the StackFrameMap (SFM). The goal is to be able to generate (SFM) entries that make the Java 7 bytecode verifier happy. Following a TDD methodology, I started by creating bogus SMF entries for the verifier to complain about; I would the replace these with my properly-calculated entries to see that I was doing it correctly.
The problem is: I can't get the bytecode verifier to complain. Here is an example, starting with the original Java code (this code is not supposed to do anything useful):
public int stackFrameTest(int x) {
if (x > 0) {
System.out.println("positive x");
}
return -x;
}
This generates the following bytecode (with SFM):
public int stackFrameTest(int);
flags: ACC_PUBLIC
Code:
stack=2, locals=2, args_size=2
0: iload_1
1: ifle 12
4: getstatic #47 // Field java/lang/System.out:Ljava/io/PrintStream;
7: ldc #85 // String positive x
9: invokevirtual #55 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
12: iload_1
13: ineg
14: ireturn
StackMapTable: number_of_entries = 1
frame_type = 12 /* same */
Now, I change the SFM to contain this:
StackMapTable: number_of_entries = 1
frame_type = 255 /* full_frame */
offset_delta = 12
locals = [ double, float ]
stack = [ double ]
As you can see, that is completely bogus, but it loads without error. I read the JVM spec, and I couldn't see any reason why this would work. I'm not using the SplitBytecodeVerifier option.
EDIT: Per the accepted answer below, Eclipse had been set to emit Java 6 class files (version 50.0). Classfiles of this this version will quietly ignore issues with the StackFrameMap. After changing the setting to use the default Java 7 classfile format (51.0), it worked as expected.
I am unable to reproduce your results. I tried modifying the stack frame and it failed to load as expected. If you want, I can post my modified classfile.
It's not clear what happened, but you've almost certainly made a mistake somewhere. The most likely explanation is that your classfile has version 50.0, in which case the JVM will fall back to normal verification when the stackmap is invalid. You need to set the version to 51.0 to force stackmap validation. Another possibility is that you simply messed up editing the file and didn't actually save the changes or didn't make the changes you thought you did.
Here's the assembly for my modified classfile.
.version 51 0
.class super public StackFrameTest4
.super java/lang/Object
.method public <init> : ()V
.limit stack 1
.limit locals 1
aload_0
invokespecial java/lang/Object <init> ()V
return
.end method
.method static public main : ([Ljava/lang/String;)V
.limit stack 2
.limit locals 1
new StackFrameTest
dup
invokespecial StackFrameTest <init> ()V
bipush 42
invokevirtual StackFrameTest stackFrameTest (I)I
pop
return
.end method
.method public stackFrameTest : (I)I
.limit stack 2
.limit locals 2
iload_1
ifle L12
getstatic java/lang/System out Ljava/io/PrintStream;
ldc 'positive x'
invokevirtual java/io/PrintStream println (Ljava/lang/String;)V
L12:
.stack full
locals Double Float
stack Double
.end stack
iload_1
ineg
ireturn
.end method
I'm playing with jasmin and I try to launch my .class file, which is supposed to perform simple string concatenation. My jasmin source looks like this:
.class public default_class
.super java/lang/Object
.method public static main([Ljava/lang/String;)V
.limit locals 1
.limit stack 1
invokestatic main_65428301()I
return
.end method
.method public static main_65428301()I
.limit locals 1
.limit stack 100
new java/lang/String
dup
ldc "foo"
invokestatic java/lang/String.valueOf(Ljava/lang/Object;)Ljava/lang/String;
invokespecial java/lang/StringBuilder(Ljava/lang/String;)V
ldc "bar"
invokevirtual java/lang/StringBuilder.append(Ljava/lang/String;)Ljava/lang/StringBuilder;
invokevirtual java/lang/String.toString()V
astore_0
iconst_0
ireturn
.end method
Now I'm able to run java -jar jasmin.jar and I get default_class.class. However, when I try to launch it like java default_class I get an error:
Exception in thread "main" java.lang.VerifyError: (class: default_class, method: main_65428301 signature: ()I) Illegal use of nonvirtual function call
What should I change in my assembly to get this to work?
In JVM, to create the object you have to first use new instruction and then call <init> method (constructor). You do not create new StringBuilder and call the wrong constructor name (should be java/lang/StringBuilder/<init>(Ljava/lang/String;)V).
I also see no reason to do:
new java/lang/String
dup
or
invokestatic java/lang/String.valueOf(Ljava/lang/Object;)Ljava/lang/String;
"The new instruction does not completely create a new instance; instance initialization is not completed until an instance initialization method has been invoked on the uninitialized instance."
I am using Javassist to extend certain classes at runtime .
In a couple of places (in the generation code) I need to create instances of the Javassist ConstPool class.
For example, to mark a generated class as synthetic, I wrote something like this:
CtClass ctClassToExtend = ... //class to extend
CtClass newCtClass = extend(ctClassToExtend, ...); //method to create a new ctClass extending ctClassToExtend
SyntheticAttribute syntheticAttribute = new SyntheticAttribute(ctClassToExtend.getClassFile().getConstPool()); //creating a synthetic attribute using an instance of ConstPool
newCtClass.setAttribute(syntheticAttribute.getName(), syntheticAttribute.get()); //marking the generated class as synthetic
This is working as expected, but I have certain doubts about this being entirely correct. Concretely, my main question is:
Is the call to CtClass.getClassFile().getConstPool() the correct way to get a constant pool in this example?. If not, what is the general proper way to get the right instance of a constant pool when creating a new class at runtime using Javassist?
Also, I am a bit lost regarding what is happening behind the curtains here: Why do we need a constant pool to create a instance of a synthetic attribute ?, or in general, of any other kind of class attributes ?
Thanks for any clarification.
Don't know if you're still interested in the answer, but at least might help others that
find this question.
First of all, a small suggestion to everyone that starts creating/modifying bytecode
and needs more in-depth information on how the JVM internals works, the JVM's specification documentation might look bulky and scary at first but it's invaluable help!
Is the call to CtClass.getClassFile().getConstPool() the correct way to get a constant pool in this example?.
Yes, it is. Each Java Class has a single constant pool, so basicaly every time you need to access the constant
pool for a given class you can do ctClass.getClassFile().getConstPool(), although you must keep in mind the
following:
In javassist the constant pool field from CtClass is an instance field, that means that if you have two CtClass objects
representing the same Class you'll have two diferrent instances of constant pool (even though they represent
the constant pool in the actual class file). When modifying one of the CtClassinstances you must use the
associated constant pool instance in order to have the expected behaviour.
There are times where you might not have the CtClass but rather a CtMethod or a CtField which don't let you backtrace to the CtClass instance, in such cases you can use ctMethod.getMethodInfo().getConstPool() and ctField.getFieldInfo().getConstPool() to retrieve the correct constant pool.
Since I've mentioned CtMethod and CtField, keep in mind that if you want to add attributes to any of these, it can't be through the ClassFile Object, but through MethodInfo and FieldInfo respectively.
Why do we need a constant pool to create a instance of a synthetic attribute ?, or in general, of any other kind of class attributes ?
To answer this question, I'll start quoting section 4.4 regarding JVM 7 specs (like I said, this documentation is quite helpful):
Java virtual machine instructions do not rely on the runtime layout of classes, interfaces, class instances, or arrays. Instead, instructions refer to symbolic information in the constant_pool table.
With this in mind, I think the best way to shed some light on this subject is to look at a class file dump. We can achieve this by running the following command:
javap -s -c -p -v SomeClassFile.class
Javap comes with the java SDK and it's a good tool to analyse classes at this level, the explanation of each switch
-s : Prints internal type signature
-c : Prints byte code
-p : Prints all class members (methods and fields, including the private ones)
-v : Be verbose, will print tack information and class constant pool
Here's the output for test.Test1 class that I modified via javassist to have the synthetic attribute both in the class and in the injectedMethod
Classfile /C:/development/testProject/test/Test1.class
Last modified 29/Nov/2012; size 612 bytes
MD5 checksum 858c009090bfb57d704b2eaf91c2cb75
Compiled from "Test1.java"
public class test.Test1
SourceFile: "Test1.java"
Synthetic: true
minor version: 0
major version: 50
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Class #2 // test/Test1
#2 = Utf8 test/Test1
#3 = Class #4 // java/lang/Object
#4 = Utf8 java/lang/Object
#5 = Utf8 <init>
#6 = Utf8 ()V
#7 = Utf8 Code
#8 = Methodref #3.#9 // java/lang/Object."<init>":()V
#9 = NameAndType #5:#6 // "<init>":()V
#10 = Utf8 LineNumberTable
#11 = Utf8 LocalVariableTable
#12 = Utf8 this
#13 = Utf8 Ltest/Test1;
#14 = Utf8 SourceFile
#15 = Utf8 Test1.java
#16 = Utf8 someInjectedMethod
#17 = Utf8 java/lang/System
#18 = Class #17 // java/lang/System
#19 = Utf8 out
#20 = Utf8 Ljava/io/PrintStream;
#21 = NameAndType #19:#20 // out:Ljava/io/PrintStream;
#22 = Fieldref #18.#21 // java/lang/System.out:Ljava/io/PrintStream;
#23 = Utf8 injection example
#24 = String #23 // injection example
#25 = Utf8 java/io/PrintStream
#26 = Class #25 // java/io/PrintStream
#27 = Utf8 println
#28 = Utf8 (Ljava/lang/String;)V
#29 = NameAndType #27:#28 // println:(Ljava/lang/String;)V
#30 = Methodref #26.#29 // java/io/PrintStream.println:(Ljava/lang/String;)V
#31 = Utf8 RuntimeVisibleAnnotations
#32 = Utf8 Ltest/TestAnnotationToShowItInConstantTable;
#33 = Utf8 Synthetic
{
public com.qubit.augmentation.test.Test1();
Signature: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #8 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 3: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this Ltest/Test1;
protected void someInjectedMethod();
Signature: ()V
flags: ACC_PROTECTED
Code:
stack=2, locals=1, args_size=1
0: getstatic #22 // Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #24 // String injection example
5: invokevirtual #30 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
8: return
RuntimeVisibleAnnotations:
0: #32()
Synthetic: true
}
Notice that both the class and the method have the attribute Synthetic: true which mean they are Synthetic but, you the synthetic symbol must also be present in the constant pool (check #33).
Another example regarding the use of constant pool and class/method attributes is the annotation added to someInjectedMethod with runtime retention policy. The method's bytecode only has a reference to the constant pool #32 symbol, and only there you learn that
the annotation is from the type test/TestAnnotationToShowItInConstantTable;
Hope things got a bit more clear for you now.