How to access and read JVMs class file members?

How to access and read JVMs class file members? - java

I'm not very familiar with JVM and I have an assignment involving the Class file.
Write a java program that when run as
java DissectClassFile file1.class file2.class ...
it will print a summary of each class file as follows:
the name of the class defined by the class file,
its super class and interfaces it implements,
the number of items in the constant pool,
the number of interfaces implemented by the class, and their names,
the number of fields of the class whose name contain the underscore character,
the number of methods of the class whose names contain at least one capital letter
Right off the bat I don't know where to begin. If someone could help me out and point me in the correct direction, I should get the hang of it.

You need to read the Java Virtual Machine Specification. It contains an explanation of the class file format.

There is a class java.lang.Class to access that information. For every Class, you can call MyClass.class (for example, String.class) to get the object with the information for that class.

Most of this information can easily be gleaned loading each class using Class.forName(...) and using the reflection APIs to fish out the information. However the constant pool size is the killer. AFAIK, this can only be determined from the class file itself.
So, your options would seem to be:
Write a bunch of code to read and decode class files. The JVM spec has the details of the class file format.
Use an existing library such as BCEL to take care of the low-level class file parsing.
Use a hybrid of class file parsing (using either of the above) to extract the constant pool size, and the reflection APIs for the rest.
I imagine that your assignment hints at which way they expect you to go. But if not, I'd look at the BCEL approach first.

Related

Saving the bytecode of a class modified with reflection

I have a java template class, of which I would like to modify a single String field.
I can instantiate an object of that class, get to its corresponding Class object, and modify the field using reflection, so far so good.
But how do I actually save the bytecode to the filesystem?
Since I think that if I get to the ClassLoader of the original template class, get to the InputStream and try to save to a file I will get the original (i.e. unmodified) class implementation. Is it so?
Ideally I would also need to change the class name to something more meaningful.
Can both things be done using pure java in the first place?
Or do I have to resort to external libraries?

When you modify a field using reflection, you're not changing anything about the class itself. It's just a fancy way of setting a variable. So there's no changed bytecode to worry about in the first place.
Anyway, AFAIK you can't easily get access to bytecode at runtime. The JVM creates classes from classfiles (either from files or in memory data) but once the class is loaded, there's no particular reason to keep the data around. Most likely, it will only keep an optimized representation that doesn't necessary correspond to the original classfile.
I think there are some APIs like Java agent that deal with bytecode at runtime, but it's not clear how well they work, partly because the JVM does optimize things.

Find an assembler/disassembler pair. Disassemble the class file, replace the string value, and compile back to class file. Note that the string constant can be referenced from several points, so probably you have to add a constant and change only one reference. If the new string value has the same length as the old one (in UTF-8 encoding), then you just replace constant with a binary file editor. If length are different, replacing would destroy the whole classfile structure.

Why does Java allow us to compile a class with a name different than the file name?

I have a file Test.java and the following code inside it.
public class Abcd
{
//some code here
}
Now the class does not compile, but when I remove the public modifier , it compiles fine.
What is the reasoning behind Java allowing us to compile a class name that is different from the file name when it is not public.
I know it is a newbie question, but I'm not able to find a good explanation.

The rationale is to allow more than one top-level class per .java file.
Many classes—such as event listeners—are of local use only and the earliest versions of Java did not support nested classes. Without this relaxation of the "filename = class name" rule, each and every such class would have required its own file, with the unavoidable result of endless proliferation of small .java files and the scattering of tightly coupled code.
As soon as Java introduced nested classes, the importance of this rule waned significantly. Today you can go through many hundreds of Java files, never chancing upon one which takes advantage of it.

The reason is the same as for the door plates. If some person officially resides in the office (declared public) his/her name must be on the door tag. Like "Alex Jones" or "Detective Colombo". If somebody just visits the room, talks to an official or cleans the floor, their name does not have to be officially put on the door. Instead, the door can read "Utilities" or "Meeting room".

The Java specification states you can only have at most one public class per file. In this case, the class name should match the file name. All non-public classes are allowed to have any name, regardless of the file name.

I think allowing them is a prerequisite for nested classes. Anonymous Classes in particular dramatically reduce the number of .java files required. Without support for this, you would need lots of single method interface implementations in their own separate files from the main class they are used in. (I'm thinking of action listeners in particular)
There is a good explanation of all nested classes in the Nested Classes Java tutorial on Oracle's website, which has examples of each. It also has a reason they are useful, which I'll quote:
Why Use Nested Classes?
Compelling reasons for using nested classes include the following:
It is a way of logically grouping classes that are only used in one place: If a class is useful to only one other class, then it is logical to embed it in that class and keep the two together. Nesting such "helper classes" makes their package more streamlined.
It increases encapsulation: Consider two top-level classes, A and B, where B needs access to members of A that would otherwise be
declared private. By hiding class B within class A, A's members can be
declared private and B can access them. In addition, B itself can be
hidden from the outside world.
It can lead to more readable and maintainable code: Nesting small classes within top-level classes places the code closer to where it is
used.
(emphasis mine)
I am not familiar with Java spec back in the early days, but a quick search shows inner classes were added in Java 1.1.

I look at it the other way round. The natural state of affairs would be for the programmer to pick both the class name and the file name independently. Probably in order to simplify finding public classes from outside a package during compilation, there is a special restriction that a public class be in a file with the corresponding name.

Note that Java is case-sensitive, but the filesystem need not be. If the file's base name is "abcd", but the class is "Abcd", would that conform to the rule on a case-insensitive filesystem? Certainly not when ported to a case-sensitive one.
Or suppose you happened to have a class called ABCD, and a class Abcd (let's not get into that being a bad idea: it could happen) and the program is ported to a case insensitive filesystem. Now you not only have to rename files, but also classes, oops!
Or what if there is no file? Suppose you have a Java compiler which can take input on standard input. So then the class has to be named "StandardInput"?
If you rationally explore the implications of requiring file names to follow class names, you will find that it's a bad idea in more than one way.

Also one other point that many answers missed to point out is that without the public declaration, the JVM would never know which classes' main method needs to be invoked. All classes declared in one .java file can all have main methods, but the main method is run on only the class marked as public. HTH

Because of a java file can contains more than one class, it may have two classes in one java file. But a java file have to contain a class as the same name as file name if it contains a public class.

How to create a "java.lang.Class" object from a .java file in java

How can I instantiate an object of the class java.lang.Class from a given .java file?
I want to create an application to automatically generate JUnit tests. For that I would need "Method" objects and for "Method" objects I need to have a "Class" object.

java 6 onwards has an api for the compiler: http://www.javabeat.net/2007/04/the-java-6-0-compiler-api/
the link above includes an example.
here's another example - http://www.java2s.com/Code/Java/JDK-6/JavaCompilertoolshowyoucancompileaJavasourcefrominsideaJavaprogram.htm
to load the file once compiled you use a classloader. there's an example at http://tutorials.jenkov.com/java-reflection/dynamic-class-loading-reloading.html and another at http://www.javaworld.com/jw-10-1996/jw-10-indepth.html
you'd think there would be a library to simplify all this. i can't find one, but am still looking.
meanwhile, here's a really nice article from ibm that compiles a function and plots it - http://www.ibm.com/developerworks/java/library/j-jcomp/index.html
found one http://docs.codehaus.org/display/JANINO/Home - this is a library that simplifies the process. i would recommend configuring it to use the javax.tools API (see last sentence in the "what is Janino" paragraph).
sorry for the google snark earlier.
it just struck me that maybe you just want a class object.
if you have a class called MyClass then the associated class is Myclass.class. that's probably obvious, but perhaps that's all you need.
and if you have the class name in a string you can use this method - http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/Class.html#forName(java.lang.String)

If you do know the full class String name, why don't you use the Class.forName(className) method to obtain the class Object?
For example, if the class name is yakam.yale.MyClass, just call the
Class.forName("yakam.yale.MyClass");
and the game is played!

Using a function in two unrelated Java classes

I have two classes in my Java project that are not 'related' to each other (one inherits from Thread, and one is a custom object. However, they both need to use the same function, which takes two String arguments and does soem file writing stuff. Where do I best put this function? Code duplication is ugly, but I also wouldn't want to create a whole new class just for this one function.
I have the feeling I am missing a very obvious way to do this here, but I can't think of an easy way.

[a function], which takes two String arguments and does soem file writing stuff
As others have suggested, you can place that function in a separate class, which both your existing classes could then access. Others have suggested calling the class Utility or something similar. I recommend not naming the class in that manner. My objections are twofold.
One would expect that all the code in your program was useful. That is, it had utility, so such a name conveys no information about the class.
It might be argued that Utility is a suitable name because the class is utilized by others. But in that case the name describes how the class is used, not what it does. Classes should be named by what they do, rather than how they are used, because how they are used can change without what they do changing. Consider that Java has a string class, which can be used to hold a name, a description or a text fragment. The class does things with a "string of characters"; it might or might not be used for a name, so string was a good name for it, but name was not.
So I'd suggest a different name for that class. Something that describes the kind of manipulation it does to the file, or describes the format of the file.

Create a Utility class and put all common utility methods in it.

Sounds like an ideal candidate for a FileUtils class that only has static functions. Take a look at SwingUtilities to see what I'm talking about.

You could make the function static in just one of the classes and then reference the static method in the other, assuming there aren't variables being used that require the object to have been instantiated already.
Alternatively, create another class to store all your static methods like that.

To answer the first part of your question - To the best of my knowledge it is impossible to have a function standalone in java; ergo - the function must go into a class.
The second part is more fun - A utility class is a good idea. A better idea may be to expand on what KitsuneYMG wrote; Let your class take responsibility for it's own reading/writing. Then delegate the read/write operation to the utility class. This allows your read/write to be manipulated independently of the rest of the file operations.
Just my 2c (+:

Obtaining Java source code from class name

Is there a way to obtain the Java source code from a class name?
For example, if I have access to the library with the class java.io.File, I want its source code.
I am working on a kind of parser and I need the source at execution time. I have also to search it recursively.
Say the aforementioned class has this method:
int method (User user) {...}
I would need to obtain User's source code, and so on and so forth with its inner classes.

Is there any way to obtain the java source from a class name? For example:...
You may want one of several possible solutions. Without knowing what you really want to do with the information, we can't be very precise with our recommendations, but I'd start by steering you away from source code if possible. JSE source code is available online, as are many open source libraries, but that may not always be the case. Additionally, you'll need to keep it all organized when you want to find it, much like a classpath, whereas the Class objects are much easier to get hold of, and manipulate, without having to parse text again.
Reflection
If you just need information about a class at runtime, just use the Java Reflection API. With it, given a Class object you can, for example, get the types of a specific field, list all fields and iterate over them, etc...:
Class clazz = User.class;
Field field = clazz.getDeclaredField("var");
System.out.println(field.getType().getName());
Reflection is useful for discovering information about the classes in the program, and of course you can walk the entire tree without having to find source code, or parse anything.
Remember you can lookup a class object (as long as it's on the classpath at runtime) with Class.forName("MyClass") and reflect on the resulting Class.
Bytecode Manipulation
If you need more than information, and actually want to manipulate the classes, you want bytecode manipulation. Some have tried to generate source code, compile to bytecode and load into their program, but trust me - using a solid bytecode manipulation API is far, far easier. I recommend ASM.
With it, you can not only get information about a class, but add new fields, new methods, create new classes... even load multiple variations of a class if you're feeling self-abusive. An example of using ASM can be found here.
Decompilation
If you really, really do need the source, and don't have it available, you can decompile it from a class object using one of the various decompilers out there. They use the same information and techniques as the above two, but go further and [attempt] to generate source code. Note that it doesn't always work. I recommend Jode, but a decent list, and comparison of others is available online.
File Lookup
If you have the source and really just want to look it up, maybe all you need is to put the .java files somewhere in a big tree, and retrieve based on package name as needed.
Class clazz = User.class;
String path = clazz.getPackage().getName().replaceAll("\\.","/");
File sourceFile = new File(path, clazz.getName() + ".java")
You want more logic there to check the class type, since obviously primatives don't have class definitions, and you want to handle array types differently.
You can lookup a class by name (if the .class files are on your classpath) with Class.forName("MyClass").

You can get a good approximation of the source from a class file using the JAVA decompiler of your choice. However, if you're really after the source of java.io.File then you can download that.

The best and simplest bet can be javap
hello.java
public class hello
{
public static void main(String[] args)
{
System.out.println("hello world!");
world();
}
static public void world()
{
System.out.println("I am second method");
}
}
do a javap hello and you will get this:
Compiled from "hello.java"
public class hello extends java.lang.Object{
public hello();
public static void main(java.lang.String[]);
public static void world();
}

Yes, if you download the source code. It's available for public download on the official download page.
If you're using Eclipse whenever you use the class you could right click > View Source (or simply click the class > F3) and it'll open a new tab with the source.

You can print the resource path from where the class was loaded with
URL sourceURL=obj.getClass().getProtectionDomain().getCodeSource().getLocation();
It will be a .class file , .jar,.zip, or something else.

So what you're trying to do is get the Java class at execution. For this, you need Java reflections.

If your goal is to get information about what's in a class, you may find the Java reflection API to be an easier approach. You can use reflection to look up the fields, methods, constructors, inheritance hierarchy, etc. of a class at runtime, without needing to have the source code for the class available.

Is there any way to obtain the java source from a class name?
The answer is complicated, not least because of the vagueness of your question. (Example notwithstanding).
In general it is not possible to get the real, actual Java source code for a class.
If you have (for example) a ZIP or JAR file containing the source code for the classes, then it is simple to extract the relevant source file based on the classes fully qualified name. But you have to have gotten those ZIP / JAR files from somewhere in the first place.
If you are only interested in method signatures, attribute names and types and so on, then much of this information is available at runtime using the Java reflection APIs. However, it depends on whether the classes were compiled with debug information (see the -g option to the javac compiler) how much will be available. And this is nowhere like the information that you can get from the real source code.
A decompiler may be able to generate compilable source code for a class from the bytecode files. But the decompiled code will look nothing like the original source code.
I guess, if you have a URL for a website populated with the javadocs for the classes, you could go from a class name, method name, or public attribute name to the corresponding javadoc URL at runtime. You could possibly even "screen scrape" the descriptions out of the javadocs. But once again, this is not the real source code.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to access and read JVMs class file members? - java

You need to read the Java Virtual Machine Specification. It contains an explanation of the class file format.

There is a class java.lang.Class to access that information. For every Class, you can call MyClass.class (for example, String.class) to get the object with the information for that class.

Related

Saving the bytecode of a class modified with reflection

Why does Java allow us to compile a class with a name different than the file name?

How to create a "java.lang.Class" object from a .java file in java

Using a function in two unrelated Java classes

Obtaining Java source code from class name

Categories

Resources