I have a .jar file that represents a plugin that I am trying to mess with. This is an older version of the plugin, and a newer version was written by somebody else. I have this newer version as a project.
The newer project is full of .java files, and the old plugin is full of .class files. I can import the jar as a project, but it's still all class files. The differences between the class files and the java files are not particularly large, and I would like to see the differences between them. When I do this now, however, the text comparison changes the .class file from its normal representation in the editor to a binary representation. I know that if they were the same type of file, I could select the two and hit "Compare With". How can I do this between a .class and a .java file, or how can I turn one into the other in a way that still allows me to compare the two?
What would really be best is if there were some way for me to edit the jar, by turning the .class files into .java files.
It seems like what you will need is a decompiler to convert the Java .class files (bytecode) back into their original .java source files (text). Then you could compare to the two text files. This seems like it might be useful: http://java.decompiler.free.fr
You can use SOOT (http://www.sable.mcgill.ca/soot/) to do this. Two approaches are possible:
Decompile the .class files into .java files using Dava in SOOT, and then compare the .java files.
Convert both .class and .java files into an intermediate representation called Jimple in SOOT, and compare the Jimple files.
I think the second approach is more reasonable, because:
In the first approach, some Java files are manually developed, while the others are machine generated. Doing a diff on them creates results that are difficult to read.
The Jimple representation is very close to Java source code and relatively easy to read. Reading a diff result on this unified, machine generated format is much easier. Also, if you want, you can convert all Jimple files back to Java source code (well, this is sort of the third approach...).
Because it was a plugin, I was able to import it as a plug-in project, and there was a box to include the source folder. When I checked that I got access to the .java code and was able to diff successfully.
Related
I'm facing an issue while comparing two jar files. Getting conversion error while trying to compare two class files inside the jar files.
I've tried changing encoding type in File formats and it didn't resolve the issue. Also downloaded the required plugins to decompile the class files.
Please help me out in resolving this issue.
Beyond Compare's Java class to source file format uses JAD to decompile .class files to .java source. It's a pretty old decompiler, so it doesn't support .class files that use newer java features.
As a workaround, use a newer decompiler to extract .class files to .java source outside of Beyond Compare, then compare the .java files in Beyond Compare.
A lot of times in Java we want to use some functionality that is given to us in the form of JARs(ex. some external library). Most often than not I've noticed that JARs contain .class files.
Since .class files represent compiled bytecode ready for use by the JVM, my question is the following:
How is it that .class files are all that's needed for us to make use of an external library? Maybe a certain JAR contains the class file called: Person.class. How am I able to reference this class in my code when all that the JAR file exposes is a .class file. Isn't the source code(.java file) what's important and what's needed? In the same way that I can have two classes in the same package, I'm able to reference one from the other, because the two .java files(not .class files) are in the same scope(just to give an example).
Excuse me if it's a dumb question, but I really want to understand this.
Even if you write your source code in .java files, they are eventually compiled to form .class files which store bytecode that can be interpreted easily. When you use the jar files in your project, all the class files inside those jar files are included in your classpath, hence enabling you to use them.
So in a JAR package, .class files are sufficient to be used as a dependency.
The Java compiler takes your Java code, which is something that humans can understand, into .class files, which is something that the Java Virtual Machine (JVM) can understand. The JVM then takes the .class files and runs them on your machine.
A .jar file is effectively a collection of .class files packaged up (under the hood, it's really little more than a .zip in disguise). When you add a .jar onto your classpath, you are telling the JVM that it is one more place it should look when it needs a particular class.
I am not sure if I totally got your question, but the JARs are simply compiled javacode, which means, that the semantic/logic etc of the code has not been changed. You need to be able to access the functions/classes etc of the java code you want to use, because otherwise you would not gain any advantage of using a JAR.
One advantage of the JARs is, that the source code of these libraries is already compiled. Since these .class files are compiled .java files, they are all you need to access the functions that were written in the .java file.
In my situation I have many .jar files being created from a build process. Before I do any debugging I want a way to quickly verify that my .java source matches the .class found in a .jar.
I figure that if I unzip the .jar and find the .class which matches my .java file then I should be able to determine if they're functionally the same.
How can I do this?
The first thing to realize is that compilation doesn't just use the specific .java file for the class being compiled. The compiler also uses information from the other .java and .class files available at compile time. For example, it may inline static final constants. Also, stuff like method overloading depends on which methods have been defined.
That being said, if you compile the same source file with the same compiler as before, you'll probably get the same, or a very similar class file. However, even with identical source files, different compilers (javac vs eclipse) and different versions of the compiler will produce different results.
Therefore, what I'd recommend is first try compiling everything and see if the classfiles match. If the class files don't match, try disassembling them with the Krakatau disassembler and do a diff on the diassemblies to see what the differences are. That will help you see if the difference is unimportant (such as a reordering of the constant pool) or if there are substantive changes to the bytecode.
You can use a java decompiler like http://jd.benow.ca/ in order to be able to view the corresponding source of your class file then you will be able to compare it with your java file
Maybe it would be enough for you if you can use a decompiler? Like one from IntelliJ IDE to see how is the source for you compiled class. You can even debug over the decompiled source.
I am familiar with the jar structure and it will have a .class files in the classes directory as well as META-INF directory containing the information to main().
But where does the actual java code resides in a jar?
Does it resides a compiled byte code? But don't different machines have different compilers?
I know that I can extract the java code using a decompiler which might be illegal. But I am not interested in doing so. I am more interested in understanding how code is stored?
Is it encrypted? If so, what is the encryption algorithm? what is the location inside the jar ?
Unless you specify otherwise, the source code is not included in the JAR file. The JAR file normally only contain class files (compiled JVM instructions), not source code.
A JAR file is just a ZIP file, renamed to mean Java ARchive.
You can check what's inside, unzipping it. If you're on a OS that doesn't allow you to decompress that archive because doesn't appear to be a compressed file, just change the extension to ZIP.
JAR files are not encrypted.
Java Sources are compiled in a platform-neutral Java bytecode, that's a kind of intermediate binary.
Once JVM load the classes it either interprets the bytecode or just-in-time compiles it to the underlying machine. JARs usually only contain that bytecode
Usually sources are not included in the JARs, especially for distribution. Some projects deliver sources as well in a separate archive. You should check with the provider of the JAR you're dealing with to get sources.
If decompiling is illegal or not depends on the terms of the license applied to JAR. You should check those.
Decompiling a class object is not a very easy task, but a guy used to do a very good job with his JAD.
Unfortunately it's no more maintained, but there are some websites where you can still download it.
Decompiled classes will not look exactly as the real sources, and you could have to make some changes, but you'll definitely get an idea about the source.
In computer science I have learned that .jar files are basically a compressed set of .java files that have been compiled. So, when you have a project, instead of those 20 .java files you can have a pile of compressed classes (a .jar). Last year in CSI we worked with a .jar file called DanceStudio, which we had to use to make feet walk across the floor. This year, we are working with a different program to better understand java, so i unzipped the .jar file contained 26 classes, which I then decompiled. I wanted to try to create a program by compiling all of the .java files with the others necessary to make the program run (Walker, Foot, ETC.) When I try to compile all of these files, it will say that I have duplicate files (Walker, Foot, ETC.) What I don't understand is why this would compile if the .jar file was basically the same thing, just in a compressed form. What also confuses me is that the Foot, ETC files in the .jar are actually more complicated and have more code.
Could someone please explain how the .jar file actually works and separates these files apart, and how it could run with a duplicate class that isn't in the .jar file?
First of all, you're missing one step in your explanation of a .jar file.
A .jar file is a collection of .class files. And .class files are what is produced by compiling a .java file.
Usually a single .java file will produce a single .class file, because it will contain a single type definition. But there are several ways for a .java file to produce more than one .class files (inner/nested classes, anonymous classes, top-level non-public classes, ...), so it's not necessarily a 1-to-1 association between .java files and .class files.
Then there's the confusion why the decompiled Java source code looks more complicated than the original Java source. This one is easy to answer: the compilation step was not designed to be reversable.
When the Java compiler turns .java files to .class files it produces a format that is best suited for being executed. That format will not represent the exact same concepts that the Java source file does. For example: there's no classical "if" in the Java bytecode. It will be implemented be appropriate jump commands.
All of this means that the process of converting .class files back to .java files is complicated and usually non-perfect.
You generally compile your (clear text) .java source files into (binary) .class files.
If you use packages, then the class files will be in different subdirectories (representing the package).
A .jar file is a compressed binary file that puts all the .classes in the right directories in one compact, easy to manage file.
.jar file can also contain other files, such as manifests, bitmaps and resources.
.jar files can also be "signed" to insure the integrity/authenticity of their contents.
Here are some good links:
http://en.wikipedia.org/wiki/JAR_%28file_format%29
http://download.oracle.com/javase/tutorial/deployment/jar/
'Hope that helps
About your duplicate: Maybe your .jar is still in your build path, so when you try to compile your project with the decompiled class, you will have duplicate. check and remove the .jar if its still in your build path.