How do external java libraries(jars) actually work?

How do external java libraries(jars) actually work? - java

A lot of times in Java we want to use some functionality that is given to us in the form of JARs(ex. some external library). Most often than not I've noticed that JARs contain .class files.
Since .class files represent compiled bytecode ready for use by the JVM, my question is the following:
How is it that .class files are all that's needed for us to make use of an external library? Maybe a certain JAR contains the class file called: Person.class. How am I able to reference this class in my code when all that the JAR file exposes is a .class file. Isn't the source code(.java file) what's important and what's needed? In the same way that I can have two classes in the same package, I'm able to reference one from the other, because the two .java files(not .class files) are in the same scope(just to give an example).
Excuse me if it's a dumb question, but I really want to understand this.

Even if you write your source code in .java files, they are eventually compiled to form .class files which store bytecode that can be interpreted easily. When you use the jar files in your project, all the class files inside those jar files are included in your classpath, hence enabling you to use them.
So in a JAR package, .class files are sufficient to be used as a dependency.

The Java compiler takes your Java code, which is something that humans can understand, into .class files, which is something that the Java Virtual Machine (JVM) can understand. The JVM then takes the .class files and runs them on your machine.
A .jar file is effectively a collection of .class files packaged up (under the hood, it's really little more than a .zip in disguise). When you add a .jar onto your classpath, you are telling the JVM that it is one more place it should look when it needs a particular class.

I am not sure if I totally got your question, but the JARs are simply compiled javacode, which means, that the semantic/logic etc of the code has not been changed. You need to be able to access the functions/classes etc of the java code you want to use, because otherwise you would not gain any advantage of using a JAR.
One advantage of the JARs is, that the source code of these libraries is already compiled. Since these .class files are compiled .java files, they are all you need to access the functions that were written in the .java file.

Related

Usage of jar with .java files and odd behavior of the compiler

I was curious about the differences between .jar with .class files and .jar with .java files. I partially got the answer here, But then what is the usefulness of .java files in the jar?
My guess is that the java files in the jar are like an interface that prevents compilation error, because I solved the IllegalAccessError thrown on runtime by replacing jar files with .class with jar files with .java specifically when using Xposed Framework. (Got the hint from this thread.)
Also
Thank you for your explanations and they were helpful. But I want to learn more about the differences in compiler's view, because I am wondering why my app works fine even if I only included the jar with java files, not class files (zxing). Also there are some cases that throws IllegalAccessException when I include the jar with class files, but not thrown when I include the jar with java files(xposed), even though I have to include at least one of them to make the compiler(AIDE) not complain about references, like unknown package. Why does the compiler not complain when I include only jar with java files though the compiler would not be able to resolve the actual implementation of the referred classes?

A .jar file is basically just a .zip file with another extension.
A .jar file with .class files have a special purpose and may have special meta-data (e.g. in META-INF folder).
A .jar file .java files is just a .zip file.
It is however common for open-source libraries to provide 3 .jar files:
One with .class files, to be used by your code, both to compile and to run your code.
One with .java files, to be used by your IDE, so you can drill into the library code and see it. Especially useful when stepping through the code with a debugger.
One with javadoc files (.html files), to be used by your IDE, so you can read the documentation about the classes and methods in the library. You do read the documentation, right?
None of those 3 files have to be named .jar. They could be renamed .zip so you could easily open them in your favorite Zip utility, or they could be renamed .foo just because...
They should be named .jar, to clarify that they are Java ARchives.

Its simple - *.java files are sources, *.class files are compiled classes.
What is used on runtime by JVM?? *.class files. Why would you put source files inside library? IDK, usally sources are distributed as separate jar, but all in all it is done to allow you to check library code without decompilation.

Compare .java file to .class file

In my situation I have many .jar files being created from a build process. Before I do any debugging I want a way to quickly verify that my .java source matches the .class found in a .jar.
I figure that if I unzip the .jar and find the .class which matches my .java file then I should be able to determine if they're functionally the same.
How can I do this?

The first thing to realize is that compilation doesn't just use the specific .java file for the class being compiled. The compiler also uses information from the other .java and .class files available at compile time. For example, it may inline static final constants. Also, stuff like method overloading depends on which methods have been defined.
That being said, if you compile the same source file with the same compiler as before, you'll probably get the same, or a very similar class file. However, even with identical source files, different compilers (javac vs eclipse) and different versions of the compiler will produce different results.
Therefore, what I'd recommend is first try compiling everything and see if the classfiles match. If the class files don't match, try disassembling them with the Krakatau disassembler and do a diff on the diassemblies to see what the differences are. That will help you see if the difference is unimportant (such as a reordering of the constant pool) or if there are substantive changes to the bytecode.

You can use a java decompiler like http://jd.benow.ca/ in order to be able to view the corresponding source of your class file then you will be able to compare it with your java file

Maybe it would be enough for you if you can use a decompiler? Like one from IntelliJ IDE to see how is the source for you compiled class. You can even debug over the decompiled source.

Where does the actual java code reside if I export my jar without the source file?

I am familiar with the jar structure and it will have a .class files in the classes directory as well as META-INF directory containing the information to main().
But where does the actual java code resides in a jar?
Does it resides a compiled byte code? But don't different machines have different compilers?
I know that I can extract the java code using a decompiler which might be illegal. But I am not interested in doing so. I am more interested in understanding how code is stored?
Is it encrypted? If so, what is the encryption algorithm? what is the location inside the jar ?

Unless you specify otherwise, the source code is not included in the JAR file. The JAR file normally only contain class files (compiled JVM instructions), not source code.

A JAR file is just a ZIP file, renamed to mean Java ARchive.
You can check what's inside, unzipping it. If you're on a OS that doesn't allow you to decompress that archive because doesn't appear to be a compressed file, just change the extension to ZIP.
JAR files are not encrypted.
Java Sources are compiled in a platform-neutral Java bytecode, that's a kind of intermediate binary.
Once JVM load the classes it either interprets the bytecode or just-in-time compiles it to the underlying machine. JARs usually only contain that bytecode
Usually sources are not included in the JARs, especially for distribution. Some projects deliver sources as well in a separate archive. You should check with the provider of the JAR you're dealing with to get sources.
If decompiling is illegal or not depends on the terms of the license applied to JAR. You should check those.
Decompiling a class object is not a very easy task, but a guy used to do a very good job with his JAD.
Unfortunately it's no more maintained, but there are some websites where you can still download it.
Decompiled classes will not look exactly as the real sources, and you could have to make some changes, but you'll definitely get an idea about the source.

How does Java know the methods of an external jar?

What I don't get is how does Java know the methods of a jar that is referenced? If it is compiled just for running and you can't read it I don't see how you can see the methods still. An example of my question is like if you made a jar that makes a box show up on the screen using a method called
"ShowABox". And you add it to another Java project. Then how does the IDE know that a method called
"ShowABox" exists since the jar was already compiled? You can't read class files in an IDE so why can it read methods?

All the information you are referring to is actually stored in the class files precisely for this reason.
As to seeing the code in class files, you can certainly do so, and it will also prove that the information was kept. Have a look at Java Decompiler. Note you can even build this into eclipse if you want to see it directly there.

Compiled classes contain bytecode. Methods still has their real names, but their code compiled to JVM instructions.
You can read java class file format specification on wiki, read "The constant pool" paragraph, methods names (as other class information) contains in constant pool.
Just try to open some .class file in text editor, you will find methods names there. (.class files are often in project/bin folder, or open .jar as archive and get .class file from there)

A JAR is nothing more than all the class files zipped in a single file with a manifest attached. Each class file completely describes its public interface.

JAR-files have a very specific format — see http://en.wikipedia.org/wiki/JAR_(file_format) — and they contain class-files, which also have a very specific format — see http://en.wikipedia.org/wiki/Java_class_file. This format, in addition to providing the Java Virtual Machine with the information it needs to execute code, also provides IDEs and compilers with the information they need to find classes, interfaces, fields, methods, and so on.

A jar is nothing but an archive containing Java compiled .class binaries compressed for compactness into a single file. Its contents are compiled binaries organized in a directory structure. So you can think of it as a directory with files but compressed into a single archive (just like a zip file). A jar itself is not a binary ("exists since the jar was already compiled") -- it doesn't get compiled itself but it rather contains compiled elements.

A little confusion about .jar files

In computer science I have learned that .jar files are basically a compressed set of .java files that have been compiled. So, when you have a project, instead of those 20 .java files you can have a pile of compressed classes (a .jar). Last year in CSI we worked with a .jar file called DanceStudio, which we had to use to make feet walk across the floor. This year, we are working with a different program to better understand java, so i unzipped the .jar file contained 26 classes, which I then decompiled. I wanted to try to create a program by compiling all of the .java files with the others necessary to make the program run (Walker, Foot, ETC.) When I try to compile all of these files, it will say that I have duplicate files (Walker, Foot, ETC.) What I don't understand is why this would compile if the .jar file was basically the same thing, just in a compressed form. What also confuses me is that the Foot, ETC files in the .jar are actually more complicated and have more code.
Could someone please explain how the .jar file actually works and separates these files apart, and how it could run with a duplicate class that isn't in the .jar file?

First of all, you're missing one step in your explanation of a .jar file.
A .jar file is a collection of .class files. And .class files are what is produced by compiling a .java file.
Usually a single .java file will produce a single .class file, because it will contain a single type definition. But there are several ways for a .java file to produce more than one .class files (inner/nested classes, anonymous classes, top-level non-public classes, ...), so it's not necessarily a 1-to-1 association between .java files and .class files.
Then there's the confusion why the decompiled Java source code looks more complicated than the original Java source. This one is easy to answer: the compilation step was not designed to be reversable.
When the Java compiler turns .java files to .class files it produces a format that is best suited for being executed. That format will not represent the exact same concepts that the Java source file does. For example: there's no classical "if" in the Java bytecode. It will be implemented be appropriate jump commands.
All of this means that the process of converting .class files back to .java files is complicated and usually non-perfect.

You generally compile your (clear text) .java source files into (binary) .class files.
If you use packages, then the class files will be in different subdirectories (representing the package).
A .jar file is a compressed binary file that puts all the .classes in the right directories in one compact, easy to manage file.
.jar file can also contain other files, such as manifests, bitmaps and resources.
.jar files can also be "signed" to insure the integrity/authenticity of their contents.
Here are some good links:
http://en.wikipedia.org/wiki/JAR_%28file_format%29
http://download.oracle.com/javase/tutorial/deployment/jar/
'Hope that helps

About your duplicate: Maybe your .jar is still in your build path, so when you try to compile your project with the decompiled class, you will have duplicate. check and remove the .jar if its still in your build path.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How do external java libraries(jars) actually work? - java

Related

Usage of jar with .java files and odd behavior of the compiler

Compare .java file to .class file

Where does the actual java code reside if I export my jar without the source file?

How does Java know the methods of an external jar?

A little confusion about .jar files

Categories

Resources