Typical build process of Java application - java

Being new to Java, I am not able to understand the complete build process from source code to hardware specific binaries. Basically I am using Eclipse for java and would like to know what all conversions takes place from source code(.java) to binary, what all files are linked using some linker, preprocessor etc etc.
Shall appreciate if you can point me to some link giving detail of complete build process for java. I have already searched this forum but did not get detail info.
Thanks
Edited:
So to be more precise I am looking for java equivalent of following build process in C:
I googled a lot but no gain! A figure like the following is not a must(though preferred), but if you can write 'n' sequential/parallel steps involved in complete Java build process, that will be really appreciated. Though much of the information provided by #Tom Anderson is very useful to me.

The first thing to appreciate is that your question contains a mistaken assumption. You ask about "the complete build process from source code to hardware specific binaries" - but the normal Java build process never produces architecture-specific binaries. It gets as far as architecture-independent bytecode, then stops. It's certainly true that in most cases, that bytecode will be translated into native code to be executed, but that step happens at runtime, inside the JVM, entirely in memory, and does not involve the production of binary files - it is not part of the build process.
There are exceptions to this - compilers such as GCJ can produce native binaries, but that is rarely done.
So, the only substantial step that occurs as part of a build process is compilation. The compiler reads in source code, does the usual parsing and resolution steps, and emits bytecode. That process is not in any way specified; as is usual, the language specification defines what the elements of the language are, and what they mean, but not how to compile them. What is specified in the format of the output: the bytecode is packaged in the form of class files, one per class, which in turn may be grouped together in jar files for ease of distribution.
When the class files come to be executed, there are then further steps needed before execution is possible. These are quite well-specified in the chapter on loading, linking, and initializing in the JVM specification. But, as i said, these are not really part of the build process.
There are a few other steps that may occur in a build process, usually prior to compilation: dependencies might be resolved and downloaded, resources might be copied and converted between character sets, and code might be generated. But none of this is standard, it's all stuff that's added on to the core process of compilation by various build tools.

There are some cool articles you can checkout if you want to know what's going on "behind the scenes".
http://www.codeproject.com/Articles/30422/How-the-Java-Virtual-Machine-JVM-Works
This is one of them, it has a good explanation on how all the parts interact to run your code.
The main idea is that the bytecode is created from your Java files to run in a Virtual Machine, making your Java code (more or less...) independent of the OS and platform you're running it on.
The JVM, specific to that environment, is then responsible for translating that bytecode into actual instructions for the specific architecture you're running your code on.
This is the basis of the "Write once, run everywhere" mantra that Java has. Although the mantra doesn't always hold... it's still true in general.

Related

What are the relationships between the notions: binary, interpretation, execution, compilation?

I deeply know that these are basic notions, but I would like to note that we can nowdays be 'web developpers' for years without understanding such mandatory notions (due to ready-made tools like Xampp, Wordpress...). I am giving three situations (among tens) where I encounter these concepts with no full understanding.
1. According to Wikipedia:
a binary data is a data in the binary form (bits and bytes) that
cannot be interpreted.
But what is interpretation??
2. I also heard a time that PHP is:
a scripting language, not compiled but interpreted. It does not
require any platform to be run.
Unlike Java or C #, you just get PHP binary, and run your script.
3. What about 'binary distribution' and 'compilation' as invoked in Apache HTTP server official documentation:
This documentation assumes that you are installing a binary distribution of
Apache. If you want to compile Apache yourself (possibly to help with
development or tracking down bugs), see Compiling Apache for Microsoft
Windows.
Could someone please give to confused people of the community "once-and-for-all" definitions with examples. Highly invaluable.
My understanding is as follows:
When used as a noun, binary refers to a compiled executable file - this is a file containing machine instructions in non-human readable form, which has previously been compiled, and can be run as an application.
compilation is the process of converting human-readable source code into a binary file, so that it can be executed.
execution is the process of running a program.
interpretation is the process of executing non-compiled code. In some programming languages the human-readable source code is executed directly, without first compiling it into binary machine code.

What is sjavac, who is it for and how do I use it?

There has been some buzz about a tool called sjavac on the OpenJDK mailing lists. Also, there are two related JEPs: JEP 139: Enhance javac to Improve Build Speed and JEP 199: Smart Java Compilation, Phase Two.
My questions are:
What exactly is the sjavac tool?
Who is it intended for?
How do I use it?
Disclaimer: Self answered question. Just wanted to bring the knowledge of this tool to the StackOverflow community and to create a reference to future sjavac FAQ.
What exactly is the sjavac tool?
The sjavac tool is an (allegedly smart) wrapper around javac, developed at Oracle and intended to provide the following features:
incremental compiles - recompile only what's necessary
parallel compilation - utilize more than one core during compilation
keep compiler in a hot VM - reuse a JIT'ed javac instance for consecutive invocations
When recompiling a set of source files, javac looks at the timestamps of the .java and .class files to determine what to keep and what to recompile. This is incredibly crude and can be devastating for large code bases. In addition to the timestamps sjavac inspects the public API of the dependencies to judge which files need to be recompiled.
Sjavac also attempts to split up the compilation into multiple invocations of javac. In other words, it brings a high level of parallelism to the build process.
Finally, the sjavac tool is split in a client part and a server part which allows you to leave sjavac running in the background, JIT'ed and ready for use in consecutive calls.
Who is it intended for?
People who are working on large projects and frequently recompiles the code base during development are encouraged to try out sjavac. (Be aware however that the tool is currently under development and there are still open issues.)
How do I use it?
The tool is not yet shipped with the OpenJDK, so you'll have to get it from the OpenJDK jdk9/dev repository. Also, there is no launcher in place yet, so you invoke it with java com.sun.tools.sjavac.Main.

How to compile Java files that put .class files directly into JAR

First some reference:
1st Link
2nd link
The first article 1st Link mentions about compiling the Java files directly into JAR files and avoiding one step in the build process. Does anyone know this?
-Vadiraj
As you linked to my blog post I thought it was only fair to give you an update.
Compiling directly to a Jar is actually fairly simple to do. Basically you extend
javax.tools.ForwardingJavaFileObject
Then override openOutputStream method and direct it to your Jar. As the Java Compiler is highly concurrent but writing to a jar file is highly sequential I'd recommend that you buffer to byte arrays and then have a background thread that writes the byte arrays as they arrive.
I do exactly this is my experimental build tool JCompilo https://code.google.com/p/jcompilo/
This tool is completely undocumented but crazy fast. Currently it's about 20-80% faster than any other Java build tool and about 10x faster than the Scala compiler for the same code.
As the author is talking about extending the compiler itself, it is possible that he has knowledge of the built-in capabilities of the compiler (that is what the compiler is capable of, maybe with a little encouragement by tweaking the code).
Right now I’m investigating extending the Java 6 compiler to remove the unneeded file exists checks and possible jaring the class files directly in the compiler. [emphasis mine]
That capability, however, is certainly not supported officially (no documentation exist about it on the javac webpage).
At best, the feature is compiler dependent; possibly requiring modification of the compiler's source code.

Creating a JVM from C

How does one start a Java VM from C? Writing the C code seems to be straigtforward -- I've been following the code that appears on p. 84 of Liang's "The Java Native Interface". It's the linking process that has me stymied. Liang's book is 10+ years out of date in that regard and I can't find anything on the net which addresses this goal (and which works).
To be clear, what I want to do is launch a standard windows program (written in C), which then launches the JVM and calls a main() in a Java class (which I have written). This program should not rely the presence of jvm.dll or jvm.lib -- the user shouldn't have to install Java to run the program. Maybe this isn't possible without an unreasonable amount of effort.
The development environment is MinGW under windows. I'm able to link in such a way that the program works when the .dll is in a separate file, but not in a way so that there's only a single executable without any .dlls or .libs.
In hindsight, I can see now that this was a dumb question, or at least one that hadn't been thought through. The moral of the story is that the "JVM" is not a single executable, or even an executable plus some JAR files; the JVM relies on a slew of independently stored files with various mutual dependencies. Unraveling all of these relationships so that they could be brought into a single file (or even two files) would be a massive undertaking. Thanks for the knock in the head.
So, to be clear - you want to launch a JVM without the requirement of a JVM being present? How do you propose to accomplish that? Unless you're contemplating writing your own JVM implementation (which I'd say falls under the category "unreasonable amount of effort"), having a JVM installed is a reasonable requirement. Assuming that, you can just spawn a java process and include the appropriate command-line parameters (classpath, class to run etc).
Disclaimer: I don't think that having a Java runtime installed is unreasonable for users. That said, I do understand your motivations for a low-friction install for users.
Using the Sun JRE is probably not going to be fruitful here. In theory, you could grab the Sun JRE, modify it to build as a static library instead of a DLL and figure out a way to cram all the resources that get bundled with it (fonts, images, cursors, SSL certificates, localized message files, etc.) into a single resource and then modify the runtime to load from there. But this is almost certainly an "unreasonable amount of effort."
You might want to look at GCJ instead: its architecture is different than the Sun JRE which lends itself more to being embedded in another application, or it can compile your Java to native machine code.
(Also, do check the licensing to ensure that you can properly redistribute this no matter which route you take.)

Difference between C++ and Java compilation process [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Why does C++ compilation take so long?
Hi,
I searched in google for the differences between C++ and Java compilation process, but C++ and Java language features and their differences are returned.
I am proficient in Java, but not in C++. But I fixed few bugs in C++. From my experience, I noticed that C++ always took more time to build compared to Java for minor changes.
Regards
Bala
There are a few high-level differences that come to my mind. Some of those are generalizations and should be prefixed with "Often ..." or "Some compilers ...", but for the sake of readability I'll leave that out.
C/C++ compilation doesn't read any information from binary files, but reads method/type definitions only from header files that need to be parsed in full (exception: precompiled headers)
C/C++ compilation includes a pre-processor step that can do a wide array of text-replacement (which makes header pre-compilation harder to do)
The C++ syntax is a lot more complex than the Java syntax
The C++ type system is a lot more complex than the Java type system
C++ compilation usually produces native assembler code, which is a lot more complex to produce than the relatively simple byte code
C++ compilers need to do optimizations because there isn't any other thing that will do them. The Java compiler pretty much does a simple 1:1 translation of Java source code to Java byte code, no optimizations are done at that step (that's left for the JVM to do).
C++ has a template language that's Turing complete! (so strictly speaking C++ code needs to be run to produce executable code and a C++ compiler would need to solve the halting problem to tell you if arbitrary C++ code is compilable).
Java compiles code into bytecode, which is interpreted by the Java VM. C++ must compile into object code, then to machine language. Because of this, it's possible for Java to compile only a single class for minor changes, while C++ object files must be re-linked with other object files to machine code executable (or DLLs). This may make the process take a bit longer.
I am not sure why you expect the compilation speed of Java and C++ to be comparable since they are different languages with completely different design goals and implementations.
That said a few specific differences to keep in mind are:
Java is compiled to byte code and not right down to machine code. Compiling to this abstract virtual machine is simpler.
C++ compilation involves not only compilation but also linking. So it is typically a multi step process.
Java performs some late binding that is the association of a call to a function and the actual code to run is done at runtime. So a small change in one area need not trigger a compile of the whole program. In C++ this association needs to be done at compile time this is called early binding.
A C++ program using all the language's features is inherently more difficult to compile. A few template invocations with a number of types can easily double or triple the amount of code to generate.
Glossing over a lot of details, in Java you compile .java files into one or more .class files. In C++ you compile .cc (or whatever) source files into .o files, and then link the .o files together into an executable or library. The linking process is usually what kills you, especially for minor changes as the amount of work for linking is roughly proportional to the size of your entire project. (this is ignoring incremental linkers, which are specifically designed to not behave as badly for small changes)
Another factor is that the #include mechanism means that whenever you change a .h file, all of the .o files that depend on it need to be rebuilt. In Java, a .class file can depend on more than one .java file (eg: because of constant inlining), but there tend to be far fewer of these "hot spots" where changing one source file requires many other source files to be rebuilt.
Also, if you're using an IDE like Eclipse it's building your Java code in the background all the time, so by the time you tell it to build it's already mostly (if not completely) done.
Java compiles any source code into bytecode, which is interpreted by JVM. Because of this feature it can be used in multiple platform.

Categories

Resources