I'm trying to understand, how java program interacts with compiler.
Lets assume, we write simple java language on plane text file. At the core level its stored in bits pattern on disk.
Java's compiler is separate identity, which is some sort of bits pattern only.
This pattern can consume something which it understands. Consumes java bit pattern so called java program and produce instructions to be processed by processor.
Where this process happens, in memory or processor ? Process where java compiler eats java and produce instructions to understand by processor.
My understanding says, memory is just for loading which we are able to see on screen, coming from disk or processor. Java program and compiler code both exists on screen and should be loaded in memory to go further.
Then how and in what sequence creation of processor's instructions created ? Where they are interacting and how?
Can anyone please help me understand this? Very curious to know. Any book or reference will also work.
It's pretty much the same as any compiler system. The compiler reads source text from a file (.java, in this case), and writes equivalent instructions to another file (.class, in this case).
The execution of the compiler proceeds like the execution of any computer program: the processor executes instructions that come from memory, and those instructions read and write memory. It does not seem appropriate to call this "in memory" or "in the processor" - it's "in the computer" as a whole.
More detail:
The compiler is, when running, instructions loaded in memory being executed by the processor. The compiler will execute instructions to read the .java files from disk into memory so that it, the compiler, can translate the Java code. The compiler will execute instructions to write the compiled code out to disk (as .class files).
The compiler and Java program are not running at the same time. The compiler translates the Java program entirely before that program (the 'application program') is run. First compile .java to .class files; then execute the .class files.
The application program is, when running, instructions loaded in memory being executed by the processor. (This is a simplification for this explanation, avoiding the existence of a thing called the JVM. It does not materially change the overall idea).
This isn't so much about Java as it is about how computers work in general. Data have to be accessible to the processor, and that means that the data need to be in memory. The processor's instructions, i.e., an executing program, can't manipulate data any other way (to an approximation; there are exceptions that are not applicable at the level of explanation we're going for here)
Related
I have been figuring out the exact working of an interpreter, have googled around and have come up with some conclusion, just wanted it to be rectified by someone who can give me a better understanding of the working of interpreter.
So what i have understood is:
An interpreter is a software program that converts code from high level
language to machine format.
speaking specifically about java interpreter, it gets code in binary format
(which is earlier translated by java compiler from source code to bytecode).
now platform for a java interpreter is the JVM, in which it runs, so
basically it is going to produce code which can be run by JVM.
so it takes the bytecode produces intermediate code and the target machine
code and gives it to JVM.
JVM in turns executes that code on the OS platform in which JVM is
implemented or being run.
Now i am still not clear with the sub process that happens in between i.e.
interpreter produces intermediate code.
interpreted code is then optimized.
then target code is generated
and finally executed.
Some more questions:
so is the interpreter alone responsible for generating target code ? and
executing it ?
and does executing means it gets executed in JVM or in the underlying OS ?
An interpreter is a software program that converts code from high level language to machine format.
No. That's a compiler. An interpreter is a computer program that executes the instructions written in a language directly. This is different from a compiler that converts a higher level language into a lower language. The C compiler goes from C to assembly code with the assembler (another type of compiler) translates from assembly to machine code -- modern C compilers do both steps to go from C to machine code.
In Java, the java compiler does code verification and converts from Java source to byte-code class files. It also does a number of small processing tasks such as pre-calculation of constants (if possible), caching of strings, etc..
now platform for a java interpreter is the JVM, in which it runs, so basically it is going to produce code which can be run by JVM.
The JVM operates on the bytecode directly. The java interpreter is integrated so closely with the JVM that they shouldn't really be thought of as separate entities. What also is happening is a crap-ton of optimization where bytecode is basically optimized on the fly. This makes calling it just an interpreter inadequate. See below.
so it takes the bytecode produces intermediate code and the target machine code and gives it to JVM.
The JVM is doing these translations.
JVM in turns executes that code on the OS platform in which JVM is implemented or being run.
I'd rather say that the JVM uses the bytecode, optimized user code, the java libraries which include java and native code, in conjunction with OS calls to execute java applications.
now i am still not clear with the sub process that happens in between i.e. 1. interpreter produces intermediate code. 2. interpreted code is then optimized. 3. then target code is generated 4. and finally executed.
The Java compiler generates bytecode. When the JVM executes the code, steps 2-4 happen at runtime inside of the JVM. It is very different than C (for example) which has these separate steps being run by different utilities. Don't think about this as "subprocesses", think about it as modules inside of the JVM.
so is the interpreter alone responsible for generating target code ? and executing it ?
Sort of. The JVM's interpreter by definition reads the bytecode and executes it directly. However, in modern JVMs, the interpreter works in tandem with the Just-In-Time compiler (JIT) to generate native code on the fly so that the JVM can have your code execute more efficiently.
In addition, there are post-processing "compilation" stages which analyze the generated code at runtime so that native code can be optimized by inlining often-used code blocks and through other mechanisms. This is the reason why the JVM load spikes so high on startup. Not only is it loading in the jars and class files, but it is in effect doing a cc -O3 on the fly.
and does executing means it gets executed in JVM or in the underlying OS ?
Although we talk about the JVM executing the code, this is not technically correct. As soon as the byte-code is translated into native code, the execution of the JVM and your java application is done by the CPU and the rest of the hardware architecture.
The Operating System is the substrate that that does all of the process and resource management so the programs can efficiently share the hardware and execute efficiently. The OS also provides the APIs for applications to easily access the disk, network, memory, and other hardware and resources.
1) An interpreter is a software program that converts code from high level language to machine format.
Incorrect. An interpreter is a program that runs a program expressed in some language that is NOT the computer's native machine code.
There may be a step in this process in which the source language is parsed and translated to an intermediate language, but this is not a fundamental requirement for an interpreter. In the Java case, the bytecode language has been designed so that neither parsing or a distinct intermediate language are necessary.
2) speaking specifically about java interpreter, it gets code in binary format (which is earlier translated by java compiler from source code to bytecode).
Correct. The "binary format" is Java bytecodes.
3) now platform for a java interpreter is the JVM, in which it runs, so basically it is going to produce code which can be run by JVM.
Incorrect. The bytecode interpreter is part of the JVM. The interpreter doesn't run on the JVM. And the bytecode interpreter doesn't produce anything. It just runs the bytecodes.
4) so it takes the bytecode produces intermediate code and the target machine code and gives it to JVM.
Incorrect.
5) JVM in turns executes that code on the OS platform in which JVM is implemented or being run.
Incorrect.
The real story is this:
The JVM has a number of components to it.
One component is the bytecode interpreter. It executes bytecodes pretty much directly1. You can think of the interpreter as an emulator for an abstract computer whose instruction set is bytecodes.
A second component is the JIT compiler. This translates bytecodes into the target machine's native machine code so that it can be executed by the target hardware.
1 - A typical bytecode interpreter does some work to map abstract stack frames and object layouts to concrete ones involving target-specific sizes and offsets. But to call this an "intermediate code" is a stretch. The interpreter is really just enhancing the bytecodes.
Giving a 1000 foot view which will hopefully clear things up:
There are 2 main steps to a java application: compilation, and runtime. Each process has very different functions and purposes. The main processes for both are outlined below:
Compilation
This is (normally) executed by [com.sun.tools.javac][1] usually found in the tools.jar file, traditionally in your $JAVA_HOME - the same place as java.jar, etc.
The goal here is to translate .java source files into .class files which contain the "recipe" for the java runtime environment.
Compilation steps:
Parsing: the files are read, and stripped of their 'boundary' syntax characters, such as curly braces, semicolons, and parentheses. These exists to tell the parser which java object to translate each source component into (more about this in the next point).
AST creation: The Abstract Syntax Tree is how a source file is represented. This is a literal "tree" data structure, and the root class for this is [com.sun.tools.JCTree][3]. The overall idea is that there is a java object for each Expression and each Statement. At this point in time relatively little is known about actual "types" that each represent. The only thing that is checked for at the creation of the AST is literal syntax
Desugar: This is where for loops and other syntactical sugar are translated into simpler form. The language is still in 'tree' form and not bytecode so this can easily happen
Type checking/Inference: Where the compiler gets complex. Java is a static language, so the compiler has to go over the AST using the Visitor Pattern and figure out the types of everything ahead of tim and makes sure that at runtime everything (well, almost) will be legal as far as types, method signatures, etc. goes. If something is too vague or invalid, compilation fails.
Bytecode: Control flow is checked to make sure that the program execution logic is valid (no unreachable statements, etc.) If everything passes the checks without errors, then the AST is translated into the bytecodes that the program represents.
.class file writing: at this point, the class files are written. Essentially, the bytecode is a small layer of abstraction on top of specialized machine code. This makes it possible to port to other machines/CPU structures/platforms without having to worry about the relatively small differences between them.
Runtime
There is a different Runtime Environment/Virtual Machine implementation for each computer platform. The Java APIs are universal, but the runtime environment is an entirely separate piece of software.
JRE only knows how to translate bytecode from the class files into machine code compatible with the target platform, and that is also highly optimized for the respective platform.
There are many different runtime/vm implementations, but the most popular one is the Hotspot VM.
The VM is incredibly complex and optimizes your code at runtime. Startup times are slow but it essentially "learns" as it goes.
This is the 'JIT' (Just-in-time) concept in action - the compiler did all of the heavy lifting by checking for correct types and syntax, and the VM simply translates and optimizes the bytecode to machine code as it goes.
Also...
The Java compiler API was standardized under JSR 199. While not exactly falling under same thing (can't find the exact JLS), many other languages and tools leverage the standardized compilation process/API in order to use the advanced JVM (runtime) technology that Oracle provides, while allowing for different syntax.
See Scala, Groovy, Kotlin, Jython, JRuby, etc. All of these leverage the Java Runtime Environment because they translate their different syntax to be compatible with the Java compiler API! It's pretty neat - anyone can write a high-performance language with whatever syntax they want because of the decoupling of the two. There's adaptations for almost every single language for the JVM
I'll answer based on my experience on creating a DSL.
C is compiled because you run pass the source code to the gcc and runs the stored program in machine code.
Python is interpreted because you run programs by passing the program source to the interpreter. The interpreter reads the source file and executes it.
Java is a mix of both because you "compile" the Java file to bytecode, then invokes the JVM to run it. Bytecode isn't machine code, it needs to be interpreted by the JVM. Java is in a tier between C and Python because you cannot do fancy things like a "eval" (evaluating chunks of code or expressions at runtime as in Python). However, Java has reflection abilities that are impossible to a C program to have. In short, the design of Java runtime being in a intermediary tier between a pure compiled and a interpreted language gives the best (and the worst) of the two words in terms of performance and flexibility.
However, Python also has a virtual machine and it's own bytecode format. The same applies to Perl, Lua, etc. Those interpreters first converts a source file to a bytecode, then they interpret the bytecode.
I always wondered why doing this is necessary, until I made my own interpreter for a simulation DSL. My interpreter does a lexical analysis (break a source in tokens), converts it to a abstract syntax tree, then it evaluates the tree by traversing it. For software engineering sake I'm using some design patterns and my code heavily uses polymorphism. This is very slow in comparison to processing a efficient bytecode format that mimics a real computer architecture. My simulations would be way faster if I create my own virtual machine or use a existent one. For evaluating a long numeric expression, for instance, it'll be faster to translate it to something similar to assembly code than processing a branch of a abstract tree, since it requires calling a lot of polymorphic methods.
There are two ways of executing a program.
By way of a compiler: this parses a text in the programming language (say .c) to machine code, on Windows .exe. This can then be executed independent of the compiler.
This compilation can be done by compiling several .c files to several object files (intermediate products), and then linking them into a single application or library.
By way of an interpreter: this parses a text in the programming language (say .java) and "immediately" executes the program.
With java the approach is a bit hybrid/stacked: the java compiler javac compiles .java to .class files, and possibles zips those in .jar (or .war, .ear ...). The .class files consist of a more abstract byte code, for an abstract stack machine.
Then the java runtime java (call JVM, java virtual machine, or byte code interpreter) can execute a .class/.jar. This is in fact an interpreter of java byte code.
Nowadays it also translates (parts) of the byte code at run time to machine code. This is also called a just-in-time compiler for byte code to machine code.
In short:
- a compiler just creates code;
- an interpreter immediately executes.
An interpreter will loop over parsed commands / a high level intermediate code, and interprete every command with a piece of code. Indirect an in principle slow.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am new to java and bit confused about the role of compiler and JVM . I have read couple of resources and to name few
What-are-the-functions-of-Java-Compiler ?
Is the JVM a compiler or an interpreter?
Compiler
As i save the file .java file in system, computer internally saves it in bytes 0 and 1's. I understand compiler is validating whether written java program
confirms to java standard or not. If not throws the error otherwise jenerate the class file.
My question is what is need of generating .class file.
Can't
JVM interpret it directly (from bytes generated corresponding to .java file) without needing .class file? Does compiler(javac) do any kind of optimization here ?
JVM :- This question is other way around. Can't compiler generate the byte/machine code which can be interpreted by CPU directly? so why JVM is needed here ?
Is JVM required to interpret the byte code specific to platform for example windows or linux ?
The compiler generates byte code, in a Java meaning, which is a .class file, containing the byte code, which is not made of 1 or 0, no matter what OS you are running on.
And JVM interprets byte code to run it on specific OS
The main point of having these two stages and intermediate representation ("byte code" in Java) is platform-independence.
The Java program, in a source code, is platform-independent (to a degree). When you compile it into byte code, the Java compiler basically does (almost) all of the things it can do while still maintaining the platform-independence:
validates syntax
performs static type checks
translates human-readable source code into machine-readable byte code
does static optimizations
etc.
These are all things, that:
maintain platform-independence
only need to be performed once, since they don't rely on any run-time data
take (possibly long) time to do so it would be waste of time to do them again each time the code is executed
So now you have .class files with byte code, which are still platform-independent and can be distributed to different OS or even hardware platform.
And then the second step is the JVM. The JVM itself is platform-specific, and it's task is to translate/compile/interpret the byte code on the target architecture and OS. That means:
translate byte code to the instruction set of given platform, and using target OS system calls
run-time optimizations
etc.
What-are-the-functions-of-Java-Compiler ?
'Javac' is a Java compiler which produces byte-code(.class File) and that code is platform-independent.
Is the JVM a compiler or an interpreter? Ans- Interpreter
JVM is Java Virtual Machine -- Runs/ Interprets/ translates Bytecode into Native Machine Code and it internally uses JIT
JIT Compiles the given bytecode instruction sequence to machine code at runtime before executing it natively and do all the heavy optimizations.
All the above complexities exists to make java compile once run anywhere or Platform independent Language and Because of that bytecode or JAVAC Output is platform-independent but JVM executing that bytecode is Platform-dependent i.e we have different JVM for Windows and Unix.
A JVM, is an implementation of the Java Virtual Machine Specification. It interprets compiled Java binary code (called bytecode) for a computer's processor (or "hardware platform").
JVM is the target machine for byte-code instead of the underlying architecture.
The Java compiler ,'Javac' produces byte-code which is platform-independent. This byte-code is, we can say generic, ie., it does not include machine level details, which are specific to each platform.
The instructions in this byte-code cannot be directly run by the CPU.
Therefore some other 'program' is needed which can interpret the code, and give the CPU machine level instructions which it can execute. This program is the 'JVM' (Java Virtual Machine) which is platform specific.
That's why you have different JVM for Windows, Linux or Solaris.
I think first we should discuss the difference between executing an interpreted runtime vs a machinecode runtime vs a bytecode runtime.
In an interpreted runtime the (generally human readable) source code is converted to machine code by the interpreter only at the point when the code is run. Generally this confers advantages such that code is platform independent (so long as an interpreter exists for your platform) and ease of debugging (the code is right there in front of you) but at the cost of relatively slow execution speed as you have the overhead of converting the source code into machine code when you run the program.
In a compiled runtime the source code has been compiled into native machine code ahead of time by a dedicated compiler. This gives fast execution speed (because the code is already in the format expected by the processor), but means that the thing you distribute (the compiled binary) is generally tied to a given platform.
A bytecode runtime is sort of a halfway house that aims to give the advantages of both intepretation and compilation. In this case the source code is complied down into an intermediate format (byte code) ahead of time and then converted into machine code at runtime. The byte code is designed to be machine friendly rather than human friendly which means it's much faster to convert to machine code than in a traditionally interpreted language. Additionally, because the actual conversion to machine code is being done at run time, you still get all that nice platform independence.
Note that the choice of whether to intepret or compile is independent of the language used: for example there is no reason in theory why you could not have a c intepreter or compile python directly into machine code. Of course in practise most languages are generally only either compiled or interpreted.
So this brings us back to the question of what the Java compiler does- essentially it's main job is to turn all your nice human readable java files into java bytecode (class files) so that the JVM can efficiently execute them.
The JVM's main job, on the other hand, is to take those class files and turn them into machine code at execution time. Of course it also does other stuff (for example it manages your memory and it provides various standard libraries), but from the point of view of your question it's the conversion to machine code that's important!
Java bytecode is an intermediate, compact, way of representing a series of operations. The processor can't execute these directly. A processor executes machine instructions. They are the only thing that a processor can understands.
The JVM processes stream of bytecode operations and interprets them into a series of machine instructions for the processor to execute.
My question is what is need of generating .class file. Can't JVM interpret it directly (from bytes generated corresponding to .java file) without needing .class file? Does compiler (javac) do any kind of optimization here ?
The javac generate .class file which is an intermediate thing to achieve platform independence.
To see what compiler optimized, simply decompile the bytecode, for instance with javap or JD.
This question is other way around. Can't compiler generate the byte/machine code which can be interpreted by CPU directly? so why JVM is needed here ? Is JVM required to interpret the byte code specific to platform for example windows or linux ?
The designers of the Java language decided that the language and the compiled code was going to be platform independent, but since the code eventually has to run on a physical platform, they opted to put all the platform dependent code in the JVM. Hence for this reason we need JVM to execute code.
To see what the just in time compiler optimized, activate debug logging and read the disassembled machine code.
In Java, only few things are considered to be atomic (What operations in Java are considered atomic?). For example i++ consists of 3 different atomic operations: Load i into a register, add 1 to that register, write the new value of the register back to i. Something like that.
My question: Is it possible to "parse" Java Code into a sequence of it's atomic representation? So when input is "i++" i don't want to have i++ as output, i want to have "LOAD I TO REGISTER", "ADD 1 TO THAT REGISTER", "WRITE I BACK FROM REGISTER". Is that possible?
Googling didn't help much on that topic.
In Java you don't have register access or assembly code access. The only way you can realize the above is to use Atomic classes as provide in the java.util.conrrent.atomic package. However, that would be more costly than a simple i++ unless you are doing this on a multi-threading background.
Yes and no. Short answer: it'll be very very hard.
The code will go through several steps before it arrives at the hardware where you could 'parse' the atomic representations.
It will first be compiled into bytecode. This is already closer to machine code, but not yet what you want. You can read it by opening the class file in an editor that can parse bytecode (Eclipse IDE can do this).
When you run the program, the Java Virtual machine will execute your bytecode. It may, depending on the situation, execute the bytecode as a scripting engine, compile it to machine code or change the bytecode and then execute it to get better performance. What the Java Virtual Machine does will depend on the system, the current load on your system and the bytecode. So it's actually impossible to predict the actual machine code it would generate before hand.
For a specific run of code, you can use an assembly debugger (disassembler) such as IDA. It will allow you to view what is being executed on the system as assembly code. You will most likely see a lot of code that is part of the Java Virtual Machine and not of your program. This will make things even more difficult.
I've read many definitions and statements about "Interpretation" and "compilation". But I am still very much confused.
Technically speaking, what is REALLY the difference between interpretation and compilation under the hood? Let me elaborate (please correct any wrong concept I might have) :
In java, the source code is "compiled" into ByteCode which is then "interpreted" and/or "just-in-time compiled" into machine code. But what is the difference between just in time compilation and interpretation? I mean, in the end, as far as my guess goes, the Host's CPU will run machine code only. Thus, in interpretation as well, instructions ARE being converted into machine code which can be understood by the CPU. So, where do we draw the line between just-in-time compilation and interpretation?
P.S. This is my conception. It might be totally wrong. In that case, Kindly excuse my stupidity and correct me.
Thanks.
1. Frankly speaking the idea that java has both Compiler and Interpreter is a myth, its the behavior of it that is marked as Compilation and Interpreter.
2. Java compiler compiles the human readable code to byte code. Which then is converted by the JIT (Just In Time Compiler) during runtime into machine level executable code.
3. During Runtime JIT identifies the runtime intensive part of the code and then converts it into machine level executable code, this part of the code is known as Hot-Spot, and thats why JIT is called as Hot-Spot compiler.
4. JIT uses the Virtual Memory Table ( V-table), which is a pointer to the method in the class. The Hot-Spot code is then converted to its machine level executable code, its address is stored here, and when this part is called again, then its directly fetched by this stored address. This behavior of JIT to keep compiling small amount of code during Run time is assumed to be Interpreted Behavior, And the JIT behaviour of storing this for later use is assumed as Compilation.
5. Virtual Memory Table also has a table which stores the address of the byte code, which can be used if needed.
When the code is compiled, the generated artifact is understandable directly by the hardware. Basically it's a machine code sent directly to the CPU. This also means that an artifact compiled against given CPU architecture won't run on another. The advantage is immediate startup and great performance.
In interpreted environments there is either no compilation at all or the result of such step is an intermediary code. This code is two abstract to be sent directly to processing unit. Instead a separate layer is needed (virtual machine, interpreter) that reads this artifact and executes it in some sandbox environment. The advantage of this approach is portability - intermediate code can run on any platform where native interpreter is available. Unfortunately the performance is almost always worse.
JIT in Java is a hybrid technology. First bytecode is interpreted, each bytecode instruction is executed by the interpreter. However at some point in time (and under some conditions) bytecode is translated into machine code and sent directly to CPU to improve performance. This approach brings best of both worlds - portability of intermediate code and speed of native code. Moreover, the JIT knows much more about runtime behaviour of your code (how many times given loop is called on average? Is this method really virtual?), so the machine code can be even faster then the one generated by an ordinary compiler (!)
You're right that eventually everything has to be converted to machine code. The basic difference is that in the case of an interpreter, this translation occurs every time the code runs, whereas a compiler does this translation ahead of time, after which the compiler is not required for running the program.
Just-in-time compilation is a combination of both, where the JIT compiler is still required for running the program, and the code is compiled at run-time.
Compilation takes time but it is advantageous when the same piece of code is run several times, for eg. in a loop. The Java HotSpot VM takes this approach further by initially interpreting bytecode directly and then JIT-compiling a piece of code once it has run a certain number of times.
Interpreters interpret code line by line, and decides the machine code at run time;
Compiler consume code by chunk, and decides the machine code at compile time;
JIT compiler is a hybrid approach, in which the code is generated at run time (but could be already cached to improve performance), but is consumed in chunk.
An interpreted environment involves instructions being executed immediately after parsing, where both the parsing and execution are done by the interpreter. This means that the machine you run the code on must have the interpreter in order to run the program.1
A compiler will parse the instructions into machine code and store them for later execution. Java however is bytecompiled2, which means that this process turns the instructions into ByteCode, which will then be used by the interpreter.
I have done some reading on the internet and some people say that Java application is executed by the java virtual machine (JVM). The word "execute" confuses me a little bit. As I know, a non-Java application (i.e: written in C, C++...) can be executed by the Operating system. At the lower level, it means the OS will load the binary program into memory, then direct the CPU to execute instructions in memory.
So now with a JVM, what would happen? As I know, JVM (contains a run-time environment) would be called first by the OS. From that point on, the JVM will spawns one (or many) threads for the application. I wonder if the role of the OS comes into play any more? It seems to me that the JVM has "by-passed" the OS and directly instruct the CPU to execute the application. If so, why do we need the OS?
Taking a little bit further, the JVM will use its JIT to compile the application's byte codes into machine codes, then execute those machine codes. Since it is already machine codes, do we need the JVM any more? Because instead of JVM, the OS can be able to instruct the CPU to execute those machine codes. Am I making any mistake here?
I would like to learn more from people here. Please correct me if I am wrong. Thank you so much!
We need the OS for all the things a C or C++ program would. The JVM does a few more things by default, but it doesn't replace anything the OS does. The only difference might be that sometimes you have Your Code [calls the] JVM [calls the] OS, or with compiled code you can have Your Code [calls the] OS
Similarly in C++ you might have Your Code [calls the] Boost [calls the] OS.
When your program is running in native code, it doesn't need the JVM as such. This is good because the JVM knows when to "stand back" and let the application run. However, not all the program will be compiled to native code for the rest of the life of the application, so you still need it.
Its is possible to use kernel by-pass devices/drivers with JNI, but Java doesn't directly support this sort of feature.
It seems to me that the JVM has "by-passed" the OS and directly
instruct the CPU to execute the application. If so, why do we need the
OS?
All C/C++ binaries (not just the JVM) run directly on the CPU. Once running, these programs can call into more machine code provided by the operating system to do useful things like reading files, starting threads, or using the network.
The JVM translates a Java program into instructions that run on the CPU. Behind the scenes, though, Java's threads, file i/o, and network sockets (to name a few) all contain instructions that call into the code provided by the operating system for threads/files/etc. This is one of the reasons you still need the OS.
Since it is already machine codes, do we need the JVM any more?
The JVM provides features that you don't get from the JIT compiler. At the end of the day the JVM is just running a lot of machine code, but not all of that machine code comes from the JIT (or from the interpreter). Some of that machine code does garbage collection, for example. That's why you need the JVM.
The underlying base O/S still has to do almost everything for the JVM, not least:
Input / Output
Memory management
Creation of threads (if using native threads)
Time sharing - i.e. allowing more than one process to run
and lots more besides!
Well, I want to keep this simple.
How you coded in ZX Spectrum, that is in old days, when really you don't use OS (even before DOS era, in pre-PC era). You write your code, and you have to manage all. In many cases there were no compiler, so your program was interpreted.
Next, it was realized that OS is great thing and the programs became simpler. Also, compiler was in broader use. I am talking about C++, for example. In those programs if you need to call to some OS function you added needed library and makes your call. One of the drawbacks where that now, your program is OS-depended, another problem was that your programs includes OS DLL in some fixed version. If another program on same station required that DLL in different version you were in trouble.
In the early days of the JVM history no JIT compiler where in used. So, your program run in interpreted mode. Your application has no longer needed to call OS directly, instead it use JVM for all it needs. JVM instead redirect some of the application calls to the OS. Think about JVM as mediator. One of the best features of the JVM where it's universality. You where not needed to stick to the specific OS (while in practice you do need to make some minor adjustments, when you are not stick to the Java requirement while your program "occasionally" works in some specific OS, for example you use C:\ for the files or assumptions upon Thread scheduler that is happen to be true on current OS, but generally JVM is not guaranteed to be true). Programmers of the JVM develop such API that can be easy in use for Java developer in one hand and it will be possible to map to any OS system calls on another hand.
JVM provides you more that simple wrapper to the OS. It has it's own memory model (thread synchronization) for example, that has some week grantees on it's own (it was totally revised in JDK 1.5, because it was broken). It also have garbage collection, it's initialize variables to null values (int i; i will be initialize to 0). That is JVM besides being moderator to OS is acting as helper code for your own application.
At some point JIT was added. It was added to reduce overhead that JVM creates. When some assumptions holds, usually after one execution of the code, command interpretation can be compiled to the machine code (I skip phase of byte code). It is optimization, and I don't know any case where you can effected.
In JDK 1.6 another optimization where added. Now, some objects in some circumstances can be allocated at the stack and not on the heap. I don't know, may be it has some side-effects, but it is example of what JVM can do for you.
And my last remark. When you compile your code, what really happens, your program is checked to be syntactically correct and then it byte-code generated (.class file). Java language use subset of the existing byte-codes (this is how AOP was implemented, using existing byte-codes that where not part of Java language). When java program is executed these byte codes are interpreted, they are translated on-the-fly to the machine instructions. If JIT is on, than some of the execution lines can be compiled to the machine language and reused instead of on-the-fly interpretation.
Since it is already machine codes, do we need the JVM any more?
compiled java programs are not the machine codes. [javac] compile [.java] file into bytecode [.class] file.Then these bytecodes are given to JRE [Java Run-time Environment].
Now the java interpreter comes in action that interpret the bytecode into native machine code that run on the CPU.
As we know OS does not Execute any program it provide Environment for processor to Execute if we talk about Environment it allocate Memory Loading file Giving instruction to the Processor,Manage the Address
of loaded data method work of the Processor is only Executing program this thing happen in c or any procedural programming Language if we see than OS Playing a very vital Role in this overhead on OS
Because if we write a small simple program in c like Hello World which contains only one Main function when will compile it generate .exe file of more than one function which is taken from Library function so manage all thing by OS is tedious job so in JVM has given Relief to OS here the work of OS is only to
load the JVM from hard disk to RAM and make the jvm Execute and allocate space for JVM to execute java Program here Momery allocation ,Loading on Byte code file from Hard disk,Address Management ,Memory Allocation and De-allocation is Done by JVM itself so OS is free it can do other work.jvm allocate or Deallocate memory based on What ever OS has given to Execute the java Program.
if we talk about Execution JVM Contains Interpreter as well as JIT compiler which converting the Byte code into Machine code of Required Function after Execution of the method Executable code of that method is destroyed thats why we can say java does have .EXE File