What are the relationships between the notions: binary, interpretation, execution, compilation? - java

I deeply know that these are basic notions, but I would like to note that we can nowdays be 'web developpers' for years without understanding such mandatory notions (due to ready-made tools like Xampp, Wordpress...). I am giving three situations (among tens) where I encounter these concepts with no full understanding.
1. According to Wikipedia:
a binary data is a data in the binary form (bits and bytes) that
cannot be interpreted.
But what is interpretation??
2. I also heard a time that PHP is:
a scripting language, not compiled but interpreted. It does not
require any platform to be run.
Unlike Java or C #, you just get PHP binary, and run your script.
3. What about 'binary distribution' and 'compilation' as invoked in Apache HTTP server official documentation:
This documentation assumes that you are installing a binary distribution of
Apache. If you want to compile Apache yourself (possibly to help with
development or tracking down bugs), see Compiling Apache for Microsoft
Windows.
Could someone please give to confused people of the community "once-and-for-all" definitions with examples. Highly invaluable.

My understanding is as follows:
When used as a noun, binary refers to a compiled executable file - this is a file containing machine instructions in non-human readable form, which has previously been compiled, and can be run as an application.
compilation is the process of converting human-readable source code into a binary file, so that it can be executed.
execution is the process of running a program.
interpretation is the process of executing non-compiled code. In some programming languages the human-readable source code is executed directly, without first compiling it into binary machine code.

Related

Diference between jdk/bin/java and jdk/jre/bin/java

Making somes tests this week i found this situation:
When i run the tomcat using the java executable in jdk/jre/bin/java the performance is a lot betther than when i run with jdk/bin/java.
The question is: Someone knows why the jdk package delivers 2 java executables and what is the difference between them that justifies the performance difference?
I'm late to the party, but... I came here looking for the difference between the several java variants within OpenJDK. I only ended up with a few clarifications and additional questions to the "what's the difference between them" part of the question; hope it's helpful.
Looking inside the OpenJDK (I'm using OpenJDK 1.7.0) base directory I see three javas, all with different hash-sums:
bin/java, binary
jre-abrt/bin/java, binary; assuming ABRT is Automatic Bug Reporting Tool
jre/bin/java, a shell script that execs the jre-abrt/bin/java variant, in one of two different ways (more below).
The binary variants above have the same file-size and creation-time (in my version and system, anyway) but 4 bytes differ between the two files (I haven't looked much further -- this is the other part of your question -- but they are different, and it doesn't look like an ASCII string, for instance).
The script variant is the one you're saying is faster, which seems counterintuitive because it seems to be doing more. (Or perhaps you're only seeing the time to execute the script and not the exec'd java command?). The script checks to see if an ABRT shared-object file exists, and if so it passes (as -agentpath...) the .so and abrt=on. Again, this seems like it should be nothing but more work... assuming use of ABRT.
If you're still interested in this topic, perhaps it would be interesting to see the following:
what path in that script you're taking (check for existence of /usr/lib64/libabrt-java-connector.so or whatever is in your jre/bin/java script)
if directly executing the third variant (jre-abrt/bin/java) is faster
what else is being touched in both of these cases -- like inotify or strace or something, but this is probably enormous for a service like this.
the java.exe files are actually the same. The JDK is the Java Development Kit, which includes all of the java executables you need to develop applications.
The JRE is the Java Runtime Environment, which includes what you need to run Java applications
So for running the application in a deployed mode, you would need only the JRE, as end users are likely to have only a JRE and not a JDK.

Typical build process of Java application

Being new to Java, I am not able to understand the complete build process from source code to hardware specific binaries. Basically I am using Eclipse for java and would like to know what all conversions takes place from source code(.java) to binary, what all files are linked using some linker, preprocessor etc etc.
Shall appreciate if you can point me to some link giving detail of complete build process for java. I have already searched this forum but did not get detail info.
Thanks
Edited:
So to be more precise I am looking for java equivalent of following build process in C:
I googled a lot but no gain! A figure like the following is not a must(though preferred), but if you can write 'n' sequential/parallel steps involved in complete Java build process, that will be really appreciated. Though much of the information provided by #Tom Anderson is very useful to me.
The first thing to appreciate is that your question contains a mistaken assumption. You ask about "the complete build process from source code to hardware specific binaries" - but the normal Java build process never produces architecture-specific binaries. It gets as far as architecture-independent bytecode, then stops. It's certainly true that in most cases, that bytecode will be translated into native code to be executed, but that step happens at runtime, inside the JVM, entirely in memory, and does not involve the production of binary files - it is not part of the build process.
There are exceptions to this - compilers such as GCJ can produce native binaries, but that is rarely done.
So, the only substantial step that occurs as part of a build process is compilation. The compiler reads in source code, does the usual parsing and resolution steps, and emits bytecode. That process is not in any way specified; as is usual, the language specification defines what the elements of the language are, and what they mean, but not how to compile them. What is specified in the format of the output: the bytecode is packaged in the form of class files, one per class, which in turn may be grouped together in jar files for ease of distribution.
When the class files come to be executed, there are then further steps needed before execution is possible. These are quite well-specified in the chapter on loading, linking, and initializing in the JVM specification. But, as i said, these are not really part of the build process.
There are a few other steps that may occur in a build process, usually prior to compilation: dependencies might be resolved and downloaded, resources might be copied and converted between character sets, and code might be generated. But none of this is standard, it's all stuff that's added on to the core process of compilation by various build tools.
There are some cool articles you can checkout if you want to know what's going on "behind the scenes".
http://www.codeproject.com/Articles/30422/How-the-Java-Virtual-Machine-JVM-Works
This is one of them, it has a good explanation on how all the parts interact to run your code.
The main idea is that the bytecode is created from your Java files to run in a Virtual Machine, making your Java code (more or less...) independent of the OS and platform you're running it on.
The JVM, specific to that environment, is then responsible for translating that bytecode into actual instructions for the specific architecture you're running your code on.
This is the basis of the "Write once, run everywhere" mantra that Java has. Although the mantra doesn't always hold... it's still true in general.

Java to C/C++, or at least get the Java converted code

Are there any Java -> C/C++ Converters? Well I expect a no.
But I know Java works by converting the Java Byte Code, into code that the OS can understand using JIT. So is there any way to get this "converted code"?
Thanks.
Thanks to Baltasarq, who set me on the right course, I starting looking for Ahead of Time compilers, Amazingly, I found GCJ which is included in GCC (I think the latest?). It does exactly what I want. Take a Java file, turn it into an EXE. But, it needs 44 DLLS for a simple print "Hello World" app. Oh well :D
But I know Java works by converting the Java Byte Code, into code that the OS can understand using JIT. So is there any way to get this "converted code"?
You're talking about compiling code "ahead of time", or at least that's the name it receives in the Mono project (free implementation of .NET/C#). If you are interested on this, you could convert your code from Java to C# (which is at least easier than C++), and then take advantage of this feature. There is even a tool dedicated to this purpose: mkbundle.
I do not know if there is a way to get the JIT code issued ~ it is a runtime conversion that is done after like 1000 or 10,000 through some part of the code path - I am sure one can get this "converted code"! but what you are talking about is source-code and anyway the JVM runs on the byte code for 1000 or 10,000 before it even tries to bring int the JIT or can be made to run on byte code only
seems to me without knowing where your needs are I would think it faster to write the code by hand ~ if I need a short loop or something to explain something to some bode I can do it faster by hand than finding some tool to do it
there actually is a dumper that comes as part of a standard install ~ and I know it works because I have some code saved in source file comments where it did it
Are there any Java -> C/C++ Converters? Well I expect a no.
Well, there are some projects, but none I know of that are production-quality. Mostly they translate to C, as the additional features that C++ offers do not align very well with what Java does. A Google search will give you quite a few projects. Also see this question:
Does a Java to C++ converter/tool exist?
But I know Java works by converting the Java Byte Code, into code that
the OS can understand using JIT. So is there any way to get this
"converted code"?
No, not that I am aware of. You could theoretically take the HotSpot source code (it's available as part of OpenJDK), and insert logging statements to dump the generated machine code. I don't know if anyone has done that yet.

Creating a JVM from C

How does one start a Java VM from C? Writing the C code seems to be straigtforward -- I've been following the code that appears on p. 84 of Liang's "The Java Native Interface". It's the linking process that has me stymied. Liang's book is 10+ years out of date in that regard and I can't find anything on the net which addresses this goal (and which works).
To be clear, what I want to do is launch a standard windows program (written in C), which then launches the JVM and calls a main() in a Java class (which I have written). This program should not rely the presence of jvm.dll or jvm.lib -- the user shouldn't have to install Java to run the program. Maybe this isn't possible without an unreasonable amount of effort.
The development environment is MinGW under windows. I'm able to link in such a way that the program works when the .dll is in a separate file, but not in a way so that there's only a single executable without any .dlls or .libs.
In hindsight, I can see now that this was a dumb question, or at least one that hadn't been thought through. The moral of the story is that the "JVM" is not a single executable, or even an executable plus some JAR files; the JVM relies on a slew of independently stored files with various mutual dependencies. Unraveling all of these relationships so that they could be brought into a single file (or even two files) would be a massive undertaking. Thanks for the knock in the head.
So, to be clear - you want to launch a JVM without the requirement of a JVM being present? How do you propose to accomplish that? Unless you're contemplating writing your own JVM implementation (which I'd say falls under the category "unreasonable amount of effort"), having a JVM installed is a reasonable requirement. Assuming that, you can just spawn a java process and include the appropriate command-line parameters (classpath, class to run etc).
Disclaimer: I don't think that having a Java runtime installed is unreasonable for users. That said, I do understand your motivations for a low-friction install for users.
Using the Sun JRE is probably not going to be fruitful here. In theory, you could grab the Sun JRE, modify it to build as a static library instead of a DLL and figure out a way to cram all the resources that get bundled with it (fonts, images, cursors, SSL certificates, localized message files, etc.) into a single resource and then modify the runtime to load from there. But this is almost certainly an "unreasonable amount of effort."
You might want to look at GCJ instead: its architecture is different than the Sun JRE which lends itself more to being embedded in another application, or it can compile your Java to native machine code.
(Also, do check the licensing to ensure that you can properly redistribute this no matter which route you take.)

Difference between C++ and Java compilation process [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Why does C++ compilation take so long?
Hi,
I searched in google for the differences between C++ and Java compilation process, but C++ and Java language features and their differences are returned.
I am proficient in Java, but not in C++. But I fixed few bugs in C++. From my experience, I noticed that C++ always took more time to build compared to Java for minor changes.
Regards
Bala
There are a few high-level differences that come to my mind. Some of those are generalizations and should be prefixed with "Often ..." or "Some compilers ...", but for the sake of readability I'll leave that out.
C/C++ compilation doesn't read any information from binary files, but reads method/type definitions only from header files that need to be parsed in full (exception: precompiled headers)
C/C++ compilation includes a pre-processor step that can do a wide array of text-replacement (which makes header pre-compilation harder to do)
The C++ syntax is a lot more complex than the Java syntax
The C++ type system is a lot more complex than the Java type system
C++ compilation usually produces native assembler code, which is a lot more complex to produce than the relatively simple byte code
C++ compilers need to do optimizations because there isn't any other thing that will do them. The Java compiler pretty much does a simple 1:1 translation of Java source code to Java byte code, no optimizations are done at that step (that's left for the JVM to do).
C++ has a template language that's Turing complete! (so strictly speaking C++ code needs to be run to produce executable code and a C++ compiler would need to solve the halting problem to tell you if arbitrary C++ code is compilable).
Java compiles code into bytecode, which is interpreted by the Java VM. C++ must compile into object code, then to machine language. Because of this, it's possible for Java to compile only a single class for minor changes, while C++ object files must be re-linked with other object files to machine code executable (or DLLs). This may make the process take a bit longer.
I am not sure why you expect the compilation speed of Java and C++ to be comparable since they are different languages with completely different design goals and implementations.
That said a few specific differences to keep in mind are:
Java is compiled to byte code and not right down to machine code. Compiling to this abstract virtual machine is simpler.
C++ compilation involves not only compilation but also linking. So it is typically a multi step process.
Java performs some late binding that is the association of a call to a function and the actual code to run is done at runtime. So a small change in one area need not trigger a compile of the whole program. In C++ this association needs to be done at compile time this is called early binding.
A C++ program using all the language's features is inherently more difficult to compile. A few template invocations with a number of types can easily double or triple the amount of code to generate.
Glossing over a lot of details, in Java you compile .java files into one or more .class files. In C++ you compile .cc (or whatever) source files into .o files, and then link the .o files together into an executable or library. The linking process is usually what kills you, especially for minor changes as the amount of work for linking is roughly proportional to the size of your entire project. (this is ignoring incremental linkers, which are specifically designed to not behave as badly for small changes)
Another factor is that the #include mechanism means that whenever you change a .h file, all of the .o files that depend on it need to be rebuilt. In Java, a .class file can depend on more than one .java file (eg: because of constant inlining), but there tend to be far fewer of these "hot spots" where changing one source file requires many other source files to be rebuilt.
Also, if you're using an IDE like Eclipse it's building your Java code in the background all the time, so by the time you tell it to build it's already mostly (if not completely) done.
Java compiles any source code into bytecode, which is interpreted by JVM. Because of this feature it can be used in multiple platform.

Categories

Resources