Application of Project Sumatra to other JVM languages - java

I have just recently discovered Project Sumatra, which aims to bring the JVM to the graphics card. From their webpage this includes a custom compiler (called Rootbeer) for Java.
This is all good news, however, I would like to hear from someone with more knowledge about the project internals if this means that project Sumatra applies to other JVM languages as well? Will it be possible to make Aparapi calls from Scala or Clojure directly? Or will you have to develop some core functionality in Java and then access that via other JVM languages?

I only just came across this question. Apologies for taking so long. Full disclosure I am the Aparapi inventor/lead and co sponsor of Sumatra.
Unlike Aparapi Sumatra has the advantage of working from the IR (Intermediate Representation) of Java methods from inside the JVM. This means that ultimately it will detect opportunities for GPU offload based on patterns found at this abstract level. Aparapi had to reverse engineer opportunities from bytecode.
It is likely that Sumatra will initially key off user hints, rather than trying to auto-parallelize code. The main focus at present is the new 'lambda' feature of Java 8 and it's companion 'stream API'. So where Aparapi required the user to inherit from a Kernel base class. Sumatr a will likely use the 'explicit' hint of parallelism suggested by:-
IntRange.range(1024).parallel().forEach(gid->{out[gid]=a[gid]+b[gid];});
Although for obvious cases, such as
for (int id=0; i< 1024; i++){
out[gid]=a[gid]+b[gid];
}
It should be entirely possible to offload this loop. So support for other JVM based languages will depend on how ambitious we are looking for opportunities to auto-parallelize. I suspect that many patterns from other languages (JavaScript (Nashorn), JRuby, Scala, JPython etc) will be detectable.

AFAIK Rootbeer (a university project) and Aparapi (an AMD based project) are unrelated, so you may haved missed something here.
Regarding Aparapi itself it states in its Wiki that it won't work with Scale/Closure etc. or in fact with anything except pure Java, since it depends on patterns used by JDK's javac to properly analyse bytecode. It also requires you to extend its Kernel class to be able to convert the bytecode into OpenCL and execute it in GPU. So it looks like you would use one or another.
Back to your question: based on all this you would have to develop in Java and call it from other JVM languages.

Related

Languages that compile to Java Bytecode and can run on the JVM

I am an embedded programmer and working with an embedded JVM.
This enables running Java files on constrained devices.
These Java files are first compiled to bytecode into .class files which are then further optimized and uploaded to the device which has a micro JVM to run the optimized bytecode.
The micro JVM does not support all features, e.g., no reflection.
The main benefit is obvious: this allows programming in Java for constrained devices.
However, I was thinking that plenty of languages compile to bytecode, some are listed https://en.wikipedia.org/wiki/Java_bytecode.
So in theory these languages could also be used to program.
I'd like to obtain a list of common languages that compile down to bytecode and was wondering if you could help.
For example, Python has special implementations that reduce to Java Bytecode, if I'm not mistaken, and stuff like C to Java virtual machine compilers also exist.
So what languages would you think are logical to try and run on the devices? Any pointers on how to or similar experiences?
Also, I'm not clear what the difference is from reading Wikipedia between (Python) bytecode and Java bytecode, could anybody help explain that?
I'm agree with you about the overall idea and it would be nice to develop an embedded application using any language that can run on a JVM. But there are some practical issues that you should consider and I think that's why none of major vendors or open source initiatives have any active/serious project on this (as far as I know).
As you mentioned, a JVM implementations that can run on embedded devices, each have their own constraints and limitations. The most obvious one is that some packages may not be available at runtime. In order to apply such a constraint, you should either control it in the compile process or have a toolchain (sort of an SDK) which accepts the bytecode and checks such constraints.
This situation would be worth when a developer tries to use a third party library that is available for that specific language. It's not easy to guess if a library is safe for use against such a framework or not.
One great facility for developers would be to have their IDE check such issues on the fly (something like inspection in IntelliJ Idea). This makes it much more smoother to move toward using such a solution. But again the problem is that for each such languages there need to be a specific plugin compatible with their own syntax.
Also some of JVM languages that are actually implementation of an existing language (e.g. Jython or JRuby) are most of the time out of sync with the original language in case of supporting libraries/syntax changes of that language.
Anyway, I think in order to have a list of JVM languages you could easily find them on Wikipedia. Maybe you mean those who may worth considering in this regard by having a large community and tools support. In my opinion, you should focus on the following JVM languages as those who may worth to include in your final list:
Groovy
Kotlin
Scala
These are all pure JVM languages and are only using different syntax than Java.
Regarding the topic in general, I should say that when you search for embedded JVM implementations, you'll notice that it's also a fairly academic concepts and they're so many publications in this topics regarding the overall architecture, threading support, toolchain, error handling, memory management, etc. This means that you should have a very great experiences/background on both embedded systems and programming language concepts and implementation to be able to devise a proper architecture for such a platform.
About your last question regarding the difference between Python bytecode and Java bytecode (if I understand your question correctly), these are both conceptually the same but each has its own syntax and constraints. The bytecode concept refers to the piece of software that is the output of the compiler and is the platform independent representation of the original code and can be run/interpreted at runtime by another software component which is the virtual machine. In Java world, this software is called the Java Virtual Machine (JVM). I'm from the Java world so I don't know what it's called in Python vocabulary but it should be something similar (e.g. Python virtual machine).
I think due to the complexity of developing such a toolchain and also considering the unprecedented development of new IoT and SoC devices, many of them capable of running a more higher level operating systems, maybe in a long run most developers prefer to develop for a more high end devices using more high level APIs and SDKs. Who knows! In that case, we would have a same situation that we're in today for PCs. Languages like C and Assembly are still in use, but they have their own domain of applications. I mean throughout the time, layers of abstraction are being added on top of the previous one. The same thing can happen for embedded devices.

Java without JVM

Just wondering if there are any Java implementations that work without a JVM. The reason I'm interested is, well, simply because I'm curious, and I was wondering if there were any "lightweight" Java implementations (without all the Sun libs attached).
I'm also interested in embedding Java in C++, but embedding the JVM in C++ seems rather ridiculous to me. I just want to exploit some of the Java language features in my C++ apps, but not exploit all the frivolous Java APIs.
EDIT:
I see from a lot of the answers I've gotten that I need to clarify...
I recently got in to developing node.js applications, which uses JavaScript. JavaScript in istelf is a language spec, it doesn't automatically come with the DOM, window.open, etc., although it did for a while. I'm wondering if there's something similar to Google's v8, except not for JavaScript, but for Java. In the end, I don't care if I can't write Hello World apps with it, I just want to be able to embed Java in a C++ application the way I can embed JavaScript in a C++ application with v8 or SpiderMonkey. If I could do that, then I could implement console output in C/C++ and then make that implementation callable from Java.
Do you want the Java VM alone without the API(STandard Library) ?
The JRE is composed by the JVM (Virtual MAchine) and the Standard Library, I have doubt that you can find a java implementation without the JVM ... You could find a compiler that compile java source code into native code(take a look at GCJ), but not a Java implementation without the VM.
Take a look at this wikipedia page to see some alternative Java implementations .
There's GCJ (GNU Compiler for Java), but the project has been deprecated since OpenJDK was open sourced.
there are light weight java processors designed for use in small devices for example JOP
As others have hinted, the "JVM" is the mechanism that knows how to load classes, interpret "bytecodes", and manage storage. It does not inherently include any of the java.lang... stuff, except that a few classes (String, Class, et al) are needed to represent classes and other basic data structures in the JVM.
As a result, Java without a JVM is just a bunch of meaningless bits.
There are (or were) compiled versions of Java that do/did not need the interpreter (though a reasonably compact interpreter is quite easy to build). A primitive class loader and some sort of storage management are still necessary, but class loading can be kept simple and for short-lived apps (or those that live with special restrictions) the storage manager need not do garbage collection.
As pstanton suggests, there are "lightweight" Java (or "Java-like") implementations that are suited for small devices.
IMHO, You need to re-exampine what it is you really want.
Java runtime consists of two main components
The JVM to run the code
The standard libraries which come with it.
You suggest you want to use Java, but you don't really have anything left without these.
For example, you cannot even write a "hello world" program without the libraries as String is a class in the JDK.

Use the right tool for the job: embedded programming

I'm interested in programming languages well suited for embedded programming.
In particular:
Is it possible to program embedded systems in C++?
Or is it better to use pure C?
Or is C++ OK only if some features of the language (e.g. RTTI, exceptions and templates) are excluded?
What about Java in this domain?
Thanks.
Is it possible to program embedded
systems in C++?
Yes, of course, even on 8bit systems. C++ only has a slightly different run-time initialisation requirements than C, that being that before main() is invoked constructors for any static objects must be called. The overhead (not including the constructors themselves which is within your control) for that is tiny, though you do have to be careful since the order of construction is not defined.
With C++ you only pay for what you use (and much that is useful may be free). That is to say for example, a piece of C code that is also C++ compilable will generally require no more memory and execute no slower when compiled as C++ than when compiled as C. There are some elements of C++ that you may need to be careful with, but much of the most useful features come at little or no cost, and great benefit.
Or is it better to use
pure C?
Possibly, in some cases. Some smaller 8 and even 16 bit targets have no C++ compiler (or at least not one of any repute), using C code will give greater portability should that be an issue. Moreover on severely resource constrained targets with small applications, the benefits that C++ can bring over C are minimal. The extra features in C++ (primarily those that enable OOP) make it suited to relatively large and complex software construction.
Or is C++ OK only if some
features of the language (e.g. RTTI,
exceptions and templates) are
excluded?
The language features that may be acceptable depend entirely on the target and the application. If you are memory constrained, you might avoid expensive features or libraries, and even then it may depend on whether it is code or data space you are short of (on targets where these are separate). If the application is hard real-time, you would avoid those features and library classes that are non-deterministic.
In general, I suggest that if your target will be 32bit, always use C++ in preference to C, then cut your C++ to suit the memory and performance constraints. For smaller parts be a little more circumspect when choosing C++, though do not discount it altogether; it can make life easier.
If you do choose to use C++, make sure you have decent debugger hardware/software that is C++ aware. The relative ease with which complex software can be constructed in C++, make a decent debugger even more valuable. Not all tools in the embedded arena are C++ aware or capable.
I always recommend digging in the archives at Embedded.com on any embedded subject, it has a wealth of articles, including a number of just this question, including:
Poor reasons for rejecting C++
Real men program in C
Dive in to C++ and survive
Guidelines for using C++ as an alternative to C in embedded designs
Why C++ is a viable alternative to C in embedded systems design
Better even at the lowest levels
Regarding Java, I am no expert, but it has significant run-time requirements that make it unsuited to resource constrained systems. You will probably constrain yourself to relatively expensive hardware using Java. Its primary benefit is platform independence, but that portability does not extend to platforms that cannot support Java (of which there are many), so it is arguably less portable than a well designed C or C++ implementation with an abstracted hardware interface.
[edit] Concidentally I just received this in the TechOnline newsletter: Using C++ Efficiently in Embedded Applications
More often than not in embedded systems, the language you're programming in is determined by which compiler is actually available.
If you're hardware only has a C compiler, that's what you're going to use. IF it has a decent C++ compiler than there is really no reason to prefer C over C++.
I would dare say that Java isn't a very popular choice in embedded systems.
Embedded programming these days spans a large range of applications.
Roughly, it goes from sensors/switches up to complete security systems.
You should base your language on the complexity and the hardware resources.
It is 1 of the choices next to HW (CPU,...), OS, protocols,...
possible choices:
switches: assembler
router-like devices: C and/or C++
handhelds: Java or QT/C++
complete systems: combinations C and/or C++ with python
Or is C++ OK only if some features of the language (e.g. RTTI, exceptions and templates) are excluded?
It's good to be thinking along these lines. Compile-time complexity is not a big deal, but runtime complexity has a resource cost.
C++ facilitates class/namespace modularity (e.g. method foo() in more than one context) and instance modularity (member field bar belonging to more than one object), both of which are a big advantage in software design. There are also features like const, references, static casts, and templates, which can help enforce constraints and have little or no runtime cost.
I would not exclude templates. They're complex to think about and you need a compiler that handles them well, but the resource cost is almost all compile time -- what's going to "cost" you is the fact that each time you use a template with different class parameters, you produce a new set of code to instantiate member functions. But you would almost certainly have to do the same thing without templates. Furthermore, templates allow you to design and test libraries for general circumstances, in separate files that are instantiated at compile time rather than link time. Just to clarify that: templates allow you to have a file A.h that you test. Then you use it with file B.h or B.c to instantiate it at compile time. (A library would be linked in rather than compiled, and this makes it less flexible -- template methods can be optimized out so they do not incur a function call.) I've used templates in embedded systems to implement CRC code and fixed-point math: I can test the general code, put it in version control, and then reuse it multiple times by writing a simple class that derives from a template or has a template member field. The classic example of course is STL.
RTTI and exceptions: these add run-time complexity. I don't have a good idea of the resource cost but I expect RTTI would be fairly simple (just adds a type tag, costing extra space) whereas exceptions are probably beastly, involving stack unwinding.
virtual functions: I used to rule these out because of the memory + executiontime costs (minimal but still there), as well as the complexity of debugging, but they allow you to decouple objects from each other. If you don't use virtual functions, when an instance of one class (e.g. Foo) needs to execute code associated with an instance of another class (e.g. Bar), then the first class needs to know everything about the second (to compile Foo you need to have static linkage to all the methods in Bar) -- this adds a lot of tight coupling.
dynamic memory allocation: this is another big thing (that is in C as well), which we avoid like the plague at my company -- not only are there all sorts of errors that can arise, but the big runtime cost is the allocator/deallocator, and you've got to be willing and able to know what that cost is and accept it.
edit: I would love to use Java instead of C++ in the embedded world. Unfortunately my choices are limited and the runtime resource costs (code size, memory size, garbage-collecting time constraints) are too high in the space that I work in. My reason for using Java is less because of its runtime goodies and more for the fact that its software design is much cleaner, and the tools are much better (OMG! refactoring! woohoo!)... the key to me seems to lie with two things, which make C/C++ feel very clunky in comparison:
that everything is an object and all the methods are virtual, so you can lean heavily on the abstractions of interfaces.
the interface/implementation separation in Java is not this clunky ugly .c/.h file splitting thing that makes compilers so slow. In constrast, I use Java in Eclipse and it automatically compiles the code right as I edit it. This is huge! I find most of my errors right away. In C/C++ I have to wait for a whole compile cycle.
Someday I hope there will be a language between C/C++ and Java that provides the advantages of Java for software development, without requiring the bells and whistles that make Java so attractive for desktop/server applications but unattractive in most of the embedded world.
Both C and C++ can be used on embedded systems. If you do limit the features of C++ that you use, then it is going to use roughly the same speed and space as C. As for using these additional features, it really depends on your constraints. For example, if you are making a real-time system, then exceptions might not be a good idea, simply because considering the propagation time for exceptions and all the paths through which exceptions can possibly propagate can make hard real-time guarantees quite tough (although, then again, the ACE / Tao real-time CORBA implementation uses exceptions). While templates and RTTI can lead to larger programs, there is a lot of variability in embedded systems, and so depending on your resource constraints, this could be perfectly fine or unacceptable. The same goes for Java. Java (well, technically, the Java programming language with a subset of the Java API) is running on Android, for example, using the Dalvik VM. J2ME runs on a variety of embedded devices. Whether it will or will not work for your application, depends on your particular constraints.
c is the most common language used in embedded systems.
However, I think C++'s future lies in the embedded system domain. I think the C++0x standards will help in that resepect. So I wouldn't be surprised to see C++ used a lot more in this field.
I think you already have a great answer by Clifford, but I'll add my experience to it. As was pointed out, the type of embedded system is the main driver. In Defense/Aerospace, C and Ada are the most popular embedded languages I encounter. Although as time goes on, I am seeing more C++ as the model based development becomes popular with the tools such as Rhapsody. In the jobs listing, seeing requirements for Object Oriented Design experience also leads me to believe that the market is slowly shifting to follow the trends of mainstream development.
If you're really interested in embedded development, I would start with C. It is pretty universal. Almost every OS, including real time OS's like Integrity, Nucleus, VxWorks, and embedded Linux, has a compiler and tools for it. The things you'll learn about pointers, memory management, etc... will translate into C++ development well in the embedded world at least.
As for Java, if you're interested in development for mobile platforms like smart phones, this is a solid choice (Android comes to mind). But in the world of Real Time Operating Systems, I haven't seen it.
On that note, that brings me to my last point and advice. If I wanted to jump into embedded programming (which I did just 4 years ago), I would focus on learning C from a low level point of view as mentioned above. Then, I would also learn what makes real time programming so difficult/different. I found the book below quite good at teaching you to think like an embedded programmer vs. an application developer. Good luck!
An Embedded Software Primer
It really boils down to what hardware platform you're operating on and hence what software platforms are open to you. For a lot of recent embedded kit - designed around a system-on-chip, a megabyte or two of RAM, a few devices - you really want a small operating system to manage the low level hardware while you concentrate on your application. Your choice of OS then constrains your available language set.
It's certainly possible to use C++ in the embedded space, but the full feature set of the language takes a lot of work to port correctly. For example, eCos is implemented in a mixture of C and what you might call the structural aspects of C++; support for RTTI, exceptions and the STL are lacking in the free version, though there are people working on this (and a commercial port available).
Similarly, it's possible to use Java - I know, I've ported a JVM to an embedded environment; it wasn't fun - though you usually get a cut-down Java language configuration, often something based on J2ME.

building a system in Java and assembly language that runs on "bare metal"

Alright everyone,
Let's say i wanted to make a system that used only assembly and java, my question would be, as long as i included all of the jvm folders, classes, jars, etc.. java should still function effectively?
i understand there are those things that are compiled platform specifically but this is why i am asking, is it possible, using assembly to replicate all of the .exe, or other executable files that java has included into a pure assembly/java system?
If you are asking whether it is possible to build a system in Java and assembly language that runs on "bare metal", the answer is yes. There are a couple of current examples:
JavaOS is targeted primarily at the embedded systems domain. (Sun consider SunOS to be a "legacy" product line these days.)
JNode aims are broader, and encompass embedded systems, desktop systems, servers and cloud computing.
Be aware that building a system of this kind is a multi-year, multi-person project requiring deep understanding of virtual machine internals, compilers, garbage collectors, hardware architectures, device driver writing and so on.
If you are asking about something else, please be more explicit.
EDIT: responding to the OP's followup question:
It is not practical to use the Java and other "exe" files per se. They require a fully fledged operating system underneath them; e.g. Windows, Linux, whatever. If you had access to the source code, you could conceivably rewrite as required to make them run on "bare metal", but that would entail significant architectural changes, especially if you want to write device drivers, etc in Java. (Besides, the core of Sun's JRE is implemented in C++ ... ).
You cannot directly use the existing Java class library JAR files, because they include a significant amount of platform specific code. However, you can build your own Java class library JARs from an existing open-source version of the Java class libraries (e.g. the OpenJDK 6.0 J2SE libraries). You deal with the platform specific code by providing your own versions as native libraries or (as JNode does) as Java classes.
If I understand your question correctly, you mean something like JavaOS. Sure, its possible to implement the JVM raw on the hardware, not sure why you would, though. And if you did, why you wouldn't use C instead of Assembly for most of the work.
Its theoretically possible to implement the jvm in a whole other language. The best example I can think of is Python/Jpython where there is the original C implementation and a pure Java implementation of the language.
The main argumant against this is -- its a ton of work for not much benefit.
The official Sun jvm and supporting jni libraries are written mostly in C, you would need to provide native assembler implementations for most of the C POSIX APIs at the very least.
Also the original design goal of C was 'a portable assembly language' and to a large extent it still meets these goals. C produces efficient machine code and most C compilers will let code machine instructions inline with the C code.
Another benefit of C is the number of cross compilers available, you dont need to run the development environment on tHe target architecture, you can deveop and unit test on your favourite paltform/IDE, when you are ready you can then export your executables to the target platform.
Jikes RVM and Sun's Maxine provide a JVM implementation with little (of the order of 1 kloc) native code. However, both VMs require an OS and are only research implementations. The process of creating a stream of octets that form machine code, is obviously achievable in Java.
Have a look at JNode. They have been working on this for years.
http://www.jnode.org/

Java runtime vs OS calls

The Java runtime provides a set of standard system libraries for use by programs. To what extent are these libraries similar to the system calls of an operating system, and to what extent are they different???
Half the point of java was to make it platform independent, so what it tries to do is provide an api that remains the same regardless of the OS underneath it.
If the OS is underpowered, Java will add library code to compensate for it.
If the OS has an implementation that doesn't map, Java will do it's best to map it.
If a new function becomes popular and Java users need to provide access to it, a new library can be created through which you can access the new functionality. If this library is popular, it will be restructured and added into the Java SDK at some point
For instance, an implementation of some concurrency libraries became popular, and soon they were voted upon and added to the standard libraries. This happens all the time.
That obviously depends on the OS you're running on, since the system calls are generally different for every OS :-).
That said, I believe Java was mostly inspired by Unix conventions (not surprinsingly, as Sun is a Unix vendor), so some Java system libraries are similar to Unix sytem calls.
E.g. java.nio.MappedByteBuffer was probably inspired by Unix's mmap() call. But ultimately most concepts are present on most OSes, so you cannot really say what inspired what.
Some of Java's "low-level" functions are basically "wrappers" around some OS
system calls.
I don't see an objective way (and reason) to "compare" both.
If you are interested in this topic, you can search the Java source code
for the native keyword, which indicates some "hidden"
(mostly OS-dependend) functionality.
Java's standard library often has a similar feature set compared to the native library but there are several important differences.
Java is Object Oriented, whether you like it or not. The advantage of this is that certain concepts are easier to manage. For example, most file related operations are found directly in the File object. Compare this to Posix, where a FILE is a handle which is really just a number; an index into your process's open file list. The Posix approach is very close to how the OS actually implements stuff. But in Java you don't see that or know it or care.
Java has certain lowest-common-denominator behaviours in certain cases. There are many AWT APIs that are the way they are because AWT needed to be identical on a number of separate platforms. That turned out to be madness, and Sun quasi-deprecated most of AWT, because supporting platform equally meant supporting every platform crappily. The newer library, Swing, implements almost everything in pure Java, and thus is far better at cross-platform stuff, and thus has a richer API. And that API is very different from the native windowing library. Also, Swing doesn't integrate too well because it uses so little of the native OS.
Java has certain limitations that the native libraries don't have. For example, you don't have function pointers. Thus you have Listeners and Runnable and other Java patterns for doing things that in C++ would involve function pointers. So any API that needs one of these features will be significantly different in Java than in the native OS.
So in conclusion, Java often has libraries that offer similar behaviour to the native OS, and sometimes offer completely different behaviour, but it's best to think of Java as a platform in its own right. Sometimes you need advanced performance, such as OpenGL or super-fast data transfer, in which case you'll want a specific Java API (Jogl, nio), but most of the time you should evaluate Java as its own thing.

Categories

Resources