Real-time Java interoperability

Real-time Java interoperability - java

I am wondering how it's the interoperability between JRE6 and the JVM from rtsj. It seems that I have to use only their implementation (since the code will be interpreted using their JVM), so I cannot use many of the features that Java 6 has to offer.
Can it support a GUI? (say for example to modify the parameters of an industrial process).
I might be wrong, hoping to get some feedback from you.
Also, it seems that are more real time implementations for Java. Which one did you use and which one did you like most?

In order to provide real-time behavior, the JVM needs to be very specifically engineered. This includes integration at the operating system level to get access to real-time scheduling features of the host OS.
The Sun rea-time JVM is compatible with J2SE5, for instance. http://java.sun.com/javase/technologies/realtime/faq.jsp#4
Generally, any specialized instance of a system (OS, JVM, etc) that offers niche functionality, like security or real-time behavior, tends to be a release behind the general purpose version.
As to using a GUI for real-time, you should investigate using 2 tier client-server control of the real-time process using something like JMX, RMI or web-services (whichever is the lightest-weight). Using a GUI directly in real-time code seems like it could introduce lots of potential problems for the application as it tries to execute withing real-time constraints.

See my answer to another question for some more examples of RTSJ commercial-grade implementations. The latest version (2.1) is compliant with JDK1.5, so you should have Swing/AWT available.
While it is feasible to write a GUI to execute within the same JVM as real-time processes, it's not clear that this is a good architectural decision. It is more likely that you'd prefer to isolate the real-time behaviors in a JVM and provide a separable interface that implements to GUI in a separate memory space.
In principle, you are supposed to be able to write RTSJ code such that it runs in the same JVM with non-real-time threads (and I have done a lot of this) but it can be tough to get synchronization right.

As this book describes, there can be interoperability between the JRE of Sun Java and the rtsj implementation.

Related

Best practices for calling native code from Java

I intend to swap out some hotspots in my Java code for something native (perhaps C++, but still to be decided). What are the modern choices for how to do this? (I don't intend or want to port the whole application).
I might for example use:
JNI
JNA
Spawn the native piece as a separate process, then communicate using:
TCP
UDP
shared memory
stdin/stdout
ZeroMQ..
I have complete freedom at this point of how I build this. Is there a recommended best practice approach?

This is a very broad question. The best practice is going to be different depending on each "hot spot" your system has, and what alternatives you have available to you for native processing.
For example, in a Java application that I work with, we have several sections of graphics processing, including transformation from one format to another. While third party Java libraries do exist and I've tested several, I haven't found one that is as efficient and accurate at transforming PDFs to TIFFs as a C++ library I access via JNI. Why JNI over JNA? Performance based on the version we tested with; the access via JNI code had a faster throughput. This might not hold for all applications, but it did for my implementation.
Your hot spots could simply be an issue with your coding architecture and would simply need some investigation and refactoring within Java and not require any native replacement.
One application I refactored conducted several tasks in a procedural fashion. Even multi-threading the entire procedure wasn't working. I broke the procedure down into 8 different stages, and identified the bottleneck, which was then handled by a multi-threaded worker queue. Now the system runs efficiently, and is still 100% Java.
Communication to non-Java applications is also dependent on what facilities exist for the non-Java application to accept that communication. I've had scenarios where I've used Runtime.exec() to spawn a native process, dropped a data file in a folder for external processing, and made calls to Web Services. The "native" solution really depends on (1) what your available native resources are, (2) which native resource is the best for handling this "hot spot" use case, and (3) what accessibility to said native resources is available. And even then, refactoring your existing code might still be the best bet.
To reiterate: any given "hot spot" can be due to a number of factors, each requiring individual attention that a broad answer here cannot provide. Your best bet is to identify each "hot spot", and evaluate them individually as to how best to optimize them; if necessary, then ask a "hot spot"-specific question for handling it.

If the function is reasonably short (~milliseconds or less), in-process might be the only reasonable option. (When running a separate process, context switch latencies on each call may outweigh the execution time).
JNI is a portable API layer that is a de-facto standard; raw JNI is, however, not easy to use. JNA uses JNI under the hood, but it is higher level and easier to use, and quite popular. It might be your safest bet. You could also consider https://github.com/dejwk/janet (which I wrote); it also uses JNI under the hood, but I think it's even easier to use than JNA.

Clojure targeted initially to servers?

Since Clojure is designed to run in a Java virtual machine (JVM), I don't understand this statement:
While Clojure started its life mainly as a server-side language, the advent of ClojureScript demonstrates that the core developers don't see that as its only purpose.
I am not real familiar with Java though I am interested in Lisp languages and hence Clojure, so this makes me wonder. Most web servers I've worked on are traditional Apache variants with standard server-side languages like Ruby, PHP, Perl, but I've never seen Java as a default installed server language in my hosting environments, so what is the meaning of this statement?
Second, JVMs are typically run on client operating systems like Mac or Windows, are they not? Sun says there are many billions of JVMs in the world, obviously this is not referring to servers.

One main point is that Clojure has several important philosophies and practices that, when applied to a particular runtime environment such as the JVM, JavaScript/ECMAScript, etc. yield a powerful language. These philosophies include:
simplicity: The ability to separate distinct parts. All the clojure variants provide for the separation of Code and data. This includes the ability to deal with data independently of the code that produced it. Directly this is the ability to read and write compound and simple (non-comound) data structures is built into the language.
Immutable Data Structures: All Clojure varients have data structures where producing a new version of even a very large data structure is efficient, and leaves the old data intact. If you pass a large data structure to several threads there is no need for locking because they work on different "forks" of the data. This is all done without copying (with structural sharing) and is efficient.
Explicit handling of Identity, State and Time: All the Clojure Varients provide explicit handling of sequences of events built into the languge. This is different between variants depending on the platform. For instance ClojureScript which produces JavaScript as it's output has no place for coordinated syncronous updates because JavaScript has only one thread, though it has all the rest of the types.
There is a lot more and it can be found on the Clojure Philosophy Page. It is also worth mentioning that If not the majority, then a very large part of the worlds web applications are written almost entirely in Java. Many people find that Clojure provides them a way to interact with this world, even if Java isn't their preferred language.

Java is exactly as much of a server-side language as Ruby or Perl (though not really PHP): It's a general-purpose language that is frequently used to write server applications, including Web applications and SOA services. Whether Java is "installed as a default", it's typically trivial to install on the Unix machines that are the usual hosts for Java servers.
A JVM can theoretically run on any platform; there are JVMs that run on bare x86 hardware, and Blu-Ray players have embedded JVMs. Sun originally thought that Java was the future for rich-client applications, but instead it's found a much wider use in powering Web sites and other services that clients access through various APIs.

By server-side language the author does not mean just the web server. It may include a whole stack of services running on the server from simple file upload to big data processing, served to the clients via the web.
Second, JVMs are typically run on client operating systems like Mac or Windows, are they not?
JVM is a development as well as deployment platform. There are numerous web scale applications deployed over JVM. It is very common to have JVM installed on you servers if your stack includes java based services.

The author of that statement means that Clojure was envisioned as a server-side language, but it has enough power where there is a demand to be able to use Clojure on the client as well.
An important distinction though is that it isn't like Clojure is actually running in a browser. ClojureScript is a tool that compiles client-side (i.e. browser) Clojure code into JavaScript. It is similar to CoffeeScript, which compiles Ruby-style code into JavaScript.
So ClojureScript is nothing more than syntactic sugar that lets people who love the power and succinctness of Clojure on the server side still write Clojure on the client side in the browser. But in the end, that client-side Clojure isn't really Clojure at all but JavaScript.
So when it comes to ClojureScript, the JVM is irrelevant.

I don't know much about Clojure's history, but it seems clear that it has been intended as a general-purpose language for some time--whatever initially pushed Hickey et al. to want to develop it. Because Clojure supports easy access to existing Java libraries and is able to create standard Java-style jar files--both crucial benefits on a server as well as elsewhere--it would have been obvious early on that Clojure could be useful outside of servers. So my answer to why "the advent of ClojureScript demonstrates that the core developers don't see [server side applications] as its only purpose" is that no such demonstration was needed.

Using IPC to combine multiple languages

This is a general "noob" question about software design, so I apologise if it seems vague,
but I would really appreciate the advice. Note the system described below is purely an example, not a specific product I have in mind.
I often have a need to combine the functionality of several libraries or utilities, written in different languages. For example, if I want to code a high-performance audio processing application for the desktop, I will write it in C / C++. Then, I want to add a nice GUI. But I don't want to learn Qt. I like the look and feel of Adobe Air, and would like to use that. Later, I have a need to access a USB device. But the USB library I have only has an API in Java. How can I combine all these elements together, to take advantage of their relative strengths?
Clearly, I cannot compile these various elements into one single executable. So I need to create and run them seperately, and give them a means to communicate. The most common way to do this seems to be using IPC (Inter Process Communication), eg shared memory or sockets. I prefer the idea of sockets, as the programs could potentially run on seperate machines on a network.
So I decide to create a local client / server system, with a custom API, to allow these elements to communicate. For example, the Air application will receive a message from the C application, telling it to update it's UI. The USB application running in Java will use the sockets to stream audio from the USB hardware, into the C application.
My question : is using local sockets in this way a typical way to design such a system?
Will the performance be much worse than a truly native application (e.g. everything in Java or C, in a single executable) ? It also seems likely that such an approach would be prone to bugs, and difficult to maintain?
I frequently find myself coming up against the limits of existing software libraries (e.g. a graphics library with a pretty, flexible UI but no way to access low-level hardware, or a media library that can mix many audio streams, but has no support for video playback), and find it very frustrating. If anyone could advise the best way to combine arbitrary software libraries like this, I would really appreciate it.
Thanks in advance!

As you have correctly identified, combining libraries from different language or platforms is hard. There are several ways to do it, but none are ideal. Examples:
Native call interfaces (e.g. JNI / JNA) - very fast but tricky to make work correctly, and you have the problem that the data types used typically don't map cleanly across different platforms. Adds native dependencies.
Socket based IPC with text protocol (XML, JSON, etc) - works OK and common formats are likely to be supported at both ends, but adds a lot of overhead. Can be a pain to maintain custom schema mappings etc.
Socket based IPC with binary protocol (e.g. Google protocol buffers) - quite efficient, needs a lot of work to get a custom protocol working correctly on both ends
Communication via a 3rd system (e.g. database, message queue, filesystem) - lots of overhead, can get fragile, introduces a major dependency on a 3rd system.
In my experience, it usually isn't worth integrating a new language / platform just to get one specific library or feature. Take your user interface example - no matter how nice Adobe Air looks, I doubt it is worth trying to integrate it with an existing C/C++ application.
Even if you get it to work, it will significantly complicate the future maintenance and devlopment of your application. Builds become more complex. You need to maintain additional communication / "glue" code. You need to manage more dependencies. Your users will get hit by many more configuration issues. Testing becomes more difficult. It becomes harder to teach someone new about how the whole system works. You need to maintain your skills in more languages / frameworks etc.
I'd recommend the following strategy:
Pick a primary platform
Whenever you need a new library or feature, look for something on your primary platform first. Hopefully (usually?) there is something good available - but even if not then it might be worth coding something yourself if the requirement is quite small.
Only if there is no reasonable option on the primary platform, then you can start to think about integrating a new language/platform
In terms of primary platform, I'd normally suggest a JVM language like Java, Scala or Clojure since the JVM is very well engineered, offers great performance, is highly portable and has the largest / most cohesive library ecosystem (most of which is open source). The JVM is therefore probably the best "general purpose" choice unless you have some very specific requirement which is unlikely to be possible on the JVM, e.g.:
If you are doing lots of embedded / realtime / systems programming wthat requires hardware access you probably need to go for C/C++
If you are coding purely for web-based clients, you probably want to use JavaScript (if you are also writing code on the server side you can consider JavaScript code generation frameworks/libraries that can work on the JVM, e.g. Vaadin or ClojureScript)

the answer is pretty much depends on the technologies you're using and there is no silver-bullet solution for this.
In general, this solutions will fall into one of the following categories:
Some interprocess communication techniques
Integrations provided by the language/platform itself
Database/some common storage (even files :) )
Example of the first:
Sockets/pipes/whatever you operating system allows.
CORBA - allows to write distributed code in different languages.
Google protobuf - allows serialization/deserialization of data-objects and its language agnostic
For the second it really depends on language/ecosystem you're using.
Examples for java:
JNI - Java Native interface - allow to execute code (dlls/so) outside the JVM.
JCA - if you're in the enterprise environment - you can write the integration with the legacy systems in this.
For languages that are compiled into the native code its less tricky - you can write and compile some code, say in Pascal, and then use the DLL in C.
Sometimes when we're talking about Java there is a plethora of languages that have their own syntax and compiler, but their compiler compiles into java binary code that can be run inside the jvm. So if your solution is based on these languages the integration will be easier. Languages like Scala, Groovy, Closure, Jython and so on are falling into this category.
The last but not the least technology to be mentioned is Web Services. This is a very popular tool for integration of different system, although its more used in enterprise environment.
Basically its an abstraction over the sockets layer that allows to send data objects in XML/JSON format between the processes/servers. Both of XML and JSON are language agnostic, so its not an issue to create an XML in a program written in C++ and then consume it in JAVA.
Hope this helps

Find an efficient way to integrate different language libraries into one project using Python as the "glue"

I am about to get involved in a NLP-related project and I need to use various libraries. Some are in java, others in C/C++ (for tasks that require more speed) and finally some are in Python. I was thinking of using Python as the "glue" and create wrapper-classes for every task that I want to do that relies on a different language. In order to do that, the wrapper class, for example, would execute the java program and communicate with it using pipes.
My questions are:
Do you think that would work for cpu-demanding and highly repetitive tasks? Or would the overhead added by the pipe-communication be too heavy?
Is there any other (preferably simple) architecture that you would suggest?

I would simply advise not doing this.
Don't implement stuff in C/C++ "for speed". The performance benefit is not likely to be as great as you expect; e.g. compared with implementing in Java using "best practice" design and performance techniques.
Don't try and glue lots of languages together. You are setting yourself up for lots of portability issues, difficulties in debugging, and reliability issues; e.g. due to C / C++ bugs crashing the JVM. In addition, there are performance overheads in bridging between languages, and there can be unexpected bottlenecks. (For instance, you may find that your C/C++ has to be run single-threaded due to threading issues, and that you therefore can't get the benefit of Java multi-threading on a typically multi-core system.)
Instead, I advise you to look for libraries that allow you to implement the entire application in one language. If that is not possible, design it so that the different language components are different executables / processes, communicating via some kind of RPC, messaging, or whatever.

Whether or not you'd have problems communicating over pipes / sockets has nothing to do with how CPU intensive the tasks are, but how frequently you'd need to send information between the processes and how much data they need to send. Setting up threads to do your communication will have little processing overhead.
You can probably automatically wrap the C/C++ code with Python (SWIG, ctypesgen, Boost.Python), so the only glue you'll have to write yourself would then be talking to Java.
You could also do it the other way -- run the Python code in the JVM with Jython so the Python and Java code are together, then talk to the C/C++ from there.

You should take a look at Apache UIMA. It is designed exactly for this. From the project website:
The Frameworks run the components, and are available for both Java and C++. The Java Framework supports running both Java and non-Java components (using the C++ framework). The C++ framework, besides supporting annotators written in C/C++, also supports Perl, Python, and TCL annotators.
UIMA can manage pipes and annotators and is built to scale.

I would look at Jepp or JPype instead of using IPC for this. I would avoid Jython since loading the C/C++ libraries into Java would probably be harder than into CPython.

1) Do you think that would work for cpu-demanding and highly repetitive tasks? Or would the overhead added by the pipe-communication be too heavy?
Depends on your task. If this is a typical NLP app where you have a large model loaded in memory and you only communicate relatively small pieces of data (strings in, label sequences/parse trees out), it may work. Pipe communication is hard to get right, though, since there's a lot of buffering and synchronization issues you have to tackle. Python is a very good glue language, but it doesn't solve everything.
2) Is there any other (preferably simple) architecture that you would suggest?
Make your NLP components services and connect to them via REST interfaces. There are off-the-shelf tools that do this, e.g. CLAM. Pyro and SPIRO make communication between Java and Python even more direct and might be easier to use than HTTP/REST (but YMMV).
The parts that are written in C/C++ can also be integrated with CPython using Cython. Don't start implementing things in C or C++ because you think they'll be faster, though; you can also implement them in Python first, then see if you can get the desired performance with NumPy and/or Cython.

Java runtime vs OS calls

The Java runtime provides a set of standard system libraries for use by programs. To what extent are these libraries similar to the system calls of an operating system, and to what extent are they different???

Half the point of java was to make it platform independent, so what it tries to do is provide an api that remains the same regardless of the OS underneath it.
If the OS is underpowered, Java will add library code to compensate for it.
If the OS has an implementation that doesn't map, Java will do it's best to map it.
If a new function becomes popular and Java users need to provide access to it, a new library can be created through which you can access the new functionality. If this library is popular, it will be restructured and added into the Java SDK at some point
For instance, an implementation of some concurrency libraries became popular, and soon they were voted upon and added to the standard libraries. This happens all the time.

That obviously depends on the OS you're running on, since the system calls are generally different for every OS :-).
That said, I believe Java was mostly inspired by Unix conventions (not surprinsingly, as Sun is a Unix vendor), so some Java system libraries are similar to Unix sytem calls.
E.g. java.nio.MappedByteBuffer was probably inspired by Unix's mmap() call. But ultimately most concepts are present on most OSes, so you cannot really say what inspired what.

Some of Java's "low-level" functions are basically "wrappers" around some OS
system calls.
I don't see an objective way (and reason) to "compare" both.
If you are interested in this topic, you can search the Java source code
for the native keyword, which indicates some "hidden"
(mostly OS-dependend) functionality.

Java's standard library often has a similar feature set compared to the native library but there are several important differences.
Java is Object Oriented, whether you like it or not. The advantage of this is that certain concepts are easier to manage. For example, most file related operations are found directly in the File object. Compare this to Posix, where a FILE is a handle which is really just a number; an index into your process's open file list. The Posix approach is very close to how the OS actually implements stuff. But in Java you don't see that or know it or care.
Java has certain lowest-common-denominator behaviours in certain cases. There are many AWT APIs that are the way they are because AWT needed to be identical on a number of separate platforms. That turned out to be madness, and Sun quasi-deprecated most of AWT, because supporting platform equally meant supporting every platform crappily. The newer library, Swing, implements almost everything in pure Java, and thus is far better at cross-platform stuff, and thus has a richer API. And that API is very different from the native windowing library. Also, Swing doesn't integrate too well because it uses so little of the native OS.
Java has certain limitations that the native libraries don't have. For example, you don't have function pointers. Thus you have Listeners and Runnable and other Java patterns for doing things that in C++ would involve function pointers. So any API that needs one of these features will be significantly different in Java than in the native OS.
So in conclusion, Java often has libraries that offer similar behaviour to the native OS, and sometimes offer completely different behaviour, but it's best to think of Java as a platform in its own right. Sometimes you need advanced performance, such as OpenGL or super-fast data transfer, in which case you'll want a specific Java API (Jogl, nio), but most of the time you should evaluate Java as its own thing.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.