Using IPC to combine multiple languages

Using IPC to combine multiple languages - java

This is a general "noob" question about software design, so I apologise if it seems vague,
but I would really appreciate the advice. Note the system described below is purely an example, not a specific product I have in mind.
I often have a need to combine the functionality of several libraries or utilities, written in different languages. For example, if I want to code a high-performance audio processing application for the desktop, I will write it in C / C++. Then, I want to add a nice GUI. But I don't want to learn Qt. I like the look and feel of Adobe Air, and would like to use that. Later, I have a need to access a USB device. But the USB library I have only has an API in Java. How can I combine all these elements together, to take advantage of their relative strengths?
Clearly, I cannot compile these various elements into one single executable. So I need to create and run them seperately, and give them a means to communicate. The most common way to do this seems to be using IPC (Inter Process Communication), eg shared memory or sockets. I prefer the idea of sockets, as the programs could potentially run on seperate machines on a network.
So I decide to create a local client / server system, with a custom API, to allow these elements to communicate. For example, the Air application will receive a message from the C application, telling it to update it's UI. The USB application running in Java will use the sockets to stream audio from the USB hardware, into the C application.
My question : is using local sockets in this way a typical way to design such a system?
Will the performance be much worse than a truly native application (e.g. everything in Java or C, in a single executable) ? It also seems likely that such an approach would be prone to bugs, and difficult to maintain?
I frequently find myself coming up against the limits of existing software libraries (e.g. a graphics library with a pretty, flexible UI but no way to access low-level hardware, or a media library that can mix many audio streams, but has no support for video playback), and find it very frustrating. If anyone could advise the best way to combine arbitrary software libraries like this, I would really appreciate it.
Thanks in advance!

As you have correctly identified, combining libraries from different language or platforms is hard. There are several ways to do it, but none are ideal. Examples:
Native call interfaces (e.g. JNI / JNA) - very fast but tricky to make work correctly, and you have the problem that the data types used typically don't map cleanly across different platforms. Adds native dependencies.
Socket based IPC with text protocol (XML, JSON, etc) - works OK and common formats are likely to be supported at both ends, but adds a lot of overhead. Can be a pain to maintain custom schema mappings etc.
Socket based IPC with binary protocol (e.g. Google protocol buffers) - quite efficient, needs a lot of work to get a custom protocol working correctly on both ends
Communication via a 3rd system (e.g. database, message queue, filesystem) - lots of overhead, can get fragile, introduces a major dependency on a 3rd system.
In my experience, it usually isn't worth integrating a new language / platform just to get one specific library or feature. Take your user interface example - no matter how nice Adobe Air looks, I doubt it is worth trying to integrate it with an existing C/C++ application.
Even if you get it to work, it will significantly complicate the future maintenance and devlopment of your application. Builds become more complex. You need to maintain additional communication / "glue" code. You need to manage more dependencies. Your users will get hit by many more configuration issues. Testing becomes more difficult. It becomes harder to teach someone new about how the whole system works. You need to maintain your skills in more languages / frameworks etc.
I'd recommend the following strategy:
Pick a primary platform
Whenever you need a new library or feature, look for something on your primary platform first. Hopefully (usually?) there is something good available - but even if not then it might be worth coding something yourself if the requirement is quite small.
Only if there is no reasonable option on the primary platform, then you can start to think about integrating a new language/platform
In terms of primary platform, I'd normally suggest a JVM language like Java, Scala or Clojure since the JVM is very well engineered, offers great performance, is highly portable and has the largest / most cohesive library ecosystem (most of which is open source). The JVM is therefore probably the best "general purpose" choice unless you have some very specific requirement which is unlikely to be possible on the JVM, e.g.:
If you are doing lots of embedded / realtime / systems programming wthat requires hardware access you probably need to go for C/C++
If you are coding purely for web-based clients, you probably want to use JavaScript (if you are also writing code on the server side you can consider JavaScript code generation frameworks/libraries that can work on the JVM, e.g. Vaadin or ClojureScript)

the answer is pretty much depends on the technologies you're using and there is no silver-bullet solution for this.
In general, this solutions will fall into one of the following categories:
Some interprocess communication techniques
Integrations provided by the language/platform itself
Database/some common storage (even files :) )
Example of the first:
Sockets/pipes/whatever you operating system allows.
CORBA - allows to write distributed code in different languages.
Google protobuf - allows serialization/deserialization of data-objects and its language agnostic
For the second it really depends on language/ecosystem you're using.
Examples for java:
JNI - Java Native interface - allow to execute code (dlls/so) outside the JVM.
JCA - if you're in the enterprise environment - you can write the integration with the legacy systems in this.
For languages that are compiled into the native code its less tricky - you can write and compile some code, say in Pascal, and then use the DLL in C.
Sometimes when we're talking about Java there is a plethora of languages that have their own syntax and compiler, but their compiler compiles into java binary code that can be run inside the jvm. So if your solution is based on these languages the integration will be easier. Languages like Scala, Groovy, Closure, Jython and so on are falling into this category.
The last but not the least technology to be mentioned is Web Services. This is a very popular tool for integration of different system, although its more used in enterprise environment.
Basically its an abstraction over the sockets layer that allows to send data objects in XML/JSON format between the processes/servers. Both of XML and JSON are language agnostic, so its not an issue to create an XML in a program written in C++ and then consume it in JAVA.
Hope this helps

Related

Platform-Independent Java <-> C# Interoperability

We want to use existing C# sources within our Java project. So far, this would not be a great problem since using e.g. Java Native Interface (JNI) is quite straight forward.
The problem is that the software shall also run on non-windows OS. So, we can compile the C# sources with Mono in order to make them executable on e.g. Linux. But how about the integration within Java? JNI or any COM-based solutions for C# <-> Java interoperability are OS-dependent and only work e.g. on Windows.
One possible solution would be the implementation of webservices. Has anybody another idea of how to solve this problem? I would be very thankful for alternative suggestions!
Thanks very much!
Regards

This is maybe not an "answer" as such, more a bit of discussion of how I viewed a similar (I think) situation.
I had a major investment in a C#/.Net-based client-server style system. So when I decided that I also wanted to support an Android "client" app I looked into various options. To me the most important factor was to maintain my C# classes as the defining classes for the object interchange between the existing system and the to-be-written Java Android app.
What I eventually settled on, and tweaked to my liking, was a system where Google Protocol Buffers is the interchange media. (If you're not familiar with them they are a sort of JSON-like interchange format.)
https://developers.google.com/protocol-buffers/
At the .Net end I use ProtoBuf-Net, written by Marc Gravell (he works here at SO, I believe). It includes the ability to take .Net objects and generate .proto files, the defining file for Protocol Buffers.
https://code.google.com/p/protobuf-net/
At the Android end I use ProtoStuff, written by David Yu. There is a part of his code that takes a .proto file and generates the corresponding Java classes.
https://code.google.com/p/protostuff/
One problem I encountered was that this didn't work well for my .Net classes that are derived classes, which was most of them. I created a workaround that is described in my comment to the answer here:
How to get protobuf-net to flatten and unflatten inherited classes in .Net?
This is now working to my satisfaction.
Note that I haven't talked at all about how the Android app connects to the Windows-based system and how the communications is performed. That was secondary for me - my primary consideration was making the C# class definitions the definitive definitions and having Java classes created from them automatically, and then the object-to-object interchange. (In the event I'm using a home-made TCP/IP communications link, but the actual communications could be anything, probably also web services.)
Hope this helps.

So I did a lot of research on this topic and want to share my findings with you:
One (from a technical point very attractive) option is to use commercial bridges between Java and .Net. For sure, the most popular products are JNBridge and Javonet. Both products seem to be quite easy-to-use, have good support and seem to be very sophisticated. Especially JNBridge already supports bridging between Java and Mono too, which allows the portation to also non-Windows OS, which is one of our main requirements as stated above. Javonet also wants to integrate Mono and is going to release this feature soon. However, both solutions are commercial and one needs to weigh their features against the respective costs. Nevertheless, from a pure technical point of view, they look great and also state to enable very fast communication between Java and .Net (faster than with web services).
Another option is to connect Java and .NET via COM. Since COM is generelly defined platform-independently, this could work on multiple OS. There are lots of open source projects that could be used for such an implementation, such as EZJCOM, J-Interop, JACOB or JCOM. The main restriction (expecially for our project) is that Mono only supports COM-interoperability under Windows (yet). So, this is not really an option for us. But if you want to create Java-.NET interoperability on Windows only, this is a good way.
The straighforward way of integrating Java and C# is to use Java Native Interface (JNI). You can also find manifold implementations that make JNI more easy to use, the most popular one is probably jni4net which seems to be a very active and frequently used project. But there are also others with specific pros and cons, such as Caffeine, Espresso or csjni. Finally, JNI is not 100% platform independet. It is applicable on different platforms, but you have to generate platform-specific code which makes it clearly less usable for our purposes. If you limit your application to Windows, jni4net seems to be a very good choice.
The third option could be to run both the Java and the .Net part within a Common Language Runtime. Ikvm.net is one possible and very popular solution therefore (as mentioned above by Samuel Audet). The drawback of this option is the loss of features and efficiency of the JDK.
The last and surely most generic alternative is to set up webservices between the Java and the .Net world. For this solution, one needs to find appropriate ways for serializing/deserializing objects from/to Java and .Net. There are manifold possible solutions for that available. RenniePet mentioned a sophisticated solution based on Protocol Buffers. Others exist as well such as http://java-cs-bridge.sourceforge.net/. This option might have a potential drawback when considering communication runtime, but may be the way to go for us.
Hope this may help anyone in the future that is confronted with the same problem.

Clojure targeted initially to servers?

Since Clojure is designed to run in a Java virtual machine (JVM), I don't understand this statement:
While Clojure started its life mainly as a server-side language, the advent of ClojureScript demonstrates that the core developers don't see that as its only purpose.
I am not real familiar with Java though I am interested in Lisp languages and hence Clojure, so this makes me wonder. Most web servers I've worked on are traditional Apache variants with standard server-side languages like Ruby, PHP, Perl, but I've never seen Java as a default installed server language in my hosting environments, so what is the meaning of this statement?
Second, JVMs are typically run on client operating systems like Mac or Windows, are they not? Sun says there are many billions of JVMs in the world, obviously this is not referring to servers.

One main point is that Clojure has several important philosophies and practices that, when applied to a particular runtime environment such as the JVM, JavaScript/ECMAScript, etc. yield a powerful language. These philosophies include:
simplicity: The ability to separate distinct parts. All the clojure variants provide for the separation of Code and data. This includes the ability to deal with data independently of the code that produced it. Directly this is the ability to read and write compound and simple (non-comound) data structures is built into the language.
Immutable Data Structures: All Clojure varients have data structures where producing a new version of even a very large data structure is efficient, and leaves the old data intact. If you pass a large data structure to several threads there is no need for locking because they work on different "forks" of the data. This is all done without copying (with structural sharing) and is efficient.
Explicit handling of Identity, State and Time: All the Clojure Varients provide explicit handling of sequences of events built into the languge. This is different between variants depending on the platform. For instance ClojureScript which produces JavaScript as it's output has no place for coordinated syncronous updates because JavaScript has only one thread, though it has all the rest of the types.
There is a lot more and it can be found on the Clojure Philosophy Page. It is also worth mentioning that If not the majority, then a very large part of the worlds web applications are written almost entirely in Java. Many people find that Clojure provides them a way to interact with this world, even if Java isn't their preferred language.

Java is exactly as much of a server-side language as Ruby or Perl (though not really PHP): It's a general-purpose language that is frequently used to write server applications, including Web applications and SOA services. Whether Java is "installed as a default", it's typically trivial to install on the Unix machines that are the usual hosts for Java servers.
A JVM can theoretically run on any platform; there are JVMs that run on bare x86 hardware, and Blu-Ray players have embedded JVMs. Sun originally thought that Java was the future for rich-client applications, but instead it's found a much wider use in powering Web sites and other services that clients access through various APIs.

By server-side language the author does not mean just the web server. It may include a whole stack of services running on the server from simple file upload to big data processing, served to the clients via the web.
Second, JVMs are typically run on client operating systems like Mac or Windows, are they not?
JVM is a development as well as deployment platform. There are numerous web scale applications deployed over JVM. It is very common to have JVM installed on you servers if your stack includes java based services.

The author of that statement means that Clojure was envisioned as a server-side language, but it has enough power where there is a demand to be able to use Clojure on the client as well.
An important distinction though is that it isn't like Clojure is actually running in a browser. ClojureScript is a tool that compiles client-side (i.e. browser) Clojure code into JavaScript. It is similar to CoffeeScript, which compiles Ruby-style code into JavaScript.
So ClojureScript is nothing more than syntactic sugar that lets people who love the power and succinctness of Clojure on the server side still write Clojure on the client side in the browser. But in the end, that client-side Clojure isn't really Clojure at all but JavaScript.
So when it comes to ClojureScript, the JVM is irrelevant.

I don't know much about Clojure's history, but it seems clear that it has been intended as a general-purpose language for some time--whatever initially pushed Hickey et al. to want to develop it. Because Clojure supports easy access to existing Java libraries and is able to create standard Java-style jar files--both crucial benefits on a server as well as elsewhere--it would have been obvious early on that Clojure could be useful outside of servers. So my answer to why "the advent of ClojureScript demonstrates that the core developers don't see [server side applications] as its only purpose" is that no such demonstration was needed.

Find an efficient way to integrate different language libraries into one project using Python as the "glue"

I am about to get involved in a NLP-related project and I need to use various libraries. Some are in java, others in C/C++ (for tasks that require more speed) and finally some are in Python. I was thinking of using Python as the "glue" and create wrapper-classes for every task that I want to do that relies on a different language. In order to do that, the wrapper class, for example, would execute the java program and communicate with it using pipes.
My questions are:
Do you think that would work for cpu-demanding and highly repetitive tasks? Or would the overhead added by the pipe-communication be too heavy?
Is there any other (preferably simple) architecture that you would suggest?

I would simply advise not doing this.
Don't implement stuff in C/C++ "for speed". The performance benefit is not likely to be as great as you expect; e.g. compared with implementing in Java using "best practice" design and performance techniques.
Don't try and glue lots of languages together. You are setting yourself up for lots of portability issues, difficulties in debugging, and reliability issues; e.g. due to C / C++ bugs crashing the JVM. In addition, there are performance overheads in bridging between languages, and there can be unexpected bottlenecks. (For instance, you may find that your C/C++ has to be run single-threaded due to threading issues, and that you therefore can't get the benefit of Java multi-threading on a typically multi-core system.)
Instead, I advise you to look for libraries that allow you to implement the entire application in one language. If that is not possible, design it so that the different language components are different executables / processes, communicating via some kind of RPC, messaging, or whatever.

Whether or not you'd have problems communicating over pipes / sockets has nothing to do with how CPU intensive the tasks are, but how frequently you'd need to send information between the processes and how much data they need to send. Setting up threads to do your communication will have little processing overhead.
You can probably automatically wrap the C/C++ code with Python (SWIG, ctypesgen, Boost.Python), so the only glue you'll have to write yourself would then be talking to Java.
You could also do it the other way -- run the Python code in the JVM with Jython so the Python and Java code are together, then talk to the C/C++ from there.

You should take a look at Apache UIMA. It is designed exactly for this. From the project website:
The Frameworks run the components, and are available for both Java and C++. The Java Framework supports running both Java and non-Java components (using the C++ framework). The C++ framework, besides supporting annotators written in C/C++, also supports Perl, Python, and TCL annotators.
UIMA can manage pipes and annotators and is built to scale.

I would look at Jepp or JPype instead of using IPC for this. I would avoid Jython since loading the C/C++ libraries into Java would probably be harder than into CPython.

1) Do you think that would work for cpu-demanding and highly repetitive tasks? Or would the overhead added by the pipe-communication be too heavy?
Depends on your task. If this is a typical NLP app where you have a large model loaded in memory and you only communicate relatively small pieces of data (strings in, label sequences/parse trees out), it may work. Pipe communication is hard to get right, though, since there's a lot of buffering and synchronization issues you have to tackle. Python is a very good glue language, but it doesn't solve everything.
2) Is there any other (preferably simple) architecture that you would suggest?
Make your NLP components services and connect to them via REST interfaces. There are off-the-shelf tools that do this, e.g. CLAM. Pyro and SPIRO make communication between Java and Python even more direct and might be easier to use than HTTP/REST (but YMMV).
The parts that are written in C/C++ can also be integrated with CPython using Cython. Don't start implementing things in C or C++ because you think they'll be faster, though; you can also implement them in Python first, then see if you can get the desired performance with NumPy and/or Cython.

Java app & C++ app integration / communication

We have two code bases, one written in C++ (MS VS 6) and another in Java (JDK 6).
Looking for creative ways to make the two talk to each other.
More Details:
Both applications are GUI applications.
Major rewrites or translations are not an option.
Communications needs to be two-way.
Try to avoid anything involving writing files to disk.
So far the options considered are:
zero MG
RPC
CORBA
JNI
Compiling Java to native code, and then linking
Essentially, apart from the last item, this boils down to a choice between various ways to achieve interprocess communication between a Java application and a C++ application. Still open to other creative suggestions!
If you have attempted this, or something similar before please chime in with your suggestions, lessons learnt, pitfalls to avoid, etc.
Someone will no doubt point out shortly, that there is no one correct answer to this question. I thought I would tap on the collective expertise of the SO community anyway, and hope to get many excellent answers.

Well, it depends on how tightly integrated you want these applications to be and how you see them evolving in the future. If you just want to communicate data between the two of them (e.g. you want one to be able to open a file written by the other, or read a stream directly from the other), then I would say that protocol buffers are your best bet. If you want the window rendered by one of these GUI apps to actually be embedded in a panel of the other GUI app, then you probably want to use the JNI approach. With the JNI approach, you can use SWIG to automate a great deal of it, though it is dangerously magical and comes with a number of caveats (e.g. it doesn't do so well with function overloading).
I strongly recommend against CORBA, RMI, and similarly remote-procedure-call implementations, mostly because, in my experience, they tend to be very heavy-weight and consume a lot of resources. If you do want something similar to RMI, I would recommend something lighter weight where you pass messages, but not actual objects (as is the case with RMI). For example, you could use protocol buffers as your message format, and then simply serialize these back and forth across normal sockets.
Kit Ho mentioned XML or JSON, but protocol buffers are significantly more efficient than either of those formats and also have notions of backwards-compatibility built directly into the definition language.

Use Jacob ( http://sourceforge.net/projects/jacob-project ), JCom ( http://sourceforge.net/projects/jcom ), or j-Interop ( http://j-interop.org ) and use COM for communication.

Since you're using Windows, I'd suggest using DDE (Dynamic Data Exchange). There's a Java library available from Java Parts.

Dont' know how much data and what type of data you wanna transfer and communicate.
But to simplify the way, I suggest using XML or Json based on HTTP protocol.
Since there are lots of library for both applications and you won't spend too much effort to implement and understand.
More, if you have additional applications to talk with, it is not hard since both tech. are cross-languages.
correct me if i am wrong

Java Backend and Rails Frontend

I have a startup considering building a Java backend and a Rails frontend. The Java backend will take care of creating a caching layer for the database and offer other additional services. The Rails frontend will mostly be for creating the webapp and monitoring tools.
What startups/companies out there are using this kind of setup? What are some gotchas in terms of development speed, deployment, scalability, and integration?
(What would be helpful for me is personal experience or informal case studies. I'd like to de-priorities answers addressing alternatives like Grails or JRuby unless it turns out to be a big part of the equation)
Thanks!

I've never done any rails development but here are my thoughts. Why not just use Grails? I've done a fair amount of Grails development and it works well for rapid prototyping. It also offers all the power of Java, Spring, and Hibernate. Instead of dealing with communication between two different technologies you could take advantage of the fact that Grails uses Spring and Hibernate under the covers to deal with caching as well as any other requirements these technologies support. If you have Java developers Grails should not be to hard to pick up. The grails plugin story is decent. All of them are stored in a central place and are easy to obtain but quality differs depending on the plugin author. You also have to remember that since Grails uses Groovy which is syntactically similar to Java and runs on the JVM it is very easy to use existing java code including the multitude of available Java libraries. I don't know this for certain but I assume there are a lot more libraries out there for use with Java and there for with Grails then are available for the Ruby language and Rails. I can't make a resource usage call for you but my question would be how many people do you have with experience designing Rails systems that use Java on the back end with all the gotchas that will go along with that? You may find that you have no one who is proficient at debugging when something goes wrong in communication between the two technologies.

I have used similar pair for one of my projects, but instead of RoR I used Python. I don't think there's large difference.
In general, there's nothing specific in this kind of programming. Two most important things you must care about:
good modularity;
well thought-out protocol between RoR and Java.
First is about exact functionality decomposition between parts of the system. We've got some troubles because of not understanding what part of a job must be done by Java, and what part by Python. In general, you must tie together all functions that are close to each other, and thing that are far must be connected in a very few places. I guess you know rules of good modularity, but in case of composing different languages this must be thought-out much more carefully. You can also be interested in creating several distinct Java services (e.g. one for database caching and another one for all the rest) to be able freely combine them or even use in other projects later.
Second is about communication between your parts. I can see two ways to communicate: through database and through pure network protocol. The former need some network communication anyway, so we used pure network protocol, without any other way to connect parts.
We had experimented a lot with SOAP, but it gave hundreds of errors: this protocol is quite ok to connect services written in one language (i.d. Java to Java), but terrible for connecting services in different languages - automatic tools for generating WSDLs gave different results for Java and Python, and manual creating of scheme was hard and labor-intensive.
So we used to REST. It uses all the features of HTTP protocol such as all four main HTTP methods (POST, GET, PUT, DELETE), error codes and many other things, so it covers almost everything you may want. The only restriction with REST is that it can't hold state, so you may need to implement your own sessions mechanism.
If you're not very comfortable with REST and seek some real example, see Facebook Graph API and for implementing REST services in Java you can use Restlets.

Perhaps I'm stating the obvious here, but the biggest gotcha here is the actual fact you're combining two technologies. Not only will development be slower - Rails and all Java frameworks are expecting to have a full Ruby/Java application - but for the other issues you mention (deployment, scalability, integration), the existing tools and solutions will not work, or at least not very well.
If you have a compelling reason to combine the two, go ahead but expect to spend more time on these issues than with a single technology solution. If you don't, pick either Rails or Java.

I would strongly urge you to consider having the frontend and backend in the same JVM for performance reasons.
For Rails, JRuby can run Rails, and you can have it in a Java EE container providing fast communication between components.
Having multiple architechtures which do not run in the same process requires you to do network based communication which takes much, much longer than just passing a reference to an object (I do not expect you to want to use shared memory).

Twitter.com is supposedly using a Java backend, and Rails for their web interface. Here are some resources:
http://www.radicalbehavior.com/5-question-interview-with-twitter-developer-alex-payne/
http://blog.adsdevshop.com/2008/05/02/twitter-is-not-abandoning-rails/
http://twitter.com/#!/ev/status/801530348
More specifically, they use Scala on the backend (which compiles to Java byte code):
http://www.artima.com/scalazine/articles/twitter_on_scala.html
I don't think there is anything wrong with this approach. However, you will be supporting two different platforms/VMs. I have seen this kind of situtation result in a sort of fracturing of expertise within an organization. And you end up paying to a lot to have two sets of expertise in house.
If the problem of segmenting expertise concerns you, I would suggest running JRuby of Rails. It is very mature now - and runs better than Rails on Ruby 1.8.x.

We have used Ruby on Rails for front-end and Core-Java SOAP Api for backend. It worked very well.
The main issue is not with performance but with the people. It is very difficult to hire expert people in Ruby on Rails for everything, instead you hire few or you can easily mould to just learn Rails UI.
As startup point of view, it is best strategy. So many believe ROR is best for startups... but very few knew it because in most of MNC's we use Java, so there is pressure to learn Rails and Code, or hire very few ROR developers. So instead you can use ROR for frontend (only few people do that, or it is easy to learn) and use experienced Java developers for DB updates and business logic.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.