I'm developing a Java application that consists of a server and a client (possibly multiple clients in future) which may run on different hosts.
For communication between these two I currently use a custom protocol which consists of JSON messages that are sent over network sockets and that are converted back to Java Bean objects on both sides. However the more complex the application gets I notice that this method doesn't meet my standards and is too complex.
I'm looking for a well established, possibly standardized alternative.
I've looked at Remote Method Invocation (RMI) but read that the protocol is slow (big network overhead).
The technology I'm looking for should be lightweight (protocol and library wise), robust, maybe support compression (big plus if it does!), maybe support encryption, well document and well established (e.g. an Apache project). It should be as easy as calling a method on a remote object with RMI but without its disadvantages.
What can you recommend?
Avro is an Apache project that is designed for cross-language RPC (see Thrift for its spiritual predecessor). It is fairly new (less than two years old), so it isn't as well-established as RMI, for example. You should still give it a chance, though; large projects like Cassandra are moving to Avro. Avro is also a sub-project under Hadoop and has been receiving healthy support from that community.
It designed to be fast and support multiple languages, so you will probably need to introduce another step during compilation in which you translate an Avro IDL file into Java, although it isn't strictly necessary. The rest is typical RPC.
One nice thing about Avro is that its transport layers are independent of how data is represented. For example, it comes with various "transceivers" (their base communication class) for raw sockets, HTTP, and even local intra-process calls. HTTPS and SASL transceivers can provide security.
For representing data, there are encoders and decoders of various types, although the default BinaryEncoder generally suffices since Hadoop, Cassandra, etc... focus on efficiency. There is also a JsonEncoder in case you find that useful.
This really all depends on what kind of compatibility you require between client and server. CORBA is a well established and standardized way of communicating between different languages, but it requires a bit more effort to use than Java RMI. If the clients are running from some external, untrusted source, then an HTTP based protocol makes more sense. If you follow a REST approach, then it becomes easier to scale out later as you need to add more servers.
If both client and server are Java, and they are running within a trusted network, RMI meets your requirements for being "well established". Performance overhead of RMI is exaggerated, but very early versions did not pool connections.
If you're willing to toss away both "well established" and "standardized", you can use Dirmi as a substitute for RMI. It's faster, easier, has more features, and it doesn't have the firewall problems RMI has. Like RMI, it supports TLS (encryption), but neither supports built-in compression.
Whatever you choose, beware of lock-in. Try to design your server such that the remote access layer is a thin layer over the core code. This allows you to easily support multiple protocols, perhaps at the same time.
Mybe CORBA?
Would you consider HTTP/REST?
If so, you can leverage something like a Tomcat/Spring, and still support all the requirements you listed ( robust, lightweight, well documented, well established )
The RPC based protocols are simply antiquated.
Seriously, unless you're doing a web app that already requires the web baggage, you really do want RMI or, even better, CORBA. I recommend JacORB (www.jacorb.org).
Ignore general claims of slow/fast and perform your own performance tests.
Keep in mind that a software project is successful because it performs the useful function for which it was designed and intended, not because it uses the latest cool buzzword tech.
Good luck.
Apache MINA library for client-server communication and EJB3 will suit best
Related
There is this java web application with a lot of users. These users place some orders according to the data shown in their panels. The data is being updated second by second by calling an outsider service (via a webservice or so). The moment we get this data, users' panels must be updated immediately to make sure that users are placing valid orders.
So we need to push data to the client WebApp. Performance and reliability are of great concern.
What approach or technology do you suggest here? Should I use somthing like Comets? Or is using primitive WebSockets suitable?
Websocket is the fastest transport but it's not supported well (by old versions of Internet Explorer).
AJAX requests are not that fast (because browser potentially will establish new HTTP connection and HTTP headers introduce overhead too) but much better supported. And with correct keep-alive settings HTTP connection should be reused.
You can use some generic implementation like sock.js (well supported by Spring framework). It'll choose the best available transport automatically. But it introduces an additional layer of complexity.
There are a bunch of things to consider.
There are various technologies. You've mentioned Comet. There is also Web Sockets. Without support at the protocol level, you are stuck with pretty much polling for data. This is the approach Comet takes.
Web Sockets is specifically designed for this. It has far less overhead than a TCP or UDP based message stream.
Are you targeting modern browsers or also need to support older browsers?
There is varied support for protocols, across versions, implementations may have some caveats and so on.
Or is using primitive WebSockets suitable?
It is perfectly acceptable. Though you have to deal with variances with browser differences, or you may find porting your web sockets across web servers may require some work.
For instance, if you are deploying on Jetty (and using its API natively), you need to implement WebSocketCreator. If you are using Grizzly natively, you need to implement WebSocketListener and so on.
Atmosphere tries to fix this by providing a uniform interface which works across various servers. Again, once you pick such a library, you will need to make changes if you want a different library in the future.
Or you could do use a service like Pusher or any of its competitors.
If you Google around, you should be able to find plenty of examples.
Hopefully it helps.
There is RMI, which I understand to be relatively fragile, direct Socket connections, which is rather low level, and Strings, which while about as solid as it gets seems to be the metaphorical PHP.
What less basic options do I have for internet based Client/Server communication ? What are the advantages/disadvantages ? What are some concerns I should take into consideration ? Third party library suggestions are fine, as long as they stay platform independent (i.e. no restrictive native code).
Looking for options, not a definite answer, so I'm leaving the details of my own requirements blank.
As you specified "internet based", there's a lot to be said for an HTTP-based, RESTful approach (I've highlighted some concerns you should consider):
Pros:
You can use/abuse one of the myriad web-tier frameworks for the server side (e.g. Spring MVC, Play!)
The low-level work has been done on client-side (Apache HTTPClient)
Plain-text protocol is easy to debug on the wire
Tons of tools available to help you debug the interactions (e.g. SoapUI) - you can pretend to be client OR server and so develop in isolation until the other end is ready
Using a well-known port (80/443) makes punching through corporate firewalls a whole lot easier
Cons:
There's a fairly major assumption that the server will be doing the lion's share of the work - if your model is "inverted" then it might not make much sense to be RESTful
Raw performance will be lower than a bits-on-the-wire socket-based approach
Plain-text protocol is easy to sniff on the wire (SSL can remedy this)
What does 'relatively fragile' mean? The issues with RMI have to do with it being a large superstructure on a narrow foundation, and specifically that it has a large reliance on DNS and Object Serialization. The closer you are to the silicon, the less 'fragile' any program becomes, but the more code you have to write. It's a tradeoff. I'm not a one-eyed RMI supporter, despite having written a book about it, but 'fragility' is too strong a word. It does what it does, and it does that reasonably well. RMI/IIOP does it even better in many respects if you need large-scale scalability. If your view of the world is an at-most-once remote method call without too much security, RMI/JRMP will give it to you. The further you get from that model the harder it becomes to apply.
Why WebServices took advantage over CORBA ?
I suspect everything started from the firewall issues: CORBA requests are binary, multiple random ports are required for the normal work so CORBA requests and responses used to be blocked by firewalls when these first appeared. HTTP and FTP also use phantom ports, but these protocols were in much more wider use so immediately was obvious that firewalls must be configured to allow them. As a result, developers cannot rely on possibility to have CORBA connection between the server and end user PC and need to use some more firewall friendly approach.
Firewalls appear much less an issue in communication between the specialized servers than can use separate networks, IP/MAC filtering, specialized firewalls and the like. I think CORBA, same as JDBC, are still used to communicate data between servers.
It may also be a factor that CORBA messages use aligned fields (to match boundary alignments as they are used in C/C++ data structures). Derived protocols (like Google Protocol buffers) do not send unnecessary bytes just for alignment. Hence they messages are compact and these protocols may be preferred when binary messages and fast pre-generated message parsers are desired. Protocol buffers that appear for me rather similar to CORBA by design (IDL-like compiler, stubs and servants, binary messages,
language interoperability) is really far from decline, being internally used in many Google services.
While CORBA framework is complex, a "properly done" web service stack is also not exactly trivial, so I do not think that the complexity of the standard has been an issue. Similarly, while the original OMG specification documents may appear horrible, the similar SOAP/WDSL specifications are equally complex, probably it is difficult to document a standard in an easy to read way.
CORBA protocols are not proprietary, they have been implemented in Free software many times, including JacORB and also GNU/Classpath implementations (well, now OpenJDK is also Free).
While initially CORBA may have been thought to provide what web services provide today, I think I agree that it seems that for this application CORBA has "lost".
However, as an RPC technology that is supported on a wide array of platforms (embedded), works and is tested well from native (C++), and can be used to implement pretty performant data transfer scenarios, I do think that CORBA still has an important niche. It's just got nothing to to with "the web".
To directly refer to the question: I like to think that CORBA "lost" because, unlike WebServices, it is not targeted at The Web as it is used today -- As in: Tunneling everything through port 80, running apps 60% in the browser and 40% in a webserver, etc.
I don't think it's a stretch to say that web services have won out in the marketplace. CORBA is a niche at best, and a small one at that.
Web services:
Simpler, although WS-* can add weight and complexity
Use HTTP as wire protocol instead of proprietary
Can tunnel through port 80 in firewall
Services aren't as complete
CORBA:
Requires ORB to operate
Proprietary vendors or open source
May use HTTP, but also use proprietary protocols
Provide services like naming, directory, transaction, security, etc.
CORBA lost primarily because of two things:
1) lack of good development/testing tools or IDE-plugins
2) See (1) again
What is the "official" Java API for client/server or P2P communication? Java RMI? Some other networking API??
Is this official networking API the standard for both SE and EE?
I'm sure the answer is very context-specific, so let's take a look at a few instances:
You have 2 swing clients installed on 2 machines and connected to the same network (or the Internet), and you want either of them to send the other a primitive, such as the integer 4, or some POJO, like a "Widget" object
Same as #1 above, but between a Swing client and a fully-compliant Java EE back-end (implementing managed beans, app servers, the whole nine yards)
I don't have a specific application in mind, I'm just wondering what are the "norms" for client-client and client-server communication in the world of Java.
If being bound by Java isn't a problem, RMI is a pretty abstracted solution when it comes to the client and server solution "exchanging" data (especially when the data is Java classes which might be difficult/too much effort to represent as textual data). Just make sure your object implements Serializable and almost anything can be transmitted over the wire.
If this doesn't fit your bill and you want to drop down the raw networking stuff, the client-server socket framework Netty is a pretty good choice.
There's no such thing as the most official networking API in J2SE, all J2SE APIs are official in the sense they are supported by Sun (now Oracle).
That said, you should choose your API based on following criteria:
Do you (or your team) know how to use particular API;
How simple/complex is this API to use;
What throughput are you aiming for? For performance-sensitive applications you may be forced to use binary protocol. For the rest of cases, you can use text-based protocol.
For example, between two clients simple text-based protocol will suffice for passing POJOs, for example using Apache MINA or Google protocol buffers.
This will work between client and server as well.
Response to Zac's questions in comment:
Binary protocols performance gain comes from the fact you don't need to convert everything to text form and back -- you just can pass binary presentation of you application memory with minimal changes, like, in case of BSD Sockets API, converting from host byte-order to network byte-order. Unfortunately, I don't know details about how RMI/Java serialization processes objects, but I'm sure, it still much faster than passing all data in readable form;
Yes, MINA and protocol buffers have Java APIs. They just not part of Java SE bundle, you have to download them separately. By the way, MINA can use both binary and readable serialization, depending on how you use it.
You should define notion of 'good' somehow, for example, answering to questions I mentioned above. If you want to use objects over network, use RMI. If you don't, Netty or MINA will suffice, whatever you'll find easier to master.
For P2P, Sun at one point pushed JXTA pretty hard.
I wouldn't dare to use RMI for P2P communication.
rmi is pretty much the standard java to java protocol. it's built in and very simple to use. most j2ee backends also communicate using rmi, although that's not the only possibility.
J2SE the most common is probably RMI or raw sockets.
J2EE uses a messaging bus that everyone (servers and clients) subscribes to which is quite different from rmi style solutions (although at the lowest level an implementation may still rely on RMI). It helps automate redundancy and failover. If you need this functionality I believe it can be used in SE as well.
I haven't used J2EE for quite a while now, so this may have changed, but I doubt it. The messaging system was a core component of J2EE.
We have a C++ application on Windows that starts a java process. These two apps need to communicate with each other (via snippets of xml).
What interprocess communication method would you choose, and why?
Methods on the table for us are: a shared file(s), pipes and sockets (although I think this has some security concerns). I'm open to other methods.
I'm not sure why you think socket-based communication would have security concerns (use SSL). It is often a very good approach as it is language agnostic, assuming that you have a well-defined communication protocol. Have a look at Google's protocol buffers, for example - they generate the required Java classes and streams.
In my experience, file systems (especially network file systems) are not well suited to such communication as they are not necessarily tuned for messaging (I've seen caching issues result in files being not picked up by the target process for example).
Another option is a messaging layer (AMQ or Tibco for example) although this will likely involve a greater administrative overhead (plus expertise) to set up.
Personally I would opt for a pure-socket approach because of its flexibility and simplicity. You will be in complete control.
I've used named pipes for communication between C# and a cross-platform c++ app and had nothing but good results. Barring that sockets is definitely the way to go.
Sockets are nice. They give you the ability to very easily create a blackbox testing layer around each component, as well as run each component on its own machine.
Security is definitely a concern, but there are a good range of options depending on how important it is. You can use SSL, custom handshaking, password protected logins and firewalls to help secure it.
Edit:
Not something I'd recommend, but there's also shared memory using JNI. Just thought I'd mention it because it's not on your list.
Ice is pretty cool :)