I know that MPI does it, also heard that Erlang has nice support for this. But is there any similar frameworks/languages on JVM? I need to run one program distributed on multiple machines transparently.
Thanks,
The "classic" solution for this is Terracotta Cluster, which provides JVM-level objects distributed across a cluster, where "cluster" means distribution across a network, not just across processes.
It's open-source (or bits of it are, anyway), but I have no personal experience of it. It's impressive technology, though.
Also been hearing good things about Hazelcast, also open source, though I'm not sure it is transparent like Terracotta. On the flip side, if it isn't transparent it means it is not enhancing byte code which some people prefer to avoid due to stack traces no longer lining up with the source.
Related
I've been digging into the depths of IBM's research on JavaSplit and cJVM because I want to run a JVM program across a cluster of 4 Raspberry Pi 3 Model B's like This.
I know nearly nothing about clusters and distributed computing, so I'm starting my dive into the depths by trying to get a Minecraft Server running across them.
My question is, is there a relatively simply way to get a Java program running on a JVM to split across a cluster without source code access?
Notes:
The main problem is that most java programs (toy program included) were not built to run across a cluster, but I'm hoping that I can find a method to hack the JVM to have it work.
I've seen some possible solutions but due to the nature of Minecraft and Java, updates come so frequently and the landscape changes that I don't even know what is possible.
As far as I know, FastCraft implements multithreading support, or it used to and it's now built in.
Purpose:
This is a both a toy program and a practical problem for me. I'm doing it to learn how clusters work, to learn more about Linux administration and distributed computing, and because it's fun. I'm not doing it to setup a minecraft server. The server is a cherry on top, but if it doesn't work out I'll shove it on a Dell tower.
MineCraft can be scaled using what is effectively a partitioning service. The tool which is usually used is BungeeCord This allows a client to connect to a service which passes the session to multiple backend servers which run largely without change. This limits the number of users which can be in one server, but between them you can have any number of servers.
I can only reiterate that such a generic solution, if one exists, is not commonly applied. There are inherent challenges to try and distribute a JVM, such as translating a shared memory execution model, where all memory access is cheap, to a distributed model, where non-local memory access is orders of magnitude more expensive, without degrading performance. This requires smart partitioning of data, and finding such partitions in an automated way is a very complex optimization problem.
In the particular example of minecraft, one would additionally have to transform a single threaded program into a multi threaded one, which is a rather complex program transformation by itself.
In a nutshell, solving the clustering problem in such generality is a research level topic, for which, to the best of my knowledge, no algorithms competitive with manual code changes currently exist. In addition, if such an algorithm were to exist, if would be very unlikely to be offered free of charge, because it would represent both a significant achievement, and could be licensed for a lot of money.
What is Terracotta?
What services does it offer?
What problems does it solve?
What other products solve problems similar to those that Terracotta solves?
Find a great article about Terracotta and how it works at InfoQ written directly by Orion Letizi, co-founder and software engineer at Terracotta:
http://www.infoq.com/articles/open-terracotta-intro
It helped me to prepare for a webcast about terracotta and how it can be used for clustering and scaling grails applications and gave me a good overview about Terracotta.
I like to think about Terracottas DSO in terms of advanced parallel architectures: Terracotta turns your message-passing multicomputer into a usual unified memory multiprocessor. Multicomputers are different from multiprocessors in that processors share memory and, therefore, are easier to program because you just write into memory in usual multithreading way. Though, you it means that you need to explicitly synchronize access to the shared data using a lock, system saves you from the need to explicitly message-passing data marshaling and resolves the biggest parallel programming issue -- the cache coherence -- for you. Multiprocessor marshals the data for you when you take/release the lock. It is, therefore, desirable. But, initially you have a bunch of computers -- a multicomputer.
The magic is achieved by injecting some code into your classes at object field/lock access points. To correspond DB world, Terracotta considers all updates done under a lock atomic (transaction). Likewise multiprocessors can have a global storage, Terracotta allows to back up the locally updated data to disk.
What other products solve problems similar to those that Terracotta solves?
Try Hazelcast, It is super simple to use. Peer to peer, highly scalable, fully open source clustering technology for Java. It is simply distributed Map, Queue, MultiMap, ExecutorService. You can use its Map as a distributed cache.
I found an article in JavaWorld about Terracotta at http://www.javaworld.com/javaworld/jw-01-2009/jw-01-osjp-terracotta.html.
I am evaluating various Java object distribution libraries (Terracotta, JCS, JBoss, Hazelcast ...) for an application server and I'm having trouble understanding their behavior on various axes.
My requirements for distributed objects are not many -- they boil down to one-to-one and one-to-many messaging. There's more, but for the rest we just use JDBC and I assume I can plop a cache in front of this using any of the available libraries.
I would like a system that distributes objects and exhibits locality properties -- in other words, a server that grabs an object tends to hold onto it without excess communication to other nodes. Hazelcast looks simple (and peer-to-peer is nice) but seems to require objects are distributed evenly across all nodes.
I'd like a way to persist objects, preferably transparently. I plan on using EC2, so I have the option of temporary, free, limited local storage (the disk) and permanent, non-free, unlimited storage (S3). It'd be great not to worry about OutOfMemoryErrors.
I like the simplicity and "magic" of Terracotta but it scares the beejeezus out of me. Also in order to truly scale you have to spend $$$$, otherwise you're communicating with a single hub.
I'm cheap and I want something not only free but mature and with a large userbase.
Thanks for any input.
Terracotta seems like a perfect fit for your situation.
It's simple to setup
it can be configured to be persistent (use an EBS volume for EC2)
it's closely integrated with Ehcache (actually Terracotta bought Ehcache) for great distributed caching performance
the free offering scales pretty well with several clients.
Just start playing around with it. I bet you'll love it. To ease your performance fears, simply run a through put test for message passing. This shouldn't take much more than an afternoon of your time.
I have to admit that I haven't used Terracotta for a year and that I don't know the others you suggested.
Terracotta does fit the bill. I understand your objections, but here's my comments:
1) Terracotta does exhibit locality - and is probably the best system at it compared to those you mentioned. Objects are only brought in to a local JVM where requested. Locking for reads or writes is performed using a leasing mechanism. This means if you exhibit perfect locality in your system then you will incur very little network overhead.
2) Terracotta provides disk persistence out of the box - in the OSS version (you don't have to pay $$$$)
3) Why does it scare you so much? Just use EHCache as a cache, or the Hibernate 2nd Level Plugin. It's incredibly easy to setup and use.
4) Yes, Terracotta FX requires you to pay (for scale-out servers). However I would suggest that if you have a system that is mostly read and exhibits true locality then I don't think you'll have a problem getting the scale you are looking for. With Terracotta 3.2 the performance of the Hibernate 2nd Level Cache is 100,000 ops/s using 8 application servers and one Terracotta server at 100/0 read/write ratio and 12,000 ops/s using the same config at 95/5 read/write ratio.
(I just did a talk for the Bay Area SDForum on these numbers so I happen to have them handy)
Yes Hazelcast will distribute your objects across the cluster. However you can enable near cache if you want to reduce the communication cost.
http://www.hazelcast.com/documentation.jsp#MapNearCache
Btw, it's not clear what you are looking for (messaging is not the same as clustering/distributed objects).
If you are looking for messaging in Java I recommend you have a look at RabbitMQ (it's Erlang based but that doesn't matter).
I've work in embedded systems and systems programming for hardware interfaces
to date. For fun and personal knowledge, recently I've been trying to learn more about server programming after getting my hands wet with Erlang. I've been going back and thinking about servers from a C++/Java prospective, and now I wonder how scalable systems can be built with technology like C++ or Java.
I've read that due to context-switching and limited memory, a per-client thread handler isn't realistic. Usually a thread-pool is created and a mix of worker-threads and asynchronous I/O is used to handle requests. I wonder, first of all, how does one determine the thread pool size? Does one simply have to measure and find the optimal balance? Eventually as the system scales then perhaps more than one server is needed to handle requests. How are requests managed across mulitple servers handling a large client base?
I am just looking for some direction into where I might be able to read more and find answers to my questions. What area of computer science would I look into for more information in this area? Are there any design patterns for this area of computing?
Your question is too general to have a nice answer. The answer depends greatly on the context, on how much processing any one Thread does, on how rapidly requests arrive, on the CPU family being used, on the web container being used, and on many other factors.
for C++ I've used boost::asio, it's very modern C++, and quite plesant to work with. Also the C++0x network libraries will be based on ASIO's implementation, so it's valuable knowledge.
As for designs 1thread per client, doesn't work, as you've already learned. And for high performance multithreading the best number of threads seems to be CoresX2, but for servers, there is lots of IO per request, which means lots of idle waiting. And from experience, looking at Apache, MySQL, and Oracle the amount of threads is about CoresX10 for database servers, and CoresX40 for web servers, not saying these are the ideals, but they seem to be patterns of succesful systems, so if your system can be balanced to work optimally with similar numbers atleast you'll know your design isn't completely lousy.
C++ Network Programming: Mastering Complexity Using ACE and Patterns and
C++ Network Programming: Systematic Reuse with ACE and Frameworks are very good books that describe many design patterns and their use with the highly portable ACE library.
Like Lothar, we use the ACE library which contains reactor and proactor patterns for handling asynchronous events and asynchronous I/O with C++ code. We use sizable worker thread pools that grow as needed (to a configurable maximum) and shrink over time.
One of the tricks with C++ is how you are going to propagate exceptions and error situations across network boundaries (which isn't handled by the language). I know that there are ways with .NET to throw exceptions across these network boundaries.
One thing you may consider is looking into SOA (Service Oriented Architecture) for dealing with higher level distributed system issues. ACE if really for running at the bare metal of the machine.
Is it possible to dump an image of a running JVM and later restore the previous state by loading the image into the JVM? I'm fairly certain the answer is negative, but would love to be wrong.
With all the dynamic languages available for the JVM comes an increase in interactivity, being able to save a coding session would help save time manually restoring the VM to a previous session.
There was a JSR 323 proposed for this a while back but it was rejected. You can find some links in those articles about the research behind this and what it would take. It was mostly rejected as an idea that was too immature.
I have heard of at least one startup (unfortunately don't recall the name) that was working on a virtualization technology over a hypervisor (probably Xen) that was getting pretty close to being able to move JVMs, including even things like file system refs and socket endpoints. Because they were at the hypervisor level, they had access to all of that stuff. By hooking that and the JVM, they had most of the pieces. I think they might have gone under though.
The closest thing you can get today is Terracotta, which allows you to cluster a portion of your JVM heap, storing it in a server array, which can be made persistent. On JVM startup, you connect to the cluster and can continue using whatever portions of your heap are specified as clustered. The actual objects are faulted in on an as-needed basis.
Not possible at present. In general, pausing and restarting a memory image of a process in a different context is incredibly hard to achieve: what are you going to do with open OS resources? Transfers to machines with different instruction sets? database connections?
Also images of the running JVM are probably quite large - maybe much larger than the subset of the state you are actually interested in. So it's not a good idea from a performance perspective.
A much better strategy is to have code that persists and recreates the application state: this is relatively feasible with most JVM dynamic languages. I do so similar stuff in Clojure, where you have an interactive environment (REPL) and it is quite possible to create and run a sequence of operations that rebuild the application state that you want in another JVM.
This is currently not possible in any of the JVMs I know. It would not be very difficult to implement something like this in the JVM if programs run disconnected from their environments. However, many programs have hooks into their environment (think file handles, database connections) which would make implementing something like this very hairy.
As of early 2023, there's some progress in this space and it seems a lot of things can at least be tried, even if without claims for their production readiness.
One such feature is called CRaC. You can check their docs or even get an OpenJDK build that includes the feature. The project has its own repo under OpenJDK and looks quite promising.
Another vendors/products to check:
Azul ReadyNow!
OpenJ9 InstantOn
What's also really exciting, is AWS Lambda SnapStart. It doesn't give you full snapshoting capabilities, and is intrinsically vendor-specific, but it's what a ton of Java engineering who use AWS Lambda were waiting for so long.