We are considering development of a mission critical application in Java EE, and one thing that really impressed me is the lack of session isolation in the platform. Let me explain the scenario.
We have a native Windows application (a complete ERP solution) that receives about 2k LoC and 50 bug-fixes per month from sparse contributors. It also supports scripting, so the costumer can add their own logic and we have no clue about what such logic does. Instead of using a thread pool, each server node has a broker and a process pool. The broker receives a client request, enqueues it until a pooled instance is free, sends request to that instance, delivers response to client, and releases the instance back to the process pool.
This architecture is robust because with so many sparse contributions and custom scripting, it's not uncommon for a deployed version to have some serious bug such as an infinite loop, a long-waiting pessimistic lock, a memory corruption or memory leakage. We implemented a memory limit, a timeout for requests, and a simple watchdog. Whenever some process fails to answer correctly and on time, the broker simply kills it, so the watchdog detects and starts another instance. If a process crashes before it started to answer a request, the broker sends the same request to another pooled instance, and the user doesn't know about any failure on the server side (except in admin logs). This is nice because some instances are slowly trashed by bogus code as they work on requests. Because most session data is held at the client or (in rare cases) at a shared storage, it seems to work perfectly.
Now considering a move to Java EE, I couldn't find anything similar on the spec or popular application servers such as Glassfish and JBoss. Yes, I know that most cluster implementations do transparent fail-over with session replication, but we have small companies that use our system on a simple 2-node cluster (and we also have adventurers that use the system on a 1-node server). With a thread pool, I understand that a buggy thread can bring an entire node down, because the server cannot detect and safely kill it. Bringing an entire node down is much worst than killing a single process - we have deployments where each node has about 100 pooled process instances.
I know that IBM and SAP are aware of this problem, based on
http://www.trl.ibm.com/people/kawatiya/pub/Kawachiya07vee.pdf,
and
http://java.sys-con.com/node/47362
, respectively. But based on recent JSRs, forums and open-source tools, there isn't much activity on the community.
Now comes the questions!
If you have a similar scenario and
use Java EE, how did you solve?
Do you know about an upcoming
open-source product or change in
Java EE spec that can address this
issue?
Does .NET have the same problem? Can
you explain or cite references?
Do you know about some modern and
open platform that can address this
issue and is worth the task doing
ERP business logic?
Please, I have to ask you not tell about making more testing or any kind of QA investment, because we cannot force our costumers to make this on their own scripts. We also have cases where urgent bug-fixes must bypass QA, and while we force the customer to accept this, we cannot make him accept that a buggy software part can affect a range of unrelated features. This is issue is about robust architectures, not development process.
Thanks for your attention!
What you have stumbled upon is a fundamental issue regarding the use of Java and "hostile" applications.
It's a fundamental issue not just at the Java EE level, but at the core JVM level. The typical JVMs available have all sorts of issues with loading "unsafe code". From memory leaks, class loader leaks, resource exhaustion, and unclean thread kills, the typical JVM is simply not robust enough to handle badly behaving code well in a shared environment.
A simple example is memory exhaustion of the Java heap. As a basic rule, NOBODY (and by nobody, I specifically mean the core java library and just about every other 3rd party library out there) catches OutOfMemory exceptions. There are the rare few who do, but even they can do little about it. Typical code handles the exceptions they "expect" to handle, but let others fall through. Runtime exceptions (of which OOM is one) will happily bubble up through the call stack all the way to the top, leaving behind a wreckage of unchecked critical path code, leaving all sort of things in unknown state.
Things such as Constructors or static initializers which "can't fail" leaving behind uninitialized class members which are "never null". These damaged classes simply don't know they're damaged. Nobody knows they're damaged, and there's no way to clean them up. A Heap that hits OOM is an unsafe image and pretty much needs to be restarted (unless, of course, you wrote or audited ALL of the code yourself, which, naturally, you won't -- who would?).
Now, there may well be vendor specific JVMs which are better behaved and give you better control. The ones based on the Sun/Oracle JVM (i.e. most of them) do not.
So, it's not necessarily a Java EE issue, it's a JVM issue.
Hosting hostile code in the JVM is a bad idea. The only way it's practical is if you host a scripting language, and that scripting language implements some kind of resource control. That could be done, and you can tweak the existing ones as a start (JavaScript, Groovy, JPython, JRuby). The fact that these languages give users direct access to Java libraries makes them potentially dangerous, so you may have to restrict that as well to only aspects wrapped by script handlers. At this point, though, the "why use Java at all" question floats up.
You'll note Google App Engine does none of these. It spools up a separate JVM for each application that's being run, but even then it greatly restricts what can be done within those JVMs, notably through the existing Java security model. The distinction here is that these instances tend to be "long lived" so as not to endure the processing costs of startup and shutdown. I should say, they SHOULD be long lived, and those that are not do incur those costs.
You can make several instances of the JVM yourself, give them a bit of infrastructure to handle requests for logic, give them custom class loader logic to try and protect from class loader leaks, and minimally let you kill the instances off (they're simply a process) if you want. That can work, and probably work "ok" depending on the granularity of the calls, and the "start up" time for your logic. The start up time will minimally be the loading of the classes for the logic from run to run, that alone may make this a bad idea. And it certainly WON'T be "Java EE". Java EE is not set up to do this kind of thing. But you're not clear what Java EE features you're looking at either.
Effectively, this is what Apache and "mod_php" does. Several instances, as processes, individually handling requests, with badly behaving once being killed off as necessary. This is why PHP is common in the shared hosting business. In this structure, it's basically "safe".
I believe your scenario is highly untypical, thus it is improbable that there is a ready made framework/platform addressing this need. Java EE sort of assumes that the request processing code is written by the same team as the rest of the app, thus it need not be isolated, watched and reset that often, and bug fixes would be handled the same way in all parts of the system. This assumption greatly simplifies development, deployment, testing etc. for most of the projects, not forcing them to pay for something they don't need, And yes, it isn't suitable for everyone. If you want something fundamentally different, you probably need to implement a fair amount of failover logic yourself. Java EE does provide the fundamental building blocks for this though.
I believe (although have no concrete experience to prove it) that .NET or other platforms are basically built on similar assumptions.
We had a similar - though not so severe - port of a really enormous Perl site to Java. On receiving an HTTP request we instantiate a class and call its processRequest method. surrounded by try-catch and time measurement. Adding a timer and thread would suffice to be able to kill the thread. This probably is sufficient in real life.
A Java EE server like glassfish is an OSGi container you might have more isolating means.
Also you could run an array of (web or local) applications on which you dispatch your request via a central web applications. Those applications then are isolated.
Even more isolated are serialized sessions and operating system processes starting a new JVM.
Related
I read that each application runs in its own JVM. Why is it so ? Why don't they make one JVM run 2 or more apps ?
I read a SO post, but could not get the answers there.
Is there one JVM per Java application?
I am talking about applications launched via a public static void main(String[]) method ...)
(I assume you are talking about applications launched via a public static void main(String[]) method ...)
In theory you can run multiple applications in a JVM. In practice, they can interfere with each other in various ways. For example:
The JVM has one set of System.in/out/err, one default encoding, one default locale, one set of system properties, and so on. If one application changes these, it affects all applications.
Any application that calls System.exit() will effectively kill all applications.
If one application goes wild, and consumes too much CPU or memory it will affect the other applications too.
In short, there are lots of problems. People have tried hard to make this work, but they have never really succeeded. One example is the Echidna library, though that project has been quiet for ~10 years. JNode is another example, though they (actually we) "cheated" by hacking core Java classes (like java.lang.System) so that each application got what appeared to be independent versions of System.in/out/err, the System properties and so on1.
1 - This ("proclets") was supposed to be an interim hack, pending a proper solution using true "isolates". But isolates support stalled, primarily because the JNode architecture used a single address space with no obvious way to separate "system" and "user" stuff. So while we could create APIs that matched the isolate APIs, key isolate functionality (like cleanly killing an isolate) was virtually impossible to implement. Or at least, that was/is my view.
Reason to have one JVM pre application, basically same having OS process per application.
Here are few reasons why to have a process per application.
Application bug will not bring down / corrupt data in other applications sharing same process.
System resources are accounted per process hence per application.
Terminating process will automatically release all associated resources (application may not clean up for itself, so sharing processes may produce resource leaks).
Well some applications such a Chrome go even further creating multiple processes to isolate different tabs and plugins.
Speaking of Java there are few more reasons not to share JVM.
Heap space maintenance penalty is higher with large heap size. Multiple smaller independent heaps easier to manage.
It is fairly hard to unload "application" in JVM (there to many subtle reasons for it to stay in memory even if it is not running).
JVM have a lot of tuning option which you may want to tailor for an application.
Though there are several cases there JVM is actually shared between application:
Application servers and servlet containers (e.g. Tomcat). Server side Java specs are designed with shared server JVM and dynamic loading/unloading applications in mind.
There few attempts to create shared JVM utility for CLI applications (e.g. nailgun)
But in practice, even in server side java, it usually better to use JVM (or several) per applications, for reasons mentioned above.
For isolating execution contexts.
If one of the processes hangs, or fails, or it's security is compromised, the others don't get affected.
I think having separate runtimes also helps GC, because it has less references to handle than if it was altogether.
Besides, why would you run them all in one JVM?
Java Application Servers, like JBoss, are design to run many applications in one JVM
I would like to ask what would be more appropriate to choose when developing a server similar to SmartFoxServer. I intend to develop a similar yet different server. In the benchmarks made by the ones that developed the above server they had something like 10000 concurrent clients.
I made a bit of research regarding the cost of using too many threads(>500) but cannot decide which way to go. I once made a server in java but that was for a small application and had nothing to do with heavy loads.
Thanks
Take a look at Apache Mina. They've done alot of the heavy lifting required to use NIO effectively in a networking application. Whether or not NIO increases your ability to process concurrent connections really depends on your implementation, but the performance boosts in Tomcat, JBoss and Jetty are plenty evidence to you already in the positive.
i'm not familiar with smartfoxserver, so i can only speak generically (which is not always good :P but here i go)
i think those are 2 different questions. on one hand, the io performance when using native java sockets vs. native sockets written in c (like tomcat).
the other question is how to scale up to that kind of concurrency level. other than that, i'd always choose native sockets (i.e: c).
now, how to scale: it's not a good idea to have a lot of threads running at the same time (os constraints, etc), so i'd choose to scale horizontally, meaning to add a load balancer that can send the requests to different servers that can be linked by using messages (using jms, like rabbitmq or activemq, or even using a protocol like stomp or amqp).
other solution, a cloud environment that allows you to grow your installation as you need
In most benchmarks which test 10K or 100K connections, the server is doing no work and unless your server does next to nothing, these test are unrealistic.
You need to take a clear idea of mow many concurrent connections you want to support.
If you have less than 1K connection, using a thread per connection will work ok. This is the simplest approach to take. Using a dispatcher model with NIO will work better if your request are very simple. Otherwise it won't matter much.
If you have more than 1K connections it is likely you want to use more than one server as each connection is getting less than 1% of a core and the cost of a basic server is relatively cheap these days.
I want to know how Java (JSP) on Tomcat compares to PHP on Apache in terms of performance.
Two servers with the same hardware configurations, one running Tomcat/Java (JSP) the other Apache/PHP, both servers maxed out with how many connections they can handle at once. Would they be somewhat close or would one pull away from the other one by a large margin? I basically just want to know if Tomcat/Java (JSP) is going to be a big performance hit if I switch to it vs PHP. If anyone can give a detailed answer on why one is faster than the other that would be amazing. Links are great too, I was unable to find anything online surprisingly.
Please no Java vs PHP wars, this is about performance only, nothing to do with the languages themselves.
Note: If there is any other concerns I should have for switching to Java from PHP please let me know. I REALLY hate asking this question because I'm usually the first person to say "program in what you like" but in my situation I need whats also good for the projects I work for. I know that there are large sites written in JSP, but it doesn't mean that they're better.
Thanks
What's good for the projects you're working on is to spend as little time as possible to write them as developer time is way more expensive than any perceived differences in performance. So stick with what you're familiar with.
The answer to your question is: they are both fast enough.
Any such comparison is hard because you end up doing things differently in different languages. Java bytecode is probably faster to interpret but then again any decent PHP install uses as opcode cache largely negating any such advantage in real terms.
Java also has a more complicated development model because Web processes are persistent. This can have a performance advantage but also can create problems like memory and other resource leakage, which PHP doesn't tend to have because everything is created and destroyed on each request (barring session information, memcache and so on).
Also PHP extensions can be created for any parts that you want to speed up.
$10,000 can buy an awful lot of hardware. It can buy the hardware to run SO. It doesn't buy much developer time.
I've got experience doing both Java and PHP development. I will generally choose PHP for Web development because of:
quicker to test changes in development (ie no build/deploy steps and Java hot-deploy has serious limitations). Words cannot express how freeing it is to test changes by saving the file you're working on and clicking reload on a browser vs running an Ant/Maven build process;
far fewer issues of memory/resource leakage;
extensive library of functions to do pretty much anything you want;
cheaper to host (at the low end).
I will use Java for some things, like anything that involves a lot of background processing and threading, which aren't PHP's strong points.
You'll note that performance (or the lack thereof) doesn't even rate as a reason for or again.
Sorry if that doesn't answer your question, but such concerns over performance are a pointless distraction.
The best way to answer performance questions is with a benchmark. Implement some simple page in both PHP and Java and then benchmark them using ab (Apache Benchmark).
Having said that, I suspect Java will outperform PHP because of the nature of the 2 platforms. Java is compiled to optimized bytecode (once) and then interpreted by a virtual machine. When Tomcat runs, the JVM loads the classes required for any given page and keeps them in memory so they're ready to go when an HTTP request hits the web server. Contrast that with PHP which reloads and re-interprets the code from scratch with each invocation by Apache. This is helped to a large degree by op-code caching, but still not to the level of what happens in the JVM.
I am currently investigating what Java compatible solutions exist to address my requirements as follows:
Timer based / Schedulable tasks to batch process
Distributed, and by that providing the ability to scale horizontally
Resilience, no SPFs please
The nature of these tasks (heavy XML generation, and the delivery to web based receiving nodes) means running them on a single server using something like Quartz isn't viable.
I have heard of technologies like Hadoop and JavaSpaces which have addressed the scaling and resilience end of the problem effectively. Not knowing whether these are quite suited to my requirements, its hard to know what other technologies might fit well.
I was wondering really what people in this space felt were options available, and how each plays its strengths, or suits certain problems better than others.
NB: Its worth noting that schedule-ability is perhaps a hangover from how we do things presently. Yes there are tasks which ought to go at certain times. It has also been used to throttle throughput at times when no mandate for set times exists.
Asynchronous always brings JMS to mind for me. Send the request message to a queue; a MessageListener is plucked out of the pool to handle it.
This can scale, because the queue and listener can be on a remote server. The size of the listener thread pool can be configured. You can have different listeners for different tasks.
UPDATE: You can avoid having a single point of failure by clustering and load balancing.
You can get JMS without cost using ActiveMQ (open source), JBOSS (open source version available), or any Java EE app server, so budget isn't a consideration.
And no lock-in, because you're using JMS, besides the fact that you're using Java.
I'd recommend doing it with Spring message driven POJOs. The community edition is open source, of course.
If that doesn't do it for you, have a look at Spring Batch and Spring Integration. Both of those might be useful, and the community editions are open source.
Have you looked into GridGain? I am pretty sure it won't solve the scheduling problem, but you can scale it and it happens like "magic", the code to be executed is sent to a node and it is executed in there. It works fine when you don't have a database connection to be sent (or anything that is not serializable).
Is it possible to dump an image of a running JVM and later restore the previous state by loading the image into the JVM? I'm fairly certain the answer is negative, but would love to be wrong.
With all the dynamic languages available for the JVM comes an increase in interactivity, being able to save a coding session would help save time manually restoring the VM to a previous session.
There was a JSR 323 proposed for this a while back but it was rejected. You can find some links in those articles about the research behind this and what it would take. It was mostly rejected as an idea that was too immature.
I have heard of at least one startup (unfortunately don't recall the name) that was working on a virtualization technology over a hypervisor (probably Xen) that was getting pretty close to being able to move JVMs, including even things like file system refs and socket endpoints. Because they were at the hypervisor level, they had access to all of that stuff. By hooking that and the JVM, they had most of the pieces. I think they might have gone under though.
The closest thing you can get today is Terracotta, which allows you to cluster a portion of your JVM heap, storing it in a server array, which can be made persistent. On JVM startup, you connect to the cluster and can continue using whatever portions of your heap are specified as clustered. The actual objects are faulted in on an as-needed basis.
Not possible at present. In general, pausing and restarting a memory image of a process in a different context is incredibly hard to achieve: what are you going to do with open OS resources? Transfers to machines with different instruction sets? database connections?
Also images of the running JVM are probably quite large - maybe much larger than the subset of the state you are actually interested in. So it's not a good idea from a performance perspective.
A much better strategy is to have code that persists and recreates the application state: this is relatively feasible with most JVM dynamic languages. I do so similar stuff in Clojure, where you have an interactive environment (REPL) and it is quite possible to create and run a sequence of operations that rebuild the application state that you want in another JVM.
This is currently not possible in any of the JVMs I know. It would not be very difficult to implement something like this in the JVM if programs run disconnected from their environments. However, many programs have hooks into their environment (think file handles, database connections) which would make implementing something like this very hairy.
As of early 2023, there's some progress in this space and it seems a lot of things can at least be tried, even if without claims for their production readiness.
One such feature is called CRaC. You can check their docs or even get an OpenJDK build that includes the feature. The project has its own repo under OpenJDK and looks quite promising.
Another vendors/products to check:
Azul ReadyNow!
OpenJ9 InstantOn
What's also really exciting, is AWS Lambda SnapStart. It doesn't give you full snapshoting capabilities, and is intrinsically vendor-specific, but it's what a ton of Java engineering who use AWS Lambda were waiting for so long.