I have a requirement where I have to host two (different) webapps on the same machine accessible via HTTP but on different ports. I am wondering what is the better solution, or basically what is the difference between
Starting two separate Tomcat instances with specific catalina_home / catalina_base and of course two conf dirs with corresponding server.xml(s)
Having one Tomcat instance and configure multiple Services in a single server.xml. There is a default Catalina Service in the server.xml and adding an another with the specific ports and appbase
Could someone describe which way to choose and why? I am interested in the main differences between the two instances vs two services?
The main difference between both approach is:
Many Tomcat instance
You can start or stop tomcat without affect the other instances, this approach is very helpful when you need to give maintenance to one application but you don't want to affect the availability of the other instances. Each instance will use their own resources, that could be a problem when the machine couldn't guarantee the amount of memory or processor that is need by every tomcat instance, this approach will demand more resources than the other one, in terms of tomcat knowledge this approach is very easy to implement.
One Tomcat instance And Many services x webapp
This approach is helpful when you need to share the resources between the web applications, you can have one single point of configuration for all the web application, with this approach is more difficult to isolate problems between web apps because they coexist in the same tomcat instance, for example if you need to troubleshoot one application how do that if both of them are running in the same tomcat? how read the log files? are the log files of both application in one log file? or are properly separated? be careful here if no proper configuration is perform then it could be nightmare in production. This approach will need more effort and knowledge in the tomcat configuration in order to define a proper separation of services, it is more difficult to configure but in terms of efficiency is better.
How to decide
Well it depend on
a. the amount of the resources of the server.
b. the knowledge level of the IT team in terms of tomcat configuration
c. how critical are the web applications, for example if one application is very critical is better to keep it in a separated tomcat instance because it helps you to isolate in a simple form any problem that can occurs with that specific application.
And finally it will depends on the context where you need to implement your solution and your business needs.
within our company it's kind of standard to create repositories for data which is originally stored in the database as described for example in https://thinkinginobjects.com/2012/08/26/dont-use-dao-use-repository/.
Our web infrastructure consist of a few independent web applications within Tomcat 7 for printing, product description, product order (this is not persisted in the database!), category description etc.
They are all build on Servlet 2 API.
So each instance/implementation of repository holds a specialised kind of data represented by serializable classes and the instances of this serialzable classes are set up/filled by an periodically executed database query (for every resultrow the setters of the fields are called; reminds me of domain oriented entity beans with CMP).
The repositories are initialized on the servlets init sequences (so every servlet keeps it's own set of instances).
Each context has a own connection to the Oracle database (set up by resource description file on deployment).
All the data is read only, we never need to write back to the database.
Because we need some of these data types for more than one web application (context) and some even for more than one servlet within the same web context repositories with an identical data type are instantiated more than once - e.g. four times, twice within the same application.
In the end some of the data is doubled and I'm not sure if this is as clever and efficient as it should be. It should be possible to share the same repository object to more than one application (JNDI?) but at least it must be possible to share it for several servlets within the same application context.
Despite I'm irritated by the idea to use a "self build" repository instead of something like a well tested, open developed cache (ehcache, jcs, ...) because some of these caches also provide options for distributed caches (so it should also work within the same container).
If certain entries are searched the search algorithm iterates over all entries in the repository (s. link above). For every search pattern there are specialised functions which are directly called from within the business logic classes using the "entity beans"; there's no specification object or interface.
In the end the application server as a whole does not perform that well and it uses a hell lot of RAM (at least for approximately 10000 DB entries); this is in my opinion most probably correlated to the use of serializeable XSD-to-JAXB-generated classes.
Additionally every time a application is deployed for tests you have to wait at least two minutes until all entries of the database have been loaded into the repositories - when deploying on live there's a well recognizable out of service phase on context/servlet start up.
I tend to think all of this is closely related to the solutions I described above.
Because I haven't got any experiences in this field and I'm new in the company I don't want to be to obtrusive.
Maybe you can help me to evaluate ideas for a better setup:
Is it for performance and memory better to unify all the repositories into one "repository servlet" and request objects from there via HTTP (don't think so, though it seems quite modular/distributed system friendly) or should I try to go with JNDI (never did that before) and connect to the repository similar to a JDBC database?
Wouldn't it be even more sensible, faster and efficient to at least use only one single connection pool for the whole Tomcat (and reference this connection pool from within the web apps deployment descriptor)? Or might that slow down connections or limit it in any other aspect?
I was told that the cache system (ehcache) didn't work well (at least not with the performance of the self written solution - though: I can't believe that). I imagine the usage of repositories backed by a distributed (as across all contexts) cache used in all web applications should not only reduce memory footprint significantly but should not be significantly slower. - I believe it will be faster and have shorter start up times respectively it shouldn't be needed to redeploy it that often.
I'm very grateful for every tip or hint and your thoughts. Would be marvellous to get a peer review of my ideas based on practical experiences.
So thank you very much in advance!
Is it better to hold a repository for every web application (context) or is it better to share a common instance by JDNI or a similar technique
Unless someone proves me otherwise I would say there is no way to do it, in a standard way, meaning as defined in the Servlet Sepc or in the rest of the Java EE spec canon.
There are technical ways to do it which probably depend on a specific application server implementation, but this cannot be "better" in its universal sense.
If you have two applications that operate on the same data, I wonder whether the partitioning of the applications is useful. Maybe all functionality operating on some kind of data needs to be in the same application?
within our company it's kind of standard to create repositories for data which is originally stored in the database as described for example in https://thinkinginobjects.com/2012/08/26/dont-use-dao-use-repository/.
I looked up Evans in our book shelf. The blog post is quite weird. A repository and a DAO are basically the same thing, it provides CRUD operations for an object or for a tree of objects (Evans says only the the aggregate roots).
The repositories are initialized on the servlets init sequences (so every servlet keeps it's own set of instances). Each context has a own connection to the Oracle database (set up by resource description file on deployment). [ ... ]
In the end the application server as a whole does not perform that well and it uses a hell lot of RAM
When something performs badly its the best to do profiling, e.g. with YourKit or with perf and FlameGraphs if you are on Linux. If your applications need a lot of RAM, analyze the heap e.g. with Eclipse MAT. There is no way somebody can give you a recommendation or hint on a best practice without seeing any line of code.
A general answer would include anyting about performance tuning for Oracle DBs, JDBC, Java Collections and Concurrent Programming, Networking and Operating Systems.
I was told that the cache system (ehcache) didn't work well (at least not with the performance of the self written solution - though: I can't believe that)
I can. EHCache is between 10-20 times slower then a simple HashMap. See: cache benchmarks. You only need a map, when you do a complete preload and don't have any mutations.
I imagine the usage of repositories backed by a distributed (as across all contexts) cache used in all web applications should not only reduce memory footprint significantly but should not be significantly slower
Distributed caches need to go over the network and add serialization/deserialization overhead. That's probably another factor 30 slower. When is the distributed cache updated?
I'm very grateful for every tip or hint and your thoughts.
Wrap up:
Do the normal software engineering homework, do profiling and analyzing and spend the effort of tuning at the right places
Ask specific questions on one topic on stackoverflow and share your code and performance data. Ask a question about one thing at one time and read https://stackoverflow.com/help/on-topic
You may also come to the conclusion that there is nothing to tune. There are applications out there that need a day to build up an in memory data structure from persistent data. Maybe its just a lot of data? If you do not like the downtime use green blue deployment. Also use smaller data sets for development and testing
I have an administrative page in a web application that resets the cache, but it only resets the cache on the current JVM.
The web application is deployed as a cluster to two WAS servers.
Any way that I can elegantly have the "clear cache" button on each server trigger the method on both JVMs?
Edit:
The original developer just wrote a singleton holding a HashMap to be the cache in question. Lightweight and (previously) worked just fine for the requirements. It caches content pulled from six or seven web services for specified amounts of time.
Edit:
The entire application in question is three pages, so the elegant solution might well be the lightest solution.
Since the Cache is internal to your application you are going to need to provide an interface to clear it within your application. Quotidian says to use a JMS queue. That will not work because only one instance will pick up the message assuming you have clustered MQ Queues.
If you want to reuse the same implementation then the only way to do this is to write some JMX that you will be able to interact with at the instance level.
If not you can either use the built in WAS cache (which is JMX enabled) or use a distributed cache like ehcache.
in the past I have created a subclassed LinkedHashMap that was linked to all instances on the network using JBOSS JGroups. Of course reinventing the wheel is always painful.
I tend to use a JMS queue for doing exactly that.
We currently have a web application loading a Spring application context which instantiates a stack of business objects, DAO objects and Hibernate. We would like to share this stack with another web application, to avoid having multiple instances of the same objects.
We have looked into several approaches; exposing the objects using JMX or JNDI, or using EJB3.
The different approaches all have their issues, and we are looking for a lightweight method.
Any suggestions on how to solve this?
Edit: I have received comments requesting me to elaborate a bit, so here goes:
The main problem we want to solve is that we want to have only one instance of Hibernate. This is due to problems with invalidation of Hibernate's 2nd level cache when running several client applications working with the same datasource. Also, the business/DAO/Hibernate stack is growing rather large, so not duplicating it just makes more sense.
First, we tried to look at how the business layer alone could be exposed to other web apps, and Spring offers JMX wrapping at the price of a tiny amount of XML. However, we were unable to bind the JMX entities to the JNDI tree, so we couldn't lookup the objects from the web apps.
Then we tried binding the business layer directly to JNDI. Although Spring didn't offer any method for this, using JNDITemplate to bind them was also trivial. But this led to several new problems: 1) Security manager denies access to RMI classloader, so the client failed once we tried to invoke methods on the JNDI resource. 2) Once the security issues were resolved, JBoss threw IllegalArgumentException: object is not an instance of declaring class. A bit of reading reveals that we need stub implementations for the JNDI resources, but this seems like a lot of hassle (perhaps Spring can help us?)
We haven't looked too much into EJB yet, but after the first two tries I'm wondering if what we're trying to achieve is at all possible.
To sum up what we're trying to achieve: One JBoss instance, several web apps utilizing one stack of business objects on top of DAO layer and Hibernate.
Best regards,
Nils
Are the web applications deployed on the same server?
I can't speak for Spring, but it is straightforward to move your business logic in to the EJB tier using Session Beans.
The application organization is straight forward. The Logic goes in to Session Beans, and these Session Beans are bundled within a single jar as an Java EE artifact with a ejb-jar.xml file (in EJB3, this will likely be practically empty).
Then bundle you Entity classes in to a seperate jar file.
Next, you will build each web app in to their own WAR file.
Finally, all of the jars and the wars are bundled in to a Java EE EAR, with the associated application.xml file (again, this will likely be quite minimal, simply enumerating the jars in the EAR).
This EAR is deployed wholesale to the app server.
Each WAR is effectively independent -- their own sessions, there own context paths, etc. But they share the common EJB back end, so you have only a single 2nd level cache.
You also use local references and calling semantic to talk to the EJBs since they're in the same server. No need for remote calls here.
I think this solves quite well the issue you're having, and its is quite straightforward in Java EE 5 with EJB 3.
Also, you can still use Spring for much of your work, as I understand, but I'm not a Spring person so I can not speak to the details.
What about spring parentContext?
Check out this article:
http://springtips.blogspot.com/2007/06/using-shared-parent-application-context.html
Terracotta might be a good fit here (disclosure: I am a developer for Terracotta). Terracotta transparently clusters Java objects at the JVM level, and integrates with both Spring and Hibernate. It is free and open source.
As you said, the problem of more than one client web app using an L2 cache is keeping those caches in synch. With Terracotta you can cluster a single Hibernate L2 cache. Each client node works with it's copy of that clustered cache, and Terracotta keeps it in synch. This link explains more.
As for your business objects, you can use Terracotta's Spring integration to cluster your beans - each web app can share clustered bean instances, and Terracotta keeps the clustered state in synch transparently.
Actually, if you want a lightweight solution and don't need transactions or clustering just use Spring support for RMI. It allows to expose Spring beans remotely using simple annotations in the latest versions. See http://static.springframework.org/spring/docs/2.0.x/reference/remoting.html.
You should take a look at the Terracotta Reference Web Application - Examinator. It has most of the components you are looking for - it's got Hibernate, JPA, and Spring with a MySQL backend.
It's been pre-tuned to scale up to 16 nodes, 20k concurrent users.
Check it out here: http://reference.terracotta.org/examinator
Thank you for your answers so far. We're still not quite there, but we have tried a few things now and see things more clearly. Here's a short update:
The solution which appears to be the most viable is EJB. However, this will require some amount of changes in our code, so we're not going to fully implement that solution right now. I'm almost surprised that we haven't been able to find some Spring feature to help us out here.
We have also tried the JNDI route, which ends with the need for stubs for all shared interfaces. This feels like a lot of hassle, considering that everything is on the same server anyway.
Yesterday, we had a small break through with JMX. Although JMX is definately not meant for this kind of use, we have proven that it can be done - with no code changes and a minimal amount of XML (a big Thank You to Spring for MBeanExporter and MBeanProxyFactoryBean). The major drawbacks to this method are performance and the fact that our domain classes must be shared through JBoss' server/lib folder. I.e., we have to remove some dependencies from our WARs and move them to server/lib, else we get ClassCastException when the business layer returns objects from our own domain model. I fully understand why this happens, but it is not ideal for what we're trying to achieve.
I thought it was time for a little update, because what appears to be the best solution will take some time to implement. I'll post our findings here once we've done that job.
Spring does have an integration point that might be of interest to you: EJB 3 injection nterceptor. This enables you to access spring beans from EJBs.
I'm not really sure what you are trying to solve; at the end of the day each jvm will either have replicated instances of the objects, or stubs representing objects existing on another (logical) server.
You could, setup a third 'business logic' server that has a remote api which your two web apps could call. The typical solution is to use EJB, but I think spring has remoting options built into its stack.
The other option is to use some form of shared cache architecture... which will synchronize object changes between the servers, but you still have two sets of instances.
Take a look at JBossCache. It allows you to easily share/replicate maps of data between mulitple JVM instances (same box or different). It is easy to use and has lots of wire level protocol options (TCP, UDP Multicast, etc.).
Is there a way to deactivate the optimization for collocated objects that Weblogic uses by default for a specific EJB ?
EDIT: Some context :
We have a scheduler service that runs inside one node of the cluster. This is for historic reasons and cannot be changed at the moment.
This service makes call to an EJB and we would like to load balance these calls. Unfortunately at the moment every calls runs on the node that hosts the scheduler service because of the optimization mentioned in the question.
I was thinking of coding a custom load balancing class however this optimization seems to be done before the load balancing step happens.
Supposing you are trying to call a remote EJB (load balancing on local ejbs can only be obtained through an indirection trick like Patrick mentioned) you will have to create a new InitialContext using the address of the cluster instead of a particular server. This new IC will provide stubs as if you were a foreign client, subject to the same load balancing strategies as they are.
Unfortunately, this means that EJB3 injections won't work. You will have to do the lookup yourself. There is a chance, an this is pure speculation, that those stubs you can get from the cluster IC are serializable. In other words, it might be possible to bind them and get them injected using #Resource afterwards.
Not being too familiar with guts of weblogic but reading their material I would say you cannot, without some amount of trickery.
It does seem though that you can put a JMS (MDB) facade in front of your EJB that does not abide by the collocated object optimization.
OR
If your scheduler is servlet based, you should be able to deploy it in a separate web-application within the container and have it execute calls to the EJB cluster.
How many nodes are there in your cluster? Have you considered deploying the EJB to nodes other than the one the scheduler service has been deployed to?
I happen to stumble upon this old question.
Is it possible that you create a JMS queue in between the scheduler and the actual executor? Since all managed servers will consume from the same queue, the queue will act as a load-balancer, in a way.
If you have solved this issue differently, I'd be interested to know how :)